19 Predictive People Analytics
Predictive analytics uses previous data to produce forecasts or predictions about upcoming occurrences or results. It does this by utilizing statistical modeling, machine learning algorithms, and/or data mining approaches to find links and patterns that can be utilized to forecast future behavior. Predictive analytics is often seeking to answer the question, ‘what is likely to happen?’
Let’s say a company wants to forecast how much employee attrition will occur over the next six months. They can compile historical data on employees, such as their demographics, performance indicators, pay and benefits information, amount of training, and attrition rates. Using statistical techniques such as machine learning, regression, or decision trees, the company can create a ‘predictive model’ that attempts to predict attrition while taking into account factors like tenure, job satisfaction, performance evaluations, training history, and compensation. The model can produce a probability score that indicates how likely it is that a worker (or a particular quantity of workers) will leave in a certain time frame. This approach could simultaneously be used to forecast how much attrition may occur and also to identify factors that appear to be predictive of attrition. For example, if a model shows that the % of increase in pay an employee has received is significantly ‘predictive’ of attrition, they would be able to use that factor to help forecast expected attrition and they would also know that pay raises, or a lack of them, are an important factor in an employee’s decision to leave the company. Approaches to predictive modeling differ widely. Some seek to define probabilities for individuals, some seek to categorize likelihood (such as low, medium, or high-risk of attrition groups), and some seek to forecast volumes of attrition for workforce planning purposes. How a predictive model is designed should match the goal of the organization and also take into account the very real impacts that making predictions can have on actual people.
With predictive insights at hand, the company can proactively implement strategies, like retention initiatives, stay interviews, development opportunities, or seek to resolve certain issues identified in the model. The company may take actions to keep top talent, increase engagement, and lower attrition rates by using predictive analytics to predict and identify employees who are at danger of leaving. But it is important to note that just because a predictive model says that something may possibly happen, that doesn’t mean that it will.
Predictive analytics is very highly hyped in people analytics today. Those who are new to using data to inform people decisions can sometimes be drawn in by the promise of the word “predictive.” They may erroneously believe that the results really do indicate what will happen, but it’s important as a people analytics professional to ensure others are aware that no prediction will ever be truly accurate and that a predictive model is not actually about the future, it is about the past. It is built using data from the past to make a best guess about what might happen in the future but only if things continue to follow the same past patterns. It might be better to call this “perhaps analytics” because we only have a guess that it might continue as before and any change in the world changes the likely accuracy of our prediction.
Personal opinion: In people analytics, the best predictions are the ones that turn out to be wrong.
Wait, what? Yep, you heard me. I believe that the best predictive analytics in people analytics are those that end up being the furthest from reality in the end. In fact, the predictive model I am the most ashamed of in my career was an annual predictive attrition model that really did seem to predict what would happen. It was amazing statistically; it had a crazy low p-value, a high adjusted r-square, low error estimates, false prediction rates were beautifully low, and the final number of people who actually did leave the company that year ended up being 99% of the number I predicted it would be. Mathematically, I couldn’t have built a better model. And, at the time, I was very proud of myself. However, in retrospect, I consider that model and what I did with it one of the most embarrassing things I’ve done in my people analytics career. I had these insights, I had a pretty good idea of what was going to happen, where and when, but I didn’t do anything about it. That year should have been spent preparing the talent acquisition team for backfilling those positions, creating knowledge transfer opportunities, devising retention or alternate work strategies.
It is true that because we can never know for sure whether our models will be that accurate that we may not always want to act on all of them. And, it is sometimes the case that you may be building models to understand patterns and not necessarily to take action on all aspects. But for me personally, I got into people analytics so data could be used to help inform decisions and make work better for employees. For me, I am most proud of the predictive models I built that were statistically sound (and could be validated through other means), but that ended up being far from what was predicted in the end. Not because they were ‘wrong,’ but because the insights were used to take action and more positive futures were realized from what was learned.
Predictive Analytics Skills
If you are attempting to predict what might happen, based on what has happened in the past, focus on building the following predictive analytics skills often used in people analytics:
- Regression Analysis: This technique establishes relationships between variables to predict future values. For example, you could conduct a regression analysis on historical employee data, including factors like years of experience, training hours, performance ratings, and even tenure with the company to help you determine the impact of these variables on salary growth, allowing the organization to establish fair and equitable compensation structures.
- Causal Inference: This approach goes a step further than identifying relationships (through things like correlation or regression) and also establishes causal relationships between variables. In people analytics, causal inference can be used to determine whether specific workforce practices (like new training programs or wellness initiatives) actually lead to improvements in employee engagement or performance.
- Decision Trees: This technique is most simply like a flowchart (though it can be made very complex if desired). Imagine a series of questions, each with a yes or no answer, that ultimately lead you to a specific outcome. Decision trees work similarly, but they learn these questions and the resulting outcomes by analyzing historical data. In people analytics, decision trees can be useful for classifying employees based on various factors. For example, you could build a decision tree to predict employee attrition. The tree might ask questions like: “Does the employee have less than two years of experience?” or “Has the employee received a high number of performance reviews in the last year?” Based on the employee’s answers to these questions, the decision tree would then categorize them as high, medium, or low risk for leaving the company. A key advantage of decision trees is their interpretability. Unlike algorithms, decision trees have a more easily visualized and explained logic behind their predictions and can be really useful when you need to explain your reasoning to others. They also make it a bit easier to see which factors are most influential in the final outcome.
- Time Series Analysis: This method focuses on analyzing time-dependent data to forecast future trends. For example you might apply a time series analysis to assess employee absenteeism data over several years. By identifying seasonal patterns and trends, you can develop proactive strategies to manage staffing levels and ensure smooth operations during peak absenteeism periods.
- Survival Analysis: This technique is related to time series analysis but goes beyond simple prediction of event occurring (like employee turnover) and focuses on the time it takes for that event to occur. Survival analysis can help you understand things like the duration of employee lifecycles within the organization and identify factors that influence employee longevity. For example, imagine you’re studying a group of 100 new hires, instead of just predicting which people are likely to leave, survival analysis could help you estimate how long you would expect those employees to stay with the company on average.