28 Data Management and Governance
To be a successful people analytics professional you need a solid foundation in data management and governance. In fact, people analytics isn’t possible without these. Data management is the process of structuring, securing, and preserving data to maintain its usability, quality, and availability. From data production and collection through storage, retrieval, and eventual destruction, it encompasses any action you might take to ensure the appropriate use of data. When deciding how to go about data management, it should be in alignment with an overarching framework and guidelines for ensuring quality, security, ethics, and privacy. This overarching framework is what we call data governance and is usually formalized through policies, standards, and procedures. Data governance indicates how data should be handled, used, and protected. You can think of data governance as rules for what should happen and data management as the actions taken to execute actions in alignment with those rules.
Putting It Into Context: A People Analytics Example
Data management and governance include many components that are easier to understand in context, rather than dry definitions. So, I’ll discuss them within a people analytics example and highlight important data management and governance components in bold text while using italics to draw your attention to key terms or items for consideration.
Example: A company wants to boost employee satisfaction by analyzing data from an employee engagement survey and acting on the resulting insights. They have asked you to help make this happen.
Here are the many different data management and governance aspects you may need to consider for this scenario:
First, you need to define how you will measure engagement. Next, you would design a method for data creation – in this case, the company already decided on a survey method, but you will still need to create and format questions that will provide valid and reliable measurements and insights relevant to engagement. You also must determine how and when to distribute the survey to employees and whether the information gathered will be kept anonymous or confidential. Before you can send the survey, you need systems in place to handle the secure collection, loading, and storage of this information. How and where will the data first be inputted? (Will it be an online survey? Paper surveys? Will it be administered by a 3rd party vendor, or directly by you?) Where will it be stored once it has been gathered? (Will it be in a centralized database? A data repository? Will it be locally hosted or stored on a cloud?) When making these decisions, you will need to plan for your expected data volume and data processing needs. You also need to make sure you have established or are following existing data governance policies to ensure that the methods you choose comply with privacy, security, and other requirements. These policies should cover and define aspects such as roles, responsibilities, ownership, employee communication, training, remediation for data breaches, and more. Wherever the data is stored, access should be restricted to only authorized individuals and the storage and usage policies must adhere to data privacy laws, regulations, and standards. You may need to apply techniques like anonymization or de-identification to ensure privacy and you will need to ensure data security measures are in place and working as intended to prevent unauthorized access, breaches, or misuse of the data. Your role in protecting the data extends beyond just access, you will also be responsible for protecting the data against inappropriate use. Your data should only be used for approved purposes for which employee consent was obtained during the data collection process and it is your job to make sure it stays that way. When it’s time to use the data for its approved purpose, you will need to extract the information from its data storage location and load it into some kind of tool to be analyzed. Depending on the storage system and analytical tools used, this may range from simple downloads of the data to more advanced data integration processes. Data is rarely collected in a way that is immediately ready for use. You will likely have to transform it first. Sometimes this means changing the actual structure of the data. For example, if you want to use Microsoft Excel, you will need the data to be structured and formatted into rows and columns or with comma-separated values so you can view it in a spreadsheet. You may also need to make changes to the data itself. For example, if you received responses on a Likert-type scale of options from “strongly agree” to “strongly disagree” you may need to recode those words to numerical values so you can quantitatively analyze them. Even with well-structured and re-coded data, survey results almost always contain errors, missing values, or inconsistencies. So, you will need to ensure data quality through data validation and data cleaning processes that focus on assessing and addressing any issues regarding accuracy, completeness, validity, and reliability. Once the survey data is prepared and trustworthy, only then can you start to analyze and report your findings. But your journey doesn’t end after you share your findings. You will need to adhere to your established data governance policies regarding data retention and data deletion – these dictate how long and in what ways the information can be stored, kept, or used. And, since engagement surveys are usually a recurring activity, you may need to create standardized reporting methods for how survey results will be shared in the future.
This scenario is meant only as an example, but it illustrates how even something as simple as a survey requires a tremendous amount of thoughtful data governance (the setting of policies and procedures for how the data should be managed) and data management (the steps taken to follow those policies and procedures). Data governance and management is important for any type of work that involves data (which is probably everything these days!) but it is especially crucial in people analytics. We deal with private and sensitive information and the decisions made from that data affect people’s lives – privacy, security, and accuracy can not be underestimated.
In my opinion, data governance and management are skills that separate people analytics practitioners from people analytics leaders. Only when we are good stewards of people data can we be good stewards of the people analytics profession.
Looking for more guidance on data governance? ISO 30439 "Human Resource Management: Safe Handling of Data" is a standard published by the International Organization for Standardization specifically designed to support those involved in human resource management. It provides steps that an organization can follow to ensure people data is handled appropriately at all stages of the data management process.