Poor data quality usually has three causes:
- The processes, responsibilities and accountabilities are not regulated clearly enough.
- A harmonized data system landscape is missing.
- Certain quality assurance measures are not implemented.
Data is only the gold of the 21st century if its fineness is high enough.
The eight most important measures of modern data quality management are as follows:
1. Make data and processes transparent for everyone
The foundation of data quality management is transparency. Employees must all have the same view of the data and processes.
This is why complete transparency of processes, data sets and data structures is crucial:
- Make the data and processes visible, transparent and comprehensible for all employees in the company.
- And ensure that the processes are updated regularly.
2. Implement a data release process
Create a process that regulates the release of data. The three most important points here are:
- Introduce a standard for the release process with a dual control principle in a system.
- It is important that this process is explicit - with approval documented in the system - for entering, changing and deleting data.
- The process must be known to all those involved and is best carried out using a workflow.
3. Develop comprehensible data governance
Introduce data governance - define which data may be entered and added when, how, by whom and in which system.
Such data governance describes all important key points, rules and processes for
- data entry,
- data modification and
- Data deletion
As well as the responsibilities regarding
- the data,
- the process to be used and
- the respective system properties.
The data governance must also contain all information and be accessible and understandable for everyone.
4. Assign data ownership
Make employees responsible for the data and processes!
It is important that clear responsibility for the data, processes and the associated data quality is assigned at both the management level of the company (Chief Data Officer - CDO) and the employee level (data steward).
This responsibility must be clear, transparent and documented for all employees. And they should undergo regular training on their tasks and responsibilities.
5. Harmonize data systems
Harmonize your data system landscape!
Pipelines, databases and systems often grow organically and under very different conditions. This leads to confusing structures and incompatible formats.
A minimum goal would be to connect the systems in which data is maintained directly with each other, for example via the SAP document format Idoc. This makes it possible, for example for example, this makes it possible to automatically update a change to a data record in one system in all other systems.
Ideally, however, you should have a single system for data entry, data changes and data deletion, from which you can update all other systems by distributing your data records from the core system to the other systems.
One way or another, you should regularly and systematically analyze the sources, entries and quality of the data in the respective systems.
6. Install regular reporting on the data
Make analyses and define standard reports and dashboards for the data, data quality and data types.
It is important that there is regular - ideally monthly - reporting on the data for all employees:
- In this reporting, all new, changed and deleted data is shown.
- The number of errors during data entry and modification are listed.
- The data report is best sent automatically and regularly with a direct link to the reporting system.
These reports form the basis for the analysis of data quality and data usage.
7. Ensure that data records are complete
Make sure that data records are always up-to-date and complete.
Each data record has field contents or attributes that must be filled when data is created. Therefore, create a "golden record" for each data record in which the field contents are defined that must not remain empty.
8. Document all information about the data
All information about the data must be summarized in a document.
This document - be it a guideline, an SOP or a procedural instruction - contains all tasks, processes, responsibilities and rules for the data records.
It is quasi the bible, after reviewing which everyone in the company understands the meaning and purpose of the data - and the importance of high data quality.
Conclusion: Data quality requires a suitable infrastructure, clearly defined workflows and quality assurance measures
In order to use data in the value creation process, it must achieve and maintain a certain quality. This requires three things:
- A suitable infrastructure. Companies bring together data from very different sources. Only a data system landscape that is as harmonized as possible can cope with this.
- Clearly defined workflows. This includes comprehensible data governance - i.e. rules on who is allowed to work with which data and when - a clear assignment of responsibilities and very well documented information on the data.
- Diverse quality assurance measures. The most important of these are regular employee training, processes to ensure that data sets are complete and regularly updated reports on the status of the available data.