Data Governance: Check and Check Again
We are progressing in our data governance by building out the framework. We are moving data into areas that the business can use it but how do we know they should use it? More importantly how do they know that they CAN use it?
So much of what we do relies on trust. The business trust that we will understand what they need so that we can deliver reliable data. The first time they find an issue that trust is broken. Part of our framework builds out data validation processes but how do we put that into practice?
One of the first things we will need to do is check ourselves. Your data governance framework can be used to create natural data quality checkpoints. Is your data pipeline ready to migrate? Did you check it? When your process captured the requirements did you ask who in the business would validate the work? Your processes are the first place to start checking your data.
Some specific examples of how you can use your data governance framework to create natural data quality checkpoints are:
- Require that all data be validated before it is loaded into a data warehouse. This could be done using data validation rules or by running the data through a data quality tool.
- Require that data be audited on a regular basis. Even long running and stable data feeds can fall victim to bad data so you have to make sure that it is correct and complete periodically. This could be done using a data audit tool or by manually reviewing the data.
- Set up alerts to notify you when data quality issues are detected. This will help you to identify and address data quality issues quickly.
- Create a data quality dashboard to track key data quality metrics. This will help you to identify trends in data quality and to identify any areas where improvement is needed. In the long run this is probably one of the best ways to automate known issue detection using a set of standard KPI's
Here are some specific examples of how you can automate data quality validations:
- Use data validation rules to validate data before it is loaded into a data warehouse.
- Use a data quality tool to identify and correct data quality issues on a regular basis.
- Set up automated alerts to notify you when data quality issues are detected.
- Create a data quality dashboard to track key data quality metrics and to identify any areas where automation could be used to improve data quality.
There are many different tools out there that you can use to automate this task. Just a few are:
- Informatica: Informatica offers a range of data quality and data governance solutions, including Informatica PowerCenter, which can be used to automate a variety of data validation tasks.
- Talend: Talend offers a cloud-based data integration and data quality platform that includes a variety of data validation features.
- Trillium Software: Trillium Software offers a range of data quality and data governance solutions, including Trillium Quality Center, which can be used to automate a variety of data validation tasks.
- IBM: IBM offers a range of data quality and data governance solutions, including IBM Information Governance Catalog, which can be used to automate a variety of data validation tasks.
- SAP: SAP offers a range of data quality and data governance solutions, including SAP Information Steward, which can be used to automate a variety of data validation tasks.
Comments
Post a Comment