What is Data Governance, why and how to implement it in an organization?
When it comes to Data Governance or data management many experts will say how that data are “21st Century Oil”. As oil enabled the industrial revolution in the 20th century, so data today enables organizations to generate revenue. Of course, in case organizations know how to use data. For most organizations, data creates value, ie revenue. Understanding revenue-generating data, as well as the impact of poor data quality on revenue generation, is a challenge for data governance.
The question is “How can data governance enable organizations to generate revenue?”
Data governance is a set of procedures that ensure that important data within a company is managed in a structured way, and that data can be trusted. In this way, responsibility for any negative consequences due to low data quality is clearly established. It is a control that ensures that data entry by a member of the operational team or automated processes meets clear standards, such as business rules, data definitions, and data integrity constraints in the data model. Enables data consistency, increases data availability and security, increases data processing performance, and enables data revision.
Data governance is essentially the improvement of business processes in the field of strategic data management. Its purpose is to orchestrate people, processes, and technology to enable the organization to make better use of data assets and increase trust in them. This ensures consistently defined data (data dictionary), their good understanding throughout the organization and increase the use and trust in data. This results in more efficient decision-making at all levels, but also in compliance with regulatory requirements that may be placed before the organization.
How to achieve this?
It is first necessary to determine at what stage of maturity the organization is from the perspective of data governance. There are several methodologies and ways to determine the level of maturity, as well as the levels of maturity themselves. In general, it can be concluded that the higher the degree of adoption of data governance, the higher the return for the organization, and at the same time the lower the risk. This is illustrated in the following graph.
Once the level of maturity is identified, it is necessary to define the mission and vision of the project and decide on the methodology of data governance implementation. The methodology contains data management principles and processes, roles and responsibilities, organizational structure, data quality requirements and control measures.
The activities that the methodology should include are:
- defining policies, standards, procedures, architecture, metrics
- development of project plan (roadmap)
- developing information language catalogs as a basis for further activities of implementation and enforcement of data governance procedures
- communicating data throughout the organization
- promoting the importance of data assets
- implementation of data management projects
Some of the outcomes of data governance implementation are:
- setting up business participants as information owners – defining roles and responsibilities for a specific data set
- positioning problems related to business data as multifunctional (not only a problem of IT but also other functions)
- data is managed separately from applications – data quality must be ensured before entering them into applications.
Basic terms in data governance: dataset, data dictionary, data owner, data steward…
Dataset is a set of data over which data governance is to be performed. Roles are added to them, such as data owners, data stewards, data users, etc. Data stewards are given responsibilities that define who is responsible for which term, category, property, etc.
Data dictionary is a repository of metadata. It is a centralized repository of information about meanings, relationships, sources, data usage (tables, attributes).
Business glossary is a business glossary that explains the meaning of business terms and their relationship to the data dictionary. Each term can be described by a short or long description, but also by a document, and the terms are hierarchically classified.
Information governance catalog presents the data dictionary and business glossary in the same place. It offers an understanding of what a particular business term means and to which data assets it refers.
Lineage – a graphical representation of the traceability of a term from a report to the columns and tables of the source system involved in creating the value that enters the business term.
From all the above, one might think that data governance is the task of the IT department in the organization and that only IT should deal with it. This is simply not true because the IT department does not create data. Data is created by business users. Data governance should link the IT and business side of an organization to create a “common vocabulary” to eliminate the costs of spending time and resources on, for example, developing unnecessary products, services or functionalities. Misinterpretation or contradictory data can lead to a misunderstanding of user needs and decision making that will have negative consequences for the organization.