What is Data Governance, why and how to implement it in an organization?
When talking about Data Governance or data management, many experts will say that “data is the oil of the 21st century”. As oil enabled the industrial revolution in the 20th century, so data today enables organizations to generate revenue. Of course, in case organizations know how to use data. For most organizations, data creates value, ie revenue. Understanding revenue-generating data, as well as the impact of poor data quality on revenue generation, is a challenge for data governance.
The question is, “How can data governance enable organizations to generate revenue?”
Data governance is a set of procedures that ensure that important data within a company is managed in a structured way, and that data can be trusted. In this way, responsibility for any negative consequences due to low data quality is clearly established. It is a control that ensures that data entry by a member of the operational team or automated processes meets clear standards, such as business rules, data definitions, and data integrity constraints in the data model. Enables data consistency, increases data availability and security, increases data processing performance, and enables data revision.
Data governance is essentially the improvement of business processes in the field of strategic data management. Its purpose is to orchestrate people, processes, and technology to enable the organization to make better use of data assets and increase trust in them. This ensures consistently defined data (data dictionary), their good understanding throughout the organization and increase the use and trust in data. This results in more efficient decision-making at all levels, but also in compliance with regulatory requirements that may be placed before the organization.
How to achieve this?
It is first necessary to determine at what stage of maturity the organization is from the perspective of data governance. There are several methodologies and ways to determine the level of maturity, as well as the levels of maturity themselves. In general, it can be concluded that the higher the degree of adoption of data governance, the higher the return for the organization, and at the same time the lower the risk. This is illustrated in the following graph.
Once the level of maturity is identified, it is necessary to define the mission and vision of the project and decide on the methodology of data governance implementation. The methodology contains data management principles and processes, roles and responsibilities, organizational structure, data quality requirements and control measures.
The activities that the methodology should include are:
- defining policies, standards, procedures, architecture, metrics
- development of project plan (roadmap)
- developing a catalog of information assets as a basis for further implementation and implementation of data governance procedures
- communicating data throughout the organization
- promoting the importance of data assets
- implementation of data management projects
Some of the outcomes of data governance implementation are:
- setting up business participants as information owners – defining roles and responsibilities for a specific data set
- positioning problems related to business data as multifunctional (not only a problem of IT but also other functions)
- data is managed separately from applications – data quality must be ensured before entering them into applications.
Basic terms in data governance: dataset, data dictionary, data owner, data steward…
A dataset is a set of data over which data governance is to be performed. Roles are added to them, such as data owners, data stewards, data users, etc. Data stewards are given responsibilities that define who is responsible for which term, category, property, etc.
The data dictionary is a repository of metadata. It is a centralized repository of information about meanings, relationships, sources, data usage (tables, attributes).
Business glossary is a business glossary that explains the meaning of business terms and their relationship to the data dictionary. Each term can be described by a short or long description, but also by a document, and the terms are hierarchically classified.
The information governance catalog presents the data dictionary and business glossary in the same place. It offers an understanding of what a particular business term means and to which data assets it refers.
Lineage – a graphical representation of the traceability of a term from a report to the columns and tables of the source system involved in creating the value that enters the business term.
From all the above, one might think that data governance is the task of the IT department in the organization and that only IT should deal with it. This is simply not true because IT clothes