Is your Data Governance Ready to Handle the Risks Posed by Artificial Intelligence?

Data governance means organizing data at the enterprise level. Artificial Intelligence, at its core, involves generating insights from data. Machine Learning algorithms match present situations to the past and apply time-tested solutions. As the algorithm processes more data and decisions get validated, the output becomes better and more accurate. 

There is a direct relationship between the quality and quantity of data and the accuracy or potency of output.

Poor-quality data causes the algorithm to make faulty conclusions, leading to unintended results. The benefits gained from introducing AI to the process fritter away. Worse, the AI project may turn counterproductive and harm the enterprise.

Consider a case of a hospital applying AI to MRI scans. Poor data quality may cause the algorithm to misdiagnosis results. Doctors may end up recommending incorrect or unneeded medication to patients.

Even accurate data sets, where accuracy exceeds everyday world scenarios, are problematic. For instance, an AI model used to predict future criminal behaviour could come up with a bias against ethnic minorities.

Technology does not have a solution to such issues. Higher processing power may train algorithms faster, but does not change the quality of the results. In fact, it may speed up the application of wrong decisions.

The solution rather lies in data governance.

1. Get the basics right

Data governance means organizing data at the enterprise level. There is no one best way to go about the task. The best framework depends on what the enterprise plans to do with the data. Nevertheless, all data governance frameworks address the basics, such as:

  • Defining ownership of data.
  • Rendering clarity on the data sources and methods of data collection.
  • Classifying data on importance and sensitivity, with access controls.
  • Laying down protocols for data storage and transmission, including compliance with regulatory protocols such as European Union’s General Data Protection regulation (EU GDPR) and California Consumer Privacy Act (CCPA).
  • Establishing policies and processes related to data handling and processing.
  • Cleansing the data. A comprehensive data governance framework finds out source data reliability, accuracy, and quality. The framework ensures both structured and unstructured data go through such parameters.

2. Ensure transparency

Transparency is always a virtue. As the enterprise grows organically, it draws in data from several sources. Many such sources come up through ad hoc means and remain outside the direct control of enterprise IT. Functional departments may silo their data, citing confidentiality. Unless enterprise IT has complete visibility of data and data sources, it cannot apply governance policies. Transparency becomes more critical for AI, where outputs and decisions depend on data.

  • Make every process visible and traceable. Increased visibility and traceability enable robust review and vetting of results. The risk of unintended effects reduces.
  • Establish data pipelines. A good data pipeline buffers the original data and point of consumption, and also feeds the Machine Learning algorithms. AI success depends on data pipelines feeding the algorithms with relevant data. If the pipelines feed wrong or irrelevant data, the system generates insights that drift away from the objectives.
  • Have systems in place to filter data entering the pipeline. Ask the following questions when connecting a data source to the pipeline.
    • Where is the data coming from?
    • What does the data represent?
    • Is there anything to change before feeding the data into the algorithm?
  • Create audit trails for algorithms. Pay special attention to the underlying variables and the selection process of such variables.

3. Improve data quality

A good framework defines data quality requirements and establishes metrics to monitor quality.

To ensure the desired data quality:

  • Set up a data governing body to define the policies and methodology related to data. The governing body develops consistent principles for data services and data collection mechanisms.
  • Address data related issues and problems without delay. Any delay risks faulty outcomes.
  • Establish monitoring protocols. Post-deployment of AI models, establish monitoring protocols to assess the output. A feedback loop maintains the quality of the insights over time and makes certain biases not creep into the model.

4. Consider the ethical implications

Traditional data governance concerns more with the technical integrity of the data. With Artificial Intelligence, the implications of the data also matter. As enterprises rely on algorithms, data replaces the human brain as the arbitrator of fairness. As such, data governance for AI requires a new moral dimension. Data governance becomes the filter that ensures fair, reliable, and responsible outcomes.

There is no gold standard for “ethical data governance.” Ethics relates to the enterprise and the cultural values around which the enterprise operates. But as basic rules of thumb, data governance standards should ensure the algorithm:

  • Treats everyone fairly.
  • Promote workforce, customer, and partner safety.
  • Respect user privacy and protect data.
  • Ensure accountability for the output decisions.

A good example of data governance in action is Deloitte’s Trustworthy AI framework. The framework lists six pillars to sustain the trust of customers and employees when applying analytics for decision-making.

5. Ensure dynamism and flexibility

Data governance for artificial intelligence systems requires different approaches to conventional data governance. Addressing data governance in an AI-driven needs new purpose-built frameworks.

  • Reconcile stringency with flexibility. AI results in an increased flow of data, increasing the risks. Such a situation calls for more stringent policies to manage the data. But the system also needs the flexibility to support fast-paced changes.
  • Focus on holistic governance. The traditional piecemeal approach toward data governance will not work for AI. AI tools deal with voluminous data sets that integrate different data streams. The focus shifts from governing individual data elements to governing the entire data pipeline.
  • Institute dynamic governance. Effective data governance is never a one-off exercise. AI makes governance more dynamic than ever before. Effective governance requires periodic evaluation of outcomes. Retraining models for course-correction would need new training datasets. The data governance framework would also have to change to accommodate the changing requirements.
  • Prioritize interoperability. The business environment has become very fragmented. Most businesses function through an ecosystem of partners and outsourced hands. Interoperability becomes an essential requirement in such a scheme of things. Data governance needs to co-opt protocols that manage the shared space among partners. For instance, it needs to clarify access rights.
  • Ensure consistency across geographies. Enterprises today collaborate with partners spread out across the world. Data governance policies must reflect regulations across geographies, languages, and time zones.

Applied the right way, data governance for AI streamlines processes, enhances decision-making, and enables powerful new products. Good data governance practices accelerate data science initiatives. Poor data governance increases the cost of fixing errors down the lane.

Find out how to design a data governance framework that delivers value.

Tags:
Email
Twitter
LinkedIn
Skype
XING
Ask Chloe

Submit your request here, my team and I will be in touch with you shortly.

Share contact info for us to reach you.
Ask Chloe

Submit your request here, my team and I will be in touch with you shortly.

Share contact info for us to reach you.