As adoption of grows, companies must become data experts – or risk results that are inaccurate, unfair or even dangerous. Here’s how to combat bias.
copyright by searchenterpriseai.techtarget.com
As companies step up the use of -enabled systems in their day-to-day operations, they become increasingly reliant on those systems to help them make critical business decisions. In some cases, the systems operate autonomously, making it especially important that the automated decision-making works as intended.
However, -based systems are only as good as the data that’s used to train them. If there are inherent biases in the data used to feed a algorithm , the result could be systems that are untrustworthy and potentially harmful.
In this article, you’ll learn why bias in systems is a cause for concern, how to identify different types of biases and six effective methods for reducing bias in .
Why is eliminating bias important?
The power of comes from its ability to learn from data and apply that learning experience to new data the systems have never seen before. However, one of the challenges data scientists have is ensuring that the data that’s fed into algorithms is not only clean, accurate and — in the case of supervised learning, well-labeled — but also free of any inherently biased data that can skew results.
The power of supervised learning, one of the core approaches to , in particular depends heavily on the quality of the training data. So it should be no surprise that when biased training data is used to teach these systems, the results are biased systems. Biased systems that are put into implementation can cause problems, especially when used in automated decision-making systems, autonomous operation, or facial recognition software that makes predictions or renders judgment on individuals.
Some notable examples of the bad outcomes caused by algorithmic bias include: a Google system that misidentified images of minorities in an offensive way; automated credit applications from Goldman Sachs that have sparked an investigation into gender bias; and a racially biased program used to sentence criminals. Enterprises must be hyper-vigilant about bias: Any value delivered by and systems in terms of efficiency or productivity will be wiped out if the algorithms discriminate against individuals and subsets of the population.
However, bias is not only limited to discrimination against individuals. Biased data sets can jeopardize business processes when applied to objects and data of all types. For example, take a model that was trained to recognize wedding dresses. If the model was trained using Western data, then wedding dresses would be categorized primarily by identifying shades of white. This model would fail in non-Western countries where colorful wedding dresses are more commonly accepted. Errors also abound where data sets have bias in terms of the time of day when data was collected, the condition of the data and other factors.
All of the examples described above represent some sort of bias that was introduced by humans as part of their data selection and identification methods for training the model. Because the systems technologists build are necessarily colored by their own experiences, they must be very aware that their individual biases can jeopardize the quality of the training data. Individual bias, in turn, can easily become a systemic bias as bad predictions and unfair outcomes are automated.
How to identify and measure bias
Part of the challenge of identifying bias is due to the difficulty of seeing how some algorithms generalize their learning from the training data. In particular, algorithms have proven to be remarkably powerful in their capabilities. This approach to neural networks leverages large quantities of data, high performance compute power and a sophisticated approach to efficiency, resulting in models with profound abilities.
Deep learning, however, is a “black box.” It’s not clear how an individual decision was arrived at by the neural network predictive model. You can’t simply query the system and determine with precision which inputs resulted in which outputs. This makes it hard to spot and eliminate potential biases when they arise in the results. Researchers are increasingly turning their focus on adding explainability to neural networks. Verification is the process of proving the properties of neural networks. However, because of the size of neural networks, it can be hard to check them for bias.[…]
Read more: searchenterpriseai.techtarget.com