Knowledge of the processes taking place at each step of a typical machine learning deployment is crucial for transparency in its applications.
Organizations in a growing range of industries rely on machine learning solutions—and in turn their underlying algorithms—to carry out extensive analysis. Their ability to resolve business issues and work in tandem with human knowledge means there has been a high uptake for machine learning technology. But in-built and untraceable bias in the algorithms used in some applications has caused AI and ML to come under scrutiny. Regulation is imminent and Gartner predicts by 2023 over half of G7 nations will have associations to supervise AI and ML development and deployment. Knowledge of the processes taking place at each step of a typical machine learning deployment is crucial for transparency in its applications.
Recent headline news such as the Amazon AI-driven recruitment program that demonstrated bias against female applicants, have made AI and machine learning (ML) ethics a priority issue. Algorithms used in these deployments must be humble and honest to tackle bias concerns. Considering the recent headlines, AI and ML tools must be developed with some key features to ensure the future of the technology is bias free and holds users accountable for decisions made. Users must always be able to understand how an algorithm has come to a decision and validate or adjust it accordingly. For this to happen visibility into the process is essential, putting responsibility back in users’ hands.
Important to provide step-by-step guidance through model selection
With so many model types, choosing and applying the best model for an analysis can be difficult. Deep neural network models, for example, are inherently less transparent than probabilistic methods, which typically operate in a more ‘honest’ and transparent manner.
Many machine learning tools are lacking features to actively involve users – they are fully automated with no opportunity to review and select the most appropriate model. Although this is beneficial for users to rapidly prepare data and deploy a machine learning model, it provides little to no prospect of visual inspection to identify data and model issues.
For an ML solution to perform effectively it needs to be able to help identify and advise on resolving possible bias in a model during all stages – from the preparation stage, then on to provide support through to creation—where it will visualize what the chosen model is doing and provide accuracy metrics—and finally on to deployment, where it will evaluate model certainty and provide alerts when a model requires retraining.
Testing features shift power back to users
To further increase visibility during data preparation and model deployment, we should look towards ML platforms that have built-in testing features, where users can test a new data set and receive best scores of the model performance. This helps identify bias and make changes to the model accordingly.
When a selected model is rolled out, the most effective platforms will also extract extra features from data that are otherwise difficult to identify and help the user understand what is going on with the data at a granular level, beyond the most obvious insights.
The overall objective here is to put power directly into the hands of the users, enabling them to actively explore, visualize and manipulate data at each step, rather than simply delegating to an ML tool and risking the introduction of bias.
Control your bias
Problems such as bias can be introduced into the machine learning process as early as the initial data upload and review stages. There are hundreds of parameters to take into consideration during data preparation, so it can often be difficult to strike a balance between removing bias and retaining useful data.
But it is clear that a parameter that can easily introduce bias is gender. Gender might be a useful parameter when looking to identify specific disease risks or health threats, but using gender in many other scenarios is completely unacceptable if it risks introducing bias and, in turn, discrimination. Machine learning models will inevitably exploit any parameters—such as gender—in datasets they have access to, so it is vital for users to understand the steps taken for a model to reach a specific conclusion.
Behind the scenes of machine learning
Taking the complexity out of the data science procedure will help users discover and address bias faster – and better understand the expected accuracy and outcomes of using a particular model.
Machine learning tools with built-in explainability allow users to demonstrate the reasoning behind applying ML to tackle a specific problem, and ultimately justify the outcome. First steps towards this explainability would be features in the ML tool to enable the visual inspection of data – with the platform alerting users to potential bias during preparation – and metrics on model accuracy and health, including the ability to visualize what the model is doing.
ML platforms can take transparency further by introducing full user visibility, tracking each step through a consistent audit trail. This records how and when data sets have been imported, prepared and manipulated during the data science process. It also helps ensure compliance with national and industry regulations—such as the European Union’s GDPR “right to explanation” clause—and helps effectively demonstrate transparency to consumers. […]