Scientists have identified human biases in datasets used to train machine learning models for computer-aided syntheses.
They found that models trained on a small randomised sample of reactions outperformed those trained on larger human-selected datasets. The results show the importance of including experimental results that people might think are unimportant when it comes to developing computer programs for chemists.
Machine learning models are a valuable tool in chemical synthesis, but they’re trained on data from the literature where positive results are favoured, whereas the dark reactions – the experiments that were tried but didn’t work – are usually left out. ‘Including these failures is essential for generating predictive machine learning models,’ says Joshua Schrier of Fordham University, US, who was part of a team that studied hydrothermal syntheses of amine-templated metal oxides and found that biases were introduced into the literature by people’s choices of the reaction parameters.
‘We considered extra dark reactions – a class of reactions that humans don’t even attempt, not because of scientific or practical reasons, but simply because it’s humans who make the decisions,’ Schrier says. ‘We found that chemists tend to be stuck in a rut when planning new experiments, and this gets reinforced by social cues. There’s a tendency to follow the crowd, as defined by precedent in the literature.’ This results in systematic overrepresentation of some reagents and reaction conditions in experimental datasets, he says. ‘We found evidence for this both in crystallographic databases and in our collection of digitised dark reactions from laboratory notebooks.’
The researchers evaluated over 5000 amine-templated metal oxide structures deposited in the Cambridge Structural Database and found that 17% of the known amine reactants (70 ‘popular’ molecules) occur in 79% of the reported structures, while the remaining 83% (345 ‘unpopular’ molecules) are present in just 21% of the structures. They also analysed unpublished experimental records for hydrothermal vanadium borate reactions from their Dark Reactions Project and found similar biases in the pH and amine quantities used.
‘We removed this bias by intentionally rejecting the standard approach to these exploratory reactions,’ says Alexander Norquist of Haverford College, US, who was also involved in the study. He points out that there was no difference in the reaction performance when the ‘unpopular’ amines were used. ‘We created two machine learning models. One used the biased data and the other used random experiments. The model from the random experiments was stronger and better. In a laboratory test with unseen reagents, it was able to predict new reactions more successfully and discover new compounds that would be totally missed by a model trained on the anthropogenic biased data.[…]