Machine-learning systems could help flag hateful, threatening or offensive language
Copyright by www.scientificamerican.com
Social platforms large and small are struggling to keep their communities safe from hate speech, extremist content, harassment and misinformation. Most recently, far-right agitators posted openly about plans to storm the U.S. Capitol before doing just that on January 6. One solution might be AI: developing algorithms to detect and alert us to toxic and inflammatory comments and flag them for removal. But such systems face big challenges.
The prevalence of hateful or offensive language online has been growing rapidly in recent years, and the problem is now rampant. In some cases, toxic comments online have even resulted in real life violence, from religious nationalism in Myanmar to neo-Nazi propaganda in the U.S. Social media platforms, relying on thousands of human reviewers, are struggling to moderate the ever-increasing volume of harmful content. In 2019, it was reported that Facebook moderators are at risk of suffering from PTSD as a result of repeated exposure to such distressing content. Outsourcing this work to machine learning can help manage the rising volumes of harmful content, while limiting human exposure to it. Indeed, many tech giants have been incorporating algorithms into their content moderation for years.
One such example is Google’s Jigsaw, a company focusing on making the internet safer. In 2017, it helped create Conversation AI, a collaborative research project aiming to detect toxic comments online. However, a tool produced by that project, called Perspective, faced substantial criticism. One common complaint was that it created a general “toxicity score” that wasn’t flexible enough to serve the varying needs of different platforms. Some Web sites, for instance, might require detection of threats but not profanity, while others might have the opposite requirements.
Read more: www.scientificamerican.com