How Platforms Rate on Hate: Measuring Antisemitism and Adequacy of Enforcement Across Reddit and Twitter
How much hate exists online, and how large is its reach? Is it possible to independently evaluate the claims tech companies make about the amount of hate on their platforms and how effectively they are addressing it? To answer these questions, ADL Center for Technology and Society (CTS) is building the Online Hate Index (OHI), a machine learning system that detects hate targeting marginalized groups on online platforms. This report presents the inaugural findings of the OHI’s antisemitism classifier, a new artificial intelligence tool that harnesses the rich knowledge of ADL’s antisemitism experts to that of trained volunteers from the Jewish community who have experienced antisemitism.
We used this antisemitism classifier and our human reviewers to filter and analyze representative samples of English-language posts over the week of August 18-25, 2021, across both Twitter and Reddit. We should make clear that this analysis is not an indictment of Reddit and Twitter. In fact, we could only do this analysis because of these companies’ commitments to transparency and data-sharing with third parties, a lead we call on other platforms to follow. For example, this analysis would not be possible on Facebook, the world’s largest social media platform. While Reddit and Twitter have far more to do, they have both made substantial recent strides in addressing antisemitism and hate online. In this light, we offer our recommendations to help them better address these broader societal problems of online—and offline—hate and antisemitism.
Extrapolating from the late August 2021 Twitter and Reddit samples, we estimate:
The potential reach of antisemitic tweets in that one week alone was 130 million people on Twitter. An equivalent estimate of the reach of antisemitic content on Reddit is not available.
That extraordinary reach was made possible by the 27,400 antisemitic tweets our machine learning tool enabled us to calculate were posted on Twitter that week; we found 1,980 antisemitic comments on Reddit.
The rate of antisemitic content on Twitter was 25% higher than it was on Reddit during that week.
A month later, we evaluated company enforcement against the antisemitic content we had found. Then, more than two months after the initial investigation, we repeated the analysis on the same sample. We found that the great majority of the antisemitic content had remained on the platforms for months, in clear violation of company guidelines on hate content, revealing the continuing inadequacy of the companies’ content moderation.
Moreover, even after ADL eventually reported the content that had remained online for more than two months after the initial discovery, the companies failed to remove more than half of that original antisemitic content.
We returned to the antisemitic content about a month after the initial discovery to see if the platforms had removed it, but little had changed. On Twitter, 79% of the original antisemitic tweets remained, and on Reddit, 74% of the antisemitic comments remained.
We returned again more than two months after the initial discovery and found that at least 70% of the anti-Jewish content was still on the platforms.
Finally, on November 10, 2021, more than two months after the initial discovery, we contacted the platforms directly to report the antisemitic content from our samples that remained online.
One week after that notification, we returned to the representative samples and found that 56% of the antisemitic Reddit comments and 57% of the antisemitic tweets were still online.
Among the hundreds of millions of tweets and the tens of millions of Reddit comments posted during the week in question, 27,000 antisemitic tweets and 2,000 antisemitic Reddit comments may not sound like much relative to overall volume,but it is in line with the relatively small yet disproportionately harmful levels of all types of toxic and abusive content experts have found across online platforms. One such study indicates that approximately 0.001–1% of content on mainstream online platforms may contain some form of abuse, a category that includes not only content that targets people based on their identities, but also more generally abusive content. More niche platforms may have levels of abuse closer to 5–8%.
Research consistently shows the impact of hate and harassment online is significant, and even a single targeted comment can affect a person’s life to an extraordinary degree. In our most recent annual survey of Online Hate and Harassment, published in March 2021, 41% of American adults reported experiencing online hate and harassment, and 27% reported being subjected to severe online harassment (defined as sexual harassment, stalking, physical threats, swatting, doxing, and sustained harassment). Thirty-three percent of all respondents in our survey reported experiencing online hate that was identity-based, targeting individuals or groups on the basis of, for example, race, religion, ethnicity, gender, or sexual orientation. The scale of the harm caused by antisemitism online comes into even sharper focus when the extraordinary reach of the content we find is considered in conjunction with ADL’s recent research showing that 31% of Jewish Americans report being targeted on platforms because they are Jewish.
This report almost certainly undercounts the overall prevalence of antisemitism on the platforms we researched, as ADL’s classifier only detects English-language antisemitism, in the form of text, excluding videos, audio, and images. This tool is also better at detecting explicit language than subtle language, though our ongoing training of the classifier will continue to improve that capability.
Indeed, it’s particularly dismaying that so much of the content we discovered was blatantly antisemitic. It was not even debatable.
To the best of our knowledge, the ADL Online Hate Index and this investigation represent the first cross-platform measurement of the prevalence of antisemitic content on social media undertaken by an independent civil society organization utilizing an AI tool that is meticulously trained by experts in antisemitism and volunteers from the targeted community. This enables the first-ever independent, AI-assisted, community-based measurement of identity-based hate across an entire platform.
The ability to independently measure hate content at scale, and compare results between and among different platforms, is crucial to understanding how much hate exists online. It makes possible a better understanding of what internal or external triggers may increase or decrease the amount of hate online. It is also essential for independently determining if companies’ anti-hate policies, practices, and product changes work and if their claims on that score can be verified. This work also provides a model for other civil society organizations rooted in targeted and marginalized groups who wish to take active roles in training similar classifiers to identify the specific types of online hate that target their communities.
It is worth emphasizing again that Twitter and Reddit, to their credit (and in marked contrast to the world’s largest social media platform, Facebook), make their data far more accessible to independent researchers for this kind of analysis. While there are areas for improvement, we commend both platforms for this transparency. We hope it serves as an example of how to advance the fight against online hate.