Twitter is experimenting with several options and practices to reduce toxicity & hate speech on the platform.
Kayvon Beykpour, Product Lead, Twitter in a conversation with Wired discusses the strides Twitter is experimenting with, to tackle the growing problem of hate speech on the platform.
Beykpour mentions using machine learning to identify tweets using hate speech and de-ranking them is one way they’re experimenting with.
So tweets that are most likely to be blocked or reported for abuse would appear lower in the thread. Twitter optimizes replies that are more likely to get reactions. However, de-ranks replies that they are reaction-worthy, yet abusive.
Twitter is building AI that can detect hate speech by precisely defining what their rules area and having sample data around tweets that violate those rules.
The AI models being built will only predict the tweets that violate Twitter’s rules. Beykpour adds that few tweets that might be considered abusive by users, may not violate Twitter’s rules, “and that’s where it gets tricky”.
Twitter has previously been involved in several controversies surrounding this tricky area. Seemingly, Twitter’s systems detect words from the sentence and not the actual meaning of it. So, even the tweets that demote hateful conduct and not promote it get removed or a few times even get the account banned.
Several users have complained about this too.
Moreover, users have also pointed out that hateful tweets from infamous personalities such as Donald Trump stay on the platform, without him facing any consequences.
To which Twitter defended itself by saying, tweets from such personalities may be hateful but need to stay on the platform to give context.
Twitter may have to train their AI models better to understand tweets and not just detect them. Or the back and forth debate over suspension might never end.