Twitter launches prompt encouraging users to rethink hateful replies

Twitter prompt

As conversations on Twitter often turn into heated debates and hateful arguments, the platform is now launching a prompt that asks people to review offensive or potentially harmful language before tweeting.

The prompts are an improved version of the ones being tested in 2020, with changes made after incorporating user feedback and have now been rolled out on iOS and Android. The users would have the option to Tweet, Edit, or Delete the reply after seeing the prompt by Twitter.

Insults, strong language, or hateful remarks would trigger the algorithm to show the prompt. Early tests found that nuances such as sarcasm, and friendly banter were also triggering the prompt as it ould not differentiate such replies from potentially hateful language.

Detection inconsistencies have corrected on the basis of results generated from early experiments and feedback gathered from the public, and Twitter also mentions that “the tests ultimately resulted in people sending less potentially offensive replies across the service and improved behavior on Twitter”.

Also Read: Twitter Updates: Vaccination Prompt & Professional Profiles

Twitter has also made a few more alterations to the algorithms powering the prompts and the detection systems. The nature of the relationship between the author and replier, based on the frequency of their interactions and the likelihood to have an understanding of the preferred tone of communication, will also be considered.

In situations wherein underrepresented communities use the language in non-harmful ways, would be better accounted for by the modifications to the technology.

Improvement to the technology to more accurately detect strong language, including profanity, and an easier way for people to let Twitter know if they found the prompt helpful or relevant, have also been integrated.

Early experiments found that if prompted, 34% of people revised their initial reply or decided to not send their reply at all, and people were less likely to receive offensive and harmful replies back. After being prompted once, people composed, on average, 11% fewer offensive replies in the future.