Safety Mode: Twitter to trial new account autoblock feature

1 Sep 2021

A phone with Twitter’s safety mode feature is shown on-screen. The black smartphone is depicted against a blue background.

Image: Twitter

When enabled, Twitter’s new setting will temporarily block accounts sending tweets that it deems harmful, offensive or repetitive.

Twitter is trialling a new feature called Safety Mode which will try to detect when accounts are sending potentially harmful or insulting tweets. When tweets are picked up by this feature, the user’s account will automatically block the offending account for seven days.

Twitter said its technology will “assess the likelihood of a negative engagement by considering both the tweet’s content and the relationship between the tweet author and replier”. This means that blocks won’t come into effect between two users who interact frequently, so playful banter shouldn’t accidentally be mistaken for offending content.

The feature won’t be rolled out for everyone just yet but will be implemented for select members of a feedback group. Those in the group will be able to turn on Safety Mode through the Twitter privacy settings.

When Safety Mode detects harmful content, the offending account will be unable to follow the user’s account, see their tweets or send any direct messages until the seven days have elapsed.

Safety Mode users will be able to see the tweets that caused the automated intervention, as well as the details of the blocked account. Before the block period ends, the user will receive a notification with a recap of this information so they can make any further necessary decisions.

“We won’t always get this right and may make mistakes, so Safety Mode autoblocks can be seen and undone at any time in your settings,” Twitter explained. “We’ll also regularly monitor the accuracy of our Safety Mode systems to make improvements to our detection capabilities.”

Twitter also highlighted that it conducted prior feedback sessions with expert partners in online safety, mental health and human rights. Its Trust and Safety Council was a large part of this, with contributing members asked their opinions and advice.

“As members of the Trust and Safety Council, we provided feedback on Safety Mode to ensure it entails mitigations that protect counter-speech while also addressing online harassment towards women and journalists,” said Article 19, a human rights organisation centred on digital rights and equality.

“Safety Mode is another step in the right direction towards making Twitter a safe place to participate in the public conversation without fear of abuse.”

Twitter particularly emphasised online gender-based violence as an area of concern. It said that it held a number of discussions on how best to implement customisation for features like Safety Mode.

The social media platform has lately been busy tweaking its features. It received flak for its updated font and visual design and also retired its Fleets temporary content feature. The company previously said that it planned to spend additional funds this year on research and development.