Twitter releases enormous datasets of Russian tweets for data scientists

17 Oct 2018

A Twitter sign on a building. Image: wolterke/Depositphotos

Data journalists and researchers will now get a chance to search through millions of tweets linked with operatives in Russia and Iran.

At the beginning of 2018, Twitter revealed its findings into the 2016 US presidential election and its effects on it, widely believed to have been a key player in the unexpected electing of Donald Trump. In the months that followed his victory, reports emerged that teams of dedicated operatives in Russia and Iran worked to stir up hatred and division on the social media platform to encourage a Trump victory.

Now, after promising the US Congress that it was going to thoroughly investigate what happened prior to the election, Twitter is dumping millions of tweets online in huge datasets.

In a blogpost, the company said the datasets contain all the accounts and content associated with “potential information operatives” found on the platform since 2016.

In numbers, Twitter said the datasets comprise 3,841 accounts affiliated with the Internet Research Agency – a Russian-based company that attempts to influence online discussions – and a further 770 accounts possibly linked with Iran.

‘They will adapt and change’

Content-wise, this includes more than 10m tweets as well as 2m images, GIFs, videos and Periscope broadcasts. All of this, authors Vijaya Gadde and Yoel Roth said, has been done “with the goal of encouraging open research and investigation of these behaviours from researchers and academics around the world”.

The company did admit, however, that it likely will never completely eliminate the efforts of the Internet Research Aency and other operatives as “they will adapt and change as the geopolitical terrain evolves worldwide and as new technologies emerge”.

Meanwhile, here in Ireland, Twitter has come under investigation from the Irish Data Protection Commission (DPC) over its refusal to provide a specific user with data about how he is tracked when he clicks on links in tweets.

University College London researcher Michael Veale said that he asked Twitter to provide him with the personal data the company had collected from him when he clicked on links in other people’s tweets, but Twitter refused. The Irish DPC received a complaint from Veale, and is now investigating the issue.

A Twitter sign on a building. Image: wolterke/Depositphotos

Colm Gorey was a senior journalist with Silicon Republic