Twitter has released a data store of posts from 3,841 accounts that have been identified as being connected to the Internet Research Agency (IRA), the Russian "troll factory" that used Twitter and Facebook to conduct an "influence campaign" aimed at causing political turmoil during the 2016 US presidential election as well as undermining the political process in other countries, including Germany and Ukraine. The company has also released another set of data connected to 770 accounts believed to be connected with an Iranian influence campaign.

Totaling over 360 gigabytes—including more than 10 million tweets and associated metadata, and over 2 million images, animated GIFs, videos and Periscope streams—the data store provides a picture of how state-sponsored agencies have used the Twitter platform. Some of the content dates back as far as 2009.

In a post announcing the release, Twitter Legal, Policy and Trust & Safety lead Vijaya Gadde and Twitter's head of Site Integrity Yoel Roth wrote that Twitter was providing the data "with the goal of encouraging open research and investigation of [state-sponsored influence and information campaigns] from researchers and academics around the world."

The archive of the IRA's tweet metadata alone is 5.4GB of comma-separated data when expanded. In many cases, the user ID and screen name of many accounts—those with fewer than 5,000 followers—have been concealed with hash values to "reduce the potential negative impact on real or compromised accounts," a Twitter spokesperson said in a statement on the data archive. The hash values still allow individual accounts to be analyzed without exposing the actual names associated with them.

[...] Gadde and Roth noted that Twitter expects these sorts of campaigns to continue and said that Twitter's Site Integrity team will "continue to proactively combat nefarious attempts to undermine the integrity of Twitter, while partnering with civil society, government, our industry peers, and researchers to improve our collective understanding of coordinated attempts to interfere in the public conversation."