Stories
Slash Boxes
Comments

SoylentNews is people

posted by mrpg on Wednesday December 27 2017, @09:00AM   Printer-friendly
from the announcement dept.

Starting on Jan. 1, 2018, the U.S. Library of Congress will only archive Twitter selectively, instead of nearly completely:

Since 2010, Library of Congress has been archiving every single public tweet: Yours, ours, the president's. But today, the institution announced it will no longer archive every one of our status updates, opinion threads, and "big if true"s. As of Jan. 1, the library will only acquire tweets "on a very selective basis."

The library says it began archiving tweets "for the same reason it collects other materials – to acquire and preserve a record of knowledge and creativity for Congress and the American people." The archive stretches back to Twitter's beginning, in 2006.

But as anyone who's been following along can attest, Twitter and the way it's used has changed since then. First and foremost from a collection perspective: the sheer number of tweets.

"The volume of tweets and related transactions has evolved and increased dramatically since the initial agreement was signed," the library explains in a white paper accompanying the accouncement[sic].

The library doesn't say how many tweets [it] has in its collection now, but in 2013, it said it had already amassed 170 billion tweets, at a rate of half a billion tweets a day.

[...] Another issue: Twitter only gives the library the text of tweets – not images, videos, or linked content. "Tweets now are often more visual than textual, limiting the value of text-only collecting," the library says.

The library also has to figure out how to effectively manage deleted tweets, which aren't part of the archive.


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by All Your Lawn Are Belong To Us on Wednesday December 27 2017, @05:39PM

    by All Your Lawn Are Belong To Us (6553) on Wednesday December 27 2017, @05:39PM (#614800) Journal

    Probably a coincidence, but it is an interesting concept. Of course, they never had to take all tweets in the first place, but since they started I wonder if anyplace else is trying something similar (Internet Archive? Something else?)

    I'm sure they'll have standards and if researchers are actually going to query the data they'll have to know what the new limits are. But AFAIK, the LoC has always had carte blanche in determining what it thinks is necessary to preserve. Somehow I doubt that the two Stradivarius violins they have (per Wikipedia) will ever be used in a CBO study. ;)

    --
    This sig for rent.
    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2