×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Library of Congress To Receive Entire Twitter Archive

samzenpus posted more than 2 years ago | from the that's-a-lot-more-than-144-characters dept.

Twitter 106

An anonymous reader writes "The Library of Congress and Twitter have signed an agreement that will see an archive of every public Tweet ever sent handed over to the library's repository of historical documents. 'We have an agreement with Twitter where they have a bunch of servers with their historic archive of tweets, everything that was sent out and declared to be public,' said Bill Lefurgy, the digital initiatives program manager at the library's national digital information infrastructure and preservation program. Researchers will be able to look at the Twitter archive as a complete set of data, which they could then data-mine for interesting information."

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

106 comments

Even deleted ones? (5, Interesting)

Anonymous Coward | more than 2 years ago | (#38295564)

I deleted my Twitter account and it's been 30 days. Does Twitter still keep those tweets for posterity on their servers through some manner of legal acrobatics?

Re:Even deleted ones? (5, Insightful)

Shoe Puppet (1557239) | more than 2 years ago | (#38295598)

I stopped selling the book I wrote and it's been 30 days. May the world still have copies of it through some manner of legal acrobatics?

Once you have published something, you cannot expect to be able to pull it back.

Re:Even deleted ones? (4, Insightful)

Anonymous Coward | more than 2 years ago | (#38295760)

Twitter says they're going to delete it after thirty days. There's a marked difference between 'delete' and 'archive'. I have no issue if someone cut and pasted the last 3200 tweets from my Twitter account but the fact Twitter says they'll delete the tweets, not archive them, is deceptive.

Re:Even deleted ones? (2)

RealGrouchy (943109) | more than 2 years ago | (#38300008)

Twitter says they're going to delete it after thirty days.

No they don't.

Why can't I see all my Tweets? My Tweet count is _,___. Are they lost?

The good news is they're not lost or gone! We have all your Tweets. The bad news is that we currently only allow you to see the 3200 most recent Tweets (this could also be construed as good news, as that number could be lower than 3200). We do not currently plan to change this limit, but we welcome your feedback - just send a mention to @feedback.

From the Twitter FAQs [twitter.com] .

- RG>

Re:Even deleted ones? (1)

Anonymous Coward | more than 2 years ago | (#38295774)

Bad Analogy. Twitter still hosting the content is the equivalent of a publisher still selling the books.

Re:Even deleted ones? (0)

Anonymous Coward | more than 2 years ago | (#38296208)

Bad Analogy. Twitter still hosting the content is the equivalent of a publisher still selling the books.

No, it's not. It's the equivalent of book stores still selling the books that they have on inventory.

Anything that you yourself have placed online is self-published. Twitter is just the medium you've chosen.

Re:Even deleted ones? (0)

Anonymous Coward | more than 2 years ago | (#38296622)

When you post a tweet, you allow twitter to publish your tweet.
When twitter allows you to delete a tweet, twitter implicitly allows you to revoke the right to publish the tweet.

how difficult can this be ?

Re:Even deleted ones? (1)

19thNervousBreakdown (768619) | more than 2 years ago | (#38295824)

Legal acrobatics? You published it! That means that anyone in the whole internet who asked for it, got it. There should be no more expectation that you can take that back than you should be able to stop people from remembering what you said out loud. Less. Twitter themselves couldn't take it back if they wanted to.

Re:Even deleted ones? (1)

petman (619526) | more than 2 years ago | (#38299288)

Interesting that you analogized tweeting with saying something out loud. I'm pretty sure if someone were to record things random people say in a public place and then publishes the recordings, there would be legal repercussions.

Re:Even deleted ones? (1)

shentino (1139071) | more than 2 years ago | (#38297282)

I think your tweets will disappear faster if you send them a DMCA notice and threaten to sue their pants off.

Some pastebin links to legally hot material have mysteriously gone 404.

Re:Even deleted ones? (1)

Anonymous Coward | more than 2 years ago | (#38298772)

I guess that's why the call it a FAQ:
"Private account information and deleted tweets will not be part of the archive."

http://blogs.loc.gov/loc/2010/04/the-library-and-twitter-an-faq/

Any from anyone? (3, Insightful)

AdamJS (2466928) | more than 2 years ago | (#38295584)

Even if it's in their TOS that you lose all rights to the IP contained in a given tweet, this will more than guarantee some lawsuits from some very large groups.

Re:Any from anyone? (4, Insightful)

Baloroth (2370816) | more than 2 years ago | (#38295642)

Why? Anyone who made a public tweet with the expectancy of being able to retain some control over it is, well, a moron... oh wait nevermind. You're probably right.

Re:Any from anyone? (1)

AdamJS (2466928) | more than 2 years ago | (#38303374)

IIRC corporates have sued over confidential information that was recorded on unlisted (but still public!) satellite channels.

I think the DNC may have also took action over the Clinton video that appears in Spin (long before, of course), but I can't remember exactly.

Re:Any from anyone? (1)

El_Muerte_TDS (592157) | more than 2 years ago | (#38295962)

No, you don't lose "Intellectual Property". You just gave non exclusive right for Twitter, and everybody else, to distribute your "intellectual Property". But it's still yours.

Then again, given twitter's size limit, it's not protected under the general interpretation of copyright. It's only 160 characters.

Re:Any from anyone? (1)

Anonymous Coward | more than 2 years ago | (#38296066)

Twitter is only 140 characters, actually.

Re:Any from anyone? (1)

ksd1337 (1029386) | more than 2 years ago | (#38296282)

I think most web services work this way. IANAL, but I'm sure a signature is required for an actual copyright transfer.

Re:Any from anyone? (1)

shentino (1139071) | more than 2 years ago | (#38297292)

Twitter can still be served with a DMCA takedown notice.

Imagine if someone tweeted the PS3 root keys?

Re:Any from anyone? (1)

AdamJS (2466928) | more than 2 years ago | (#38303396)

But if all of it so far is going into public domain (perhaps I'm wrong on that point, I really don't know) then there's nothing to takedown?

Re:Any from anyone? (1)

shentino (1139071) | more than 2 years ago | (#38303672)

It's not on public domain just because it gets posted to twitter.

Copyrighted content remains copyrighted no matter where you post it.

Slapping something on twitter that wasn't yours to begin with doesn't magically make it subject to twitter's terms. You have violated the implied warranty of authority by attempting to act without the permission of the copyright holder as their agent.

Which means you get busted for infringement and your rogue post to twitter gets taken down in compliance with the DMCA. It's no different from posting a copyrighted video on Youtube and having it deleted when someone drops a DMCA notice on their legal department.

Oh great... (5, Funny)

BlastfireRS (2205212) | more than 2 years ago | (#38295616)

...now the inane mumblings and poor grammar of the Twitter Age will be remembered throughout history. I was kinda hoping we'd eventually be able to forget all of this ever happened.

Re:Oh great... (1)

Anonymous Coward | more than 2 years ago | (#38295948)

Same with the previous generations and Usenet, heck even slashdot archive everything, just face up to it, if you don't want your comments to be online forever, don't upload them in the first place! the RIAA and MPAA is having to learn this the hard way.

Re:Oh great... (1)

chuckinator (2409512) | more than 2 years ago | (#38296564)

...now the inane mumblings and poor grammar of the Twitter Age will be remembered throughout history. I was kinda hoping we'd eventually be able to forget all of this ever happened.

Obligatory inane comment about wasted taxpayer money.

Re:Oh great... (1)

Anonymous Coward | more than 2 years ago | (#38296630)

Sure, why not. A historical rune inscription that I am rather fond of reads
"oli er oskeyndr auk strodhinn i rassinn",
which apparently means roughly "Oli has been taken in his unwiped ass."
We tend to think of the study of history as a dignified, if not outright dull pursuit, but there's a lot of vulgarity there, in both senses.

I'm sure that seen with the hindsight of twenty years, or of a hundred years, the texts on Twitter will have a very distinctive feel of the decade about them. And I think that as long as you remember that they are mostly banter from a selected set of people, you could get a pretty good feel of the Zeitgeist of the time from them.

yes, but... (1)

owlnation (858981) | more than 2 years ago | (#38295622)

How much space will this take up?

Re:yes, but... (4, Interesting)

TheSpoom (715771) | more than 2 years ago | (#38295644)

By itself probably a lot, but remember it's mostly text. They'll be able to compress the hell out of it.

Re:yes, but... (0)

Anonymous Coward | more than 2 years ago | (#38295696)

(50 million per day) * (3 years) * (160 bytes) = 7.97246027 terabytes

50 million tweets per day was the number for early 2010, but on the other hand that is a gross overestimate of the early years. Either way, it's almost certainly under 10TB.

FWIW, 10TB is one of the accepted definitions of a "library of congress" (that refers to the printed collection though, in plain text).

Re:yes, but... (0)

Anonymous Coward | more than 2 years ago | (#38295756)

140 characters, duh.

Re:yes, but... (1)

Anonymous Coward | more than 2 years ago | (#38295806)

unknown, however the library of congress will be increased by approximately 1.5 libraries of congress.

Re:yes, but... (1)

kryliss (72493) | more than 2 years ago | (#38296218)

So how do we measure the amount of data on anything as compared to the library of congress now that the size has changed?!?!? All previous measurements now have to be remeasured using the new Library of Congress size model.

Re:yes, but... (1)

dzfoo (772245) | more than 2 years ago | (#38297354)

Hum... On standard measuring conventions, it'll take exactly one Library of Congress.

Re:yes, but... (1)

VortexCortex (1117377) | more than 2 years ago | (#38302050)

How much space will this take up?

I hope NONE. Otherwise they'll screw up my standard unit of measure: Libraries of Congress.

Pooping (5, Funny)

stevegee58 (1179505) | more than 2 years ago | (#38295636)

All my pooping tweets preserved for all posteriority. (intentional misspelling)

Re:Pooping (0)

Anonymous Coward | more than 2 years ago | (#38299568)

Now I can tell my grandchildren that they can find my mark on humanity in the Library of Congress.

I think I made a fake account 4 years ago and posted "i'm a twitter shitter"

LsoC (-1)

Anonymous Coward | more than 2 years ago | (#38295652)

So, how many libraries of congress is this?

I TOLD YOU! (-1)

Anonymous Coward | more than 2 years ago | (#38295730)

Twitter, Facebook, Google+, Youtube. Everything can and will be used against you.

Results are in (4, Funny)

mr1911 (1942298) | more than 2 years ago | (#38295742)

Researchers will be able to look at the Twitter archive as a complete set of data, which they could then data-mine for interesting information.

Nothing interesting was found.

Re:Results are in (0)

Anonymous Coward | more than 2 years ago | (#38296654)

And nothing of value was gained.

This begs one simple question (3, Funny)

karmicoder (2205760) | more than 2 years ago | (#38295794)

Why?

Re:This begs one simple question (0)

Anonymous Coward | more than 2 years ago | (#38296294)

raises, not begs

Re:This begs one simple question (0)

Anonymous Coward | more than 2 years ago | (#38296404)

This does not "beg" the question, it "raises" the question.

I know it is popular among the stupid to be confused about the phrase "begging the question," but you are a geek, you should hold yourself to a higher standard.

Re:This begs one simple question (2)

geekoid (135745) | more than 2 years ago | (#38296406)

Because it's a great documentation of early online society.
Plus there is a ton of social and behavioral data to be found.

I know it's hip to poo-poo twitter on /., but the vast majority of users are normal people with fine spelling tweeting about things that interest them.

Re:This begs one simple question (2, Insightful)

Anonymous Coward | more than 2 years ago | (#38296462)

So future generations can look back on the golden age of the internet when everybody was talking and nobody was listening.

Re:This begs one simple question (1)

Joe Tie. (567096) | more than 2 years ago | (#38300746)

I'm not sure if you're being sarcastic, but these are of immense historical importance. Often times the big historical events are only seen as such in retrospect. So people's reactions to them tend to be heavily based on conjecture and memory rather than solid data. Say what you want about twitter, but it serves as a minute by minute log of the emotional state of people within seconds of anything happening. And yes, there is some selection bias going on in that it's only data from the kind of person who uses twitter. But that's still a million times better than the couple articles by newspaper writers looking for a story, and who'd probably not be interviewing the common man on the street for what seemed like a blurb at the time. The short answer though is that it's not for you and me. It's for the generations not yet born.

#n hashtag is supposed to help (1)

jslarve (1193417) | more than 2 years ago | (#38295810)

I don't know how correct it is, but @PogoWasRight uses it. I asked her why, and she said that tweets with the #n are ignored by Library of Congress import.

Re:#n hashtag is supposed to help (0)

Anonymous Coward | more than 2 years ago | (#38296200)

Twitter should have done the right thing and made it an option within the users profile/settings page.

Re:#n hashtag is supposed to help (1)

geekoid (135745) | more than 2 years ago | (#38296426)

They offer a service. If people didn't want the twits....tweets to be public they shouldn't use it.

I'm not sure what all the hellabalu is anyways, overall, this is a good thing.

Time to put on my tinfoil hat (5, Insightful)

davesque (1911272) | more than 2 years ago | (#38295842)

I've thus far stayed out of the privacy debate, but this is starting to scare me. Where is our right to oblivion, as Jeffrey Rosen put it (see this article [npr.org] ). We call it a right because it represents a fundamental part of the human psyche. Thusly, we can either adapt our system to account for it or face the consequences later when the system breaks down. I have to put in a dissenting vote for this idea.

Re:Time to put on my tinfoil hat (1)

Tyrannosaur (2485772) | more than 2 years ago | (#38295918)

wait, you vote for not storing public information?

Re:Time to put on my tinfoil hat (1)

tomhudson (43916) | more than 2 years ago | (#38296642)

It's called "the anonymity of the crowd." If you think about it, following you around in a public place is called stalking for that reason. You have a right to go about your public business without undue and/or unwanted scrutiny, though less than you used to.

It's the same with the tweets - you agreed to post them for people to read in near-real time, not to be fodder for people to look at "forever and ever, world without end, amen and pass the gravy."

There's also the problem of context, both literal and historical. And that people can and do change their opinions, and that sometimes they have legit reasons for wanting stuff to be forgotten in the dustbin of time.

There's also the problem of "emergent data" - conclusions that are drawn when you combine various data sets. For example, given enough data from enough different sources, someone might be able to link, with +90% certainty you with the poster looking for help in a support forum for rape and abuse victims, then use that as an excuse to pass you over for a job because you might have "issues".

The long and the short of it is that people didn't sign up for this, and Twitter is wrong to do it. And so is the LoC. If Twitter were doing this directly, there'd be h*** to pay. That the LoC is using taxpayer funds just makes it worse.

Mark parent troll/Ignorant. (1)

p43751 (170402) | more than 2 years ago | (#38298042)

No it's not called anonymity of the crowd. If You want an example: You are writing something! with your name on it! and posted it on a searchable part of the web.
How You can claim anonymity on something like that is a mystery for me.
It's nothing you said in a crowd where no one knows You at all.
Neither is it something You said to a friend in a crowd where no one else knows You.
It's something You published on the web under Your name.

Your naivety about writing something on the web and expecting it to be forgotten just shows how little You know about computers and after that I just lost interest.

after that I just lost interest.........

Sarah Palin? Is that You?

Re:Mark parent troll/Ignorant. (1)

tomhudson (43916) | more than 2 years ago | (#38299002)

First, the concept of "anonymity of the crowd" is protected by law in places like Canada. You know, places where the government isn't bought and sold so readily because of limits on political campaign financing ...

Second, nobody authorized twitter (or anyone else) to turn over the entire posting history to researchers.

Re:Time to put on my tinfoil hat (0)

Anonymous Coward | more than 2 years ago | (#38299306)

It's called the Internet and digital information. There's no such thing as anonymity of the crowd in a digital age.

Re:Time to put on my tinfoil hat (1)

tomhudson (43916) | more than 2 years ago | (#38299550)

Actually, there is. Example: Cyber-stalking is the net equivalent of stalking. Both are done in public, and both are illegal. Or are you going to argue that both forms of stalking has suddenly become legal?

Just because something is posted in a public forum does NOT give 3rd parties the right to use it beyond the original agreed-upon terms, and nowhere did anyone give express consent to let their posts be aggregated by some 3rd party, such as the Library of Congress. The LoC was not a party to any agreement with the users, and the users only agreed to allow their posts to be stored and shared for the purpose of posts on twitter, not other services, and not a government agency.

Re:Time to put on my tinfoil hat (1)

geekoid (135745) | more than 2 years ago | (#38296446)

IT's public information. It has exactly NOTHING to do with privacy.
Let me know when twitter gives you the option for tweets to be private,and then gives out THAT information.

Idiot.

Re:Time to put on my tinfoil hat (1)

compgenius3 (726265) | more than 2 years ago | (#38296624)

Well, they do give you the option to "protect" your tweets, making it so that you have to approve everyone who can see them. Will those be archived as well?

Re:Time to put on my tinfoil hat (1)

davesque (1911272) | more than 2 years ago | (#38297318)

That flamebait reply is also public. Image this: Some years later, you are going through an important job application process. The company you're wanting to get hired at queries your name in a "public" online records archive and they find this post where you rashly label someone an idiot and decide you are unfit to work for them because it gives the impression of a hot temper.

Or perhaps they don't even personally view the post, but it was factored into a kind of "personality score sheet" by a data mining bot. They submitted the query to a company that offers that kind of data mining and personality assessment via public online archives as a service.

Does it really seem that far-fetched? And when the persons involved in this Twitter archive project openly propose the possible use of data mining?

Re:Time to put on my tinfoil hat (1)

RealGrouchy (943109) | more than 2 years ago | (#38300018)

People's diaries and letters are published all the time, especially after they're dead and no longer have a say in the matter. And those are things that aren't initially published in a public medium.

- RG>

Computational linguists rejoice (1)

Anonymous Coward | more than 2 years ago | (#38295862)

A wild corpus appears!

Re:Computational linguists rejoice (0)

Anonymous Coward | more than 2 years ago | (#38300074)

I use my sword to detect good on it.

How big? (1)

cashman73 (855518) | more than 2 years ago | (#38295924)

Does anyone know how big the Twitter archive is? In terms of Libraries of Congress? Because with this new "donation", the size of the Library of Congress could double, and it will increase with every tweet.

Re:How big? (4, Funny)

Amouth (879122) | more than 2 years ago | (#38295970)

well now that the Twitter archive is part of the Library of Congress it can only reflected as a portion of the Library of Congress.

Re:How big? (1)

sco08y (615665) | more than 2 years ago | (#38298084)

well now that the Twitter archive is part of the Library of Congress it can only reflected as a portion of the Library of Congress.

So does this mean we've invented the recursive unit?

Yikes (0)

Anonymous Coward | more than 2 years ago | (#38295946)

As a taxpayer, I'm pretty disgusted that we'd spend our money to preserve tweets. What does preserving tweets have over anything else? Why not Facebook comments too? My goodness....what next, we preserve my Netflix recent list and comments on videos? Sigh.

Except (0)

Anonymous Coward | more than 2 years ago | (#38296292)

Things posted by E.U. citizens living in the E.U. should be covered under E.U. law and weren't they just saying we had the right to have everything permanently deleted? If not, then the U.S. cannot complain when E.U. citizens ignore U.S. copyright laws online.

Re:Except (0)

Anonymous Coward | more than 2 years ago | (#38296392)

The EU is bound by treaty to respect US copyrights. The same cannot be said of EU data protection laws on servers domiciled in the US.

A Joke (-1)

Anonymous Coward | more than 2 years ago | (#38296458)

Now only if they could do the same with financial data... oh wait, they would never do that with the current control Wall St. has.

oh yeah (0)

Anonymous Coward | more than 2 years ago | (#38296934)

time to start the tweet cannons and tweet some nice innuendo like "do John Boehner and Newt Gingrich still have gay butseks ?"

Great. (1)

Anachragnome (1008495) | more than 2 years ago | (#38297022)

Great. Now the taxpayers are on the hook for Twitter's backup maintenance costs. Seriously. They don't even need their own storage anymore. I'm sure, since the Library of Congress is a publicly available entity, they'll have full access to the data-sets. They can just pipe everything straight to the LIB servers then access them at will, at any time. And who the fuck is paying for all that bandwidth?

Next we'll see the entire Facebook data-sets, Google cache data...

Twitter Terms of Service (2)

Altanar (56809) | more than 2 years ago | (#38299358)

Twitter Terms of Service: http://twitter.com/tos [twitter.com] "By submitting, posting or displaying Content on or through the Services, you grant us a worldwide, non-exclusive, royalty-free license (with the right to sublicense) to use, copy, reproduce, process, adapt, modify, publish, transmit, display and distribute such Content in any and all media or distribution methods (now known or later developed). You agree that this license includes the right for Twitter to make such Content available to other companies, organizations or individuals who partner with Twitter for the syndication, broadcast, distribution or publication of such Content on other media and services, subject to our terms and conditions for such Content use."

Finally.. (0)

Anonymous Coward | more than 2 years ago | (#38301870)

... I can stop worrying that future generations will be unaware that "I'm drinking orange juice, lol"

And nothing of value was gained? (0)

Anonymous Coward | more than 2 years ago | (#38306446)

And nothing of value was gained?

Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...