Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Facebook Crawler Speaks Back

CmdrTaco posted more than 4 years ago | from the everybody-litigate-now dept.

The Courts 317

Last week we ran a story about Facebook suing to get a crawled dataset offline. This week we have a bit of a response written by Pete Warden, the guy who actually did the crawling. He followed robots.txt, and then Facebook's lawyers went after him. It's actually a quite interesting little tale and worth your time.

cancel ×

317 comments

Sorry! There are no comments related to the filter you selected.

Pretty naive (5, Insightful)

elrous0 (869638) | more than 4 years ago | (#31747338)

Did this guy really think he could just give away the data that Facebook sells (or intends to sell) to third parties and NOT have them sue him for it? It's no secret that the business model of most of the social sites and big search engines factor in the massive amounts of data they collect on users as a major corporate asset, to be used internally for data mining and also sold (supposedly after being anonymized) to advertisers and other third parties. It takes a babe in the woods to think he can just waltz in and take that away with a "But your robot.txt didn't say I *couldn't* do it" defense, without expecting a big legal fight.

Is the guy in the right? Probably. Would he have a case? Probably. Does either of those facts matter if he doesn't have the big $ needed to hire lawyers and fight through several courts? Nope.

Re:Pretty naive (5, Insightful)

Anonymous Coward | more than 4 years ago | (#31747372)

If he's in his right, but not having as much money as a big cooperation means he'll lose anyway, then your U.S. court system is broken. Please fix it.

Re:Pretty naive (3, Insightful)

elrous0 (869638) | more than 4 years ago | (#31747578)

Sure, I'll just go to Congress and explain to them that they should pass a law that will be universally opposed by the corporations that give them millions in campaign contributions every year--because it's the right thing to do.

Re:Pretty naive (2, Funny)

Shakrai (717556) | more than 4 years ago | (#31747674)

I'll just go to Congress and explain to them that they should pass a law that will be universally opposed by the corporations that give them millions in campaign contributions every year

You do realize that corporations can't (legally) contribute to campaigns in the United States, right?

Re:Pretty naive (4, Informative)

elrous0 (869638) | more than 4 years ago | (#31747718)

I hope you're being sarcastic. If not, I have some bad news [npr.org] for you.

Re:Pretty naive (4, Insightful)

Archangel Michael (180766) | more than 4 years ago | (#31748264)

Want to fix the ELECTION laws, while not breaking the First Amendment Rights to Free Speech? It is really quite simple. One simple rule.

Only People (persons, not legal entities)who are eligible to vote can donate to political campaigns.

This doesn't deny corporations from running ads, they just have to do it on their own, and out in the open where everyone can see who they are telling people to vote for. They have to buy their own ads to tell people to vote for Harry Reid or Mitch McConnell.

This also goes for Unions and all other organized groups. Make them buy their own ads for their own causes.

Simple rule, clear, concise, straightforward and solves all sort of problems with current campaign laws, without any bias towards or against anyone.

AND that is why it won't ever be implemented.

And I'm sure that there is someone that is going to be upset because their favorite group won't be able to donate money to a candidate/campaign while at the same time restricting anyone that might oppose them (it) from doing likewise at the same time.

Re:Pretty naive (1, Informative)

CrimsonAvenger (580665) | more than 4 years ago | (#31747960)

You do realize that corporations can't (legally) contribute to campaigns in the United States, right?

Actually, they can. They weren't even forbidden to do so before McCain-Feingold was largely overturned. They were merely limited in the amounts they could contribute.

Now, of course, they can contribute freely to any campaigns....

Re:Pretty naive (0)

Anonymous Coward | more than 4 years ago | (#31747974)

That law was recently repealed (not a joke you can Google the story). It will take some time for the political momentum to hit main stream media outlets but now for example, Bank of America can donate to a political campaign.

Back before the law that was recently repealed in the U.S.A. we used to have politicians openly owned by corporations. For example someone would be introduced as "The Senator of US Steel" rather than say "The Senator from Pennsylvania". The corruption go so bad that a law was created to stop this type of campaign finance. But because grandchildren and great grandchildren usually forget the lessons their ancestors had to learn the hard way, the U.S. Supreme Court has decided to repeal the law to prevent corporate finance of political campaigns.

I predict it will take about 10 years before a U.S. politician has the balls to be introduced as "The Senator for Microsoft".

Re:Pretty naive (1)

Dorkmaster Flek (1013045) | more than 4 years ago | (#31748202)

Uh, I think you're thinking of Canada. One of the things I actually love about our system, actually.

Re:Pretty naive (1)

captainpanic (1173915) | more than 4 years ago | (#31747786)

So, you already gave up.
"We'd like to have a fair system, but hey, we can't win anyway."

Your government is a good government and a true representation of the people that live in the USA. The US government doesn't fix this problem, because its people just don't care.

More on topic: I believe that regarding sites like facebook, we're going through a phase of "awareness" where the general public has no clue of how much this website knows about them.
And therefore I salute the guy who tried to screw Facebook and get it out in the open. I realize that he also did it for the money though :)

Re:Pretty naive (1)

elrous0 (869638) | more than 4 years ago | (#31747858)

Not caring and not being naive are two very different things. There are ways to change certain aspects of the system, but appealing to Congress's sense of morality is not one of them--especially in an era of ridiculously expensive election campaigns that absolutely depend on corporate and special interest sponsorship.

Re:Pretty naive (3, Insightful)

Lumpy (12016) | more than 4 years ago | (#31747990)

Nope I haven't gave up, I'm hoping for an uprising. problem is that 99.997% of all Americans are placated with their cable tv. Fat dumb and happy is the American way. Almost nobody here will even inconvenience themselves for "freedom" Then we have these "tea party" idiots. loudmouths simply looking for 10 minutes of fame who really have no desire to protect freedom.

Re:Pretty naive (3, Insightful)

rilian4 (591569) | more than 4 years ago | (#31748318)

Interesting. You want people to fight for their freedoms but when someone does, all of a sudden they're a bunch of idiots. Not saying I agree w/ everything the Tea Party stands for but at least they're willing to to stand for it in public and fight for it. If they don't get loud, no one listens. Most of them probably have no desire for fame or fortune. They simply want their freedom. If you really want people to stand up for their rights, be prepared for the consequences.

Re:Pretty naive (2, Informative)

Yvanhoe (564877) | more than 4 years ago | (#31748192)

Or you could support Lawrence Lessig's Fix Congress First [fixcongressfirst.org] initiative which proposes to do just that.

Re:Pretty naive (1)

Yvanhoe (564877) | more than 4 years ago | (#31748254)

Don't forget to bring pitchforks and torches.
Yes, people used to do that to fix systems.

Re:Pretty naive (0)

Anonymous Coward | more than 4 years ago | (#31747638)

I don't think the issue is that he's right but our courts cost too much to prove it. I think its more a matter of "He might be right, according to some strict interpretations, but other than that its a grey area." That means that given the right argument, from the right lawyer, to the right judge, and he could very well lose the case.

Which has implications for everyone. Not just this guy. Which sucks.

Is our system broken? In some instances. Can we fix it? Not without disregarding the whole thing altogether.

Re:Pretty naive (5, Insightful)

Lumpy (12016) | more than 4 years ago | (#31747958)

WE have the best court system money can buy!

Re:Pretty naive (5, Interesting)

qoncept (599709) | more than 4 years ago | (#31747422)

matter if he doesn't have the big $ needed to hire lawyers

Thank you. I ran an open source project for a few years and came home one night to find to find that my webhost had taken its site down after being contacted by a company with a similar name. The company claimed they'd tried to contact me, explained how my project was causing them harm, but the simple fact of the matter was that my project's name did not infringe on theirs.

I ended up renaming the project. I've told the story dozens of times, and the response is always the same. "That's BS! They can't do that! Go to court!" People don't understand that $20 a month in unmanaged Google ads doesn't cover lawyers the same way that company's actual paying customers do.

Re:Pretty naive (1, Interesting)

Anonymous Coward | more than 4 years ago | (#31747620)

Naieve perhaps but also depressing. Our system does not offer equal justice. It only offers justice for those who can afford to pay handsomely for it and thus guarantees injustice for those who cannot. Hurray.

Of course, even coming up with a hypothetical system of justice that would solve this inequity is incredibly difficult so the system we have endures.

Re:Pretty naive (2, Insightful)

elrous0 (869638) | more than 4 years ago | (#31747946)

A good start would be a "loser pays [pointoflaw.com] " system similar to what they have in much of Europe. It gives people who legitimately have a strong case a chance to find a lawyer, and discourages frivolous lawsuits and lawsuits aimed only at intimidation (so-called "SLAP" lawsuits).

Re:Pretty naive (2, Insightful)

kalirion (728907) | more than 4 years ago | (#31748312)

And then if your lawyer loses the case, you get to pay for the company's team of 20 $1000/hr lawyers?

Re:Pretty naive (5, Interesting)

Pharmboy (216950) | more than 4 years ago | (#31747724)

American justice might be blind, but it know what money smells like. One more reason why we need judicial reform to prevent abuses like this. Of course fighting it wouldn't be worth it, as even if you won, your "winnings" would have only been the ability to continue using the name. Another good example is http://www.nissan.com [nissan.com] , where he actually fought and won, at a great price. His name is Nissan, and his computer business and name existed back when the cars were called "Datsun", but they sued anyway. This is another one of those "We are bigger than you, thus more deserving of the domain name than you" cases.

Re:Pretty naive (1)

neumayr (819083) | more than 4 years ago | (#31747996)

Interesting. In Germany, we have insurances that pay the legal fees when you need to protect your rights. They're called "rights protection insurance", and it's absurd that those a required. They're just a quick patch for a broken legal system.

But they do lower the price of being actually able to fight for your rights, allowing more people to do so...

Re:Pretty naive (1)

John Hasler (414242) | more than 4 years ago | (#31748186)

Such insurance is available in the USA.

Re:Pretty naive (1)

Altus (1034) | more than 4 years ago | (#31748322)

yea, but I'm betting you need to have the insurance before someone comes to sue you.

If you knew that someone was going to come after you for the name of your open source project you probably wouldn't have used that name in the first place.

Its tough to justify paying for insurance to ensure your own rights, at least before you have experienced being the little guy in a lawsuit and by then it is too late.

Re:Pretty naive (0)

Anonymous Coward | more than 4 years ago | (#31747482)

Facebook already gives the data away from free. All he did was merely aggregate it.

As far as legalities; allowing crawlers via robots.txt is an open invitation to crawl, index, and publish results.

He should have hung in there - I'm sure he could have found a law firm to work on that pro bono, with the chance of scoring a jackpot (plus a resume entry) by annhilating Facebook's claims.

Re:Pretty naive (3, Insightful)

elrous0 (869638) | more than 4 years ago | (#31747608)

The only way to "score a jackpot" in a case like this is to have it declared a civil rights case (meaning the losing party has to pay the lawyer's fees of the winner), and that doesn't seem very likely here.

Re:Pretty naive (0)

Anonymous Coward | more than 4 years ago | (#31747536)

Did this guy really think he could just give away the data that Facebook sells (or intends to sell) to third parties and NOT have them sue him for it? It's no secret that the business model of most of the social sites and big search engines factor in the massive amounts of data they collect on users as a major corporate asset

You brought up an interesting question in my mind. I am in the process of making a web browser game. I intended to fund it through micropayments and advertisments. Lets say I get several thousand users... I never considered what I would do if some company approached me and wanted to buy user data. Even if it is as simple as anonymous statistics that I am selling, I don't know how I would respond.

Take the money:
a. The game gets to continue being hosted (to the enjoyment of my users)
b. I continue to make money (being a starving programmer isn't fun, as many of you know)
c. I have to compromise on some of my ideas on privacy "If you have something that you don't want anyone to know, maybe you shouldn't be doing it in the first place," [youtube.com] right? Right?

Deny the money:
a. Turn down potential extra money. Less money = game development is slower
b. I stay a starving programmer. Not really, I have a 9 to 5. But every dollar in this economy helps
c. I get to take the moral high ground! Yay!

I'm not sure what the net good is.

Re:Pretty naive (0)

Anonymous Coward | more than 4 years ago | (#31747892)

Is that you Overload?

His startup "Mailana" is "Anal I Am" in reverse (1, Funny)

tomhudson (43916) | more than 4 years ago | (#31747590)

FTFA:

I'm a software engineer, my last job was at Apple but for the last two years I've been working on my own startup called Mailana. The name comes from 'Mail Analysis', and my goal has been to use the data sitting around in all our inboxes to help us in our day-to-day lives.

All Facebook is doing is nailing has "anal".

re you sig (0)

Anonymous Coward | more than 4 years ago | (#31747808)

> LGBT used the toilet war [transboutique.com]

You understood wrong, someone said "Yeah we boys _got screws_ around".

This is the wrong forum.

Re:Pretty naive (1)

digitalchinky (650880) | more than 4 years ago | (#31747636)

I'm not sure how good his chances would be in court, though how is this any different than people scraping the train or bus timetables and such - after all, it's just a bunch of 'facts' when you get down to it. If facebook didn't want that stuff to be public, then maybe they'd put a little more effort in to their privacy mechanisms.

Re:Pretty naive (5, Interesting)

julesh (229690) | more than 4 years ago | (#31747698)

It takes a babe in the woods to think he can just waltz in and take that away with a "But your robot.txt didn't say I *couldn't* do it" defense, without expecting a big legal fight.

Yes. Apart from anything else, he's just about entirely missing Facebook's point. Facebook don't give a shit how he accesses their site; this has nothing to do with the fact that he spidered it in a way that their robots.txt file allows, and everything to do with the fact that he was *redistributing their data* without consent.

Now, the question becomes whether what he was distributing falls under fair use. This is a very tricky question, and has nothing to do with how he acquired it.

Re:Pretty naive (2, Informative)

John Hasler (414242) | more than 4 years ago | (#31748012)

> *redistributing their data*

No one owns data. Data is not protected by copyright in the US.

Re:Pretty naive (3, Insightful)

Lumpy (12016) | more than 4 years ago | (#31748024)

It aint their data, it's the owners data. they are simply hijacking ownership.

Re:Pretty naive (1)

John Hasler (414242) | more than 4 years ago | (#31748220)

> It aint their data, it's the owners data.

Under US law data cannot be owned.

Re:Pretty naive (1)

pla (258480) | more than 4 years ago | (#31748074)

and everything to do with the fact that he was *redistributing their data* without consent.

Just as a point of clarification - Does FaceBook claim to own the copyright to information entered by the users of their site?

If not, the strongest claim FB could make would seem to boil down to theft of service ("stealing" their bandwidth via his spider). And the very fact that they have a robots file defines what they consider fair game in that regard (if they allow Google and Yahoo etc to do it, tough to say "no" to some random academic doing research).

If so... A lot of people who think nothing of uploading pictures of themselves in a drunken gang-bang on the beach might not feel the same way if Facebook claim the right to use that picture, complete with names, as the cover/poster for their new book/movie.


Now, A lot of people have said that he probably can't afford to take this to court even though in the right... Except, he doesn't need to. FB did nothing but threaten, with what most people seem to consider very little on which to base their claims. Solution? Release the data and force them to prove their case. And even if they somehow pull off a Chewbacca... They couldn't undo the release itself, so still lose.

Re:Pretty naive (1)

neumayr (819083) | more than 4 years ago | (#31748126)

As far as I know, Facebook requires an account to get to see or collect anything on their site.
Which means he agreed to some kind of TOS. The robots.txt argument seems to fail there - the TOS most likely takes precedent over what the robots.txt file allows.

Re:Pretty naive (4, Interesting)

whisper_jeff (680366) | more than 4 years ago | (#31747890)

If they want to sell data (as they clearly do given that's what their business model is built upon) then they should take greater precautions to ensure that it is protected. If they leave that information out in the open, for anyone with a hint of insight to find, then they should not be surprised to find their valuable data in the hands of someone else. He didn't delve into their private information - he simply accessed publicly available information that anyone with an internet connection could view.

Facebook got lucky - the data was gathered by just an average Joe without the backing to fight a legal battle. Had it been someone significantly larger, the result may have been "go ahead and sue - we'll see you in court." And, quite frankly, I'd be shocked if Facebook would win that sort of battle. And that's a battle that Facebook decidedly does not want to lose - it would mean the end of their business...

I'd be curious to learn if that information is still available (as I am certain it is...) because someone/some company might decide that's pretty valuable _PUBLIC_ information and might, just might, decide they're willing to battle Facebook's legal team for it... Expensive legal battle over very valuable marketing data... If you have the resources for the fight, it might be a fight worth waging...

Facebook may have gotten lucky once but they may not be so lucky next time...

Re:Pretty naive (1)

poetmatt (793785) | more than 4 years ago | (#31747918)

that's a pretty dead on first post. He absolutely has a case though, in fact it's quite solid. It's public information. If it was private would be another story. I do agree he probably doesn't have the money but the gamble is the fact that if it's solid enough the judge might prevent facebook's lawyers from going after him - aka ANTI-SLAPP or equivalent.

I have no idea if that would happen or not, but it's certainly possible. Depends on how clued in the judge is to the interwebs.

Yea he could. (1)

unity100 (970058) | more than 4 years ago | (#31748328)

because, they put a robots.txt file in their root folder which allowed him to crawl everything.

its facebook's fault.

To keep the lawyers happy... (0)

Anonymous Coward | more than 4 years ago | (#31747368)

...you are supposed to scan scumbags.txt, not robots.txt.

Re:To keep the lawyers happy... (0)

Anonymous Coward | more than 4 years ago | (#31747576)

A lawyer may have contacted him, but I guarantee it was at the behest of an MBA.

Ballsy. (2, Funny)

Pojut (1027544) | more than 4 years ago | (#31747388)

Stupid, but ballsy. Gotta give credit where it's due.

Re:Ballsy. (4, Insightful)

hansamurai (907719) | more than 4 years ago | (#31747496)

Not really ballsy considering he didn't actually let Facebook's challenge of "The only legal way to access any web site with a crawler was to obtain prior written permission" go to court. Maybe he should have gone to the EFF for help as the repercussions of a judge actually deciding in Facebook's favor would have been devastating to the web.

Re:Ballsy. (1)

Pojut (1027544) | more than 4 years ago | (#31747518)

I meant it was ballsy to assume a beast as huge as Facebook would let him do this.

I can understand not wanting to go bankrupt, but I agree with you and others...he likely could have found someone willing to work on this case at no charge. Still, the guy seems quite talented and capable...I'm sure he will find a way to get the professional recognition he deserves.

Re:Ballsy. (0)

Anonymous Coward | more than 4 years ago | (#31747812)

Maybe he should have gone to Google to fund his legal bills.

Re:Ballsy. (1)

smith6174 (986645) | more than 4 years ago | (#31747526)

I agree, ballsy. I say do the crawling anyway! It is obvious that the information has value, and is publicly available. In the intelligence world this kind of stuff is known as "open-source intelligence" and is where an estimated 80% of info is found. I hope the guy didn't sign the agreement. If so, he is probably the only person prohibited from doing the same thing again.

Mark Zuckerberg (5, Interesting)

prayag (1252246) | more than 4 years ago | (#31747404)

Mark Zuckerberg is the most unethical guy in the industry today. As is obvious by the origins of Facebook, his infamous hacking of the journalists passwords during the the-facebook era and countless other fiascoes that come to news from time to time. Everyone who has ever dealt with him says have bad things to say about him.
If he is the face of the next generation entrepreneurs, then god saves the industry.

Re:Mark Zuckerberg (0, Offtopic)

Ornlu (1706502) | more than 4 years ago | (#31747694)

If he is the face of the next generation entrepreneurs, then God save the industry.

There. Fixed that for ya.

Re:Mark Zuckerberg (1, Offtopic)

Pojut (1027544) | more than 4 years ago | (#31747772)

Capitalizing god's name means applying a human characteristic to an omnipotent and all-powerful force...in other words, it's as silly as applying one sex or the other to god.

Re:Mark Zuckerberg (1, Informative)

Anonymous Coward | more than 4 years ago | (#31748110)

Fictional or not, in the context it is a name. Names are usually capitalized, whether it is God or Bugs Bunny.

Re:Mark Zuckerberg (1)

bluewolfcub (1681832) | more than 4 years ago | (#31748166)

I would have regarded the change from "saves" to "save" a more important one, since the former implies god will definitely save the industry...

Re:Mark Zuckerberg (1, Troll)

Jumperalex (185007) | more than 4 years ago | (#31747814)

"If he is the face of the next generation entrepreneurs, then [insert imaginary friend(s)] save the industry"

There. Fixed that for ya.

Annoying having someone tell you about your own beliefs isn't it?

Re:Mark Zuckerberg (1)

Idiomatick (976696) | more than 4 years ago | (#31748128)

If he is the face of the next generation entrepreneurs, then G-d save the industry.

There.

Publicity (2, Interesting)

rwa2 (4391) | more than 4 years ago | (#31747412)

The guy's work looks somewhat interesting. I don't see why he can't just make it a facebook app or something that just happens to crossover onto the rest of the internet as well, maybe that would have helped him fly under their radar if it was seen as something that enhanced facebook.

But seems like his problem all along was lack of publicity, which /. will surely help with.

That said, call me old-school, but I've had more fun with things like ircstats [humdi.net] . So I'm mostly still waiting for this new social crap to catch up.

Arachnophobia (4, Insightful)

mfh (56) | more than 4 years ago | (#31747430)

I might be alone here but spiders revolt me to a point where I simply respect them and leave them alone.

But that said, Google operates a spider, pretty much. So we have to look at any potential spider on the internet like we might look at Google. If he followed the Robots.txt as Facebook set it up and he didn't try to misunderstand it, then there isn't anything they can do. Although, I'm pretty sure the Facebook EULA says you can't spider them so he's SOL anyway if that's the case. This should be a long and drawn out case unless there is a settlement.

Facebook is ripe. People put up EVERYTHING about themselves on there. I never accept a friend request unless I know the person and I offer a challenge question often. If it's not responded to adequately, I simply ignore them. But in the end there isn't much you can do. If you put it on Facebook -- consider it public, like if it was in the phone book.

Re:Arachnophobia (2, Informative)

mfh (56) | more than 4 years ago | (#31747454)

Disregard this, he settled.

Re:Arachnophobia (1)

drinkypoo (153816) | more than 4 years ago | (#31747466)

I might be alone here but spiders revolt me to a point where I simply respect them and leave them alone.

Spiders in my house are OK; spiders in my bathroom must die. Incidentally, this is why I don't run google desktop :D

If he followed the Robots.txt as Facebook set it up and he didn't try to misunderstand it, then there isn't anything they can do.

Would that this were true.

This should be a long and drawn out case unless there is a settlement.

Too true.

Re:Arachnophobia (0)

Anonymous Coward | more than 4 years ago | (#31747624)

Spiders in my house are OK; spiders in my bathroom must die.

Surely your bathroom is in your house?!

Re:Arachnophobia (1)

RalphSleigh (899929) | more than 4 years ago | (#31747752)

Yes it is, but spiders in my bathroom is a more specific declaration and so will override the spiders in my house clause.

Re:Arachnophobia (1)

omnichad (1198475) | more than 4 years ago | (#31747520)

I may be going out on a limb, but I doubt that it matters if he operated the spider in a legal manner. Selling data from Facebook isn't the same thing as the attempt at fair use that Google engages in.

Re:Arachnophobia (3, Funny)

mfh (56) | more than 4 years ago | (#31747644)

Google sells our information by what we like. They do it in a way that somewhat protects our privacy and it's part of their service. Gmail targets adds directly to you based on keywords in your emails. If you had enough money you could know what people are talking about by how the adds played out. Therefore there is no real privacy on Google email, and Google reads our emails.

Google collects all kinds of websites and offers search. They build stats and sell off residual information based on information collected. This is why they have so many PHDs there, so they can understand what everything means on the internet to capitalize on it. Okay they say they are not evil, but that doesn't mean they don't sell the info to people who are fucking evil. In fact most of the people who deal with Google daily for business transactions (ad sense, ad words.. .etc) would eat babies, given the chance.

Re:Arachnophobia (1)

Herkum01 (592704) | more than 4 years ago | (#31747860)

Yahoo has my email, does that mean they own my email account? I don't think so. When people say that somehow Facebook owns this data it is a load of crap. They no more own that data than Google owns the content of the web sites they have crawled. They provide a place to host information, they provide a way to relate users to each other and users have a way of sharing it. In fact all the users of the system enter the information, not Facebook. So at what point do you people believe that this magically became Facebook's data instead of their users?

Re:Arachnophobia (1)

omnichad (1198475) | more than 4 years ago | (#31748056)

Because their TOS says that they own whatever you put into it. As far as I know, that would stand up in court.

Re:Arachnophobia (1)

John Hasler (414242) | more than 4 years ago | (#31748300)

No one owns data under USA law. Their TOS may get them some sort of license for any copyrightable content (creative expression on Facebook? I suppose there is some...) but it very unlikely that it can get them ownership of the copyrights: that requires an explicit instrument of conveyance.

Re:Arachnophobia (5, Informative)

OnlyJedi (709288) | more than 4 years ago | (#31747658)

From the Statement of Rights and Responsibilities [facebook.com] , Section 3 "Safety":

2. You will not collect users' content or information, or otherwise access Facebook, using automated means (such as harvesting bots, robots, spiders, or scrapers) without our permission.

The question then becomes how enforceable is the agreement? Sure, if he has an account Facebook can close it, but if he is just accessing Facebook without an account do they have a case? Last I saw you can browse parts of profiles without being logged in, and without ever agreeing to any terms.

Re:Arachnophobia (1)

John Hasler (414242) | more than 4 years ago | (#31748054)

> ...if he is just accessing Facebook without an account do they have a case?

No.

Re:Arachnophobia (1)

Idiomatick (976696) | more than 4 years ago | (#31748234)

That is in the EULA though. There is still plenty of publicly available data where the EULA doesn't apply at all. Ignoring the fact that EULAs are completely unenforceable besides for the transfer of knowledge (checking it shows you read it but you don't have to follow it).

There's something I don't understand (2, Interesting)

Thanshin (1188877) | more than 4 years ago | (#31747432)

Assuming what he did produces a valuable result.

If it's defensible in court by an entity with enough cash or lawyer might, why is there no such entity doing the same thing and then fighting facebook in court?

If it isn't defensible in court, why does it matter that he didn't fight because he didn't have the money?

Re:There's something I don't understand (1)

0xdeadbeef (28836) | more than 4 years ago | (#31747502)

You should invest in SCO.

Re:There's something I don't understand (1)

Idiomatick (976696) | more than 4 years ago | (#31748266)

Who wants to go through the trouble to put themselves at big legal risk just to be right?? With no possible payoff. And guaranteed loss of time and money.

obviously this is abusive (4, Insightful)

circletimessquare (444983) | more than 4 years ago | (#31747434)

this is what the guy should do:

1. engage the lawsuit

the downside is financial exposure. so incorporate your work in such a way that it can't hit your personal finances. the upside is massive exposure. you will achieve some level of fame: the guy who finally gave the robots.txt convention a legal status quo. this will help you professionally, as well as make your life story

2. whine to google

you are completely right that google shouldn't have to get permission every time it wants to crawl the site. therefore GET GOOGLE TO DEFEND YOU

Re:obviously this is abusive (5, Interesting)

ikoleverhate (607286) | more than 4 years ago | (#31747544)

how about if he rejigged his crawler to get the data from the google cache instead? So he'd never get anything from facebook or enter into any implied agreement with them.

brilliant (1)

circletimessquare (444983) | more than 4 years ago | (#31747790)

mod +6

Re:obviously this is abusive (0)

Anonymous Coward | more than 4 years ago | (#31747794)

Then he'd be breaking Google's TOS.

Re:obviously this is abusive (0, Troll)

goldenseller01 (1784086) | more than 4 years ago | (#31747586)

http://www.golden-seller.com/ [golden-seller.com] Jersey $23 Sunglass $12 Purse: $12 Necklace $15 Bracelet $15 handbag $33 Bikini $23 http://www.golden-seller.com/ [golden-seller.com] High quality,competitive price,accept paypal,fast delivery

Re:obviously this is abusive (1)

bhtooefr (649901) | more than 4 years ago | (#31747688)

Either that, or there's another tactic that you could use.

Do step 1.

But then, instead of doing step 2, get a crap lawyer. Intentionally lose the case.

Then, Google will lobby Congress to push through a law legalizing robots.txt, which will trump the case law.

Re:obviously this is abusive (1)

idontgno (624372) | more than 4 years ago | (#31747844)

Then, Google will lobby Congress to push through a law legalizing robots.txt, which will trump the case law.

Only if the case hinges entirely on robots.txt

The real "infringement" isn't crawling to collect this data, it's actually collecting it. If you were insane enough to collect usage, friendship network, and other statistics by hand-clicking Facebook pages and tallying numbers with a pad of paper and a pencil, Facebook would still be down your throat.

Those numbers, in Facebook's ego-inflated universe, belong to Facebook. That's their marketing magic, their secret treasure. The demographics and aggregated characteristics of their usership. No one else is allowed to duplicate that. Just ask 'em.

So a law ennobling robots.txt would be as useful as snow shovels on the Titanic: you could push the ice chips off the deck, but that ship is still gonna sink.

Re:obviously this is abusive (3, Insightful)

elrous0 (869638) | more than 4 years ago | (#31747692)

Most lawyers work for money. It's nice to think that the little guy in the right can take on the big guy and wins in court. But real life isn't a movie. Most of the time the little guy fighting a case like this ends up broke, whether he wins or loses. It's also nice to think that he could just go to the EFF and get a lawyer for free, but something tells me it's not that simple (I suspect the EFF is already swamped with what few lawyers they have).

Re:obviously this is abusive (0)

Anonymous Coward | more than 4 years ago | (#31748252)

Than you have a really f*cked up judicial system.
The guy in the right should always win and suffer no financial loss whatsoever from the procedure. Otherwise someone with lots of money could bankrupt anyone he wanted by accusing him of some random shit and forcing him to defend himself in court long enough.

Re:obviously this is abusive (1)

Animaether (411575) | more than 4 years ago | (#31748222)

the guy who finally gave the robots.txt convention a legal status quo

Good lord, I hope not. There's two sides to that coin.. if you're going to give legal clout to "it wasn't listed in robots.txt therefore it's legal to index it" you give the same legal clout to the notion "it was listed in robots.txt, so your crawler which disregards it is in violation of legal statutes". Next they would suggest that the pages behind the little "Terms of use" links hidden away somewhere should get legal clout as well.

robots.txt is entirely voluntary. A crawler -may- follow it, it -may- also completely disregard it. If a website owner doesn't want a crawler to see something, actively block it. If they come to the realization that some crawlers don't readily identify themselves (e.g. through the user-agent string), or there's a new crawler in town every month and they don't want to keep adding their identification, then maybe they're just going to have to put the data they don't want to share with the world behind a login.

Ditto terms of use. No - I do -not- have to agree to their 'terms of use' as described in some page in order to be allowed to visit their site. I request its content, they give it to me, end of story. Don't like it? Again - put it behind a login, or block me.

One of the few rights they have are copyrights.. in which case, yes, Facebook -should- be eyeing Google or Archive.org and similar service that do in fact redistribute their content. That doesn't stop anyone from being fully allowed to aggregate data that happens to be presented within a copyrighted document, though (not sure which case applies in the U.S., Feist v Rural? ianal and all that)

legal? (0)

Anonymous Coward | more than 4 years ago | (#31747490)

I'm pretty sure robots.txt doesn't count as a legal document.
and it's not the fact that he downloaded it all, it's the fact that he is distributing it.

Re:legal? (0)

Anonymous Coward | more than 4 years ago | (#31747770)

Does robots.txt have the same force as an executed contract, no certainly not. But it is a public claim of the permitted uses of your site, and as such has the same sort of legal force as, for example, a sign at the edge of your property that says "Rules for use of this property...". If someone else comes along and reasonably relies upon your public claims you can't later sue them for that reliance. If you sign says "Come on in, Use the pool" you can later ask them to leave but you can't sue them for their original use of your property.

Now, this is more complicated by his duplication and distribution of the material; there may also be copyright claims to be made. But from a simple contract-property standpoint robots.txt is a perfectly valid legal document, and so long as his reliance upon it was reasonable and prudent he has no liability for those original actions.

Re:legal? (2, Interesting)

hrvatska (790627) | more than 4 years ago | (#31748136)

robots.txt isn't legal document, it's an accepted industry standard way for web sites to limit what web spiders and other web robots can search for. Facebook's robot.txt file basically welcomes everyone to come on in and search their site. Complaining that someone used the data that you gave them permission to access is like realtors complaining that someone is visiting open houses they sponsor and then publishing an analysis of houses for sale based on data gathered during those visits. If Facebook doesn't like that others can aggregate data on their site they should get the industry to agree to a new standard tag that permits crawling but forbids aggregation.

"Don't be evil"... (1)

jockeys (753885) | more than 4 years ago | (#31747564)

Do I really need to say anything else at this point?

Re:"Don't be evil"... (1)

mfh (56) | more than 4 years ago | (#31747666)

Most of the income Google makes comes from people looking to game the internet for money. So if Google's source of income comes from evil people, how are they clean?

I suspect this was totally legal (3, Interesting)

Halo- (175936) | more than 4 years ago | (#31747628)

I am not anything even approaching a lawyer, but I suspect his actions were probably legal. The Internet is a public medium, unless you specifically put walls around content, it has the same protection as if you posted fliers on a physical bulletin board in a public place. Yes, you retain copyright over your content, but you have ZERO ability to say "by reading this, you agree to additional terms". If I want to produce a review of all the fliers posted around town, I can. If I want to make excerpts (within "Fair Use") I can. Pretty much the only thing I can't legally do is deface them or copy them outright. Unless he was doing this from a logged in account, I can see how they can limit what sorts of derivative works he makes. (So long as the derivative doesn't violate copyright)

Ooo, deja vu (5, Insightful)

lxt (724570) | more than 4 years ago | (#31747634)

It's sort of ironic that Facebook is trying to stop someone crawling public profiles on their site, because that's exactly what Mark Zuckerberg did while he was at Harvard (I was a grad student in the CS department at the time).

Pre-Facebook, Zuckerberg created a site that let Harvard students compare each other, a bit like Hot or Not. Obviously nobody was going to go to a site that wasn't populated with their classmates, so he basically crawled the websites of the various residential houses that put their students info online (but behind passwords and auth) and copied it into his own site.

He actually got into a fair bit of trouble for this, and ended up being sent to Harvard's ad-board for discipline (I think he got put on probation, but I'm not entirely sure).

The key difference here is that this guy actually did everything by the book and followed robots.txt, whereas Mark Zuckerberg didn't.

Create and release dataset anon? (0)

Anonymous Coward | more than 4 years ago | (#31747678)

What is stopping someone from crawling for the same data, and posting it anonymously on something like Wikileaks or the like?

The information that could be gleaned from this dataset is immense, and the one set of data could be analyzed in different ways for years to come.

strange brew that's also good for you (0)

Anonymous Coward | more than 4 years ago | (#31747684)

That would be kombucha.

facebook is evil (0)

Anonymous Coward | more than 4 years ago | (#31747756)

80% o the people on it, and the people who run it. laugh zuckerberg.... you evil queen, you will become IRRELEVANT

I will persist.

Facebook's privacy policy (5, Informative)

whencanistop (1224156) | more than 4 years ago | (#31747762)

Facebook's privacy policy [facebook.com] says:

“Everyone” Privacy Setting. Information set to “everyone” is publicly available information, may be accessed by everyone on the Internet (including people not logged into Facebook), is subject to indexing by third party search engines, may be associated with you outside of Facebook (such as when you visit other sites on the internet), and may be imported and exported by us and others without privacy limitations. The default privacy setting for certain types of information you post on Facebook is set to “everyone.” You can review and change the default settings in your privacy settings. If you delete “everyone” content that you posted on Facebook, we will remove it from your Facebook profile, but have no control over its use outside of Facebook.

I'd also like to point out in their terms [facebook.com] :

When you publish content or information using the "everyone" setting, it means that everyone, including people off of Facebook, will have access to that information and we may not have control over what they do with it.

Re:Facebook's privacy policy (1)

crivens (112213) | more than 4 years ago | (#31747920)

"we may not have control over what they do with it."

Haha - guess Facebook has everything covered. You can't sue Facebook if your info gets into the wrong hands. But the info can't get into the wrong hands because Facebook won't allow it. Unless they give it out. That's why we shouldn't use Facebook.

Privacy ... (3, Funny)

zuperduperman (1206922) | more than 4 years ago | (#31747800)

So, Mark, you say Facebook have a reasonable expectation for privacy of its data? Isn't privacy passe now? Or did I hear you wrong?

I am not anything even (-1, Troll)

Anonymous Coward | more than 4 years ago | (#31747820)

I am not anything even approaching a lawyer, but I suspect his actions were probably legal. The Internet is a public medium, unless you specifically put walls around content, it has the same protection as if you posted fliers on a physical bulletin board in a public place. Yes, you retain copyright over your content, but you have ZERO ability to say "by reading this, you agree to additional terms". If I want to produce a review of all the fliers posted around town, I can. If I want to make excerpts (within "Fair Use") I can. Pretty much the only thing I can't legally do is deface them or copy them outright. Unless he was doing this from a logged in account, I can see how they can limit what sorts of derivative works he makes. (So long as the derivative doesn't violate copyright)
How To Play The Most Economical China Mobile Phone [chinamobilephones.org] The Difference Between Japanese Girls And Chinese Girls [chinese-girls.org] Indian Girl Into A Karate Black Belt [indian-girls.net]

Facebook did not sue. (2, Interesting)

John Hasler (414242) | more than 4 years ago | (#31747980)

Threats of legal action are not a lawsuit. He didn't get sued. He got bluffed. I don't blame him for caving in, but he shouldn't mislead people by referring to the receipt of threats from lawyers as being sued (this is the sort of error I expect from the Slashdot editors, of course).

Re:Facebook did not sue. (0)

mbone (558574) | more than 4 years ago | (#31748206)

That was exactly my thought. I don't see any sign (in reading various articles about this) that Facebook actually sued.

Note : Just because a company says they will sue doesn't mean they will sue.

Apparently, he didn't even get a lawyer, so who knows what he actually agreed to ? He certainly doesn't.

There is no privacy (1)

fortapocalypse (1231686) | more than 4 years ago | (#31748048)

Privacy is a misguided concept introduced at the same time as when cavemen tried to hide behind trees whilst relieving themselves. It is public data ergo it is public data. Case solved.

will the real EFF please standup (1)

the100rabh (947158) | more than 4 years ago | (#31748116)

will the real EFF please standup
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>