Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror
×
Media Privacy Social Networks Your Rights Online

'Scrapers' Dig Deep For Data On Web 158

srwellman writes "The practice of Web 'scraping' is growing as many firms offer to collect personal, and potentially incriminating, data about users from their social networking profiles and discussions. Many companies even collect online conversations and personal details from social networks, job sites and forums where people might discuss their lives and even potentially sensitive data, such as health issues. These scrapers operate in a legal grey area leaving many users exposed." We ban scrapers like this regularly here simply for not adhering to the rules spelled out in robots.txt.
This discussion has been archived. No new comments can be posted.

'Scrapers' Dig Deep For Data On Web

Comments Filter:
  • Like Google? (Score:3, Interesting)

    by bonch ( 38532 ) * on Wednesday April 13, 2011 @12:32PM (#35809300)

    Firms offer to harvest online conversations and collect personal details from social-networking sites, résumé sites and online forums where people might discuss their lives.

    You mean like Google already does for its advertisers? In fact, one of the related links in the article is a story about Google titled Google Agonizes on Privacy as Ad World Vaults Ahead [wsj.com], discussing their plans for utilizing their vast archive of valuable user data. The battle for online privacy was lost long ago.

    • by blair1q ( 305137 )

      This is a new form of privacy of which the news has not come to Harvard.

      I'm pretty sure information posted for the entire planet to read is not private.

      Out on the street, a huckster can size you up in about ten seconds, with 90% accuracy. Online, in text, you're not wearing that tribal-armband tattoo, so it might take a few minutes to figure out you're a joiner with delusions of individuality.

      Time to revise my motto: The Internet is not secure, and open forums are not private.

      • by Anonymous Coward

        > I'm pretty sure information posted for the entire planet to read is not private

        Well, that's what I think too, but amazingly, about 98% of humanity doesn't seem to agree. It seems to me that they're insane if they expect something posted to the whole world to be private, but there are SO many who think that way, I'm not sure what to make of it.

        • The majority of humanity probably think posting something to facebook or whatever is similar to writing "Got totally plastered on holiday" on the back of a postcard and posting it to their local (something that people do)

          Sure, it's public but after a few years it will have vanished without trace.

          Tim.

      • by hoggoth ( 414195 )

        / sheepishly pulls sleeve over tribal armband tattoo...

    • Re:Like Google? (Score:5, Insightful)

      by betterunixthanunix ( 980855 ) on Wednesday April 13, 2011 @01:16PM (#35809838)

      The battle for online privacy was lost long ago.

      Only because one side of the battle never bothered to fight. Nobody was forced to go to social networking websites and post their life story, anyone could encrypt their email and IM conversations, and ad blocking software is widely available. Large amounts of the information that these companies are aggregating could have been made far more difficult to obtain if the majority of computer users could have been bothered.

      Sadly, the Internet has become more of an adversarial game than a way to unite people.

      • The battle for online privacy was lost long ago.

        Only because one side of the battle never bothered to fight. Nobody was forced to go to social networking websites and post their life story, anyone could encrypt their email and IM conversations, and ad blocking software is widely available. Large amounts of the information that these companies are aggregating could have been made far more difficult to obtain if the majority of computer users could have been bothered. Sadly, the Internet has become more of an adversarial game than a way to unite people.

        forced to use social tools? no.

        encryption available? yes

        understood by anyone in the general public? nope

      • by Anonymous Coward

        Its ridiculous to expect users to anticipate and thwart privacy invasions. These companies could be shut down overnight (or at least rendered illegal) with common-sense legislation. The problem is not users, it is their bought-and-paid-for "representative" government(s) which sell out their constituents to be deceived and abused by sleazy industries.

        • by plover ( 150551 ) *

          Its ridiculous to expect users to anticipate and thwart privacy invasions. These companies could be shut down overnight (or at least rendered illegal) with common-sense legislation. The problem is not users, it is their bought-and-paid-for "representative" government(s) which sell out their constituents to be deceived and abused by sleazy industries.

          It's "ridiculous"? Someone held a gun to your head and told you to post your oh-so-pitiful life story on line? They made you post that picture of you drinking with some friends at a stripper bar, or the story about that time you were snorting coke off a hooker's ass? You think some all-powerful government should come and save your irresponsible neck from someone else trying to make a buck off your drunken stupidity, and do so by censoring your writings from them? And you think that doesn't sound ridicul

        • "Its ridiculous to expect users to anticipate and thwart privacy invasions. These companies could be shut down overnight (or at least rendered illegal) with common-sense legislation. The problem is not users, it is their bought-and-paid-for "representative" government(s) which sell out their constituents to be deceived and abused by sleazy industries."

          Not really. I mean yes, in part. Some of what OP was talking about is completely free (as in freely available to anybody) public information. But OP doesn't like scrapers because (1) if used irresponsibly they can hit servers too hard for comfort, and (2) while the information might be freely available, it takes "normal" people a lot of time to go online and sort through all that information, while a scraper can grab it and sift it in a very short time indeed.

          But OP doesn't seem to be accounting for a co

      • The battle for online privacy was lost long ago.

        So if I post to a public forum I should expect privacy?

        What about CC companies selling data, that was going on before the internet, and seems more intrusive than many of these situations.

        Sadly, the Internet has become more of an adversarial game than a way to unite people.

        I think all those countries having revolutions in the middle east might disagree with you.

      • by jd ( 1658 )

        There's that and there's the fact that the US (one of the largest consumers of data) has no data privacy laws and has been pressuring places that do (such as the EU) to violate their own laws. The laws don't solve the problem in and of themselves, what they do is make the public more* aware that the problem even exists. (*You can have more than nothing.)

        The older ITAR laws and RSA patents didn't help - it effectively criminalized any effort to produce a product, since you'd need to sell the product in the U

      • Open source has an uphill battle educating the masses as more uneducated people join it with zero expectacions of passing some required level of readiness prior to being let loose online.

        Merge a good version of a "secure" OS, like Debian, say, Ubuntu with a paranoid version out there where your proposed security is ON by default --no need to know where to get Adblock for grandma's firefox. Test and tweak to ensure the security doesn't cripple the top 50 websites, (youtube, facebook, myspace, hotmail, google

  • by Tigger's Pet ( 130655 ) on Wednesday April 13, 2011 @12:38PM (#35809358) Homepage

    I'm not on FB, Twitter, MyCloud or whatever else, so there's no data out there about me. If there's nothing to harvest then they can't harvest it - I'd rather be classified as 'boring' or 'not with it' (whatever the fuck 'It' is), than have stuff out there that might come back to bite me in the ass in 10 or 20 years time.

    • by yog ( 19073 ) *

      Definitely avoid using a real or traceable name in online discussion forums and social sites. Also, avoid embedding your real name into your email address, such as "JohnSurfer@cox.net" or the like.

      Unfortunately, my real name is embedded in one of my email addresses, and it's all over the web by now. I guess I can eventually switch to a different address, but the damage is done.

      If you have someone's name, you can now obtain their current and past addresses, their age, their schools, possibly where they wor

      • Re: (Score:2, Funny)

        by Anonymous Coward

        I suppose if you have nothing to hide and have avoided getting too controversial in your online discussions, or too outrageous in your social network photos and statuses, you're probably safe from major problems.

        Yep. That's why my pic on chatroulette is an exact average size penis.

      • Definitely avoid using a real or traceable name in online discussion forums and social sites. Also, avoid embedding your real name into your email address, such as "JohnSurfer@cox.net" or the like.

        That's unlikely to help. I'm afraid this fight [randomwalker.info] is already [33bits.org] lost [schneier.com]

        • Bollocks. Utter nonsense. The people who have "lost" this "fight" are only the ones who were never "fighting" in the first place!

          They weren't using different information (or even names and locations) on different sites. They weren't using different IP addresses and MAC addresses. They weren't... doing ANYTHING. Because they didn't even know they had to. That's a pretty weird definition of a "fight".

          Pardon me, but (as is probably the case with most internet users in the US today) getting repeatedly sod
      • by sakti ( 16411 ) on Wednesday April 13, 2011 @01:29PM (#35810056) Homepage

        IMO it's better to have an easy to find public 'you' online for these people to track. You use that for everything 'safe'. You then use multiple anonymous accounts for anything you don't want tracked.

        If you have nothing tracking online I think it might start looking more suspicious than not. Plus having nothing might encourage 'them' to dig in and try to relate you to your anonymous account(s).

        • fundamentally that's what I do.
          There is a real me on FB. Then there is me here (and this ID is shared across multiple sites) which would not be too hard to link to the real me.
          For stuff I really don't want tied to me in re. job interviews, non-gov't background checks etc. I use other identities. For something that I would be afraid of coming out in a relatively thorough discovery && || government background check I simply don't post it on line. At all.
          -nB

      • Check out my name. I have several email addresses under that name with different providers, and under different names as well. I have for years. And none of those email accounts are attached to my "real" name or personal information, in any way. And most of them were established from different IP addresses. Also: other people use that name. That is one of the reasons I chose it.

        I fully believe (because history clearly demonstrates as much) that the ability to communicate privately and anonymously is esse
    • Re: (Score:2, Funny)

      by Anonymous Coward

      That's OK, Phillip Wilkerson of Midland, MI. We still know all about you. Tell Donna and the kids hi for us. Don't forget to pick up dog food on your way home from the tanning salon.

      Sincerely,

      Google

    • A real pro would be able to do it based on this comment of yours.

      http://slashdot.org/comments.pl?sid=2031640&cid=35457796 [slashdot.org]

      • Well done - you can track my previous postings on /. Do you want a prize? I'm now accepted as one of the 6.5 million people in the UK who have their DNA on record because this country stores DNA samples from everyone convicted (and many who are not convicted). Assuming of course that I'm not just posting things to try and make a point and gain Karma points - just like all the people on here who post about "My wife had this happen to her..." - we know that they haven't got a wife or they wouldn't be on here

        • I was trying to be polite.

          I was half way to a contextual analysis based on some of your more creative phrases but I ran out of time to rule out false positives. At a minimum I think you post on at least five sites and cross referencing those is almost enough. The last trick requires one of the web admins (for easy sake start with slashdot) to use the new geolocation trick based on public nets to narrow it down. The point is that it's a When-Not-If world out there so plan your future expecting to be tracked

    • by ceoyoyo ( 59147 )

      In eight years on Slashdot I wonder if you've ever accidentally posted something that might link to you. I can't be bothered to find out, but I'm sure that information might be valuable to someone.

      Of course, you probably drag cookies around like everyone else anyway.

    • Even though you never post a thing, someone else may post something about you. You may already be tagged in multiple photos on Facebook. You may have loan applications visible on the web. Your information is not entirely under your control - with pervasive digital storage, constant security challenges, and an increasing cultural trend to blurring the line between public and private, there is a growing chance that your information will leak out into the public.
  • by blair1q ( 305137 ) on Wednesday April 13, 2011 @12:38PM (#35809368) Journal

    That Anonymous Coward guy is going to have a mailbox full of goatse spam.

    • That Anonymous Coward guy is going to have a mailbox full of goatse spam.

      With the kinds of responses he's posted to some of my posts, let me assure you... he already does!

  • Comment removed based on user account deletion
  • by swanzilla ( 1458281 ) on Wednesday April 13, 2011 @12:40PM (#35809392) Homepage
    Example 'scrape' FTA:

    He used a pseudonym on the message boards, but his PatientsLikeMe profile linked to his blog, which contains his real name.

    I don't think we need to dig any deeper to come to the conclusion that this guy is an idiot.

    • Re:Bravo (Score:5, Funny)

      by TypoNAM ( 695420 ) on Wednesday April 13, 2011 @01:04PM (#35809716)

      He used a pseudonym on the message boards, but his PatientsLikeMe profile linked to his blog, which contains his real name.

      I don't think we need to dig any deeper to come to the conclusion that this guy is an idiot.

      Indeed, Joseph Swanson.

    • Slashdot is filled to the brim with people who take the time to create an alias and then list their homepage on their profile, which of course, is displayed in a link on the same line as their alias in the post they just made.

      I click on those homepages whenever I read something really stupid or ridiculous or inflammatory or completely polar opposite my perspective. Which is to say, I click on them A LOT. I am amazed at how many of these "homepages" are links to commerce sites, or sites advertising some ki

      • by plover ( 150551 ) *

        "Why," I inevitably ask myself, "would I ever buy anything from you, you knucklehead, you?"

        You aren't supposed to buy from them. The link isn't there for your benefit. It's an SEO trick, part of the strategy in trying to raise the page rank for that site.

        If you run a blog, you'll find you'll get a commenters that say stuff like "hi, your site is a good understand! one for my book marks." It's flatteringly nice, and obviously English isn't their native tongue, so you thank them for their kind words. And with luck, you may not follow the link in their user name, which you might then discover lin

  • by Nero Nimbus ( 1104415 ) on Wednesday April 13, 2011 @12:47PM (#35809478)
    This was talked about back in October:

    http://yro.slashdot.org/story/10/10/15/1340244/Data-Miners-Scraping-Away-Our-Privacy?from=rss [slashdot.org]

    I thought the guy in the picture looked familiar...
  • "We ban scrapers like this regularly here simply for not adhering to the rules spelled out in robots.txt." Hah! robots.txt doesn't stop any decent crawler
    • by Anonymous Coward

      Getting banned sure will though.

      • by yacc143 ( 975862 )

        Well, considering that there are two additional escalation steps:

        *) emulate a human-like access pattern that works at a human-speed.

        *) passively record data via a proxy when you normally browse.

        Add to this multiple IP addresses, and catching your scraper becomes so much more problematic.

    • I think the point they're making is that crawlers which do not obey the rules spelled out in robots.txt are blocked.
    • by sharkey ( 16670 )
      Actually, it stops ALL "decent" crawlers. It's the ones that behave indecently that ignore robots.txt.
    • When they say ban, they mean IP ban presumably. As in, the robot doesn't follow robots.txt, and because of this, they get their ass kicked, and banned. That makes a lot more sense I think.

  • Known robots, and scrapers

    IP addresses that do not honor /robots.txt.

    and IP addresses that robotically submit spam on robots.txt disallowed HTML feedback feedback forms

    Much web scraping can be automatically detected.

    Sites like Facebook/social networking sites are perfect places to trap/detect scrapers, if they would be willing to contribute to a DNSBL

    • A good place to begin would be to examine the robots.txt of large sites to see what they're blocking. Sometimes they leave helpful comments in the text files as well. The most interesting I've come across so far is Wikipedia's robots.txt file [wikipedia.org] which has comments for every disallow or series of disallows.
      • by mysidia ( 191772 )

        The most interesting I've come across so far is Wikipedia's robots.txt file [wikipedia.org] which has comments for every disallow or series of disallows.

        Well.. it bothers the hell out of me that I can't Google VfD/Afd/Page for deletion Articles on Wikipedia, because a few people were annoyed there were VfD articles about their nonnotable vanity page on WP. Wtf are the Wiki people thinking? Sometimes interesting points arise in a discussion, and it would be useful to be able to search those discussions i

        • Sometimes, bots can be detected by their patterns or behavior. If a bot doesn't want to comply with robots.txt and ends up sucking a site's bandwidth, the site may ban it automatically if it's configured to do so. Not sure if Wiki does this, though

          Listing Firefox/MSIE in robots.txt also wouldn't do anything because those are browsers, not web crawlers, so they don't have to even acknowledge the robots.txt standard. Though, that's not to say that it wouldn't be fun, let alone downright tempting, to disallow

          • by mysidia ( 191772 )

            Listing Firefox/MSIE in robots.txt also wouldn't do anything because those are browsers, not web crawlers, so they don't have to even acknowledge the robots.txt standard.

            Shouldn't effect users.... but I was thinking some of the 'evil bots' might be using an API/framework for making bots, where they supplied the fake UA field to, and that framework might be so gracious as to _force_ the bot application developer to comply (?)

            I was also wondering if FF/MSIE might have some auto-crawler features that

            • Shouldn't effect users.... but I was thinking some of the 'evil bots' might be using an API/framework for making bots, where they supplied the fake UA field to, and that framework might be so gracious as to _force_ the bot application developer to comply (?)

              Yeah, there are some frameworks and free-to-use bots all around, but because of the diversity of bots and their uses as well as the functions of various servers, it'd be hard to control their behavior so simply. That's part of the reason why robots.txt

        • Something tells me they don't bother too much about IPs of bots that don't honor and use generic user agents.

          Perhaps (unlike some) they're not stupid enough to think there's a 1:1 correspondence between users and IP addresses?

      • A good place to begin would be to examine the robots.txt of large sites to see what they're blocking. Sometimes they leave helpful comments in the text files as well. The most interesting I've come across so far is Wikipedia's robots.txt file [wikipedia.org] which has comments for every disallow or series of disallows.

        After reading this the first thing I thought was, "Now we need a meta-robots.txt file to stop robots from scraping the robots.txt file."

    • by BillX ( 307153 )

      There are a few specialist blacklists popping up. Here is one [stopforumspam.com] specifically for listing spam robots that attack the most popular forum softwares (phpBB, SMF, etc). What I would really like to see is one that lists all the latest "scrapers to detect when people say negative things about your company/product and C&D them" services. I'd sign onto that in a minute - a no-brainer security measure for yourself, your blog and your forum users.

  • I've always wondered -- how would this work for future politicians from our generation?

    All your comments, history etc are probably available in a multitude of places, and anyone with enough motivation can go around digging and find some pretty serious material. Combined with the fact that most people know (or care) little to nothing about privacy, you will have an entire generation of users with a good chunk of their private lives and opinions shared out on the Internet for everyone to see.

    And knowing how w

    • There's already pretty damning video clips of many US politicians that are widely available. It doesn't seem to have any real impact on their ability to get (re) elected here. Watching the Daily Show for a week, you will come up with numerous examples.

      Unless of course you're referring to the effects these sorts of things might have on the political proceedings in smoke filled rooms.

  • What's to stop me from 'scraping' the info? What's to stop me from simply downloading the entire site with something like this [httrack.com]? Slowly if needed to avoid arousing suspicion..

    • Slowly if needed to avoid arousing suspicion..

      How slowly? Could you download all Slashdot comments in a profitable amount of time? You would also have to use a download pattern that is not obviously automated (e.g. sequentially requesting each link on a page).

      In short, it is not the easiest thing to do. It is like trying to pass the Turing test (which software is getting pretty good at doing, as it so happens).

      • by hoggoth ( 414195 )

        Run a separate scraper from different IP addresses for each "category" on Slashdot. Each scraper will read all of the articles in that category and refresh the comments from time to time (random intervals) just like a human would. That would be pretty hard to detect.

      • Depends. Am I allowed to use a botnet? From a previous story, I know that you can buy machines on botnets for about five cents each. For a dollar, I could have 20 machines, all grabbing one Slashdot story per minute (probably slow enough not to be seen as a spider). That's about a million Slashdot stories every four days. Maybe make it a million a week to make sure. Spread it over a big botnet and you can get the entire archive in an hour or so, without it looking like anything other than a few hundre
        • That's about a million Slashdot stories every four days.

          If you're only interested in unique ones it'd be more like a few thousand.

  • I did expect the Spanish Inqueisiton!

  • Soon as I click to read the comments, the ad on the right is for a web scraping solution.
  • by Anonymous Coward

    You're telling me that stuff on a public web site is public?

  • by garcia ( 6573 ) on Wednesday April 13, 2011 @01:28PM (#35810034)

    Because the public sector has very little time to handle FOIA requests and they sometimes cost more money to complete than I'm willing to pay (usually because they don't do much of their own data work in-house and have to call on a contractor to do it for me), I use their websites to glean the data I want.

    Last week I gave a talk about using SAS to do screen scraping and then perform analysis on the data of jail inmate registries [sas.com] and level 3 sex offenders in MN. I have dashboards of the data available on my website and as I mentioned in my presentation it has even been used to help one county avoid what could have been a serious privacy issue. [lazylightning.org]

    So while there are any number of pitfalls to screen scraping (not understanding the meaning of the data and trends, being fed incomplete or purposefully incorrect data, or even being banned outright) screen scraping can be great for learning about and reporting on the public sector when they are physically or financially incapable or simply unwilling to do it themselves.

  • by sdguero ( 1112795 ) on Wednesday April 13, 2011 @01:59PM (#35810386)
    The company was SEM/SEO then they moved to social optimization and scraping. It was a black art, like the SEO stuff, and totally dependent on the provider (in this case facebook and twitter) to not change anything. It's the same basic the problem with SEO and Google; if facebook's (or Google's) API coughs the social media scrapers (or SEM/SEO people) get pneumonia. If Facebook wants to stop it, they can do so fairly easily.

    Unfortunately for privacy, a huge part of FB's business model (like Google) is selling that data to the scrapers and the scrapers' clients.
  • Face it, the type of people who go into marketing have very little to offer this world. Their whole reason for existence is to hopefully sell something to somebody who might not otherwise buy it. The only redeeming aspect of marketing is that it is a non-violent sinkhole in which to drop money, vs say a war in some God forsaken desert.

    Have you ever met a marketing/advertising person who actually liked people?

    • by Jeng ( 926980 )

      Marketing Marketing Marketing, where the real money from the movie is made!

      I was going to post a response agreeing with you, but the more I think on it, well....

      Marketing subsidizes my entertainment choices, considering how much Geico spends on advertising I think basic cable would collapse if Geico stopped advertising.

      Marketing also helps the company I'm at. Our marketing consists of our catalog and website with our products and pricing. Without that how would our customers know what to buy from us? Som

  • Collecting data about others is somewhat an essential freedom. But my view and the modern view differ as most people do not feel the same way. But if we take the usual view any company collecting data about a specific person could be charged with stalking. We usually think of a pervert stalking a child or pretty girl. But stalking is stalking regardless of whether it is a corporation or a pervert. The motive for the stalking is irrelevant. Considering the current mood huge civil suits might take p

  • Add a line in your acceptable use / EULA section stating that you expect the user of the account to be human and that any attempt to scrape the data off of the server is fined at $100,000 per message, plus $10,000 to each message author.

    • Add a line in your acceptable use / EULA section stating that you expect the user of the account to be human and that any attempt to scrape the data off of the server is fined at $100,000 per message, plus $10,000 to each message author.

      And also, you reserve the right to sue the Tooth Fairy for lost unicorns.

      There is no "legal gray area" in scraping. By publishing data on a public webserver, you give consent to clients for viewing it. And what does "the user of the account to be human" mean, anyway? Presumably, humans will eventually view the data downloaded by the scraper. Challenge of the day: give me a legally watertight definition of "web browser" that includes user agents like Lynx (which downloads data from a remote server and presen

      • by hrieke ( 126185 )

        Sure- Automated process that stores the results in a database or is otherwise used in a system where the results are aggregated and retrievable for 4th party consumption with a method to tie back to a person.

        That wasn't difficult at all. Just because I write something for consumption to the members of a particular web site (assuming that it's NOT out in the public like Slashdot's or any other comment system), I would not expect it to be slurped up and sold by 3rd parties. On a member's only web site, such

  • by istartedi ( 132515 ) on Wednesday April 13, 2011 @03:08PM (#35811090) Journal

    The report is back sir, and the results are disturbing. Almost everybody likes sex, and a lot of them are weird. The ones that don't like sex have very strange hobbies. The ones that don't abuse illegal drugs are abusing legal drugs, and almost nobody weighs what they say or looks like their online picture. What should we do?

    (boss pauses for a moment) "Don't hire anybody ever again".

  • Our SiteTruth [sitetruth.com] system does some "scraping". We're looking for the name and address of the company behind the web site, so we can check the business out. We also look for ad links and a few other things, like BBBonline seals, which we check. We use a user agent name of SiteTruth.com site rating system. We don't look very deeply into a site; if after examining the most likely 20 pages, we haven't found out who runs the site, we figure they're not going to tell us. The site is down-rated accordingly.

    Our expe

    • Hmm. Sitetruth seems to be a little flawed. Not the least because it considers itself to be a little questionable, and secondly because it doesn't consider the possibility that a subdomain might have more authoritative information than the main domain (for example, "store.company.com" might have an EV certificate, giving you a high assurance of identity and location, while the main site at "www.company.com" has no high assurance sources). I also notice the complete lack of contact information. Ironic, f

      • by Animats ( 122034 )

        for example, "store.company.com" might have an EV certificate, giving you a high assurance of identity and location, while the main site at "www.company.com" has no high assurance sources

        It's rare to see that. Know of a significant example? One might expect it for "store.yahoo.com", but that site won't even accept a HTTPS connection. Neither will "disney.go.com". Citibank has separate certs for "www.citibank.com" and "online.citibank.com".

        Contact information is on the "about" page.

        • Ah, there it is - why didn't I see that email address before. I might email you guys some specific examples now that I can see how.

  • If the scrapers are already not following the rules laid out in the robots.txt file, what's to say they'll honor your ban. They'll find some way around any technical means of blocking them, in time.
    • I'm pretty sure by ban he meant an entry in an .htaccess file banning the IP, not a line in a text file saying "please keep out"
  • On this topic, here is some bad practices in HR that needs to end:
    1. Hiring based on stereotypes is NOT a good idea. [com.com]
    2. The purpose of HR should not be to minimize legal liability.
    3. The illusion that celebrities are perfect needs to end.
    4. Filtering people based on health problems to minimize health insurance costs is not a good idea.
    5. Not hiring people based on debt creates a paradox for those who have to pay it off.
    And as a side note, companies with seriously broken HR often have other problems too.

    • by Jiro ( 131519 )

      If you don't try to minimize legal liability, you'll find yourself with more legal liability than you need. And legal liability really hurts.

      • by yuhong ( 1378501 )

        But it should not be the primary purpose of HR.

      • If you don't try to minimize legal liability, you'll find yourself with more legal liability than you need. And legal liability really hurts.

        Liability only hurts if you have done something actionable.

        • Liability only hurts if you have done something actionable.

          Anything is actionable, in the sense that somebody can sue you for it. And even if the case is laughed out of court in five minutes you're still looking at a few grand in legal fees, wasted time etc.

  • Would that be legal? Could I setup a company that collected DNA samples without their owners permission(say, by tying the hair clippings from a salon to the CC that paid for the cut)? Could I sell that info to the government?

    If no one's done it, someone should, if for no other reason than to scare the shit out of people and hopefully wake them up.

    • by King_TJ ( 85913 )

      Umm..... yes, someone obviously could do it, but you'd probably have some difficulty linking up the clippings you found to specific individuals. (I mean, would you propose the hair stylists themselves start indexing their customers' hair clippings? They'd be the ones who know their clients' names, addresses and phone numbers since everyone's in their computer system already. If they started acting as the data collectors for this type of operation, it would cause a big loss of business when people started

  • ...between generations. I'm not sure how children or students will take you seriously once they will be able to see every dumb thing you did when you were their age.

The Tao is like a glob pattern: used but never used up. It is like the extern void: filled with infinite possibilities.

Working...