Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
Security Stats IT

What Cybercrime Stats Have In Common With Sexual Braggadocio 69

An anonymous reader writes "Microsoft researchers have rubbished figures from cybercrime surveys, deeming them subject to the types of distortions that have long bedeviled sex surveys. All it takes is a few self-styled Don Juans to hopelessly distort the sex-survey figures. Similarly, cybercrime surveys tend to get dominated by a minority of responses, normally those who have or think they have lost a great deal as a result of hacking or malware attack, and are vocal about it. 'Cybercrime surveys are so compromised and biased that no faith whatever can be placed in their findings,' the researchers write."
This discussion has been archived. No new comments can be posted.

What Cybercrime Stats Have In Common With Sexual Braggadocio

Comments Filter:
  • Cant you just exclude the outliers from the analysis?
    • Re:outliers? (Score:5, Insightful)

      by Ruke ( 857276 ) on Thursday June 09, 2011 @03:56PM (#36392860)

      Firstly: No. Outliers are part of a data set, and it's dishonest to simply dismiss data that does not fit with your expectations.

      Secondly: The over-reporters aren't outliers. There is systematic error in asking people to self-report loss due to security breaches. People either fail to respond to polls due to internal security procedures, or they tend towards overestimating their own loss. It's not simply that there's one guy out there saying he lost $5 billion due to hackers; it's that people who respond to the poll tend to overestimate their real losses by some unknown percentage.

      • by Anonymous Coward

        Much like Microsoft overestimates their losses from people using their software without paying them?

        • Much like Microsoft overestimates their losses from people using their software without paying them?

          If Microsoft say that (made up numbers) a billion people are using pirated copies of Windows for which Microsoft would charge $50 each, then they can legitimately argue that they have lost $50 billion in revenue. It is up to people then to disprove the assumptions made.

          • by gnick ( 1211984 )

            If Microsoft say that (made up numbers) a billion people are using pirated copies of Windows for which Microsoft would charge $50 each, then they can legitimately argue that they have lost $50 billion in revenue. It is up to people then to disprove the assumptions made.

            If Microsoft says that a billion people are using pirated copies of Windows that they would have otherwise paid $50 for, then they have $50B in losses. But if there are a billion people using pirated copies that never would have bought the product were it not free, it just means that there is software out there that Microsoft claims is worth $50B. That's a big distinction.

      • It's not dishonest if you *say that you are excluding them*, and explain why you are doing so. It's not like the whole field of robust statistics doesn't exist. Real statisticians filter data for stuff like people misplacing decimal points, and so on, all the time.
      • Secondly: The over-reporters aren't outliers.

        You don't know that. The fact that soemone says he lost $1000 doesn't mean that he did. The paper actually opens with several examples of over-reporters who's data was shown to be incorrect.

        • by Ruke ( 857276 )
          I think we might have a difference in understanding in what "outlier" means. An outlier isn't a data point that is shown to be incorrect; it's a data point that is numerically distant from the rest of the points in a set. The difficulty with this data set is that it's not just the extraordinarily high values that are incorrect, but that the statistically-average values are under suspicion as well. There might very well be one large company who actually did lose $30 million due to a security breach, and 100
          • I think we might have a difference in understanding in what "outlier" means. An outlier isn't a data point that is shown to be incorrect; it's a data point that is numerically distant from the rest of the points in a set. The difficulty with this data set is that it's not just the extraordinarily high values that are incorrect, but that the statistically-average values are under suspicion as well. There might very well be one large company who actually did lose $30 million due to a security breach, and 100 small companies who reported losing $25,000 when they actually lost something closer to $2000. The problem is that the incorrect values aren't outliers; there's a whole bunch of them, so they don't look statistically different from the rest of the data.

            No, I think we're on the same page as to what constitutes outlier. The point the paper makes is that for some surveys 75% of the average comes from an outlier or two. This is exactly the case with the 2007 ID theft survey they mention in the intro: the answers from 2 people (in a survey of over 4000) made a 3x difference in the average (and were found to be fabricated). It's quite possible that some of the non-outlier answers were fabricated also, but they don't have the same influence on the estimate.

      • by Kjella ( 173770 )

        Secondly: The over-reporters aren't outliers. There is systematic error in asking people to self-report loss due to security breaches. People either fail to respond to polls due to internal security procedures, or they tend towards overestimating their own loss. It's not simply that there's one guy out there saying he lost $5 billion due to hackers; it's that people who respond to the poll tend to overestimate their real losses by some unknown percentage.

        Well at least a sexual braggart knows (or should know) how many he's really slept with. How much business would we lose if we lost our customer data? We might get a decent figure of the people who'll instantly halt business with us. It might be one of the reasons people leave us later, but nobody really knows if it was the tipping point. Trying to guess what amount of new business we've lost is hopeless. Likewise if some competitor now is sitting on our data and stealing customers left and right. That is if

        • Well at least a sexual braggart knows (or should know) how many he's really slept with.

          Once you get past a couple of hundred, you sort of lose count. Apparently.

    • Cant you just exclude the outliers from the analysis?

      It depends on whether the outlier data is correct. If you're surveying wealth and some guy claims to be worth $50 billion, you need to figure out if he's telling the truth or not. Outliers have a huge effect on the average, that's the point of the sex-survey. The average number of partners reported by men is 5x higher than reported by women. But if you throw out the outliers among the men the averages are almost the same. Point of the paper is that in cyber-crime surveys they never even examine outlier res

      • The average number of sex partners by one sex (assuming hetero sex only and a closed population) is going to differ from the average number of sex partners by the other sex in a way that depends only on their relative numbers. To calculate, you sum the number of distinct pairs of people who engaged in sex, and divide either by the number of people of that sex in the population. Therefore, under realistic conditions, the averages should be almost the same. (The distributions may be different, and the ave

  • Press releases by a single company, on the other hand are mostly well balanced views on reality, where distorted views are balanced by having a large population providing information.
    • Turn to page 4 of the associated pdf, and look at the figure. 'Chart title'? 'Axis title'? Yeah, this is real professional looking.
  • by Qzukk ( 229616 ) on Thursday June 09, 2011 @03:47PM (#36392760) Journal

    Hackers emailed me a grenade that blew up my PC! [theregister.co.uk]

    It's true!

  • by jandrese ( 485 ) <kensama@vt.edu> on Thursday June 09, 2011 @03:49PM (#36392790) Homepage Journal
    Some of the worst offenders of this are outfits like the RIAA and MPAA that grossly overstate the impact of piracy in order to legitimize themselves. When a single kid with Limewire deserves a fine larger than the GDP of the entire world for a decade, you know the metrics have lost all basis in reality.
  • by Hatta ( 162192 ) on Thursday June 09, 2011 @03:50PM (#36392798) Journal

    Everyone exaggerates how many systems they've penetrated.

    • A couple of years ago, I tried to penetrate Lustrust [luxtrust.lu], but I didn't manage. Indeed, the security hole that I was aiming for was protected by 2 big phat lunar firewalls. However I still managed to deface it, and the defacement stayed for a couple of days...
    • by Anonymous Coward

      I have penetrated many beautiful systems in my day.

      Oh, you were talking about hacking computers?

    • Well, when your intrusion fingerprint is so small, who can tell if you've even done it at all?
  • You mean the guy down the hall in my apartment building with the mustard stains on his shirt *may* be exaggerating when he tells everyone he's the world's greatest hacker and that his "I could be a millionaire if I wanted to be, but I don't hack for money, so that's why I live here" claim could *possibly* be bullshit?

  • Do the folks at Microsoft speak it?

    Verbing weirds language.

  • by gstoddart ( 321705 ) on Thursday June 09, 2011 @03:57PM (#36392870) Homepage

    Unverified self-reported numbers that come from such people are used as the basis for calculating losses that are based on, at best, guesstimates.

    Unfortunately, this is also how Microsoft comes up with numbers for piracy ... they pull them out of their ass, and build guesstimates to suggest they've lost eleventy trillion dollars to piracy. Same goes for the RIAA/MPAA and the BSA. They have no objective numbers.

    Microsoft just doesn't like these ones because their OS is at the heart of much of it.

    You can't go dissing the methodology when you don't want them to be true, and using the methodology when it suits you. Although, corporations don't seem concerned by such things as logical inconsistencies.

    • You can use more accurate numbers to estimate the rate of piracy because they don't rely on self reported surveys. If you can determine how many licenses that Microsoft issued in the region and compare it to computers that are running windows update, you can get fairly accurate statistics.

      The amount of profits lost are subject to more debate because you don't know what percentage of sales are lost as a result of piracy. Microsoft will likely overstate this effect while pirates will understate this effect.

      • by Eivind ( 15695 )

        Agreed. If you can know with reasonable accuracy how many copies of Windows was sold, and how many was installed - then you can make a reasonably good estimate of piracy.

        But it's -real- tricky to estimate economic losses. The maximum is easy, that's just the standard retail price times the count of pirated copies.

        The minimum however, is negative. It's entirely possible that if everyone had to pay full price, the result would be that Windows lost it's dominant position as an OS, and thus that sales would be

  • Sex surveys can use a large number of samples (up to the entire population if funding permits) which will eliminate the outlier bias problem. Victims of cyber crime are a smaller population from which to sample. And those victims are not representative of the Internet community. They are either attractive targets or too stupid to secure their systems. Its like asking blonds with big tits how many partners they've had and extrapolating.

    From TFA:

    It's well enough established that men claim to have more female sexual partners in sex surveys than women claim male partners, a discrepancy that can't be explained by sampling error alone.

    That can be explained by a few women I know. They can take on t

    • by Anonymous Coward

      The BBC did a study during one of it's sex ed type shows (don't remember which one). They asked males and females about their sexual partners then asked them the same questions under a lie detector. The males tended to over state a little bit (to impress people) while the females understated more (to not seem like sluts).

      The lie detector checked results came out with males and females having the same amount of partners.

      • by PPH ( 736903 )

        Not an anonymous survey? Of course people will exaggerate.

        The Kinsey Reports [wikipedia.org], based on anonymized data are remarkably accurate. Consider that their figures for male homosexuality were collected at a time (1948) when such behavior was not socially acceptable, let alone the basis for bragging.

    • by Toonol ( 1057698 )
      Men's distribution of partners follows a fairly simple bell curve, with the peak at around four. Women's is more complicated; it peaks at fewer, but trails off more slowly. At the high end, many dozens of partners or more, there are more women than men.

      So, a typical woman may have fewer partners than a typical man, even while it all balances out, due to those wonderful girls at the far end of the spectrum.
      • It peaks at four? (????)

        I'm either a manslut, or you're a priest.

        OK, so I do live in Brazil, we're a more open society etc etc etc but ARE YOU A FUCKING AMISH?

        I'm well past four and I don't even think I'm at MY peak yet.

        • by PPH ( 736903 )

          It peaks at four? (????)

          I'm either a manslut, or you're a priest.

          Remember, this is Slashdot. Where everyone's girlfreind's last name is JPEG.

          I'm well past four and I don't even think I'm at MY peak yet.

          I hit four when I went to band camp in high school.

        • It peaks at four? (????)

          I'm either a manslut, or you're a priest.

          OK, so I do live in Brazil, we're a more open society etc etc etc but ARE YOU A FUCKING AMISH?

          I'm well past four and I don't even think I'm at MY peak yet.

          For most slashdotters it peaks at one, if they've got a sister who needs help with her maths homework on a regular basis..

      • It can't be a bell curve, since the number can't be less than zero. Can be approximately a bell curve either, since it definitely isn't symmetric.
    • It's well enough established that men claim to have more female sexual partners in sex surveys than women claim male partners, a discrepancy that can't be explained by sampling error alone.

      That can be explained by a few women I know. They can take on three men at a time. So unless you correct the survey for them, the numbers won't match.

      No, it can't. Suppose one woman sleeps with 100 guys. One woman increased her count by 100, and 100 guys increased their count by 1 each. The average number of heterosexual sex-partners that men and women have had is the same. Do you need me to draw you a diagram?

      • by PPH ( 736903 )

        Now, think about what happens to a statistical sample if you miss that one woman.

        • That might make sense if we were talking about a single study but the data is consistently out, in the same manner, across hundreds of studies over several decades. You can't just accidentally miss the really promiscous women every single time (well you could but it is statistically improbable).

          The only two viable explanations I've ever heard (and I use the term loosely for the second) is that either there is consistent incorrect self reporting or extremely promiscous women have such a higher rate of death

          • Maybe they are inaccurate in identifying the GENDER of the person they had sex with? what if 50 guys had sex with a tranny, and all report it as a woman, while that tranny identifies (on the anonymous survey) as a man who had sex with 50 men? Or maybe the women are only reporting the MEN they slept with, but some are excluded because the woman afterwards decides some of her partners didn't deserve to be called men?
  • by Anonymous Coward
    When Microsoft collects $5 per computer license (MAR - Microsoft Authorized Refurbishers) on used PCs donated small schools and internet cafes in African nations, with incomes below $1,000 per year... for used PCs which already had a licensed version of Microsoft... and the people who copy the old license back on for free are "cybercriminals", and the billionaire people who take the $5 from countries where that money could save a child's life from malaria ... It seems to me to be kind of difficult to descri
  • when I saw the phrase "sexual braggadocio" I thought this would be problem caused by boasting of uber-hackers, vis-a-vis Swordfish : "With a gun pressed to my head, I hacked the defense grid with one hand, the World Bank with the other, and the CIA with voice recognition software, while getting a blowjob from two women"
    • Yeah when I read the headline, "What Cybercrime Stats Have In Common With Sexual Braggadocio 25," I thought they were going to find a link between bragging about sex and committing Cybercrime.
    • And three hacks?

      Dude! Everyone knows that it's one babe per hack. By accepting such low standards you're screwing up the surveys.

      • oh......third woman......lesseee...."with the dominatrix pressing the lubed suppressor of her Mac-10 into my rectum,,,,....."
  • by Anonymous Coward

    So this 'over reporting' and overstating and exaggeration of loss applies to the MPAA/RIAA too right? After all, just two months ago they claimed losses in the trillions of US dollars, and yet the sum they quoted was many times more than their entire industries made -adjusted for inflation- worldwide for the entire time of their existence. Yet draconian laws created by them and enacted by bribed corrupt politicians and enforced by punishment-does-in-no-way-fit-the-crime law enforcement agencies happens at

    • Not sure this work really talks about RIAA. I don't think the RIAA estimates were done from self-report surveys, but they're still just made-up numbers. It seems to be the rule in anything related to cyber-foo that you make up loss estimates, and nobody questions them so long as a) they're big and b) bigger than last year's numbers and c) you use them to claim a "growing crisis."
  • Microsoft assume no one is really getting hacked.

    Sorry nerds, some of us really are sexual tyrannosaurs.

    • Sorry nerds, some of us really are sexual tyrannosaurs.

      Sorry, I'm not getting your metaphor. Does that mean you haven't had a chance to reproduce in millions of years, or that you have stubby almost useless appendages?

  • Maybe the people complaining about cybercrime just need to get laid.

The Tao is like a glob pattern: used but never used up. It is like the extern void: filled with infinite possibilities.

Working...