Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Internet Census 2012 Data Examined: Authentic, But Chaotic and Unethical

timothy posted about 3 months ago | from the could-have-been-worse dept.

The Internet 32

An anonymous reader writes "A team of researchers at the TU Berlin and RWTH Aachen presented an analysis of the Internet Census 2012 data set (here's the PDF) in the July edition of the ACM Sigcomm Computer Communication Review journal. After its release on March 17, 2013 by an anonymous author, the Internet Census data created an immediate media buzz, mainly due to its unethical data collection methodology that exploited default passwords to form the Carna botnet. The now published analysis suggests that the released data set is authentic and not faked, but also reveals a rather chaotic picture. The Census suffers from a number of methodological flaws and also lacks meta-data information, which renders the data unusable for many further analyses. As a result, the researchers have not been able to verify several claims that the anonymous author(s) made in the published Internet Census report. The researchers also point to similar but legal efforts measuring the Internet and remark that the illegally measured Internet Census 2012 is not only unethical but might have been overrated by the press."

cancel ×

32 comments

Sorry! There are no comments related to the filter you selected.

Census of the trusting and lazy (0)

Anonymous Coward | about 3 months ago | (#47551033)

Since only the paranoid and diligent weren't compromised.

Re:Census of the trusting and lazy (1)

Anonymous Coward | about 3 months ago | (#47551141)

Or maybe you could read the linked summary, and learn that it was not a survey of compromised machines, instead the compromised machines were used to do the actual survey.
But thanks for your uninformed and lazy comment.

I didn't RTFA (0)

Anonymous Coward | about 3 months ago | (#47551039)

What does passwords have to do with an Internet census? Was it a census about passwords or about users?

I didn't RTFA (0)

Anonymous Coward | about 3 months ago | (#47555349)

Please refer to the /. post from last year explaining what happened: http://tech.slashdot.org/story/13/03/20/1520218/botnet-uses-default-passwords-to-conduct-internet-census-2012

Why not just get the metadata from the NSA? (2)

WillAffleckUW (858324) | about 3 months ago | (#47551081)

They illegally and unconstitutionally collect it anyway, especially on Americans, and give a copy of the feed illegally and unconstitutionally to the CIA and GCHQ.

Among others.

Anagram near miss (1)

tepples (727027) | about 3 months ago | (#47551093)

"Authentic" would be an anagram of "unethical" if it weren't for that darned "l".

Re:Anagram near miss (1)

disposable60 (735022) | about 3 months ago | (#47551105)

Try it in French: l'Authentic

Re:Anagram near miss (0)

Anonymous Coward | about 3 months ago | (#47552423)

Try it in real French: l'authentique.

Re:Anagram near miss (0)

Anonymous Coward | about 3 months ago | (#47552531)

mon amour <3

Unethical (2, Interesting)

maevius (518697) | about 3 months ago | (#47551127)

Unethical? Whatever.
Having read the original "census", it was a cool hack and no harm was done, nothing more. I'm pretty sure he/they didn't go for vigorous scientific process when this was done.

Re:Unethical (0)

Anonymous Coward | about 3 months ago | (#47551243)

Pretty much says exactly that on the original site [sourceforge.net]

The why is also simple: I did not want to ask myself for the rest of my life how much fun it could have been or if the infrastructure I imagined in my head would have worked as expected. I saw the chance to really work on an Internet scale, command hundred thousands of devices with a click of my mouse, portscan and map the whole Internet in a way nobody had done before, basically have fun with computers and the Internet in a way very few people ever will. I decided it would be worth my time.

"but might have been overrated by the press" (1)

Anonymous Coward | about 3 months ago | (#47551189)

Shocking! Just simply shocking!

I wonder (2)

NotInHere (3654617) | about 3 months ago | (#47551205)

Why is using idle machines of other people (he's used only machines whose load was under a certain threshold), more unethic than to torment and kill mice in the name of science? I don't think that, when used responsible, latter is unethic, but I wonder why do they put things above biological life?

Re:I wonder (1)

Anonymous Coward | about 3 months ago | (#47551269)

but I wonder why do they put things above biological life?

Same reason why it is illegal to steal, even if it's only food, and you are really hungry. Understandable, sure. Forgivable, maybe. But still illegal.

Re:I wonder (0)

Anonymous Coward | about 3 months ago | (#47552437)

That depends on the country.

There are countries where if you steal a slice of bread for won consumption they will not charge you. There are states where you do it three times and you get a life without a parole. There are states where they cut off the hand that stole. There were societies where the term theft did not exists.

Re:I wonder (0)

Anonymous Coward | about 3 months ago | (#47551273)

Why is using idle machines of other people (he's used only machines whose load was under a certain threshold), more unethic than to torment and kill mice in the name of science? I don't think that, when used responsible, latter is unethic, but I wonder why do they put things above biological life?

Are you some sort of communist?

Re:I wonder (1)

NotInHere (3654617) | about 3 months ago | (#47551317)

What he did was illegal, and when he were found I'd have no problem of him being punished according to the law. But it is not unethic. Not when he uses default passwords, and creates no harm.
No, I'm not.

Biased, much ? (2)

aepervius (535155) | about 3 months ago | (#47551315)

We do not "torment" and kill mice gratuitiously, a choice of word which certainly show quite inherent bias here. Usually you have to go thru an ethical comitee for animal experimentation (although the barrier is lower for lab mouse). Furthermore most of those animal experimentation have a clear goal to help develop cure or model for the human health. If you can't differentiate that from people misusing the computer of others, then I can't help you.

Re:Biased, much ? (1)

NotInHere (3654617) | about 3 months ago | (#47551539)

I don't think that we shouldn't cover animal experimentation with flower words. I've no doubt animal experiments are OK, as you've said they mostly help the health of humans, but we should at least name what we do to the animals by what it is. How would you call it?

Of course, an internet census is not such an "ethical" goal as healing people, so my comparison might be a bit shaky from this perspective.

Cetrtainly not torture or torment (1)

aepervius (535155) | about 3 months ago | (#47560987)

Firstly not all animals in experimentation are killed or suffer. But even for those who do : one of the goal of ethical guideline is to avoid animal pain as much as possible. In fact in some case we go more out of our way to avoid unnecessary pain to animals in labs, than we do for human at end of life in hospital.

You simply have a warped view on lab experimentation which is not found in medical labs. Now you may have a point with *cosmetic* experimentation , but you won't find me defending those.

Re:I wonder (1)

weilawei (897823) | about 3 months ago | (#47551385)

When you use a machine, it ceases to become idle. It incurs bandwidth and power costs. That's (one of) the unethical bits.

Re:I wonder (0)

Anonymous Coward | about 3 months ago | (#47552361)

Yeah, an extremely tiny amount. But murdering mice is okay.

Re:I wonder (1)

penguinoid (724646) | about 3 months ago | (#47551697)

Why is using idle machines of other people (he's used only machines whose load was under a certain threshold), more unethic than to torment and kill mice in the name of science? I don't think that, when used responsible, latter is unethic, but I wonder why do they put things above biological life?

Well, because now we can cure even the most obscure diseases that afflict mice.

Re:I wonder (0)

Anonymous Coward | about 3 months ago | (#47552385)

TImes are difficult and I do not know the answer to your question but I do know that the name Putin was missing in this thread. Here I FTFY.

Re:I wonder (0)

Anonymous Coward | about 3 months ago | (#47553301)

Why is using idle machines of other people (he's used only machines whose load was under a certain threshold), more unethic than to torment and kill mice in the name of science? I don't think that, when used responsible, latter is unethic, but I wonder why do they put things above biological life?

It's still unethical to experiment on somebody else's mouse without permission.

OS Fingerprints! (3, Interesting)

d33tah (2722297) | about 3 months ago | (#47551759)

Apparently the researchers didn't analyze OS fingerprints at all. There is some metadata that the original researcher(s) forgot to remove (as well as a lot more mess). Service fingerprints are interesting as well. I did a lot of research on this data set and I have to say that while messy, this is also a really amazing data set. This article is IMHO biased.

Re:OS Fingerprints! (0)

Anonymous Coward | about 3 months ago | (#47573007)

"Apparently the researchers didn't analyze OS fingerprints at all."

Did you look into their paper? This is apparently not true. They focused on the ICMP data set but also looked into others, in particular the service probes that you mentioned. One of their validation sets is using that data set.

Re:OS Fingerprints! (1)

d33tah (2722297) | about 3 months ago | (#47573029)

"Apparently the researchers didn't analyze OS fingerprints at all."

Did you look into their paper? This is apparently not true. They focused on the ICMP data set but also looked into others, in particular the service probes that you mentioned. One of their validation sets is using that data set.

Okay, point taken about the service fingerprints, but I still see no mention for the OS fingerprints. If they looked at the data format that is there, they could get much more out of the set. (they'd also find more mess by the way as there was some weird bug that destroyed quite a few samples there)

Unfortunately, the data is partially fake (1)

Mr. Spock (25061) | about 3 months ago | (#47553865)

The methodology to verify the data used in the paper was to perform their own scans of networks that were known to have hosts, and then compare the results to the published 2012 internet census data. They got a high match rate. I evaluated the data slightly differently. I scanned network segments that I know to be empty, unused, or entirely behind firewalls. In these cases (for segments /24 and larger) there are still records in the internet census data. These records are completely made up. Try going to the search engine [exfiltrated.com] and typing in a network you know to be completely empty as scanned from the outside. A network that has been allocated but never used would be best. It's a lot of fun, and shows the internet census data is partially falsified and likely to be of no scientific value. Don't avoid the data becuse it's immoral, just avoid it because it's incorrect.

Unfortunately, the data is partially fake (0)

Anonymous Coward | about 3 months ago | (#47555341)

Nice catch! Could you please share a couple of faked IPs? I'd be interested to see how they look like.

Unfortunately, the data is partially fake (0)

Anonymous Coward | about 3 months ago | (#47555381)

Are you sure that it is fake? AFAIK, most of the data sets don't list a source IP. So we don't know where the potentially faked records were measured. It could be a reply by an transparent proxy that is reported in the data. It's so messy that this might be impossible to distinguish. That'd would be another reason for me why the data is trash and of no value.

Results of the census (1)

roca (43122) | about 3 months ago | (#47553987)

I assumed "Authentic, But Chaotic and Unethical" was the description of the Internet resulting from the census.

Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?