Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

MD5 Collision Source Code Released

Zonk posted more than 8 years ago | from the collisiontacular dept.

Security 411

SiliconEntity writes "The crypto world was shaken to its roots last year with the announcement of a new algorithm to find collisions in the still widely-used MD5 hash algorithm. Despite considerable work and commentary since then, no source code for finding such collisions has been published. Until today! Patrick Stach has announced the availability of his source code for finding MD5 collisions and MD4 collisions (Coral cache links provided to prevent slashdotting). MD4 collisions can be found in a few seconds (but nobody uses that any more), while MD5 collisions (still being used!) take 45 minutes on a 1.6 GHz P4. At last we will be able to implement various attacks which have been purely hypothetical until now. This more than anything should be the final stake in the heart of MD5, now that anyone can generate collisions whenever they want."

Sorry! There are no comments related to the filter you selected.

leet (1)

Dragoonkain (704719) | more than 8 years ago | (#14038137)

leet

SHA1 (5, Funny)

mysqlrocks (783488) | more than 8 years ago | (#14038144)

So is SHA1 the recommended alternative?

Re:SHA1 (3, Informative)

Mind Booster Noori (772408) | more than 8 years ago | (#14038214)

SHA-1 is not the sollution [schneier.com] .

Re:SHA1 (1)

mboverload (657893) | more than 8 years ago | (#14038279)

It was a JOKE people!

Re:SHA1 (5, Informative)

psykocrime (61037) | more than 8 years ago | (#14038225)

So is SHA1 the recommended alternative?

No, see:

http://www.computerworld.com/securitytopics/securi ty/story/0,10801,99852,00.html [computerworld.com]

and

http://www.computerworld.com/softwaretopics/softwa re/story/0,10801,105875,00.html [computerworld.com]

I like this quote:

"SHA-1 is a wounded fish in shark-infested waters, but I'm more worried about MD5 because it's used everywhere," said Niels Ferguson, a cryptographer at Microsoft Corp. "Try to switch away from SHA-1 as quickly as you can, but switch away from MD5 first," he said when asked what recommendations he has regarding the algorithms during a panel discussion at the conference.

Re:SHA1 (5, Informative)

Anonymous Coward | more than 8 years ago | (#14038431)

No, MD5 and SHA1 were found to have better than brute-force attacks within a few months of each other.

Crypto people are recommending SHA-256 or SHA-512 which is only like SHA-1 in name.

Obviously check your the hash length beforehand and make sure your database column is wide enough.

When migrating existing hashes to the new hash be careful not to store the old hash anywhere -- that can be the weak link in the chain. For example, generating passwords and having the MD5 around lets attackers generate valid inputs and then try them against the more computationally complex hash. It gives them an approach to attacking your stronger hash.

Take a copy of your database and hash all the existing passwords into SHA-512 form, and you'll need some way of distinguishing the MD5-to-SHA512 hashes from the SHA512 hashes, so add a date column with todays date in it. Then write a function "hashString" as a wrapper that can identify when something was hashed, and go down a different branch of code based on that.

The first branch does MD5 then SHA512, the second branch does SHA512, and it does this based on the date column.

And, of course, re-salt both branches.

Managed to get just the last few lines... (4, Funny)

Saint Aardvark (159009) | more than 8 years ago | (#14038147)

...before even the Coral cache was Slashdotted, and it turns out they've written it in LISP:

))))))) ))))))))

(With sincere apologies to Bryce Jasmer [netfunny.com] .)

shaken to our what? (3, Insightful)

tomstdenis (446163) | more than 8 years ago | (#14038152)

It's important news but not really that shocking. MD5 was not something professionals would recommend for a few years already.

Any half-way intelligent cryptographer would have suggested SHA-1, TIGER or perhaps HAVAL since quite some time already.

Tom

Re:shaken to our what? (2, Insightful)

ZachPruckowski (918562) | more than 8 years ago | (#14038235)

It's important news but not really that shocking. MD5 was not something professionals would recommend for a few years already.

But a household user can crack in an hour on a normal mid-line computer something that "a few years" ago professionals might have recommended. That's the news. If low-end PCs can crack encryption that's only a few years outdated, then the hounds are nipping at the heels of the cyptography industry. And imagine what hackers could do with more powerful computers (yes, I know there is a non-applicability issue with some forms of encryption). Or the government with all those Cray super-monstrosities.

MD5 is not an encryption algo (1, Insightful)

Junta (36770) | more than 8 years ago | (#14038325)

It is a hash algo. It's used not to protect the content of anything, just to provide a method to validate content integrity, to show nothing accidental or intentional happened to change it.

Re:MD5 is not an encryption algo (1)

san (6716) | more than 8 years ago | (#14038443)

It's also the default algorithm to hash passwords (i.e. if you type in your password, it gets hashed into an MD5 sum which is then compared to what the MD5-ed password should be, thereby avoiding plaintext password storage).

Lucklily, most sane systems use salt [wikipedia.org] so this algorithm won't work out of the box.

Re:shaken to our what? (1)

Mind Booster Noori (772408) | more than 8 years ago | (#14038249)

SHA-1 is not the sollution [schneier.com] . Take a look at SHA-224, SHA-256, SHA-384, and SHA-512.

Re:shaken to our what? (0, Redundant)

tomstdenis (446163) | more than 8 years ago | (#14038465)

I was talking about 5 years ago. Though I wouldn't trust SHA-2 today anyways. The design is just not sound.

It's practically secure [SHA-2] as to be useful today but I wouldn't be happy about it.

Tom

WHIRLPOOL (0)

Anonymous Coward | more than 8 years ago | (#14038321)

Why not WHIRLPOOL (http://en.wikipedia.org/wiki/WHIRLPOOL [wikipedia.org] ), it is made by one of the makers of AES, Vincent Rijmen.

SHA-1??? (3, Insightful)

jd (1658) | more than 8 years ago | (#14038353)

SHA-1 has known attacks, although none have (yet) proven to be useful for an exploit. The SHA-2 family (eg: SHA-256) are "unproven" and not part of the FIPS-180 standard (so cannot be used for US Government work), but I would regard them as being "probably safer" than SHA-1 for secure work.


TIGER is good, as is Whirlpool. Whirlpool has the advantage that it uses AES as the basis, and AES is regarded as secure. It was also certified for secure work by NESSIE - a European group trying to do for the EU what NIST does for the US - which means that it's probably certified for use in secure environments in Europe.


According to the Hashing Function Lounge, there are other hashing functions that have not been broken (eg: cellhash and fft-hash) but these are sufficiently obscure that a lack of a known exploit may be through lack of study and not through the presence of good security. It would make them good for beating skript-kiddies, as they won't have the skills to find exploits and those skilled enough at finding them aren't studying those algorithms much. (I don't like security through obscurity, but technically these aren't obscured as anyone CAN study the algorithm.)

So you found a collision, big deal (3, Interesting)

goombah99 (560566) | more than 8 years ago | (#14038358)

Maybe someone could explain why collisions are a serious problem for MD5. Or at least in what instances they are. I can see that in some cases, such as password hashing this could be a problem. But for many other uses I don't see what harm a collision has. Maybe I misunderstand but as I understand it MD5s are normally used in a checksum manner to sign or provide a fingerprint of a document. If you have an original document and compute it's MD5 then it can match some certified MD5 check sum. If someone were to generate a fake document they coul dnot design it to match the MD5 fingerprint. They could create some bit of gibberish that did match it but not a document that was useful as a forgery.

I guess if one were trying to deliberately pedal gibberish, like say if you were the RIAA trying to destabilize a torrent net then that would be all you care about. But for more general issues I don't get it.

Or is it now possible to generate a collision that also contains some intentional content like a binary program or source code or bank statement. That woul dbe be bad.

It seems like even in the case of gibberish generation that some simple hacks could extend the useful life of this. For example, if you were to MD5 a whole document and then MD5 the concatenation of every other byte in the document, it woul dbe pretty hard to find two collisions that had that property simulateously. Sure I wont doubt there are better ways to hash something than adding hacks to MD5. I'm just saying that it seems like a simple hack might well be good enough to extend its useful lifespan for passwords and file shareing.

But I invite you to explain to me why I'm wrong.

Re:So you found a collision, big deal (5, Informative)

Krischi (61667) | more than 8 years ago | (#14038429)

See this: http://www.cits.rub.de/MD5Collisions/ [cits.rub.de]

It demonstrates the generation of two postscript files with the same MD5 hash that nevertheless display completely differently.

Re:shaken to our what? (0, Flamebait)

Anonymous Coward | more than 8 years ago | (#14038476)

Christ, so now you're implying you're a professional cryptographer, Tom? A bug-ridden open source crapfest is not professional software development.

SHA-1 isn't much safer, try SHA-256 or higher (0)

Anonymous Coward | more than 8 years ago | (#14038511)

Any half-way intelligent cryptographer would have suggested SHA-1, TIGER or perhaps HAVAL since quite some time already.

Actually collisions in SHA-1 were confirmed in February of this year, and refined in August. Any half-way decent cryptographer would be using SHA-256 or, better yet, SHA-384 or SHA-512. We've got the disk-size and bandwidth these days not to be worried about a few extra bytes. Bruce Schneier's initial article [schneier.com] on this is instructive.

1.6GHz P4? (0, Redundant)

CCFreak2K (930973) | more than 8 years ago | (#14038166)

My desktop PC is a Pentium 4 at 1.5GHz, and even that thing is considerably slower compared to my 1.5GHz Celeron-M. A modern PC could crack it even faster.

MD5 broken, not useless (2, Insightful)

Anonymous Coward | more than 8 years ago | (#14038170)

Keeping in mind where MD5 is broken is important, so that good uses for this tool are not disposed of out-of-hand.

md5 is still good for keeping track of if your files have changed. It should not be used for document signing.

Re:MD5 broken, not useless (1)

slashdot-me (40891) | more than 8 years ago | (#14038202)

md5 is still good for keeping track of if your files have changed

But only for non-security uses. Don't use MD5 in your intrusion detection system.

Re:MD5 broken, not useless (0)

Anonymous Coward | more than 8 years ago | (#14038537)

Even tracking file changes is only good on a trusted system. Otherwise a hacker can change the files... eg, a bittorrent part that has established the hash days ago and given the attacker time to generate another valid piece.

So it'll help with accidental errors and's a hell of a lot better than CRC, but not reliably against malicious attacks.

So what the hell do I do now? (4, Interesting)

jeblucas (560748) | more than 8 years ago | (#14038175)

I'm essentially crypto ignorant. About all I've known to do was verify MD5 hashes on downloads. Now that this is by-and-large pointless, how to check the veracity of things like Linux ISO's [linuxiso.org] , video drivers [nvidia.com] , etc, ad inifintum?

Re:So what the hell do I do now? (1)

k4_pacific (736911) | more than 8 years ago | (#14038224)

I wouldn't worry about it too much, generating MD5 collisions large enough to be useful requires a multi-million dollar particle accelerator, so I wouldn't worry too much until someone figures out how to modify a microwave oven to do this.

Re:So what the hell do I do now? (5, Insightful)

DreadSpoon (653424) | more than 8 years ago | (#14038236)

Do nothing.

MD5 has not been invalidated for those uses. Checking the MD5 sum of an ISO download is not done for security purposes, it's done so that you can make sure you didn't get a bad byte or two somewhere in that 650MB. I mean, if hackers could upload a malware-filled ISO to the FTP server, they could upload a new MD5SUMS file too, right?

Re:So what the hell do I do now? (4, Informative)

yamla (136560) | more than 8 years ago | (#14038319)

That's not what MD5 sums are used for. TCP/IP already has packet integrity. MD5 sums are indeed used to make sure you don't have a malware-filled ISO. The trick is that you grab the MD5 sum from a trusted source, then you can grab the ISO image from any mirror site. Assuming MD5 is safe (obviously not the case), you know your downloaded ISO is exactly the same as the one distributed from the central repository.

Re:So what the hell do I do now? (1)

harrkev (623093) | more than 8 years ago | (#14038356)

Wrong. If the ISO has some spare room, you could throw in a trojan or two, and just throw in some random data in a junk file to make the hashes match. This is a big deal.

Re:So what the hell do I do now? (1)

Buzz_Litebeer (539463) | more than 8 years ago | (#14038400)

The main problem is it shouldnt be possible to have a functional CD image that matches the MD5 hash of a legit ISO.

Though... I guess if someone had the time they might be able to make something that IS like the regular image, tacks in some trojans, and then adds junk data to the ISO until it matches.

But i dont think that its possible to do that.

Re:So what the hell do I do now? (1)

camcorder (759720) | more than 8 years ago | (#14038381)

What if they do not change the MD5 file but put another iso that has same MD5 hash with the original iso file? It might take considerably long time to detect that kind of intrusion. But it's evident that best way to increase trustworthiness of downloads is using asymetric hashing methods (aka signing).

Re:So what the hell do I do now? (1)

diegocgteleline.es (653730) | more than 8 years ago | (#14038401)

Which is why FTP mirrors never should mirror MD5SUMS files

Re:So what the hell do I do now? (0)

Anonymous Coward | more than 8 years ago | (#14038245)

This does not make validating the MD5 digest pointless. It's not like you can create arbitrary text and have it be hashed to any MD5. This just generates random text that will match a give MD5 digest.

This is more of a concern for cryptography, not message hashing.

Re:So what the hell do I do now? (1)

TetryonX (830121) | more than 8 years ago | (#14038261)

Generally on larger files it becomes increasingly difficult to have the same file size AND md5 hash. On most repositories, md5 being cracked will have little effect since it doesn't actually matter much about whether or not the file is exactly what it says it is if you are getting it from a secure repository. IANAC, but I figure it would be a lot more difficult to generate hashes at a quick rate and match the correct filesize.

File verification is more or less just a way to determine whether or not the file became corrupted from between the server and your computer, so again no worries. If the file was already poisoned on the server, I can almost guarentee that the md5 will reflect the md5 of the poisoned file, and not a specially crafted poisoned file.

Re:So what the hell do I do now? (1)

discordja (612393) | more than 8 years ago | (#14038265)

well for the day to day I don't think the impact is overly important. Mind you, no one has been suggesting using MD5 for a few years now for important security measures. (pretty soon no one will suggest it for SHA-1 .. it's a never ending cycle as processing capacity grows). So always download your files from reliable sources and check the hash against the known good provided by the distributor and more likely than not you will have a good cd image or installer or whatever. So I'd not concern too much, there is not going to be a shift in the community to abandon MD5 tomorrow for the mundane checks.

Re:So what the hell do I do now? (3, Informative)

Zocalo (252965) | more than 8 years ago | (#14038369)

Most things use multiple checksums, for instance on the updated copy of Lynx I just grabbed for Fedora:

$ rpm --checksig lynx-2.8.5-23.2.x86_64.rpm
lynx-2.8.5-23.2.x86_64.rpm: (sha1) dsa sha1 md5 gpg OK
$

So, even if it is also possible to generate collisions for DSA and GPG keys as well as SHA1 and MD5, the chances of being able to generate a collision for all four checksums/signatures at the same time is quite likely infinitesimally small. And that's just for a random file, things are going to get much more complex if you want that random file to can pass as whatever format the original was supposed to be and actually deliver a payload that might do something useful for the cracker.

The cycle begins anew... (2, Funny)

rcbarnes (875915) | more than 8 years ago | (#14038184)

Great. Now that MD5 is dead, the slow/theoretical attacks on SHA1 can be the focus of collision research. I look forward to changing hash algorythms again from SHA1 in a year. :-/

Re:The cycle begins anew... (1)

Mind Booster Noori (772408) | more than 8 years ago | (#14038270)

SHA-1 is not the sollution [schneier.com] . Take a look at SHA-224, SHA-256, SHA-384, and SHA-512.

Should I care? (5, Funny)

SlashAmpersand (918025) | more than 8 years ago | (#14038188)

This is all really interesting theoretically, but who has the money to run a 1.6 GHz P4?

Re:Should I care? (0)

Anonymous Coward | more than 8 years ago | (#14038212)

Yeah that isn't funny at all.

quick and dirty benchmark (factorial) (1, Interesting)

Anonymous Coward | more than 8 years ago | (#14038457)

a very simple perfomance check i love to run on every computer i come across:

put windows calculator in scientific mode (to keep things comparable, please use windose calculator, not some custom written C+ program....)

type in 100,000

hit the n! button

ignore the warnings that it will take a long time, don't even bother clicking on "Continue", because the calculation is still going.

and report how long it takes to complete a factorial of 100,000

please report what CPU you have.

P4 3.2Ghz and Athlon 3200+ both do it in about 80 seconds....

Re:quick and dirty benchmark (factorial) (1)

klep (26544) | more than 8 years ago | (#14038522)

kcalc (KDE calculator) on a P4 3.4Ghz does it in less than 20 seconds.

Re:quick and dirty benchmark (factorial) (0)

Anonymous Coward | more than 8 years ago | (#14038524)

Athlon 2000+

121 seconds

(I clicked continue :-p)

Replacement Hash Functions (5, Informative)

Anonymous Coward | more than 8 years ago | (#14038190)

Recommended replacements are SHA (preferably SHA-2), WHIRLPOOL and/or RIPEMD.

http://en.wikipedia.org/wiki/SHA-2 [wikipedia.org]
http://en.wikipedia.org/wiki/WHIRLPOOL [wikipedia.org]
http://en.wikipedia.org/wiki/RIPEMD-160 [wikipedia.org]

Re:Replacement Hash Functions (1)

From A Far Away Land (930780) | more than 8 years ago | (#14038468)

Might I suggest a pre-emptive upgrade, and isntead of replacing MD5 with MD6, go with MD7 or even MD8 to avoid the need to upgrade after cracks are found for MD6 or 7. Or just use Blowfish 1024bit?

Why? (3, Insightful)

mboverload (657893) | more than 8 years ago | (#14038192)

Why not just use 2 different algorithms? Yes, it's possible. Or hell, use 3. Can some one tell me why not this isn't a standard practice? Even if one has a weakness, you still have the other to back it up

I use HMAC-SHA-256 with PasswordMaker.

https://passwordmaker.org/passwordmaker.html [passwordmaker.org]

Re:Why? (5, Insightful)

einhverfr (238914) | more than 8 years ago | (#14038324)

Even if SHA1 and MD5 have attackable collisions the chances are very low that you can find a meaningful collision that affects both algorithms.

Re:Why? (1)

mboverload (657893) | more than 8 years ago | (#14038343)

Although I was thinking of a hash of a hash, that works as well and fits into my question.

Re:Why? (1)

Zone-MR (631588) | more than 8 years ago | (#14038445)

A hash of a hash? You mean something like SHA1(MD5('data')) ?

In that case, if MD5 is broken and 'data' can be constructed to give the same MD5 hash, the SHA1 hash provides no extra security. In fact calculating a hash of a hash reduces the security, as if ANY of the algorithms in the chain are broken, the final result can be manipulated easily.

Re:Why? (1)

nahdude812 (88157) | more than 8 years ago | (#14038364)

As I understand it:

Every re-hashing increases the collision rate, so generally for highest security, you use one unbroken hashing algorithm. Yes, using more than one hashing algorithm would reduce the chances that your whole line of defense is invalidated one day, but it increases the chance that a single hole could be found.

Re:Why? (1)

mysqlrocks (783488) | more than 8 years ago | (#14038375)

Why not just use 2 different algorithms?

If this was in fact more secure than perhaps it would have already been released bundled as one algorithm. Why make people use two sets of algorithms unless the goal is to confuse potential crackers? However, mose uses of MD5 involve the recipient knowing the algorithm used so that wouldn't work.

Re:Why? (2, Informative)

ninjaz (1202) | more than 8 years ago | (#14038489)

Why not just use 2 different algorithms? Yes, it's possible. Or hell, use 3. Can some one tell me why not this isn't a standard practice? Even if one has a weakness, you still have the other to back it up
I noticed that NetBSD's source-based package management system, pkgsrc [netbsd.org] , already does this using SHA1 and RMD160 (apparently RIPEMD-160 [wikipedia.org] is the official name for the digest). Here's what it looks like in the archive fetching phase of a package installation:

=> Checksum SHA1 OK for unzip-5.52/unzip552.tar.gz.
=> Checksum RMD160 OK for unzip-5.52/unzip552.tar.gz.
===> Extracting for unzip-5.52nb2

One might also imagine that colliding two different hash types at the same time would be much more difficult than only at a time, anyway.

Collisions on demand (0)

Anonymous Coward | more than 8 years ago | (#14038196)

But will my insurance company cover the damage?

Right now it's time to... (2, Interesting)

Tackhead (54550) | more than 8 years ago | (#14038203)

> This more than anything should be the final stake in the heart of MD5, now that anyone can generate collisions whenever they want."

In other words, right now it's time to...
...LICK OUT THE KAMS, NOTHERGUCKERS!

bittorrent? (4, Insightful)

rayde (738949) | more than 8 years ago | (#14038206)

doesn't bittorrent use md5 to verify the sections of files it has downloaded? will this facilitate poison seeds? or does BT use something more complex than md5?

Re:bittorrent? (4, Informative)

Wesley Felter (138342) | more than 8 years ago | (#14038304)

BitTorrent uses SHA-1.

Re:bittorrent? (1)

Mind Booster Noori (772408) | more than 8 years ago | (#14038372)

SHA-1? Wow, you're safe then... [schneier.com] :-P

Re:bittorrent? (1)

JPyun (911266) | more than 8 years ago | (#14038447)

Hash functions in file sharing are not used for security. They are used to ensure the file didn't become corrupted in transfer. SHA-1 is perfectly fine for this.

Re:bittorrent? (1)

Mind Booster Noori (772408) | more than 8 years ago | (#14038504)

Parent talked about poisoned seeds, and SHA-1 isn't fine to prevent them...

Re:bittorrent? (0)

Anonymous Coward | more than 8 years ago | (#14038332)

http://www.bittorrent.com/FAQ.html#corruptdl [bittorrent.com]

How do I know the download isn't corrupted?

BitTorrent does cryptographic hashing (SHA1) of all data. When you see "Download succeeded!" you can be sure that BitTorrent has already verified the integrity of the data. The integrity and authenticity of a BitTorrent download is as good as the original request to the tracker. Checking the MD5/CRC32/other hash of a file downloaded via BitTorrent is redundant.

WTF this source is useless (3, Interesting)

RootsLINUX (854452) | more than 8 years ago | (#14038211)

I guess someone thinks he's a little too cool to comment his code properly. Yeah, like "/* B2 */" tells me anything useful you insensitive clog!

Re:WTF this source is useless (0)

Anonymous Coward | more than 8 years ago | (#14038529)

Here's an example of what he means:
/* B1 */
Q0[ 4] = (random() | 0xba040010) & ~(0x443b19ee | 0x00000601);
Q0[ 4] |= (Q0[ 3] & 0x00000601);
Q1[ 4] = Q0[ 4] - 0x7dffffe2;
 
X0[19] = RR(Q0[ 4] - Q0[ 3], 22) - F(Q0[ 3], Q0[ 2], Q0[ 1]) - B0 - 0xc1bdceee;
X1[19] = RR(Q1[ 4] - Q1[ 3], 22) - F(Q1[ 3], Q1[ 2], Q1[ 1]) - B1 - 0xc1bdceee;
if(X0[19] != X1[19])
  continue;
Apparently, 1 and 2-character identifier names are all the rage in China. And really, there's no need to bother with explaining the use of 0xba040010 or 0x443b19ee. It should really be self-evident.

I mean seriously, what the hell is the significance of X0[19] and X1[19], what are they used for and what does it mean if they are or are not equal, and why continue the loop until they are equal? What does the call to macro 'F' do? How about the call to 'RR'?

Oh wait, there it is:
#define F(x, y, z) (z ^ (x & (y ^ z)))
#define G(x, y, z) F(z, x, y)
#define H(x, y, z) (x ^ y ^ z)
#define I(x, y, z) (y ^ (x | ~z))
 
#define RL(x, y) (((x) << (y)) | ((x) >> (32 - (y))))
#define RR(x, y) (((x) >> (y)) | ((x) << (32 - (y))))
It's so... obvious.. now..

God, I agree, this stuff is completely unreadable. It is absolutely useless for any learning value at all, other than as a study on how not to write code that is going to be released.

The end of edonkey (1)

Cone83 (683686) | more than 8 years ago | (#14038223)

I think this will be the end of edonkey. AFAIK edonkey still uses md4 hashes. If it really is possible to find an md4 collision in a few seconds, I'm sure the MI will soon create a script that randomly downloads files, and reshares a collision.

Re:The end of edonkey (4, Informative)

n0dalus (807994) | more than 8 years ago | (#14038362)

No. This only helps you find collisions in two randomly generated strings.
It is still very difficult to produce a colliding file given a pre-existing file on the network.
It should also be noted that edonkey splits a file into 9500KB chunks, and then into smaller chunks again, and hashes each one. It would be far more difficult to produce a chunk that causes collisions on all three levels.
Anyway, I expect an eMule extension will come out soon to allow for sharing of SHA1 hashes between clients (if it doesn't already exist).

Torrent Poisoning (0, Redundant)

dduardo (592868) | more than 8 years ago | (#14038226)

Does this mean the RIAA/MPAA can poison torrents by generating files with the same hash?

Re:Torrent Poisoning (0)

Anonymous Coward | more than 8 years ago | (#14038292)

Well, no, at least not if someone doesn't give them the idea!

Re:Torrent Poisoning (1)

Mind Booster Noori (772408) | more than 8 years ago | (#14038337)

Maybe it's time for you to start using Anonymous p2p [wikipedia.org] networks like GNUnet [gnunet.org] ...

Re:Torrent Poisoning (0)

Anonymous Coward | more than 8 years ago | (#14038398)

Does this mean the RIAA/MPAA can poison torrents by generating files with the same hash?

Could I buy some of that hash from you, Professor Jennings?

Re:Torrent Poisoning (0)

Anonymous Coward | more than 8 years ago | (#14038420)

Not at all.

Bittorrent splits files up into pieces of about 64KB, then makes a hash of _each_ of those, as well as the entire file. They'd have to find collisions for at least one of the pieces and the whole file at the same time, and even then it's useless since things like MP3 have error recovery from corrupt data.

How long till we see... (0)

Anonymous Coward | more than 8 years ago | (#14038230)

How long till we see it integrated into cracking tools? Many Linux distros use MD5 for storing the passwords in the /etc/shadow file. How long will it take before they move on to SHA1 or Tiger or etc?

Re:How long till we see... (0)

Anonymous Coward | more than 8 years ago | (#14038307)

As far as I can see, it is pretty damn important for anything using MD5 for security purposes to update. Maybe it isn't so important for your CRAM-MD5 email authentication, but anything more important like on your corporate servers, etc, will need something better than MD5, and fast.

Next year, CRAM-WHIRLPOOL authentication, hehe.

Re:How long till we see... (0)

Anonymous Coward | more than 8 years ago | (#14038451)

If they can get at your password file on your corporate servers, you have bigger problems than what hashing function is being used.

Source?? (0)

Anonymous Coward | more than 8 years ago | (#14038246)

I wouldn't exactly call this source code... this gives almost no explination as to why it works, I could have gotten about as much knowledge from decompiling a binary of this.

Re:Source?? (1)

yup2000 (182755) | more than 8 years ago | (#14038414)

yeah, it is a horrible example of source code. No comments whatsoever. It looks like the programmer deliberately took the comments out. Not very professional.

am i missing something? (1)

Intangion (816356) | more than 8 years ago | (#14038250)

this just means they can find more than one input that comes up with the same output it doesnt mean they can come up with an input to match a specified output (which can usually be protected anyway) so it really doesnt change much right? or am i missing something..

Re:am i missing something? (0)

Anonymous Coward | more than 8 years ago | (#14038428)

google the term "rainbow table"

This is misleading - MD5 is still useful (5, Insightful)

hoggoth (414195) | more than 8 years ago | (#14038252)

This new algorithm does not ruin the usefulness of MD5 hashes. The algorithm can generate two documents that have the same MD5 hash, an MD5 collision. But it can NOT generate an MD5 collision starting with an existing document. In practical terms, this means a file that has been signed with an MD5 hash is STILL secure. Nobody can replace the file with a different file that will have the same MD5 hash. However someone can prepare in advance two documents with the same MD5 hash and trick someone into believing one document is really the other. So if you trust the original source (a Linux distro for example) you can be confident you are downloading the original document.

Re:This is misleading - MD5 is still useful (1)

didit (820432) | more than 8 years ago | (#14038346)

So, this means that encrypted password are still safe too, right? (not to mention that everybody should be using shadow password by now and that root privileges are required to read /etc/shadow)

Re:This is misleading - MD5 is still useful (2, Insightful)

MoogMan (442253) | more than 8 years ago | (#14038421)

Nobody can replace the file with a different file that will have the same MD5 hash.

Yet.
 

one case I can think of... (0)

Anonymous Coward | more than 8 years ago | (#14038425)

Filesharing networks that use MD5 hashes to verify a file would be severely affected. Companies like Overmind that spam the networks could use the collision to generate junk files with matching hashes. Then when clients start downloading, they'll get pieces of the broken file instead of the real one, causing the whole download to be corrupt.

Re:This is misleading - MD5 is still useful (1)

miller60 (554835) | more than 8 years ago | (#14038552)

This is also important for SSL certificates, many of which use MD5. Existing certificates relying on MD5 are still secure, and new ones can be issued using different hashes. But this is one more motivator for NIST and the security community to decide on a way forward and start making it happen.

Collisions do not mean the end of MD5 (5, Insightful)

afaik_ianal (918433) | more than 8 years ago | (#14038276)

This more than anything should be the final stake in the heart of MD5, now that anyone can generate collisions whenever they want.

No, no, no. This does not allow an attacker to generate any collision they like. They cannot find data that collides with a piece of data I provide them with. All they can do is provide me with 2 pieces of data that happen to collide.

This means that an attacker can theoretically provide 2 different documents to people with the same hash, but they cannot easily produce a document that has the same hash as a document I have written.

(Disclaimer: I haven't actually been able to RTFA (it's /.'d), but unless they have made an enormous breakthrough since this was last reported, this attack has very little implications for those of us who use MD5).

"broken" does not mean broken (5, Informative)

Edgewize (262271) | more than 8 years ago | (#14038300)

This program is an efficient way to generate two source blocks with the same resulting MD5. This program does NOT allow you to match an arbitrary MD5 hash. That may come some day, but unless I've missed a very important paper somewhere, it has not happened yet.

This does not totally invalidate MD5 for verification. This attack still does not let you poison a torrent feed, etc, unless you are the author of the original source data and you engineered the data specifically to be vulnerable to this attack.

Re:"broken" does not mean broken (1)

petermgreen (876956) | more than 8 years ago | (#14038488)

btw torrent block hashes are sha1 not md5

Will people stop saying this? (2, Insightful)

ivan256 (17499) | more than 8 years ago | (#14038301)

This more than anything should be the final stake in the heart of MD5

No, no it won't be. It won't be, because MD5 is useful for many things where the existance of an "easy" (in quotes because easy is relative) method of generating colisions is irrelavant.

It won't even kill off the use of MD5 checksums as a signature for verifying authenticity, because if your data is smaller than the checksum there may not be a colision at all, and an exploit wouldn't matter.

This is an important discovery, but it doesn't make MD5 useless any more than CRC32 is useless.

Got Salt? (2, Insightful)

Anonymous Coward | more than 8 years ago | (#14038303)

OK, so clearly a scripted attack against MD5 is bad.

But aren't most people using MD5 using salted (as opposed to unsalted) hashes? (for those unclear on the difference, a "salted" has basically uses a local seed as part of it's MD5 hash, in addition to the value to be encrypted)

Doesn't seem likely that salted hashes can be easily broken by this technique, although clearly it's a concern that, should the salt value become known, all your passwords, etc, become breakable...

Not -that- bad (2, Interesting)

Parity (12797) | more than 8 years ago | (#14038309)

The only attacks that these md5 collisions allow are denial-of-service/destruction-of-data attacks, they don't generally allow the compromise of protected data or access to systems or suchlike. The collision blocks that are generated are effectively random data. It has yet to be shown how to -craft- a collision block.

If we could craft a collision block that contained a specified string at a specified position, that would be another issue altogether.

The ability to find collision blocks easily does suggest that crafted collision blocks might be possible, but for now, you have as good a chance of getting a viable exploit out of /dev/random as out of a collision block.

This doesn't mean we shouldn't look to other options for the newest releases of high-security software, but it doesn't mean that the md5 algorithm should be purged from our systems altogether either. It's still extremely valuable at detecting accidental corruption, and useful-with-caveats at detecting malicious corruption (45 minutes to discover a block of data that matches the sum is not really useful in either speed or resulting data for any kind of man in the middle attack, for example, so using md5 to validate network packets is safer than using it to validate disk files).

Of course, the black hats may know more than we do about md5 weaknesses, but 'may know' is just as true of any other algorithm.

MD5 and verification (4, Insightful)

n0dalus (807994) | more than 8 years ago | (#14038311)

Just because collisions can be generated doesn't mean that MD5 is dead.
It might only take minutes to calculate two random strings with the same hash, but it would still take a very long time to calculate a second string that collides with a pre-existing string. So even though it is now cryptographically weak, it can still be used effectively to check the integrity of files.

Re:MD5 and verification (1)

flynt (248848) | more than 8 years ago | (#14038551)

And wouldn't that second string have to be 'meaningful' in the same context as the original, like code to be run?

Coral cache? (5, Funny)

Viper Daimao (911947) | more than 8 years ago | (#14038340)

(Coral cache links provided to prevent slashdotting)

Im sorry, you must be new here.

Lost (0)

Anonymous Coward | more than 8 years ago | (#14038363)

great, will someone crack http://thedharmainitiative.org/ [thedharmainitiative.org] already?

See pam_unix2 for blowfish in shadow files (1, Informative)

Anonymous Coward | more than 8 years ago | (#14038396)

$2$blah instead of $1$blah MD5 http://www.thkukuk.de/pam/pam_unix2/ [thkukuk.de] . Quite useful.

For windows users (1)

Psionicist (561330) | more than 8 years ago | (#14038437)

If you get the error "error getting crypto context..." replace

if(!CryptAcquireContext(&cryptHandle, NULL, NULL, PROV_RSA_FULL, CRYPT_NEWKEYSET ))

with

if(!CryptAcquireContext(&cryptHandle, NULL, NULL, PROV_RSA_FULL, 0 ))

To actually run the program you have to convert your MD5 to four ints. Take your MD5, such as 098f6bcd4621d373cade4e832627b4f6

Divide it in four pieces and convert them to dec.

Hex
=======
098f6bcd
4621d373
cade4e83
2627b4f6

Dec
=======
160394189
1176621939
3403566723
640136438

Good luck.

This is bad. (1)

Boinger69 (673392) | more than 8 years ago | (#14038440)

This is bad, Now its only a matter of time before code like this is used to corrupt P2P networks whos primary file checking is based on md5 hashes.

Re:This is bad. (1)

JPyun (911266) | more than 8 years ago | (#14038509)

You can't make an arbitrary hash. P2P is fine.

Next Version (1, Redundant)

winphreak (915766) | more than 8 years ago | (#14038446)

So... When will MD6 come about? (yes, a weak version number pun, I know)

hash collision while file sizes still match? (0)

Anonymous Coward | more than 8 years ago | (#14038477)

So, not having read the FA, does this tool enable me to find collisions for different files *that are the same size*? or can I use size to discriminate?
 

MD5 is still useful. (1, Informative)

Theovon (109752) | more than 8 years ago | (#14038484)

This program generates ARBITRARY collisions. Given a tarball of a Linux kernel, you can generate some other file with the same MD5 hash. But can you generate a collision that is also a valid tarball that unpacks cleanly and compiles? The chances of that are so remote that I don't see it happening any time soon.

Here's the real trick. Take your kernel tarball X, and your hacked version Y. Using this collision finder, can you find some garbage Z such that Y appended with Z has the same hash as X? (Tar will, however, complain about extra stuff at the end of the tarball, but it would unpack and compile.) That's a MUCH harder problem than finding arbitrary collisions and would take a HELL of a lot longer to produce than 45 minutes on a PC.

Its not the end of the world... (0)

Anonymous Coward | more than 8 years ago | (#14038530)

C'mon folks - yes there are plenty of uses for MD5 and yes MD5 is being used for a lot of applications now. However, it doesn't completely destroy its usefullness until we find a better replacement in the apps. Yes, 2 items can have the same hash - however it is unlikely to occur without trying to do it intentionally.

Remember the hashes, keys, doors, traps and such to keep unwanted folks from touching your data are only a way of keeping honest people honest. If you have honest people around - there is nothing to worry about.

Still not a preimage attack (1)

^BR (37824) | more than 8 years ago | (#14038531)

It's the same attack that has already ben spoken about, just now it is available for the masses.

The scope of the attack is one can generate two files having the same MD5 sum, if he can have a random looking section in the middle of the file. i.e. possible in many binary formats but not possible in well formed ASCII text.

What the attack doesn't do is given a MD5 hash being able to find a byte stream that hashes to the same value. So passwords stored as their MD5 sums are still safe, you can't attack the RADIUS protocol with it and constructs like HMAC-MD5 used in SSL and IPsec are still safe. What you cannot trust anymore is for example the mechanism used to check distfiles on some BSD where the port system check the MD5 of the freshly downloaded archive. Nothing proves it is the same one that the porter used for the port (OpenBSD has been safe for a few years checking not only MD5 but SHA1 and RIPEMD, dunno for the other BSDs). And certificate authorities that don't modify the CSR they are submitted also are vulnerable to people forging certificates. Any serious CA won't be caught doing that mistake again.

One of the big lessons of these attacks on cryptographic hashes is : do not ever sign the hash of a document you didn't generate or at least modified (the document, not the hash).

Weak code. (4, Funny)

kg_o.O (802342) | more than 8 years ago | (#14038562)

This code is weak. I fired it up like 20 minutes ago and still haven't r00ted my box.
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?