Beta

Slashdot: News for Nerds

×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Use BitTorrent To Verify, Clean Up Files

timothy posted more than 6 years ago | from the but-anything-bittorrent-is-for-terrorists dept.

Networking 212

jweatherley writes "I found a new (for me at least) use for BitTorrent. I had been trying to download beta 4 of the iPhone SDK for the last few days. First I downloaded the 1.5GB file from Apple's site. The download completed, but the disk image would not verify. I tried to install it anyway, but it fell over on the gcc4.2 package. Many things are cheap in India, but bandwidth is not one of them. I can't just download files > 1GB without worrying about reaching my monthly cap, and there are Doctor Who episodes to be watched. Fortunately we have uncapped hours in the night, so I downloaded it again. md5sum confirmed that the disk image differed from the previous one, but it still wouldn't verify, and fell over on gcc4.2 once more. Damn." That's not the end of the story, though — read on for a quick description of how BitTorrent saved the day in jweatherley's case.

jweatherley continues: "I wasn't having much success with Apple, so I headed off to the resurgent Demonoid. Sure enough they had a torrent of the SDK. I was going to set it up to download during the uncapped night hours, but then I had an idea. BitTorrent would be able to identify the bad chunks in the disk image I had downloaded from Apple, so I replaced the placeholder file that Azureus had created with a corrupt SDK disk image, and then reimported the torrent file. Sure enough it checked the file and declared it 99.7% complete. A few minutes later I had a valid disk image and installed the SDK. Verification and repair of corrupt files is a new use of BitTorrent for me; I thought I would share a useful way of repairing large, corrupt, but widely available, files."

cancel ×

212 comments

Nice (5, Interesting)

Goldberg's Pants (139800) | more than 6 years ago | (#23295424)

Awesome idea. I've done this in the past with stuff. If a corrupt version was on one tracker, I'd save the files, get a new torrent and import the old files. Saves a lot of bandwidth wasting.

Re:Nice (0)

Anonymous Coward | more than 6 years ago | (#23295562)

How exactly did you go about doing this?

Re:Nice (5, Interesting)

Goldberg's Pants (139800) | more than 6 years ago | (#23295704)

Okay, I had some AVI's and a bunch of them had issues. All I did was copy them out to a different directory, then find a GOOD torrent (with the same rips) then make sure the filenames match exactly. Chucked them in the directory and voila. It checks them all and uses what data it can that you already have and replaces the rest.

Done this with RAR archived stuff as well. (Multipart rars on torrents are retarded, but that's another issue entirely.)

Re:Nice (5, Funny)

peipas (809350) | more than 6 years ago | (#23295812)

Okay, my friend had some AVI's and a bunch of them had issues.
There, fixed it for you. You're welcome.

Re:Nice (0)

Anonymous Coward | more than 6 years ago | (#23295884)

Stories about AVIs can be told in the first person. It's only when stories involve pirated TV and movies that it's only grammatically correct to speak of them in terms of friends.

Re:Nice (2, Insightful)

CastrTroy (595695) | more than 6 years ago | (#23295974)

Reminds me of the "PARS" I used to get off usenet. I think it was bacally a RAR split up into hundreds of pieces, with parity information in each of the files. You only needed to download a certain percentage of the files to reconstruct the original file. It was great, because often pieces of the file would go missing, or become corrupted somewhere along the way.

Re:Nice (1)

Random Destruction (866027) | more than 6 years ago | (#23296640)

kind of. Par files are parity files, and you need one par file for each missing rar file.

Re:Nice (2, Informative)

X0563511 (793323) | more than 6 years ago | (#23296706)

And then, Par2 came along, and allowed more flexibility.

We still use them, on usenet anyways.

Re:Nice (2, Informative)

Christophotron (812632) | more than 6 years ago | (#23296758)

no, actually that's how the old par system worked. In the newer, more advanced .par2 system, the individual .rar files are divided into "blocks" and each par file can recover a certain number of "blocks". It's much more advanced than the old par files that you are referring to, and it's similar to the hashing mechanism in bittorrent. I haven't seen any of the old-style pars in a long time.

For example, if you are missing a total of 3 blocks (one block from 3 different files) you only need to download a very small par2 file that says "+3 blocks" and it will repair the three missing blocks. Of course, if you are missing a lot more data, even entire files, you can get several of the larger "+128" par files and it'll repair everything (assuming there is enough parity data). Often you can even request additional parity blocks, but that's only necessary if you have a *really* crappy nntp provider.

ftpdiff is a far better tool (-1, Troll)

Anonymous Coward | more than 6 years ago | (#23296696)

I fucked Ann Coulter. Hard. I fucked her in her tight slutty asshole and then I pulled my cock out and she sucked on it. She loves sucking on my cock once it's been in her tight asshole. I rammed her hard and fast as my throbbing meatpole stretched her butthole apart. Then I shoved my cock in her cunt and fucked her dumb brains out. The stupid slut kept screaming and telling me how much she loved being treated like a dirt fuckslave, a fucking cumslut, a whorish little fuckbunny. Then she begged me to let her lick my asshole. How could I refuse? I let her suck on my asshole as she flicked her tongue on it while jacking my throbbing cock. She wrapped her lips around my asshole and sucked it hard, flicking her tongue it, kissing it and licking it and moaning like the dirty cumslut whore she is. Then she reached back and took her cunt juices and rub it all around my asshole and proceeded to lick her love juices off my tight asshole. I bent her over and fucked her in her slutty butthole again, and then shoved my cock down her throat and facefucked her. I slapped her face with my cock and then shoved my dick back in her butthole. After a few minutes of buttfucking the bitch, I pulled me cock out. She sucked me clean and then proceeded to lick my asshole some more while jacking me off. The dirty filthy cumbunny fuckslave sure knew how to give a good rusty trombone. After a few minutes of expert rimming, I was ready to blow my load. I blew spurts of thick, hot, white, sticky cum all over her face. Thick sticky wads straight into her mouth. Cumming over and over. Thick ropes of cum arcing up high, streaming out and splattering all over her slutty face. She sucked out all my cum as I emptied my balls all over her. She then used my cock to rub my cum all over her face. It was an amazing fuck-session. Ann Coulter is an amazingly filthy slut and I encourage everyone to fuck her. Right now I have the slut so horny that I can ram that bitch anytime I want. Ann Coulter really is a spectacular cunt.

Re:Nice (2, Insightful)

Hes Nikke (237581) | more than 6 years ago | (#23296990)

Done this with RAR archived stuff as well. (Multipart rars on torrents are retarded, but that's another issue entirely.)
any idea why the multipart RAR torrents tend to have healthier swarms than single file torrents of the same content? it pisses me off!

Re:Nice (3, Interesting)

i.of.the.storm (907783) | more than 6 years ago | (#23297056)

I think it has to do with the way the "scene" releases things, they usually do it via multipart rars or something like that. I saw something to that effect in the comments on a torrent a while ago. I think the reason is that things in the "scene" get distributed in ways that aren't bittorrent, so the breaking up into pieces makes sense there. I'm still not entirely sure what the "scene" entails, and how they differ from the people that put the torrents up, so I don't know the whole answer to that.

Re:Nice (4, Interesting)

ThePhilips (752041) | more than 6 years ago | (#23295728)

I do not know what GP meant precisely, but I had similar experience.

Some game (very old RPG) was available on Overlord and on BitTorrent. Not sold anymore. Problem was that BitTorrent had only single seed which minuscule upload speed - in several day I have downloaded only few megs. I tried then Overlord and in few days I got the game almost complete - but another snag had hit me: whether by mistake or intentionally, file was poisoned and three parts couldn't be downloaded. I was ready to throw everything away - antique games interest me little (but friend was recommending it as milestone RPG I had to play). Then suddenly I was enlightened: I fed the incomplete ISO of game to BitTorrent. BT client happily announced something like 98% of file complete and in less than one night downloaded rest of the file.

Re:Nice (5, Insightful)

empaler (130732) | more than 6 years ago | (#23295866)

I assume you then continued seeding? :)

Re:Nice (5, Informative)

gomiam (587421) | more than 6 years ago | (#23295742)

It should be quite simple. Let's say torrentA leaves you with a corrupt/incomplete filesetA (one or more files, it doesn't really matter). Let's supose torrentB contains the files in filesetA, perhaps with different names in its own filesetB.

Ok, you load torrentB in your favorite Bittorrent client, and start it up. It will automatically create 0-sized files with the names in filesetB (at least, all clients I know do that). Stop the transfer of torrentB, and substitute the 0-sized files in filesetB with the corresponding files in filesetA (may require some renaming). As you restart torrentB, your Bittorrent client will recheck the whole filesetB, keeping the valid parts in order to avoid downloading them. Voilá! You have migrated files from one torrent to another.

Note: You should make sure that the files you are substituting in are the same files you want to download through torrentB or, at least, keep a copy around until you see that the restart check accepts most of their contents.

Re:Nice (1)

Drantin (569921) | more than 6 years ago | (#23296226)

Most bittorrent clients I've used also allow you to create files the full size when you start a torrent, rather than 0-byte sized ones. Helps prevent fragmenting and finding downloads stopped in the middle because the hard drive filled up...

"Saves a lot of bandwidth wasting." (0)

Anonymous Coward | more than 6 years ago | (#23296358)

Wouldn't the methodology just save a lot of bandwidth?
If you save the wasting, does that mean that you subsequently have to waste that bandwidth elsewhere, so that the entropy of the universe remains constant?
Would a sufficient quantity of hoarded bandwidth wasting go unstable, become a bureaucratic singularity, and emerge as a new government, complete with its own non-event horizon?
I worry about stuff like that.

What broken software were you using? (0, Offtopic)

evanbd (210358) | more than 6 years ago | (#23295458)

TCP/IP provides data integrity guarantees. So, if your ISP wasn't mucking with your packets (and their checksums), either Apple was sending the wrong bits or your hardware or software was screwing with them. My vote is it's not Apple.

I suggest you diagnose your computer problems, rather than relying on BitTorrent to fix them for you.

Re:What broken software were you using? (5, Insightful)

Dice (109560) | more than 6 years ago | (#23295504)

I asked the same question. Wikipedia answered it [wikipedia.org] .

Re:What broken software were you using? (3, Insightful)

kcbanner (929309) | more than 6 years ago | (#23295524)

Its networking - shit happens. Some of his bits got thrown out of a router somewhere as heat, or maybe a packet timed out and didn't quite make it.

Re:What broken software were you using? (0)

Anonymous Coward | more than 6 years ago | (#23295608)

safari, of course.

Re:What broken software were you using? (3, Informative)

Anonymous Coward | more than 6 years ago | (#23295868)

It's obvious you have no clue how the Internet actually works. Shit happens, but the Internet is designed for it. Dropped packets cause retransmission, not corrupted data; the Internet drops packets *by design* and the entire system is designed around that. Flipped bits happen, but they are detected by multiple checksums which make it astronomically unlikely for corrupt data to remain undetected. Nope; if you receive corrupt data, the blame is squarely on some piece of software fiddling with your packets and changing the checksums to match. Maybe it's the crappy cheap NAT router, or the ISP's deep-packet-inspection P2P filter, or their (not so) transparent HTTP proxy. But whatever the cause, it's almost certain that software is to blame.

I'd bet $100 that if he did the same download over HTTPS, thus preventing software meddling of the packet contents, it would come out perfect.

Re:What broken software were you using? (3, Informative)

SanityInAnarchy (655584) | more than 6 years ago | (#23295988)

It's obvious you have no clue how the Internet actually works. Shit happens, but the Internet is designed for it... Maybe it's the crappy cheap NAT router
I'm fairly sure that's what GP meant.

Oh, and TCP checksumming isn't perfect.

Re:What broken software were you using? (4, Informative)

Skapare (16644) | more than 6 years ago | (#23296170)

Flipped bits happen, but they are detected by multiple checksums which make it astronomically unlikely for corrupt data to remain undetected.

I actually saw this happen once ... the astronomically unlikely [1]. TCP accepted the corrupt packet. I'm sure it will never happen again. Fortunately, rsync caught it in the next run.

One problem I ran into once with a certain Intel NIC was that a certain data pattern was always being corrupted. TCP always caught it and dropped the packet. There was no progress beyond that point because of the hardware defect always corrupted that data pattern. Turns out there was a run of zeros followed by a certain data byte (I tried a different data byte and with different run lengths and those never got corrupted). What the NIC did was drop 4 bytes, and put 4 bytes of garbage at the end. I suspect it was a clocking syncronization error. I got around the problem by adding the -z option to rsync (which I normally would not have done with an ISO of mostly compressed files). Another way would have been to do the rsync through ssh, either as a session agent (like rsync itself can do) or as a forwarded port (how I do it now for a lot of things).

[1] ... approximately 1 in 2^31-1 chance that the TCP checksum will happen to match when the data is wrong (variance depending on what causes the error in the first place) ... which approaches astronomically unlikely. Take 1 Terabyte of random bits. Calculate the CRC-32 checksum for each 256 byte block. Sort all these checksums. You will find 2 (or more) data blocks with the same checksum (or a repeating pattern in your RNG). Why? Because CRC-32 has 2^32-1 possible states, and you have 2^32 random checksums.

But whatever the cause, it's almost certain that software is to blame.

Agreed. Since it is at least software's responsibility to detect and fix it, if the problem happens, the famous finger of fault points at the software.

I'd bet $100 that if he did the same download over HTTPS, thus preventing software meddling of the packet contents, it would come out perfect.

Your $100 is safe.

Re:What broken software were you using? (0)

Anonymous Coward | more than 6 years ago | (#23296124)

Or maybe he used Firefox to download it.

In my experience, Firefox regularly mangles large downloads. I have never once downloaded a DVD-sized file with Firefox and had it come through uncorrupted.

I stopped using it a few months ago, though. Perhaps they've fixed it. I was using wget to get anything over 500 megs at the end.

Re:What broken software were you using? (5, Interesting)

Anonymous Coward | more than 6 years ago | (#23295558)

Those who have never developed P2P software might never understand why they all need to use strong checksums to detect data corruption, and why bad blocks actually do appear in the wild; frequently.

You'd be shocked - SHOCKED - at how much data gets corrupted routinely - by errant antivirus software, flaky network equipment, plain ol' line noise that the checksums don't detect (which will happen much more often than you expect, see also birthday paradox), or misbehaving routers who think that any occurence of 0xC0A80102 obviously must be an internal IP address and needs to be changed to your external one. Even if that's in the middle of a ZIP file. Oops.

Encryption actually aids this somewhat, as the same byte patterns don't get repeated, so if there's an errant IDS changing things for example, it tends not to fire the second time.

I've done this before for file repairs. Works a treat, but you sort of wish that torrent used a Merkle hash tree such as the modified THEX standard Tiger Tree Hash. SHA-1's so last century.

Re:What broken software were you using? (4, Informative)

complete loony (663508) | more than 6 years ago | (#23295826)

The TCP checksum offloading on nForce 4 motherboards (I have one) were notorious for corrupting TCP packets and allowing them to be received by the application. That's the most likely kind of failure that would be able to reproduce this problem.

Re:What broken software were you using? (2, Informative)

CastrTroy (595695) | more than 6 years ago | (#23296028)

I had the same problem. What's really terrible is that I don't think they ever fixed the problem. That drove me nuts for a few weeks trying to figure out why all my downloads were corrupted.

Re:What broken software were you using? (1)

complete loony (663508) | more than 6 years ago | (#23296090)

I'm pretty sure they fixed the problem a while ago. Download the latest drivers and you should be ok, assuming you still use the motherboard that is.

Re:What broken software were you using? (1)

homer_ca (144738) | more than 6 years ago | (#23296314)

Yeah, it's fixed with the latest driver, but they had to disable most of the TCP offloading. I had the same problem on my NF4 board. I chucked the NV Active Armor firewall software and never had a problem since.

http://techreport.com/discussions.x/9483 [techreport.com]

Re:What broken software were you using? (4, Informative)

Tawnos (1030370) | more than 6 years ago | (#23296642)

TCP has a 16 bit checksum. That means there's a 1 in 2^16 chance of an error getting by the checksum. Let's assume, for a moment, that the packets were sent 1kb at a time (ethernet max is greater than this, but it's an easy number). In a 1.5Gb file (assuming base 10 throughout for simplicity), this means a total of 1,500,000 packets must be transmitted. Using only the TCP checksum, 22 of these packets would be corrupt, but allowed through. Even though there are additional checks at layer 2, the fact is that when dealing with large amounts of data, relying on TCP for data integrity is not enough.

Good for seeding stuck torrents, too (5, Interesting)

b4dc0d3r (1268512) | more than 6 years ago | (#23295460)

If I happen to see a stuck torrent (many leechers, no seeds), sometimes I can find a good version of the file I already have - so I start the torrent, stop it, replace the single good file (sometimes you need more if the file is smaller than the part size), and upload a few Kb to finish the torrent. Then sit back and watch as everyone fills up.

Re:Good for seeding stuck torrents, too (1)

Therefore I am (1284262) | more than 6 years ago | (#23296724)

Hi Johnny Appleseed! Nice to see you back. Did those trees just get boring?

Anonymous Coward (3, Informative)

Anonymous Coward | more than 6 years ago | (#23295468)

Those of us who use BitTorrent for *ahem* illegal purposes have been doing this since the beginning. The only way to get rare and complete downloads was to take the files to other trackers and match them against another md5 to finish the download.

It's like getting parity files over on usenet to fix that damned .r23 file which is just a bit too short for some reason :)

Scheduling (4, Informative)

FiestaFan (1258734) | more than 6 years ago | (#23295476)

Many things are cheap in India, but bandwidth is not one of them. I can't just download files > 1GB without worrying about reaching my monthly cap, and there are Doctor Who episodes to be watched. Fortunately we have uncapped hours in the night
I don't know about other bittorrent clients, but uTorrent lets you set download speed caps by hour(like 0 during the day and unlimited at night).

Re:Scheduling (2, Informative)

urbanriot (924981) | more than 6 years ago | (#23295910)

Azureus also has an excellent scheduling plugin written for it - http://students.cs.byu.edu/~djsmith/azureus/index.php [byu.edu]

I don't know about other bittorrent clients, but uTorrent lets you set download speed caps by hour(like 0 during the day and unlimited at night).

Re:Scheduling (1)

SeaFox (739806) | more than 6 years ago | (#23295924)

BitComet has too since version 0.84 at least.

!new (2, Insightful)

gustgr (695173) | more than 6 years ago | (#23295482)

For heavy BT users this tactic is very common, provided the file(s) you are willing to download is fairly well available from different sources.

Re:!new (1)

Llamalarity (806413) | more than 6 years ago | (#23295708)

Have not done so myself, have read of this technique being used to "upgrade" a late beta version to the final release. As long as only a few files have changed it should work.

Re:!new (2, Insightful)

SanityInAnarchy (655584) | more than 6 years ago | (#23296044)

It's an older concept than that, even. Goes back to the strange Debian habit of using a tool called Jigdo -- it would provide essentially a recipe for building an ISO out of all the files needed, where the files were mostly available from standard Debian mirrors. ISOs were available from far fewer mirrors than standard Debian packages, you see.

So, you'd use Jigdo, and if all went well, it'd assemble a working image. But if a few packages couldn't be downloaded, you could always take your mostly-complete Jigdo file and use rsync with an rsync-capable mirror. (Or, more recently, BitTorrent on Ubuntu -- but that's another story.)

I don't think this tactic is very common, though, as most people seem to have no fucking clue how BitTorrent works. I've seen torrents with gigantic multipart RARs, with an SFV of those. Let's see... so, my torrent software is already checksumming everything, and RAR has a builtin checksum too, or at least, acts like it does (it says "ok" or not) -- and on top of that, there's an SFV checksum (crappy CRC32), too. Never mind that RAR saves you at most a few megabytes (video is already compressed), which, based on the size of these files, you'll spend more time unpacking the RAR than you would downloading the extra couple megs. Or that, once you unpack and throw away the RAR, you can't seed that torrent from the working video. Or that multipart anything is retarded on BitTorrent, as the torrent is splitting it into 512k-4meg chunks anyway.

Whoops, end of rant. Oh, by the way, that wasn't about me, it was about my friend. Wink wink.

Re:!new (5, Informative)

Anonymous Coward | more than 6 years ago | (#23296454)

I don't think this tactic is very common, though, as most people seem to have no fucking clue how BitTorrent works. I've seen torrents with gigantic multipart RARs, with an SFV of those. Let's see... so, my torrent software is already checksumming everything, and RAR has a builtin checksum too, or at least, acts like it does (it says "ok" or not) -- and on top of that, there's an SFV checksum (crappy CRC32), too. Never mind that RAR saves you at most a few megabytes (video is already compressed), which, based on the size of these files, you'll spend more time unpacking the RAR than you would downloading the extra couple megs. Or that, once you unpack and throw away the RAR, you can't seed that torrent from the working video. Or that multipart anything is retarded on BitTorrent, as the torrent is splitting it into 512k-4meg chunks anyway.
People who aren't aware of the full situation often make this complaint. These multipart rar files are "scene releases".

First of all, scene releases are _never_ compressed; it's always done with the -0 argument, this makes is basically equivalent to the unix split program. If a file is to be compressed, it is done with a zip archive, and the zip archive is placed inside the rar archive. This is because rar archives can be created/extracted easily with FOSS software, but cannot easily be de/compressed. This was more of an issue before Alexader Roshal released source code (note:not FOSS) to decompress rar archives.

Second, people often have parts of, or complete, scene releases and they are unwilling to unrar them (often because it's an intermediary, like a shell account somewhere where law isn't a problem).

Third, people follow "the scene" and try and download the exact releases that are chosen by the social customs of the scene (I am not going to detail those here), thus, "breaking up" (ie, altering) the original scene release is seen as rude.

Fourth, the archive splitting is in precise sizes so that fitting the archives onto physical media works better; typically the archive size is some rough factor of 698, ~4698 and ~8500.

Fifth, archives are split due to poor data integrity on some transfer protocols (though this is largely historical nowadays); redownloading a corrupted 14.3mb archive is easier than redownloading a 350mb file.

Sixth, traffic of the size is measured in terabytes, with some releases being tens, or sometimes hundreds of gigabytes in size. Thus, there become efficiency arguments for archive splitting; effective use of connections, limited efficiency of software(sftp scales remarkably poorly, though that is beginning to change - not that sftp is used everywhere), use of multiple coordinated machines and so on. This is an incomplete list of reasons; it is almost as though every time a new challenge is presented to the scene, splitting in some way helps to solve it.

AC because I'm not stupid enough to expose my knowledge of this either to law enforcement, or to the scene (who might just hand me over for telling you this - it has been done). Suffice to say that this is more complex than you understand, and that even this level of incomplete explanation is rare.

Re:!new (1)

AnarkiNet (976040) | more than 6 years ago | (#23296494)

As far as I know, the habit of uploading torrents composed of a multipart RAR is that the "source" of those parts usually comes from those sites that allow anonymous users to upload files up to a certain size. So the "true" pirates, the ones that do the actual release to web of the game/software/whatever, will upload it to a direct hosting site in small chunks. The torrent author is just not smart enough to upload *after* unpacking the RARs.

Stupid Private Trackers, too (1)

feld (980784) | more than 6 years ago | (#23296714)

No, I've seen this as a requirement for a few private trackers. It put me off on posting as I'm not going to waste my time.

Any MD5s on Apple's page? (3, Interesting)

CSMatt (1175471) | more than 6 years ago | (#23295486)

Are their even MD5 hashes on Apple's download pages for such large files? Jusging by how the article was written and the lack of hashes on the QuickTime and iTunes download sites, it doesn't seem like they even bother.

Re:Any MD5s on Apple's page? (1)

wizardforce (1005805) | more than 6 years ago | (#23295496)

I would think that the only reasonable thing to do would be to have md5s because they are such large files after all, they could be corrupted.

Re:Any MD5s on Apple's page? (3, Informative)

Anonymous Coward | more than 6 years ago | (#23295762)

Yes, there are- though most of the latest ones are SHA-1 digests. They're not usually seen in the "public front page" download areas and aren't universal, but are generally present for the downloads for updates and security patches through links from the tech literature and developer sections.

Hardware Failure is your bigger concern (4, Interesting)

Bazar (778572) | more than 6 years ago | (#23295508)

One should be more concerned as to why your files are becoming corrupted.

I'd say its a safe bet that the files from apple.com are in perfect condition.

Which means it either became corrupted in transit to, or on arrival to your machine.

Which leads the question, is your memory defective
run memtest86 to check your memory.
http://www.memtest86.com/ [memtest86.com]

Check if your Harddrives have SMART and are reporting anything. A disk checker would also be a good idea.

The other idea that springs to mind is if your behind some proxy with the above problems, although i doubt anyone would want to proxy a 1.5gig file.

Fact is, if files are being corrupted on your disk, its just a matter of time before something more important is hit by corruption.

Re:Hardware Failure is your bigger concern (5, Interesting)

Anonymous Coward | more than 6 years ago | (#23295590)

could also be one's routers.

There was a problem w/ dlink routers back in the day that hit alot of p2p users. If you placed your machine in the dmz, the router basically did a search and replace on all packets replacing the bitstring representing the global address w/ the bitstring representing the local address. On large files, this didn't just hit in the ip header, but in the data as well corrupting it. If you didn't use dmz functionality, just port mapping, it worked fine, so if you were using bittorrent, you'd get repeated hash fails on some parts that would never fix, because bitorrent has no capability to work around that (as opposed to eMule's extensions)

Re:Hardware Failure is your bigger concern (1)

log0n (18224) | more than 6 years ago | (#23295820)

As per the topic, Bittorrent fixed the problems - didn't cause them - so a failing router is not likely the problem. 99% likely it's bad ram.

Re:Hardware Failure is your bigger concern (0)

Anonymous Coward | more than 6 years ago | (#23295908)

What about the possibility of bad RAM in the router? :P

Re:Hardware Failure is your bigger concern (3, Informative)

BobPaul (710574) | more than 6 years ago | (#23295942)

As per the topic, Bittorrent fixed the problems - didn't cause them - so a failing router is not likely the problem.
You misunderstood his comment; please read it again. In his story, bittorrent didn't cause any problem either--it identified a problem by use of the same mechanism (hash checks of file parts) that it solved the problem in the OP.

While I agree that bad ram is most likely the issue, it's still possible bad ram in a router or even something goofy going on in a router, such as the firmware bug described, could have caused problems. The bits were mangled before they were written to the disk. They could have been mangled by anything that processed those bits as they traversed from apple's website to his HD, including Apple's website and the HD itself. That embedded devices tend to be more reliable does not mean they don't break and do weird things sometimes.

Re:Hardware Failure is your bigger concern (0)

Anonymous Coward | more than 6 years ago | (#23295738)

probably not the case. I too got a bad download, on two of three different machines, but all are in perfect hardware condition. more likely that apple has multiple servers with different copies of the same disk image, one (or more) are corrupt, and you are just happening to get that exact image. after a few more tries, I got an image that worked perfectly.

Re:Hardware Failure is your bigger concern (0)

Anonymous Coward | more than 6 years ago | (#23295766)

Agreed -- it is probably not the poster's issue. For the first iPhone SDK beta it took me three attempts at download to get a valid disk image.

Re:Hardware Failure is your bigger concern (5, Interesting)

cheesybagel (670288) | more than 6 years ago | (#23295898)

Maybe, maybe not.

IIRC TCP/IP has a guaranteed maximum error rate of at least 10^-5 bits. Well, the thing is, 1.5 Gigabytes is over 10^10 bits in length. So even at such an error rate, it is not guaranteed that your file will arrive without bit errors.

Re:Hardware Failure is your bigger concern (1)

SnEptUne (1264814) | more than 6 years ago | (#23296528)

As far as I know, TCP/IP does not guaranteed error rate, it is all statistic. It isn't unusal for large file to get corrupted over network cable, especially over crappy cable, such as those in a dial-up connection. The protocol just isn't made to handle high noise.

Re:Hardware Failure is your bigger concern (1)

icebike (68054) | more than 6 years ago | (#23295912)

One should be more concerned as to why your files are becoming corrupted.

I'd say its a safe bet that the files from apple.com are in perfect condition.

Which means it either became corrupted in transit to, or on arrival to your machine.

Which leads the question:
Is the OP or any of the seeders on Comcast?

Re:Hardware Failure is your bigger concern (1)

rastoboy29 (807168) | more than 6 years ago | (#23296098)

I have certain people who play my game who simply _cannot_ download from my website--although it works great for me and most others.

I generally suspect malware on their clients, but I don't know for sure and it has long baffled me, because it is not rare at all.  Something like 40% or so.  Surely the malware problem is not so bad that 40% of net users can't download a 130 MB file via http without corruption?

Re:Hardware Failure is your bigger concern (0)

Anonymous Coward | more than 6 years ago | (#23296474)

It is more likely due to corrupted data on the way during download especially that many non-western countries use satellite links (which are lesser quality as far as data transmission) instead of fiber optics.

This may be off topic:

ISP's can also use more bandwidth illegally i.e. without paying the gov or regulating authority for bandwidth; it is easier to cheat or hide their usage via a satellite link. Although I doubt that they do that in India, I can tell you that in Lebanon it is difficult to get e decent connection from the ISP's for VOIP or on-line FPS games. Latency via satellite is ~ 650ms vs 150ms via FO. It is a good thing that the gov has its own ISP (although more expensive) since the gov never uses but fiber-optics.

Re:Hardware Failure is your bigger concern (0)

Anonymous Coward | more than 6 years ago | (#23296550)

It also could be that it is being downloaded in web normal web browser. I have found that Firefox and IE corrupt stuff all the time. Actually, it usually is a result from a misconfigured web server not setting the MIME type correctly (eg. for binaries).

This is why I use wget or something similar when downloading really huge files. This also has the advantage of making restarting and continuing the download easier if it dies part of the way through.

the new dr. who sucks... (0, Flamebait)

jjeffries (17675) | more than 6 years ago | (#23295548)

But Torchwood is usually pretty good, imho.

Re:the new dr. who sucks... (0)

Anonymous Coward | more than 6 years ago | (#23295676)

You must be gay. Most of the new torchwood is guys kissing guys or girls kissing girls (not that I have a problem with the 2nd part).

And the new Dr Who is hit and miss, but there are a lot of really good episodes.

Re:the new dr. who sucks... (1)

sp332 (781207) | more than 6 years ago | (#23295718)

Are you kidding? Sure, some episodes are slow or don't really work, but the second episode of the first "series" (that's "season" in the US) of the new Dr. Who is in my top five favorite sci-fi TV episodes of all time, including all the Star Treks and Babylon 5.

Re:the new dr. who sucks... (1)

mrmeval (662166) | more than 6 years ago | (#23295922)

The first season is written by a schizophrenic that likes lesbian porn. Too many contradictory episodes and only two good ones. They need to get rid of the immortal, the generic asian chick and the geek, do something about the cop that doesn't know how to use a real gun and put a spine in her and flush the traitor down a toilet as organic residue. They really need to drown some of the writers.

They could be the first series to kill off all but one of their franchise characters and several of the support crew!

Re:the new dr. who sucks... (2, Informative)

zippthorne (748122) | more than 6 years ago | (#23296096)

To be fair, very few British cops know how to use guns. At least, if the gun control advocates on my side of the pond can be believed.

Re:the new dr. who sucks... (1)

1u3hr (530656) | more than 6 years ago | (#23296242)

The first season is written by a schizophrenic that likes lesbian porn. Too many contradictory episodes and only two good ones. They need to get rid of the immortal, the generic asian chick and the geek, do something about the cop that doesn't know how to use a real gun and put a spine in her and flush the traitor down a toilet as organic residue. They really need to drown some of the writers.

Obviously you didn't see the finale of series 2. Two of your wishes were fulfilled.

Torchwood is pretty silly (especially in supernatural episodes with ghosts, Death Incarnate, zombies), but still watchable.

Re:the new dr. who sucks... (1)

Kjella (173770) | more than 6 years ago | (#23296164)

Oh please, some of the epsidoes like the whole John Hart ones are just incredibly poor. And they have the most absurd sexual relationships ever and I don't mean the gay thing. They go from deep kissing to completely psychotic let's kill each other mode in two seconds flat. And to let the guy who almost killed your crew and would have killed you not be immortal just go. I much prefer the "normal" characters over the man himself, the less of him the better. I'd like a good action/sci-fi/csi flick which it is at times, but Jack seems to be the "particle of the day" of the series.

Doctor Who is good fun, light entertainment. It's a guy flying aound in a blue police box and you're not supposed to take it so seriously, particularly since there's time paradoxes cropping up all over the place. And I think the series show I'm soooooooooo glad Star Trek didn't go with a time agency series, and why they should have kept it out of Enterprise too. You go from self-healing time to self-destroying time to being prevented from certain events to changing much bigger events to anti-time to changes that ripple through time slowly/quickly/not at all and you'll never be self-consistent.

It's fun for the odd episode but destroys the whole logic. For example now at the end of Stargate Atlantis, doctor McKey returned Colonel Shepard from the future - and only sent with him the location of where to find Teyla. WTF? He could have given him 25 years of science and technology, all wrapped up on the data crystal (the same kind that contain for example the entire replicator code...). Why didn't he? Because he "can't" use that solution. It's like the convienient non-interference with the timeline in Star Trek. It allows you to actually do a little time travel without creating so many issues.

Been using bittorrent and rsync for this for years (5, Informative)

DiSKiLLeR (17651) | more than 6 years ago | (#23295552)

I've used bittorrent for this purpose many times in years gone by.

Especially with our slow links, or worse yet, on dialup (if I go enough years back) in Australia.

Before bittorrent I would use rsync. That required me to download the large file to a server in the US on a fast connection, then rsync my copy to the server's copy to fix what is corrupt in my copy.

It works beautifully. :)

Good for game files too (5, Interesting)

trawg (308495) | more than 6 years ago | (#23295560)

We have been doing this for ages [ausgamers.com] for certain high-demand games file that we mirror. While offering torrents for some of our download mirrors is only mildly useful (as we're in Australia we're trying to keep bandwidth on-shore to cut down international traffic, and BT doesn't really help this), it is extremely helpful for the VAST amount of users that appear to either have massively crazy Internet problems or are simply unable to drive a HTTP based downloader and resume downloads.

When a large number of users are having problems downloading or resuming a particular file, I simply create a torrent for them and give them some vague instructions about how to resume it and then generally I never hear from them again. They're happy because they don't have to download a 4gb game client again from scratch, they don't have to worry about resuming/corrupt downloads, and because its a torrent it probably feels like they're getting something for free that they shouldn't be.

Moron (-1, Flamebait)

Anonymous Coward | more than 6 years ago | (#23295574)

If you are uncapped at night, then you are uncapped. Personally, I have a monthly cap of 30GB. Night or Day, if I download, it counts. You're just too lazy (stupid?) to use your ISP's contract to it's full advantage. Your comment is an insult to anyone that's suffering with a real cap. It falls into the same category of the oft used "bricked" term, which is getting thrown around too often, against it's actual meaning.

Re:Moron (0)

Anonymous Coward | more than 6 years ago | (#23296940)

Yeah, if I had a 30GB cap, I'd be just as pissed off and irritable as you are. But I don't. (Nelson voice): Ha ha!

This helps fill in pieces from Newsgroups. (1)

NFN_NLN (633283) | more than 6 years ago | (#23295646)

Big deal, I do this all the time. It also helps when you're downloading files via Torrent and supplement with pieces from the newsgroups. This combination works well because newsgroups often have RAR'd binaries that are missing files. Find a similar package available on a Torrent site and fill in the missing files. Hell you can start the Torrent first and do a Force Check as you add each piece. Why not just download the whole thing via Torrent then? Well nntp is local and much faster... Had I known this was worthy of a slashdot submission I would have done it all long time ago.

Or synchronize with yourself... (5, Interesting)

greerga (2924) | more than 6 years ago | (#23295658)

For even more fun, if you have two differently-corrupted copies of a file and a torrent to go with it, then you can have BitTorrent stitch them together into a valid file without involving any third parties.

I used Azureus's internal tracker ability and two computers on a local network with the torrent modified to track on one of the machines, and one corrupted copy of the file on each.

Obviously only works if they don't have corruption in common, but it also doesn't require the original torrent file tracker to work anymore.

Re:Or synchronize with yourself... (1)

Saberwind (50430) | more than 6 years ago | (#23295936)

Just yesterday I wrote a simple program to consolidate two partially-downloaded copies of a file that existed in two dying torrents (no complete copies existed in circulation), based on the premise that the files had runs of zeroes wherever blocks hadn't been downloaded. The result was a more complete copy, which I was then able to redistribute.

Re:Or synchronize with yourself... (1)

BobPaul (710574) | more than 6 years ago | (#23295984)

but it also doesn't require the original torrent file tracker to work anymore
More importantly, this can be done without access to the internet. The lack of an available tracker is already made unnecessary by DHT. Just on the internet, it sometimes takes a while for Azureus to find someone else with the file through DHT.

What a novel idea!!! (2, Interesting)

WarJolt (990309) | more than 6 years ago | (#23295706)

Using bit torrent for it's actual legal intended use. I love it!!!

I'm not a lawyer though. I just hope it doesn't violate apples NDA. Please please please follow the rules. Don't want to see you in prison or slapped with a large fine.

Bit torrent has received a bad reputation because of pirates. There are legitimate uses though. I do believe that doctor who episodes aren't public domain, so shame on you for that. Might want to be careful what you admit to on /.

Re:What a novel idea!!! (1)

Anonymous Freak (16973) | more than 6 years ago | (#23295848)

I have never used BitTorrent to download anything that I did not already have the legal right to possess. That is basically what the OP did. He had a legal right to possess the file, just the technical inability to get it officially.

Most of my BitTorrenting is "100% legal" (Linux distros, UBCD, publicly released media (new NIN, Star Trek fan films,) etc.) Those I make sure to seed to at least a 2:1 ratio, often more. But some of it is to download items that I have the legal right to have, but do not have ready access to. (My closed-source OS install CD when I'm not at home, for example.) Those I make sure to stay as close to the line of legality as possible by, unfortunately, seeding as little as possible.

Re:What a novel idea!!! (0)

Anonymous Coward | more than 6 years ago | (#23297000)

I just hope it doesn't violate apples NDA. Please please please follow the rules.
zomg someone might violate copyrights on the intarwebs? that's unpossible!

p2p is here to stay regardless of whether anyone "follows the rules" or not. if at first you don't succeed just torrent the bitch and get it over with. why waste your time(/money/bandwidth) doing anything else?

Here's one for you (0, Flamebait)

gEvil (beta) (945888) | more than 6 years ago | (#23295750)

Hey guys, check this out! I just found out that you can send emails to multiple people AT THE SAME TIME by putting a comma between their email addresses! Pretty cool, huh?

Invites to Demonoid (-1, Offtopic)

cyberzephyr (705742) | more than 6 years ago | (#23295776)

Do you have any invites to demonoid? I would appreciate one if you do.

Re:Invites to Demonoid (0)

Anonymous Coward | more than 6 years ago | (#23295918)

Just for the sake of chaos, here are 5:
lcw82wrfd7vxyf6iuzq2i6l3kyrbos1lyykdd1bjxq9v5
6xeeoo52jhmaijrjodvhehaeqn3w70keuwxvajby
ky8cswluxe0jh2km2rw5tbpc37agdnogk32bq5r98
mfqtowgp6l2gial5leeardj1hw91lv9mey2rgc0s
xdnlqkijbc4fu105hil2jql3g8h9ri61uvtw3g
http://www.demonoid.com/register.php?with_invite=1 [demonoid.com]

Re:Invites to Demonoid (0)

Anonymous Coward | more than 6 years ago | (#23296006)

Thanks!

Re:Invites to Demonoid (1)

cyberzephyr (705742) | more than 6 years ago | (#23296032)

Thanks a bunch. FYI /. users, i took the first one the other 4 are free!

Re:Invites to Demonoid (0)

Anonymous Coward | more than 6 years ago | (#23296066)

Bottom 4 are already used. Didn't need to try the first one. Get it while it's hot!

Re:Invites to Demonoid (0)

Anonymous Coward | more than 6 years ago | (#23296230)

Okay, to celebrate Demonoid's re-opening, here's some more:

f5mptgleeecic81ppcn2hzkjugrg4b4sdglrwe4e
enz2yuz1gsv17mpetil8ltsmq1e17cbtw11fc9uvoa
p6gzu1iguz4o0aep93l5h1jujwt13pg5q9wy5
5ubabvxlkj4z8jmr0iu8kreil7xcf7jkp2ia2252442
yubtly2w8ghvae5839faz5mmancawheh0vgf70merdm

Re:Invites to Demonoid (1)

Kredal (566494) | more than 6 years ago | (#23296390)

Thanks! The top code worked for me, and the bottom one was already used... (:

Re:Invites to Demonoid (1)

smart.id (264791) | more than 6 years ago | (#23296630)

Got the second one, thanks.

Anyone know how to do this with USENET binaries? (0)

Anonymous Coward | more than 6 years ago | (#23295874)

I can have 60% of a file downloaded but have BitTorrent only see 10%, I'm guessing because it's missing an article somewhere around there. Any client that zeroes missing file parts instead of simply not writing them? Is that possible? What about with par2 files?

Same for Rsync (0)

Anonymous Coward | more than 6 years ago | (#23295902)

I used to buy Debian CD's from a Linux shop in Sydney Australia. A few times I'd get badly burned CDR's from them, so I would take an image of the bad discs with dd, then rsync that image to fix it and then burn the fixed image.

Worked perfectly every time. I'd rather use BitTorrent for that though. Probably be quicker.

simpler home-brew technique (1)

v1 (525388) | more than 6 years ago | (#23296048)

I wrote this bash script [vftp.net] to do basically the same thing. It uses openssl (built into most unix and OS X in specific) to create 1mb check files basically the same as torrent files. Follow the instructions and its easy to fix a corrupt download from someone that has a good copy, with the minimum required data transfer. The person with the bad file runs option 1 to make the check file and sends that to the person with the good file. They run option 2 which identifies bad chunks and exports them, which they send back to the first person. Run option 3 and the exports are patched into their download and it's fixed.

Last time I used it, we repaired a 3.8gb transfer by exchanging 11mb of data. (the transfer had been resumed multiple times and apparently one of the transfers glitched its offset or something)

This is easier than BT because using BT can have a bit of a learning curve for seeding. Beta but appears stable. Feedback encouraged.

Re:simpler home-brew technique (2, Insightful)

Just Some Guy (3352) | more than 6 years ago | (#23296236)

The person with the bad file runs option 1 to make the check file and sends that to the person with the good file. They run option 2 which identifies bad chunks and exports them, which they send back to the first person. Run option 3 and the exports are patched into their download and it's fixed.

Isn't that almost exactly how rsync works?

I hope you verified the file (1)

cerelib (903469) | more than 6 years ago | (#23296246)

This is not really anything BitTorrent specific, but good use of available tools. However, I hope you then checksum verified the completed file with an MD5 from Apple or somebody who has downloaded directly from them. While you probably weren't a target of an attack, you did download software from an unknown source. An attacker could download the SDK, insert malicious code, compute a new set of MD5 sums for the torrent file, upload to pirate bay or some tracker, and then seed the torrent expecting that nobody will attempt an external verification.

CRC (1)

the brown guy (1235418) | more than 6 years ago | (#23296300)

I had a shitty old hard drive that was failing CRC (cyclic redundancy checks) but the file I had downloaded was 4 gigs, and there were a few corrupt pieces, but by copying it to another hard drive, and replacing just the corrupt pieces I saved myself a shit load of bandwidth.

What asshole tagged this '!news'? (1)

sootman (158191) | more than 6 years ago | (#23296516)

OK, maybe not tonight-at-eleven news, but this is a totally clever hack, which is exactly what many people on Slashdot live for.

On a related note, I came up with a roundabout way to do something similar to help a friend who was having trouble moving large files. On the remote end, split [hmug.org] the file into small chunks. Then md5 [hmug.org] them all and save those results into a text file. Then, ftp them, and when they arrive, md5 them all again and compare your values to what's in the text file. If any don't match, re-download them; else cat [hmug.org] them all together and you should be good.

I don't think this wouldn't have worked for the submitter, even if he knew someone with a known-good copy of the file, because I imagine these things work linearly, so if the bad part of the file was at the halfway mark, every chunk after that would have the wrong checksum. His method was very, very clever.

Torrent Distribution Network - Results: Awesome (4, Interesting)

erexx23 (935832) | more than 6 years ago | (#23296652)

I have been using Torrents for this very reason.

I was being required to copy sometimes 10-20GB of Virtual Machine Image Files from Server to PC or PC to PC on up 40 machines at one time.
This was taking way too long and copies were not perfect.
Restoration of VM images presented the same problem.
Updating a VM meant redistribution of the entire file to all machines again.

Using (Micro) Torrent and my own tracker changed all that.

I came up with the following solution using all available resources.
First I started by copying all images to workstations to a separate partition. (about 200GB of VM's.)
Then I created created my own internal Tracker and Web Page to host torrents.

The results were:
1. Extremely efficient use of all available network hard drive space.
2. Utilities every machine on the network to distribute the files.
3. Works extremely well restoring or redistributing the VM's to any one machine or several machines at once. (The more the better)
4. 100% accuracy in distribution.
5. The ability to quickly modify any one image on any machine, recreate the torrent(hash) and then update that image across hundreds of machines very quickly.
In other words, modifying a file only means that the machines only have to download the bits that changed not the whole image again.
6. With Micro Torrent any machine can be used as the tracker.
7. The Tracker is also the "master" file server, however any machine can be used to modifiy and upload a change
Just recreate and re-upload the new torrent replacing the old one. Remember that a torrent file serving network is Not a server centric file sharing system.

In the old days before BitTorrent (1)

Orion Blastar (457579) | more than 6 years ago | (#23296660)

I used to download Linux ISO files directly from FTP or web sites.

Nothing upset me more than downloading an ISO only to find out that after I burned it to CD/DVD, it had CRC errors and random lockups during an install.

After BitTorrent with error correcting, the problem was solved. It works for other things as well.

Commercial software companies can offer ISO downloads via BitTorrent trackers and send the install CD Key via email. That way customers just burn the CD/DVD and install the key they got in email.

Some thing with media files, download via BitTorrent enter an unlock key you get via email when you bought it.

Business are stupid if they ignore the benefits of BitTorrent.

Even piracy doesn't hurt that much as most people want to try the software before they buy it. It is like kicking the tires before buying a car and taking it out for a test drive before signing the papers to buy it.

Fix up jigdo file (1)

YoungHack (36385) | more than 6 years ago | (#23296854)

I've used it to finish up the last 3% of a jigdo build when I was missing a file or two. Worked great.

How does this sit with RIAA? (1)

Fluffeh (1273756) | more than 6 years ago | (#23296922)

I wonder if you could legitimately argue that you were verifying the data in a personal backup of media that you had?

Unless I am mistaken, it is perfectly legal to make a backup of data that you own right? So, if you already own an item, would downloading it to have a backup be a legal thing to do?

And if that's the case, I wonder what the legal implications are in cases where the RIAA comes down on people who have been "participating in file sharing" activities.

Another nice tool for this: rsync (1)

swillden (191260) | more than 6 years ago | (#23296932)

Assuming you can find a source that serves a known-good file via rsync, it's a very efficient way to fix up a damaged copy.

I once had to download a CD image over a dialup connection when I was at a client site in Mexico. I did the initial download via FTP, but it got corrupted and the MD5 sum didn't match the correct value. It had taken almost two full days to download the first time (over a weekend, so shipping a CD wouldn't have been faster), but rsync was able to find and correct the corrupted sections in less than five minutes.

Rsync is also an unbeatable tool for making incremental backups. I use it (rather, I use rdiff-backup, which uses rsync) to back up a server with almost 30 GiB of data, nightly, over a standard cable modem connection. Last night's, for example, took 57 minutes to run, found 527 changed files totaling 1.36 GiB of 26.2 GiB total. I don't know how much it actually downloaded, but I'm sure it was much less than 1.36 GiB.

ÂTorrent's Relocate feature. (0)

Anonymous Coward | more than 6 years ago | (#23297048)

Latest ÂTorrent (1.8+) also allows you to point to a file name that is different on your hard drive. You don't need to worry about file names matching up any more if the bits are identical.
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Create a Slashdot Account

Loading...