Beta

Slashdot: News for Nerds

×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

The Many Paths To Data Corruption

Zonk posted more than 6 years ago | from the luke-alan-cox-is-your-guru dept.

Data Storage 121

Runnin'Scared writes "Linux guru Alan Cox has a writeup on KernelTrap in which he talks about all the possible ways for data to get corrupted when being written to or read from a hard disk drive. This includes much of the information applicable to all operating systems. He prefaces his comments noting that the details are entirely device specific, then dives right into a fascinating and somewhat disturbing path tracing data from the drive, through the cable, into the bus, main memory and CPU cache. He also discusses the transfer of data via TCP and cautions, 'unfortunately lots of high performance people use checksum offload which removes much of the end to end protection and leads to problems with iffy cards and the like. This is well studied and known to be very problematic but in the market speed sells not correctness.'"

cancel ×

121 comments

Keep your porn on separate physical drives! (5, Funny)

eln (21727) | more than 6 years ago | (#20609727)

The most common way for young data to be corrupted is to be saved on a block that once contained pornographic data. As we all know, deleting data alone is not sufficient, as that will only remove the pointer to the data while leaving the block containing it undisturbed. This allows a young piece of data to easily see the old porn data as it is being written to that block. For this reason, it is imperative that you keep all pornographic data on separate physical drives.

In addition, you should never access young data and pornographic data in the same session, as the young impressionable data may get corrupted by the pornographic data if they exist in RAM at the same time.

Data corruption is a serious problem in computing today, and it is imperative that we take steps to stop our young innocent data from being corrupted.

Re:Keep your porn on separate physical drives! (3, Funny)

king-manic (409855) | more than 6 years ago | (#20609859)

In addition, you should never access young data and pornographic data in the same session, as the young impressionable data may get corrupted by the pornographic data if they exist in RAM at the same time.

indeed, young pornographic data is disturbing. Fortunately there is a legal social firewall of 18.

Re:Keep your porn on separate physical drives! (1)

rts008 (812749) | more than 6 years ago | (#20613665)

From 'eln:21727' "The most common way for young data to be corrupted is to be saved on a block that once contained pornographic data."

Unfortunately, as to your "Fortunately there is a legal social firewall of 18.", it depends on which block in which city you are cruising as to whether or not they may be at least/over 14, much less 18.

At least that's what a traveling friend of mine told me....honest!

Re:Keep your porn on separate physical drives! (1)

Neanderthal Ninny (1153369) | more than 6 years ago | (#20610265)

Boy, your expert on this subject! I wonder if the FBI is watching you.

Re:Keep your porn on separate physical drives! (1)

dwater (72834) | more than 6 years ago | (#20611929)

> Boy, your expert on this subject!

Who's "Boy", how you know Boy is an expert, and what makes Boy the poster's?

Re:Keep your porn on separate physical drives! (0)

Anonymous Coward | more than 6 years ago | (#20612583)

Me, "Tarzan" --- Him, "Boy" --- That, "Jane"
"Cheeta Bad! --- Cheeta, No make fun Boy"

Ungowa,
Tarzan

Re:Keep your porn on separate physical drives! (1, Insightful)

Anonymous Coward | more than 6 years ago | (#20610733)

Funny, but there's a bit of truth in it too. If data corruption happens in the filesystem, it can cause files to become interlinked or point to "erased" data, which might be a surprise that you don't want if you keep porn on the same harddisk as data which is going to be published.

Paul Cylon (3, Funny)

HTH NE1 (675604) | more than 6 years ago | (#20609765)

There must be 50 ways to lose your data.

Re:Paul Cylon (0)

Anonymous Coward | more than 6 years ago | (#20609853)

That's not redundant; you're just upset that he spelled "lose" correctly!

Re:Paul Cylon (1, Insightful)

HTH NE1 (675604) | more than 6 years ago | (#20609909)

Ah well. Perhaps I should have been a bit cleverer and said, "There must be 110010 ways to lose your data."

Re:Paul Cylon (1)

mcpkaaos (449561) | more than 6 years ago | (#20612359)

That or someone is playing fast and lose with their mod points.

benchmarks (4, Insightful)

larien (5608) | more than 6 years ago | (#20609829)

As Alan Cox alluded to, there are benchmarks for data transfers, web performance, etc, etc, etc, but none for data integrity, it's kind of assumed, even if it perhaps shouldn't be. It also reminds me of various cluster software which will happily crash a node rather than risk data corruption (Sun Cluster & Oracle RAC both do this). What do you [em]really[/em] want? Lightning fast performance, or the comfort of knowing that your data is intact & correct? For something like a rendering farm, you can probably tolerate a pixel or two being the wrong shade. If you're dealing with money, you want the data to be 100% correct, otherwise there's a world of hurt waiting to happen...

Re:benchmarks (5, Interesting)

dgatwood (11270) | more than 6 years ago | (#20610161)

I've concluded that nobody cares about data integrity. That's sad, I know, but I have yet to see product manufacturers sued into oblivion for building fundamentally defective devices, and that's really what it would take to improve things, IMHO.

My favorite piece of hardware was a chip that was used in a bunch of 5-in-1 and 7-in-1 media card readers about four years ago. It was complete garbage, and only worked correctly on Windows. Mac OS X would use transfer sizes that the chip claimed to support, but the chip returned copies of block 0 instead of the first block in every transaction over a certain size. Linux supposedly also had problems with it. This was while reading, so no data was lost, but a lot of people who checked the "erase pictures after import" button in iPhoto were very unhappy.

Unfortunately, there was nothing the computer could do to catch the problem, as the data was in fact copied in from the device exactly as it presented it, and no amount of verification could determine that there was a problem because it would consistently report the same wrong data.... Fortunately, there are unerase tools available for recovering photos from flash cards. Anyway, I made it a point to periodically look for people posting about that device on message boards and tell them how to work around it by imaging the entire flash card with dd bs=512 until they could buy a new flash card reader.

In the end, I moved to a FireWire reader and I no longer trust USB for anything unless there's no other alternative (iPod, iPhone, and disks attached to an Airport Base Station). While that makes me somewhat more comfortable than dealing with USB, there have been a few nasty issues even with FireWire devices. For example, there was an Oxford 922 firmware bug about three years back that wiped hard drives if a read or write attempt was made after a spindown request timed out or something. I'm not sure about the precise details.

And then, there is the Seagate hard drive that mysteriously will only boot my TiVo about one time out of every twenty (but works flawlessly when attached to a FW/ATA bridge chipset). I don't have an ATA bus analyzer to see what's going on, but it makes me very uncomfortable to see such compatibility problems with supposedly standardized modern drives. And don't get me started on the number of dead hard drives I have lying around....

If my life has taught me anything about technology, it is this: if you really care about data, back it up regularly and frequently, store your backups in another city, ensure that those backups are never all simultaneously in the same place or on the same electrical grid as the original, and never throw away any of the old backups. If it isn't worth the effort to do that, then the data must not really be important.

Re:benchmarks (1)

unitron (5733) | more than 6 years ago | (#20612549)

And then, there is the Seagate hard drive that mysteriously will only boot my TiVo about one time out of every twenty (but works flawlessly when attached to a FW/ATA bridge chipset).

And then there's my 80 Gig Western Digital that was very flakey (as soon as the warranty was up) in BX chipset (or equivalent) motherboard PCs, but I used it to replace the original drive in a Series 1 stand alone Philips Tivo and it's been working flawlessly in it for about a year now. Before you blame WD, I'm writing this on a BX chipset PC that's been running another WD 80 Gig that's almost identical (came off the assembly line a few months earlier) and it's been working fine since before I got the newer one that's now in the Tivo. Go figure.

By the way, you aren't the only one running a hard drive cemetary. :-)

Re:benchmarks (1)

IvyKing (732111) | more than 6 years ago | (#20613273)

In the end, I moved to a FireWire reader and I no longer trust USB for anything unless there's no other alternative (iPod, iPhone, and disks attached to an Airport Base Station). While that makes me somewhat more comfortable than dealing with USB, there have been a few nasty issues even with FireWire devices.


I don't recall seeing anything with regards to FireWire vs USB that would give FireWire an advantage in data integrity (though may be missing some finer points about the respective specs). OTOH, I have seen specs (one of the LaCie RAID in a box drives) that give a 10 to 20% performance advantage to FW despite the 'lower' peak speeds - one reason is that FW uses separate pairs for xmit and rcv.

Re:benchmarks (1)

dgatwood (11270) | more than 6 years ago | (#20617399)

There's no technical reason for FW drives to be more reliable. The limited number of FireWire silicon vendors, however, does mean that each one is likely to get more scrutiny than the much larger number of USB silicon vendors, IMHO.

Re:benchmarks (1)

straybullets (646076) | more than 6 years ago | (#20615705)

then the data must not really be important

Yep, that's it: loads of useless data, produced by a society barely able to perform some relatively weak techno tricks while completly failing to solve basic issues. Something is wrong in this biometric cash flow production model.

Re:benchmarks (1)

BSAtHome (455370) | more than 6 years ago | (#20610173)

To paraphrase a RFC: Good, Fast, Cheap; pick two, you can't have all three.

End-to-end (4, Informative)

Intron (870560) | more than 6 years ago | (#20609847)

Some enterprise server systems use end-to-end protection, meaning the data block is longer. If you write 512 bytes of data + 12 bytes or so of check data and carry that through all of the layers, it can prevent the data corruption from going undiscovered. The check data usually includes the block's address, so that data written with correct CRC but in the wrong place will also be discovered. It is bad enough to have data corrupted by a hardware failure, much worse not to detect it.

Iffy cards? Try crappy drivers (-1, Troll)

timecop (16217) | more than 6 years ago | (#20609851)

What "iffy cards" support TCP checksum offloading? You're not going to find that feature on a typical Realtek nic that's in a typicalslashdotuser's Linux box.
I think what the poster meant is there are dodgy Linux drivers for the said hardware, and quote, "for drivers/ide there are *lots* of problems with error handling", so to me it seems the problem is quite Linux specific and doesn't affect the rest of us who use a proper OS. I have no problem using Intel server GbE nics w/checksum offloading in Windows 2003 Server R2.

Re:Iffy cards? Try crappy drivers (0)

Anonymous Coward | more than 6 years ago | (#20610105)

Your whole post makes no sense.
  1. AC was specifically talking about Ethernet checksums.
  2. A "typicalslashdotuser's" Linux box doesn't process "datasets of several Petabytes" looking for data corruption
  3. Which poster are you talking about? The words came straight from Alan Cox
  4. Do you have any problems using said card with an alternate OS?

Re:Iffy cards? Try crappy drivers (1)

cnettel (836611) | more than 6 years ago | (#20614521)

Simple: the bus can be faulty (or the connection NIC-bus). The memory can go bad. If you do checksum offloading, you only verify the integrity at the endpoint in your machine. If you move it to the CPU, the path for data where it may be corrupted is shortened.

Hello ZFS (4, Informative)

Wesley Felter (138342) | more than 6 years ago | (#20609971)

ZFS's end-to-end checksums detect many of these types of corruption; as long as ZFS itself, the CPU, and RAM are working correctly, no other errors can corrupt ZFS data.

I am looking forward to the day when all RAM has ECC and all filesystems have checksums.

No (3, Interesting)

ElMiguel (117685) | more than 6 years ago | (#20610171)

as long as ZFS itself, the CPU, and RAM are working correctly, no other errors can corrupt ZFS data.

Sorry, but that is absurd. Nothing can absolutely protect against data errors (even if they only happen in the hard disk). For example, errors can corrupt ZFS data in a way that turns out to have the same checksum. Or errors can corrupt both the data and the checksum so they match each other.

This is ECC 101 really.

Re:No (1)

Wesley Felter (138342) | more than 6 years ago | (#20610241)

For example, errors can corrupt ZFS data in a way that turns out to have the same checksum. Or errors can corrupt both the data and the checksum so they match each other.

You can use SHA as the checksum algorithm; the chance of undetected corruption is infinitesimal.

Re:No (5, Funny)

Slashcrap (869349) | more than 6 years ago | (#20610501)

Or errors can corrupt both the data and the checksum so they match each other.

This is about as likely as simultaneously winning every current national and regional lottery on the planet. And then doing it again next week.

And if we're talking about a 512 bit hash then it's possible that a new planet full of lotteries will spontaneously emerge from the quantum vacuum. And you'll win all those too.

Re:No (2, Funny)

TruthfulLiar (927336) | more than 6 years ago | (#20613459)

> And if we're talking about a 512 bit hash then it's possible that a new planet full of lotteries will spontaneously emerge from the quantum vacuum. And you'll win all those too.

If this happens, be sure to keep the money from the quantum vacuum lotteries in a separate account, or it will annihilate with your real money.

Re:No (1)

StarfishOne (756076) | more than 6 years ago | (#20614377)

I'm amazed that none of you have ever heard of the Girlfriend Money experiment: when a girlfriend (especially your own) looks at a certain amount of money, she'll cause the collapse of the money's superposition. This _always_ results in the money disappearing both completely and instantaneously. ;o

Girlfriend? (0)

Anonymous Coward | more than 6 years ago | (#20614815)

You must be new here.

Re:Girlfriend? (2, Funny)

StarfishOne (756076) | more than 6 years ago | (#20615405)

No, girlfriend waveforms can collapse in such a way that one can actually have one. This may not happen often when combined with the /. waveform.. but every now and then it does happen. ;)

Re:No (0)

Anonymous Coward | more than 6 years ago | (#20610325)

Um... do you have a reference for that type of corruption happening in ZFS? First the checksums aren't kept with the data, second if that happens, the data is regenerated from parity information. ZFS is self healing. And by parity information, I mean the raid parity information. so even if data gets corrupted, it is repaired.

Re:No (0)

Anonymous Coward | more than 6 years ago | (#20610871)

A checksum is very small compared to the data. Even an ideally distributed 128 bit(=16bytes) checksum matches 2^3968 different contents of a 512 byte block of data. Change it to one of those contents and the error won't be detected. The only thing which stands between you and that kind of error is the very small likelyhood of it: only 1 in 2^128 random contents match the checksum. Suppose that a block was read a thousand times per second since the beginning of time and it has always returned different random data, then you would still not need to expect that one of those blocks could have bypassed a proper 128 bit checksum. That's good enough for most applications.

Re:Hello ZFS (3, Informative)

harrkev (623093) | more than 6 years ago | (#20610409)

I am looking forward to the day when all RAM has ECC and all filesystems have checksums.
Not gonna happen. The problem is that ECC memory costs more, simply because there is 12.5% more memory. Most people are going to go for as cheap as possible.

But, ECC is available. If it is important to you, pay for it.

Re:Hello ZFS (1)

Wesley Felter (138342) | more than 6 years ago | (#20610493)

Intel or AMD could force ECC adoption if they wanted to; the increase in cost would be easily hidden by Moore's Law.

Re:Hello ZFS (1)

drsmithy (35869) | more than 6 years ago | (#20612135)

Not gonna happen. The problem is that ECC memory costs more, simply because there is 12.5% more memory. Most people are going to go for as cheap as possible.

It'll happen for the same reason RAID5 on certain types of arrays will be obselete in 4 - 5 years. Eventually memory sizes are going to get so big that the statistical probability of a memory error will effectively guarantee they happen too frequently to ignore.

Re:Hello ZFS (1)

QuoteMstr (55051) | more than 6 years ago | (#20612849)

Obsolete? What would you replace it with then?

Re:Hello ZFS (2, Interesting)

drsmithy (35869) | more than 6 years ago | (#20613101)

Obsolete? What would you replace it with then?

RAID6. Then a while after that, "RAID7" (or whatever they call triple-parity).

In ca. 4-5 years[0], the combination of big drives (2TB+) and raw read error rates (roughly once every 12TB or so) will mean that during a rebuild of 6+ disk RAID5 arrays after a single drive failure, a second "drive failure" (probably just a single bad sector, but the end result is basically the same) will be - statistically speaking - pretty much guaranteed. RAID5 will be obselete because it won't protect you from array failures (because every single-disk failure will become a double-disk failure). RAID6 will only give you the same protection as RAID5 today (because you will be vulnerable to a third drive failing during the rebuild in addition to the second) and "RAID7" will be needed to protect you from "triple disk failures".

On a more positive note, with current error rates, RAID10 should last until ca. 10TB drives before SATA array elements have to be "triple mirrored" (although this is far enough down the track that I expect the basic assumptions here to have changed). "Enterprise" hardware also has (much) longer to go, because the read error rate is better and drives typically (much) smaller.

(Even today, IMHO, anyone using drives bigger than 250G in 6+ disk arrays without either RAID6 or RAID10 is crazy.)

[0]This is actually being pretty generous. It's certain we'll see 2TB drives well before then, but I'm taking a timeframe where they will be "common" rather than "high end".

Re:Hello ZFS (1)

renoX (11677) | more than 6 years ago | (#20614331)

>The problem is that ECC memory costs more, simply because there is 12.5% more memory.

The big issue is that ECC memory doesn't cost only 12.5% more than regular memory, otherwise you'd see that lots of knowledgeable (or correctly guided) people would buy ECC.

Re:Hello ZFS (1)

MarsDefenseMinister (738128) | more than 6 years ago | (#20612605)

Just so everybody knows, ZFS is available for Linux as a FUSE module. It's easy to get it working, and lots of fun to tinker with. I have it set up right now in a test configuration with an old 80 gig drive, and a 11 gig drive. 91 gigs total, in external USB enclosures. And I created files on an NFS server the same size as each of the drives, and told ZFS to use those files as mirrors. On a 100 megabit link. And surprisingly enough, it's actually not too slow to use!

But the reason I have it set up is not to use it that way forever, but to learn about ZFS administration to assess how well it'll work in an average Linux geek's setup. So far, I think it'll work very well for many people, even if you've just got one drive. ZFS can keep multiple copies of important data and repair data errors even if you're not in a mirrored or RAID configuration.

Other schemes (1)

jd (1658) | more than 6 years ago | (#20613225)

Now, as far as I know, there are many schemes for correcting and detecting errors. Some, like FEC, fix infrequent, scattered errors. Others, like turbocodes, fix sizeable blocks of errors. This leads to two questions: what is the benefit in using plain CRCs any more? And since disks are block-based not streamed, wouldn't block-based error-correction be more suitable for the disk?

Re:Other schemes (1)

Wesley Felter (138342) | more than 6 years ago | (#20616865)

Now, as far as I know, there are many schemes for correcting and detecting errors. Some, like FEC, fix infrequent, scattered errors. Others, like turbocodes, fix sizeable blocks of errors. This leads to two questions: what is the benefit in using plain CRCs any more?

CRCs are only used for detecting errors. Once you've detected a bad disk block, you can use replication (RAID 1), parity (RAID 4/5/Z), or some more advanced FEC (RAID 3/6/Z2) to correct the error. The benefit of CRCs is that you can read only the data blocks (saving bandwidth), check the CRC, and ignore the check blocks if the CRC is correct; you only need to read check blocks from disk when the CRC is wrong.

I think this has happened to me (4, Interesting)

jdigital (84195) | more than 6 years ago | (#20609977)

I think I suffered from a series of Type III errors (rtfa). After merging lots of poorly maintained backups of my /home file system I decided to write a little script to look for duplicate files (using file size as a first indicator, then md5 for ties). The script would identify duplicates and move files around into a more orderly structure based on type, etc. After doing this i noticed that a small number of my mp3's now contain chunks of other songs in them. My script was only working with whole files, so I have no idea how this happened. When I refer back to the original copies of the mp3s the files are uncorrupted.

Of course, no one believes me. But maybe this presentation is on to something. Or perhaps I did something in a bonehead fashion totally unrelated.

Re:I think this has happened to me (1)

jdigital (84195) | more than 6 years ago | (#20610045)

Of course the fa that I was referring to is here [web.cern.ch] . Much more informative than AC's post if I may say...

Re:I think this has happened to me (1)

KevinColyer (883316) | more than 6 years ago | (#20614269)

I have experienced having small chunks of other songs inside mp3 files on my mp3 player. Of course being a cheap player I assumed it was the player... I have a few problems when writing to it from Linux. I shall look more closely now!

Perhaps the FAT filesystem is interpreted differently on the player to how Linux expects it to be? (Or VV)

Re:I think this has happened to me (1)

mikolas (223480) | more than 6 years ago | (#20614751)

I have had the same kind of problems a few times. I store all my stuff on a server with RAID5, but there have been a couple of times when transferring music from the server (via SMB) to MP3 player (via USB) has corrupted files. I never solved the problem as the original files were intact so I did not go through the effort. However, after reading the article I just might do something about it as I got a bit more worried about the data integrity of my lifetime personal file collection that I store on the server.

MySQL? (4, Funny)

Jason Earl (1894) | more than 6 years ago | (#20610059)

I was expecting an article on using MySQL in production.

Re:MySQL? (1)

glwtta (532858) | more than 6 years ago | (#20611015)

They said many paths - MySQL is just the most common.

Re:MySQL? (0)

Anonymous Coward | more than 6 years ago | (#20611257)

Mod parent insightful, not funny. :P

Just wait till the banks data gets munged (1)

Colin Smith (2679) | more than 6 years ago | (#20610065)

That'll get fixed lickety split.

 

Hah (1)

the eric conspiracy (20178) | more than 6 years ago | (#20610305)

That is nothing compared to the actual storage technology. Attempting to recover data packed at a density of 1 GB/sq.in. from a disk spinning at 10,000 revolutions per minute where the actual data is stored in a micron thin layer of rust on the surface of the disk is manifestly impossible.

Re:Hah (3, Insightful)

Cajun Hell (725246) | more than 6 years ago | (#20610371)

Sometimes I think we're lucky this stuff works at all.

Re:Hah (1)

Lorkki (863577) | more than 6 years ago | (#20612203)

Some people call it luck, others call it engineering.

oblig xkcd (0)

Anonymous Coward | more than 6 years ago | (#20616849)

Hah-Ironing things out. (0)

Anonymous Coward | more than 6 years ago | (#20612197)

"Attempting to recover data packed at a density of 1 GB/sq.in. from a disk spinning at 10,000 revolutions per minute where the actual data is stored in a micron thin layer of rust on the surface of the disk is manifestly impossible."

It's not always iron oxide [storagereview.com] . Sometimes it's Cobalt [freepatentsonline.com]

...but in the market speed sells not correctness. (3, Interesting)

ozzee (612196) | more than 6 years ago | (#20610403)

Ah - this is the bane of computer technology.

One time I remember writing some code and it was very fast and almost always correct. The guy I was working with exclaimed "I can give you the wrong answer in zero seconds" and I shut up and did it the slower way that was right every time.

This mentality of speed at the cost of correctness is prevalent, for example I can't understand why people don't spend the extra money on ECC memory *ALL THE TIME*. One failure over the lifetime of the computer and you have paid for your RAM. I have assembled many computers and unfortunately there have been a number of times where ECC memory was not an option. In almost every case where I have used ECC memory, the computer was noticably more stable. Case in point, the most recent machine that I built has never (as far as I know) crashed and I've thrown same really nasty workloads it's way. On the other hand, a couple of notebooks I have have crashed more often than I care to remember and there is no ECC option. Not to mention the ridicule I get for suggesting that people invest the extra $30 for a "non server" machine. Go figure. Suggesting that stability is the realm of "server" machines and infer end user machines should be relegated to a realm of lowered standards of reliability makes very little sense to me especially when the investment of $30 to $40 is absolutely minuscule if it prevents a single failure. What I think (see lawsuit coming on) is that memory manufacturers will sell quality marginal products to the non ECC crowd because there is no way of validating memory quality.

I think there needs to be a significant change in the marketing of products to ensure that metrics of data integrity play a more significant role in decision making. It won't happen until the consumer demands it and I can't see that happening any time soon. Maybe, hopefully, I am wrong.

Re: ...but in the market speed sells not correctne (1, Insightful)

Anonymous Coward | more than 6 years ago | (#20611011)

I can't understand why people don't spend the extra money on ECC memory *ALL THE TIME*. One failure over the lifetime of the computer and you have paid for your RAM.

I do understand it. They live in the real world, where computers are fallible, no matter how much you spend on data integrity. It's a matter of diminishing return. Computers without ECC are mostly stable and when they're not, they typically exhibit problems on a higher level. I've had faulty RAM once. Only one bit was unstable and only one test of the many Memtest routines triggered the defect. Even a fault that small caused problems with every other verified CD burning. Given that lots of other reasons can cause data integrity violations, many of which can't be avoided because they're rooted in the imperfections of human nature, it is more effective to have procedures in place to deal with problems than to avoid them 100%.

Re: ...but in the market speed sells not correctne (1)

ozzee (612196) | more than 6 years ago | (#20611215)

They live in the real world, where computers are fallible ...

Computers are machines and don't need to be designed to be fallible. ECC is a small insurance policy to avoid problems exactly like the one you described. How much time did you spend on burning CD's that were no good, or running various memtests, not to mention the possible corrupted data you ended up saving and other unknown consequences ? Had you bought ECC RAM, your problem would have been corrected or more than likely detected not to mention that the memory manufacturers would need to push up their quality needs. The extra money for ECC would become miniscule if everyone bought only ECC RAM.

Equating human fallibility to physical fallibility makes little sense. If we sought to use the same standards you're proposing for electrical engineering to say civil engineering, it would extend to it being OK for a building to topple because they're "fallible" - not good.

If I could validate every data path in my computer at up to a 20% premium I would. That's better than an insurance policy. It happens to be that RAM is especially susceptible to errors that are very difficult to diagnose or even repeat and so reducing the probability of such errors is desirable, and at a small price like ECC RAM, it's a bargain of an insurance policy.

BTW, I'm not saying that you don't put procedures in place to deal with hardware failure. I'm saying that treating the problem may be far more effective than treating the symptoms.

Re: ...but in the market speed sells not correctne (1, Interesting)

Anonymous Coward | more than 6 years ago | (#20612163)

Computers are machines and don't need to be designed to be fallible. ECC is a small insurance policy to avoid problems exactly like the one you described. How much time did you spend on burning CD's that were no good, or running various memtests

That's beside the point. Computers ARE fallible, with or without ECC RAM. That you think they could be perfect (infallible) is testament to the already low rate of hardware defects which harm data integrity. It's good enough. I've experienced and located infrequent defects in almost every conceivable component of a computer system. An ECC error does not mean that the RAM is faulty. It could be caused by an aging capacitor, by a badly designed mainboard or a bunch of other reasons. An error just tells you that something is wrong. You still have to look for the cause.

If I could validate every data path in my computer at up to 20% premium, I would too. Unfortunately that is impossible, and not just because 20% is too small a premium to expect perfection. A stray particle from our sun could flip a bit in the processor and you'd be none the wiser. A seldom triggered off-by-one error in your favorite software could cause equally catastrophic mistakes as a flipped bit in main memory, and it wouldn't be caught by ECC RAM or any other available automatic integrity check. I'm not equating human fallibility to hardware problems. I'm explaining that at the current rate of faults in RAM modules, it is not the most common problem, which is precisely why it's rarely diagnosed correctly on the first try. That makes it a type of error which people don't want to pay money to avoid, as long as it can be found somehow. It turns out that it is surprisingly easy to detect too, because RAM rarely sits unused for very long, so even spurious defects show up on higher levels with a frequency that causes them to be noticed quickly. People have to be on the lookout for other defects and user errors all the time, they don't need to do anything extra to know that something is wrong when bad RAM is the cause. It just shows on a different level.

It is much more important to have working high level checks, otherwise you're going to miss lots of flaws. That's why mission critical systems run data through redundant systems with different implementations by different people and compare the results. A "whole system parity check", if you will. RAID is designed with the same philosophy: Cheap, possibly faulty hardware is used and errors are detected on a higher level and corrected if possible. Real world systems just place the checks much closer to the user, or even beyond the user, where laws allow for correction of mistakes post factum. A flipped bit in the exponent of a financial transaction does not mean you lose a lot of money. It means you end up having to correct that error. But the real world gives you that opportunity, so you're fine with saving money by not trying to achieve infallibility.

Re: ...but in the market speed sells not correctne (1)

ozzee (612196) | more than 6 years ago | (#20613697)

Computers ARE fallible, with or without ECC RAM.

Yes. They are, but considerably less fallible with ECC. Remember, "I can give you the wrong answer in zero seconds." There's no point in computing at all unless there is a very high degree of confidence in computational results. Software is just as fallible as hardware but again, I can, and do, make considerable effort in making it less fallible.

I am the default sysadmin for a family member's business. There was a time where the system was fraught with network failures on a constant basis. Printers stopped working, wan, stopped working etc without any apparent explanation. So called "experts" were called out to fix the problem which was fixed and broken the next day. Each and every time there was a failure, the cost was significant - huge, far more than ECC RAM. Just the time it took to rule out memory failure was more than the cost of ECC RAM. Under your fatalistic mode, he would be relegated to continual system failures. I reconfigured the systems, made some DHCP addresses virtually static and dropped the PPPOE in the box and put it on a Linux server, etc etc. It's been well over a year now and the only failure has been the occasional power interruption and my forgetting to set services to run on reboots. MUCH MUCH more reliable.

As for flipping bits. DRAM is particularly susceptible to background radiation while the CPU data paths are not for various reasons and there is far more silicon devoted to memory than there is to the CPU, hence using ECC over main memory is a huge deal. I have had alot of experience and I know that memory is a significant cause of reliability issues hence again, the cost of ECC RAM is minuscule compared to the benefits.

ECC - corrects errors. So even if you have a faulty bit, it is both detected AND corrected which makes it so that you don't have to worry about somthing that would otherwise have stopped you in your tracks.

I find it very strange that you have a distinction between "mission critical" and some user's machine. I have a very high expectation and whenever there is a fault, I will diagnose it to remove it. Anything which is unrepeatable wastes alot of my time and so ECC ram removes a large set of those problems - for an extra 20% investment, it's not even above the noise of other things.

Re: ...but in the market speed sells not correctne (1)

51mon (566265) | more than 6 years ago | (#20614877)

This mentality of speed at the cost of correctness is prevalent...

I use to sell firewalls. People always wanted to know how fast it would work (most were good up to around 100Mbps, when most people had at most 2Mbps pipes at most), very few people asked detailed questions about what security policies it could enforce, or the correctness and security of the firewall device itself.

Everyone knew they needed something, very few had a clue about selecting a good product, speed they understood, network security in comparison is pretty tough. Other forms of correctness are I think also more difficult to comprehend.

How many people know the safety rating of their automobile? Okay probably the wrong people to ask.

Re: ...but in the market speed sells not correctne (0)

Anonymous Coward | more than 6 years ago | (#20615137)

How many people know the safety rating of their automobile?

Excellent example. If driving were so dangerous that buying a car for maximum safety would be commonplace, then people wouldn't drive. The general safety of cars is high enough that you don't need to worry about how safe a particular car is exactly. Same with RAM. If people thought that the defect rate of RAM warranted ECC RAM, then they wouldn't trust their computers at all.

Missing option. (1)

Neanderthal Ninny (1153369) | more than 6 years ago | (#20610469)

I remember a long time ago that cosmic rays (actually the ElectroMagnetic Field disruption they caused) created some of those errors.

what we have lost (4, Insightful)

cdrguru (88047) | more than 6 years ago | (#20610491)

It amazes me how much has been lost over the years towards the "consumerization" of computers.

Large mainframe systems have had data integrity problems solved for a long, long time. It is today unthinkable that any hardware issues or OS issues could corrupt data on IBM mainframe systems and operating systems.

Personal computers, on the other hand, have none of the protections that have been present since the 1970s on mainframes. Yes, corruption can occur anywhere in the path from the CPU to the physical disk itself or during a read operation. There is no checking, period. And not only are failures unlikely to be quickly detected but they cannot be diagnosed to isolate the problem. All you can do is try throwing parts at the problem, replacing functional units like the disk drive or controller. These days, there is no separate controller - its on the motherboard - so your "functional unit" can almost be considered to be the computer.

How often is data corrupted on a personal computer? It is clear it doesn't happen all that often, but in the last fourty years or so we have actually gone backwards in our ability to detect and diagnose such problems. Nearly all businesses today are using personal computers to at least display information if not actually maintain and process it. What assurance do you have that corruption is not taking place? None, really.

A lot of businesses have few, if any, checks that would point out problems that could cost thousands of dollars because of a changed digit. In the right place, such changes could lead to penalties, interest and possible loss of a key customer.

Why have we gone backwards in this area when compared to a mainframe system of fourty years ago? Certainly software has gotten more complex but basic issues of data integrity have fallen by the wayside. Much of this was done in hardware previously. It could be done cheaply in firmware and software today with minimal cost and minimal overhead. But it is not done.

Re:what we have lost (1)

glwtta (532858) | more than 6 years ago | (#20610983)

Yeah, go figure, cheap stuff is built to lower standards than really high-end stuff.

A lot of businesses have few, if any, checks that would point out problems that could cost thousands of dollars because of a changed digit.

I would think it's extremely unlikely that such random corruption would happen on some byte somewhere which actually gets interpreted as a meaningful digit; much more likely to either corrupt some format or produce some noticeable garbage somewhere (not "wrong-yet-meaningful" data). Or just go completely unnoticed - I recently discovered (don't ask how) that you can write relatively large chunks of garbage to many parts of an excel file without producing any noticeable effect.

Why have we gone backwards in this area when compared to a mainframe system of fourty years ago?

You could ask that about almost anything: we've had some spectacular advances in audio quality, so why do we settle for lossy formats and the "passable-at-best" sound of ipods?

Answer: It's Good Enough (TM).

Re:what we have lost (0)

Anonymous Coward | more than 6 years ago | (#20611419)

But consumer level stuff is so much cheaper that I could buy maybe 100 PCs instead of one mainframe. I can run multiple redundant machines each checking the other if I really want to. If a machine dies, I throw it in the trash and move to the next and at the end it's still cheaper than a mainframe.

Plus technology is reaching such a quality and cheapness level that the overhead of redundant checking is too much of a cost or disadvantage for the minuscule chance that something may be corrupted.

Re:what we have lost (1)

ozzee (612196) | more than 6 years ago | (#20613733)

Plus technology is reaching such a quality and cheapness level that the overhead of redundant checking is too much of a cost or disadvantage for the minuscule chance that something may be corrupted.

Are you sure? The actual cost is minuscule and significantly less than the cost of potential errors.

Re:what we have lost (3, Interesting)

suv4x4 (956391) | more than 6 years ago | (#20611821)

Why have we gone backwards in this area when compared to a mainframe system of fourty years ago?

For the same reason why experienced car drivers crash in ridiculous situations: they are too sure of themselves.

The industry is so huge, that the separate areas of our computers just accept the rest is a magic box that should magically operate as is written in the spec. Screwups don't happen too often, and when they happen they are not detectable, hence no one woke up to it.

That said don't feel bad, we're not going downwards. It just so happened speed and flashy graphics will play important role for another couple of years. Then after we max this out, the industry will seek to improve another parameter of their products, and sooner or later we'll hit back the data integrity issue :D

Look at hard disks: does the casual consumer need more than 500 GB? So now we see the advent of faster hybrid (flash+disk) storage devices, or pure flash memory devices.

So we've tackled storage size, we're about to tackle storage speed. And when it's fast enough, what's next, encryption and additional integrity checks. Something for the bullet list of features...

Re:what we have lost (1)

hxnwix (652290) | more than 6 years ago | (#20617461)

There is no checking, period.
I'm sorry, but for even the crappiest PC clones, you've been wrong since 1998. Even the worst commodity disk interfaces have had checking since then: UDMA uses a 16 bit CRC for data; SATA uses 32 bit CRC for commands as well. Most servers and workstations have had ECC memory for even longer. Furthermore, if you cared at all about your data, you already had end-to-end CRC/ECC.

Yeah, mainframes are neat, but they don't save you from end-to-end integrity checking unless you really don't give a damn about it in the first place, being the sort of person who wakes up every morning to a tall steaming glass of IBM koolaid...

RAM = the weakest link (3, Interesting)

DigiShaman (671371) | more than 6 years ago | (#20610499)

It's well known that ECC and other forms of error correction are found at all levels of software and hardware. For example, hard drives have their own internal error correction while the file system it's formatted with may have another. Also worth mentioning, the CPU, serial busses, network adapters (both the physical IEEE 802.x connection and TCP/IP stack) and other forms of software error correction.

Basically, the modern computer has various hardware and software layers of error correction stacked on top of each other if not at least by themselves.

We do have weak link with desktops regarding RAM however. While modern workstations and server are generally installed with ECC RAM, our desktops are not. Also worth mentioning, most custom built clone PCs are for the desktop market. This has become a huge problem given the voltage and timing requirements don't leave much room for tolerance. The fact memory density has been going up only makes the chances for "bit flips" even worse. I can't tell you how many countless times I've ran into data corruption due to improper RAM settings. Running a few passes with Memtest 86+ will reveal this nasty issue. Hell, even Windows Vista now includes a utility to check for faulty RAM read/write issues that's how big the problem has become in the industry. As such, the desktop market severely needs to embrace ECC RAM like the server and workstation market. These days, to not use ECC is asking for trouble. And yes, you would take a 1 to 2% performance hit, but so what; Data integrity is more imporant.

Note: The newer Intel P965 chipset does not support ECC memory while their older 965x does. Crying shame too given the P965 has been designed for Core 2 Due and Quad Core CPUs.

Re:RAM = the weakest link (1)

IvyKing (732111) | more than 6 years ago | (#20613195)

We do have weak link with desktops regarding RAM however. While modern workstations and server are generally installed with ECC RAM, our desktops are not.


The major failing of the original Apple Xserve 'supercomputer' cluster was the lack of ECC - ISTR an estimate of a memory error every few hours (estimate made by Del Cecchi on comp.arch), which would severely limit the kinds of problems that could be solved on the system. I also remember the original systems being replaced a few months later with Xserves that had ECC.


And yes, you would take a 1 to 2% performance hit, but so what; Data integrity is more impor[t]ant.


A 1 to 2% performance hit is less costly than having to do multiple runs to make sure the data didn't get munged.


Note: The newer Intel P965 chipset does not support ECC memory while their older 965x does. Crying shame too given the P965 has been designed for Core 2 Due and Quad Core CPUs.


You're right about the crying shame - what you have is a high end games machine. Perhaps AMD still has a chance if their chipsets support ECC RAM. Something similar came up a few years ago on one of the Sun newsgroups about the the latest Apple box being able to run rings around a much more expensive Sun box - the one limitation of the Apple box was lack of ECC.

Re:RAM = the weakest link (2, Insightful)

KonoWatakushi (910213) | more than 6 years ago | (#20614049)

You're right about the crying shame - what you have is a high end games machine. Perhaps AMD still has a chance if their chipsets support ECC RAM.

The nice thing about AMD is that with the integrated memory controller, you don't need support in the chipset. I'm not sure about Semprons, but all of the Athlons support ECC memory. The thing you have to watch out for is BIOS/motherboard support. If the vendor doesn't include the necessary traces on the board or the configuration settings in the BIOS, it won't work. It is worth noting that unbuffered ECC ram will work in non-ECC boards, but without actually using the ECC bits, so you have to make sure that the board explicitly supports ECC, and is not merely compatible.

It is a shame though, and however nice a chip the Core2 is, AMD is the obvious choice if you care about your data.

Re:RAM = the weakest link (1)

BiggerIsBetter (682164) | more than 6 years ago | (#20613239)

The issue with chipsets is about market segmentation rather than newer=better. P965 is a "mainstream desktop" chipset, while say, a 975X is a "performance desktop" and/or "workstation" chipset and so supports ECC. The performance hit isn't a factor, but the price hit for the extra logic apparently is.

Re:RAM = the weakest link (1, Informative)

Anonymous Coward | more than 6 years ago | (#20614299)

Sad given that ECC logic is so simple it's basically FREE.

What's worse? It IS free!
Motherboard chips (e.g. south bridge, north bridge) are generally limited in size NOT by the transistors inside but by the number of IO connections. There's silicon to burn, so to speak, and therefore plenty of room to add features like this.

How do I know this? Oh wait, my company made them.... We never had to worry about state-of-the-art process technology because it wasn't worth it. We could afford to be several generations behind for exactly this reason.

Re:RAM = the weakest link (1, Informative)

Anonymous Coward | more than 6 years ago | (#20613369)

Note: The newer Intel P965 chipset does not support ECC memory while their older 965x does. Crying shame too given the P965 has been designed for Core 2 Due and Quad Core CPUs.

You meant 975x, not 965x. The successor of 975x is X38 (Bearlake-X) chipset supporting ECC DRAM. It should debut this month.

Re:RAM = the weakest link (1)

DigiShaman (671371) | more than 6 years ago | (#20613479)

You meant 975x, not 965x.

Correct. A typo on my part.

checksum offload (1)

goarilla (908067) | more than 6 years ago | (#20610575)

doesn't checksum offload means that that functionality gets
offloaded to another device like say an expensive NIC ? and thus removes that overhead from the CPU

Re:checksum offload (1, Interesting)

Anonymous Coward | more than 6 years ago | (#20610879)

Correct. However, there's two problems. Firstly, it's not an expensive NIC these days - virtually all Gigabit ethernet chips do at least some kind of TCP offload, and if these chips miscompute the checksum (or don't detect the error) due to being a cheap chip, you're worse off than doing it in software.

Also, these don't protect against errors on the card or PCI bus. (If the data was corrupted on the card or on the bus after the checksum validation but before it got to system RAM for any reason, this corruption would be not be detected. But if the checksum validation was happening in software after the data was written to RAM, the curruption would be detected by the OS. It'd assume it's a network transmission error instead of a bad network card, but (in TCP) it would arrange for a retransmittal of the data.)

Re:checksum offload (1)

shird (566377) | more than 6 years ago | (#20611313)

Yes, but then there is the risk the data gets corrupted between the NIC and CPU. Doing the checksum at the CPU checks the integrity of the data at the end-point, rather than on its way to the CPU.

How much does scrubbing cost? (2, Interesting)

skeptictank (841287) | more than 6 years ago | (#20610675)

Can anyone point me toward some information on the hit to CPU and I/O throughput for scrubbing?

Re:How much does scrubbing cost? (1)

feld (980784) | more than 6 years ago | (#20610847)

i was wondering the same thing... i dont have scrubbing enabled on my opteron workstation. i should do a memory benchmark test or two and turn it on to see how it compares.

Re:How much does scrubbing cost? (1)

Detritus (11846) | more than 6 years ago | (#20611123)

Scrubbing for RAM is an insignificant amount of overhead. All it involves is doing periodic read/write cycles on each memory location to detect and correct errors. This can be done as a low-priority kernel task or as part of the timer interrupt-service-routine.

Re:How much does scrubbing cost? (1)

skeptictank (841287) | more than 6 years ago | (#20613027)

If a system was operating in an environment where a failure was more like is it desirable to increase the frequency of the access to a given memory location. It seems reasonable that this would be the case. I am looking at an application that could be exposed to a higher level of cosmic rays than would be the normal for ground based workstations.

Re:How much does scrubbing cost? (1)

Detritus (11846) | more than 6 years ago | (#20613427)

If you want to be thorough about it, you need to determine the acceptable probability of an uncorrectable error in the memory system, the rate at which errors occur, and the scrub rate needed to meet or exceed your reliability target. If you scrub when the system is idle, you will probably find that the scrub rate is much higher than the minimum rate needed to meet your reliability target. In really hostile environments, you may need a stronger ECC and/or a different memory organization.

Speed Kills (0)

Anonymous Coward | more than 6 years ago | (#20610839)

The subject is the comment.

Timely article ... (2, Interesting)

ScrewMaster (602015) | more than 6 years ago | (#20611027)

As I sit here having just finished restoring NTLDR to my RAID 0 drive after the thing failed to boot. I compared the original file and the replacement, and they were off by ONE BIT.

Re:Timely article ... (1)

dotgain (630123) | more than 6 years ago | (#20613061)

A while ago I had an AMD K6-2 which couldn't gunzip one of the XFree86 tarballs (invalid compressed data - CRC error). I left memtest running over 24 hours which showed nothing, copying the file onto another machine (using the K6 as a fileserver) and gunzipping it there worked. I eventually bumped into someone with the same mobo and same problem, and figured binning the mobo was the fix.

To be honest, most of the comments about ECC RAM here have convinced me that it's worth it just for more peace of mind - all those poppy MP3s I've got and never thought much about could well have been caused by corruption at my end all this time.

YOU FAIl IT (-1, Offtopic)

Anonymous Coward | more than 6 years ago | (#20611159)

AT&T and Berkeley reciprocating bad Was aftEr a long ago, many of yogu isn't a lemonade revel in our gay resulted in the

Re:YOU FAIl IT (1)

ScrewMaster (602015) | more than 6 years ago | (#20611977)

Apparently this idiot has a bad network card. It's obviously corrupting outgoing packets.

HEY. (2, Funny)

yoyhed (651244) | more than 6 years ago | (#20611745)

TFA doesn't list ALL the possible ways data can be corrupted. It fails to mention the scenario of Dark Data (an evil mirror of your data, happens more commonly with RAID 1) corrupting your data with Phazon. In this case, the only way to repair the corruption is to send your data on a quest to different hard drives across the world (nay, the GALAXY) to destroy the Seeds that spread the corruption.

Re:HEY. (1)

DeadChobi (740395) | more than 6 years ago | (#20616113)

Praise be to the almighty Torrent, that he may deliver us from data corruption!

my path to corruption.... (0)

Anonymous Coward | more than 6 years ago | (#20611849)

Back in the gold old MSDOS days I managed to get one of the first VESA Local BUS IDE cards, that promised great transfer rates over ISA cards. Well, I played with the jumper settings to enable DMA fast transfers but the hardware (or driver) was not up to spec... Booted up and first thing I do is issue some "dir /s /p" commands (that was the unofficial visual speed test back then) I noticed that the listings got more corrupted with each pass (ouch), everything got screwed so I had to get back to the default jumper settings and let the format begin... LOL

I've seen it happen (1)

Hans Lehmann (571625) | more than 6 years ago | (#20611935)

I previously had a Shuttle desktop machine running Windows XP. One day I started noticing that when I copied files to a network file server, about 1 out of 20 or so would get corrupted, with larger files getting corrupted more often than smaller ones. Copying them to the local IDE hard drive caused no problems, and other machines did not have problems copying files to the same file server. I spent a lot of time swapping networking cards, etc. and not getting anywhere, until I plugged in a USB drive and noticed that files were also getting corrupted when copied to it.
I then ran tests with large random files, doing diff's between the originals and the copies. The errors were always single bytes that had changed; the file size never changed. Interestingly, whenever there was a changed byte, the seventh and eighth bytes preceding the error were always the same values, although having those two values next to each other in a file did not always cause the error. The problem turned out to be a bad motherboard; the data path to some destinations like the NIC and USB ports would corrupt data, while the path to the IDE connectors would not.

Re:I've seen it happen (1)

kg261 (990379) | more than 6 years ago | (#20614897)

And I have seen this happen on the IDE as well. In my case, the fan for the bridge chip had failed causing a bit error on disk writes every few hundred megabytes. This went on for I do not know how many months before I actually did a file copy and CMP to find the errors. Ethernet and other ports were fine.

Much Ado About Very Little (1)

Jane Q. Public (1010737) | more than 6 years ago | (#20613397)

I was a early "adopter" of the Internet... and when I was on a slow dial-up line, even with checksums being done on-the-fly via hardware, and packets being re-sent willy-nilly due to insufficient transmission integrity, my data seemed to get corrupted almost as often as not.

Today, with these "unreliable" hard drives, and (apparently, if we believe the post) less hardware checking being done, I very, very seldom receive data, or retrieve data from storage, that is detectably corrupted. My CRCs almost invariably check out after the download and storage, and I am a happy camper. :o)

Once in a great while, I get a file or other data stream that has unacceptable corruption in it. But that has been very rare in recent years, and I have little cause to complain.

Commercial interests that have big investments in their data might, of course, be justified in taking stiffer measures than the average consumer. Nothing new about that.

I just don't see the problem here. The "free market" (Microsoft and certain others notwithstanding) has reached solutions that are acceptable to the customers. Where is the issue?

Re:Much Ado About Very Little (1)

Detritus (11846) | more than 6 years ago | (#20613557)

The free market doesn't always produce socially desirable results. Manufacturers can also get trapped in a race to the bottom. Just look at the current quality of floppy disk drives and their media. I can remember when they actually worked.

ECC memory & Intel (0)

Anonymous Coward | more than 6 years ago | (#20613729)

A serious weakness in modern PCs is the lack of ECC memory. I think this is caused primarily by Intel. To create market segmentation Intel's mainstream chipsets (i815, i845, i865, i915, i945, i965 and later) do not support ECC memory. I believe this is actually market segmentation, and not a real cost reduction, because all mainstream chipsets before the i815, like the i440 LX & BX, support ECC.
A side effect of this is that it's now very expensive to build a home PC with ECC memory, because you now have to buy an expensive mainboard with intel's premium chipset (i875, i955, i975, etc.) to have ECC support.

I have seen this many times, unfortunately. :-( (2, Interesting)

Terje Mathisen (128806) | more than 6 years ago | (#20614351)

We have 500+ servers worldwide, many of them contains the same program install images which by definition should be identical:

One master, all the others are copies.

Starting maybe 15 years ago, when these directory structures were in the single-digit GB range, we started noticing strange errors, and after running full block-by-block compares between the master and several slave servers we determined that we had end-to-end error rates of about 1 in 10 GB.

Initially we solved this by doubling the network load, i.e. always doing a full verify after every copy, but later on we found that keeping the same hw, but using sw packet checksums, was sufficient to stop this particular error mechanism.

One of the errors we saw was a data block where a single byte was repeated, overwriting the real data byte that should have followed it. This is almost certainly caused by a timing glitch which over-/under-runs a hardware FIFO. Having 32-bit CRCs on all Ethernet packets as well as 16-bit TCP checksums doesn't help if the path across the PCI bus is unprotected and the TCP checksum has been verified on the network card itself.

Since then our largest volume sizes have increased into the 100 TB range, and I do expect that we now have other silent failure mechanisms: Basically, any time/location when data isn't explicitly covered by end-to-end verification is a silent failure waiting to happen. On disk volumes we try to protect against this by using file systems which can protect against lost writes as well as miss-placed writes (i.e. the disk reports writing block 1000, but in reality it wrote to block 1064 on the next cylinder).

NetApp's WAFL is good, but I expect Sun's ZFS to an equally good job a significantly lower cost.

Terje

Re:I have seen this many times, unfortunately. :-( (1)

saik0max0r (1152903) | more than 6 years ago | (#20615209)

NetApp's WAFL is good, but I expect Sun's ZFS to an equally good job a significantly lower cost.

Hard to say for certain, as comparing WAFL and ZFS ignores the often overlooked but additional integration that you get with a Filer as opposed to a more general purpose system running ZFS, which is still pretty green when it comes to this sort of stuff:

http://mail.opensolaris.org/pipermail/zfs-discuss/2006-November/036124.html [opensolaris.org] .

It's been my experience that data corruption typically occurs in RAM (ECC), at HBA, cabling or drive level itself. The difference in firmware behavior with integrated system vs. some drive you bought at Fry's is where true "End to End" data protection comes into play.

Adding up that level of integration makes the file system merely a component in a very very expensive system ;)

BS (0)

Anonymous Coward | more than 6 years ago | (#20614357)

If data corruption was so possible, you would get segfaults all the time because loaded binaries would have wrong machine code bytes in them. And even 1 bit wrong in assembly is enough for segv.

Every time you downloaded your fav *.unubtu package, it would be corrupted so either bzip would fail of the program would crash if you tried to run it. Not to mention configuration files and the fact that debugging would be impossible in such a case.

But i've never seen such segfaults on linux, except when in a boxen that the motherboard was damaged.

So alan is full of it.
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Create a Slashdot Account

Loading...