Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Google Releases Paper on Disk Reliability

Zonk posted more than 7 years ago | from the fun-saturday-night-reading dept.

Data Storage 267

oski4410 writes "The Google engineers just published a paper on Failure Trends in a Large Disk Drive Population. Based on a study of 100,000 disk drives over 5 years they find some interesting stuff. To quote from the abstract: 'Our analysis identifies several parameters from the drive's self monitoring facility (SMART) that correlate highly with failures. Despite this high correlation, we conclude that models based on SMART parameters alone are unlikely to be useful for predicting individual drive failures. Surprisingly, we found that temperature and activity levels were much less correlated with drive failures than previously reported.'"

cancel ×

267 comments

Sorry! There are no comments related to the filter you selected.

Great (5, Funny)

true_hacker (969330) | more than 7 years ago | (#18057090)

Excellent, i have been looking forward to thi *%)%*# DISK FAILURE

Re:Great (2, Funny)

Compholio (770966) | more than 7 years ago | (#18057430)

Excellent, i have been looking forward to thi *%)%*# DISK FAILURE
That's what you get for logging into slashdot from Antarctica...

Proprietary reporting (-1, Flamebait)

fred911 (83970) | more than 7 years ago | (#18057594)

Here's what they will tell us:

"One of our key findings has been the lack of a consistent
pattern of higher failure rates for higher temperature
drives or for those drives at higher utilization levels."

"Our results confirm the findings of previous smaller
population studies that suggest that some of the SMART
parameters are well-correlated with higher failure probabilities.
We find, for example, that after their first scan
error, drives are 39 times more likely to fail within 60
days than drives with no such errors."

er not...

"Despite those strong correlations, we find that
failure prediction models based on SMART parameters
alone are likely to be severely limited in their prediction
accuracy, given that a large fraction of our failed drives
have shown no SMART error signals whatsoever."

That's all they're willing to offer the public (probably for free).
Useless...

  Do no evil??? how bout make no data free..

ps.. all their farm is ata/ide?

Re:Proprietary reporting (5, Insightful)

spisska (796395) | more than 7 years ago | (#18057912)

ps.. all their farm is ata/ide?

You really didn't read the article, did you? On page 3 (Section 2.2 Deployment Details), the authors state: "More than one hundred thousand disk drives were used for all the results presented here. The disks are a combination of serial and parallel ATA consumer-grade hard disk drives, ranging in speed from 5400 to 7200 rpm, and in size from 80 to 400 GB. All units were put into production in or after 2001. [...] The data used for this study were collected between December 2005 and August 2006."

What are you waiting for Google to tell you? Are you really accusing them of being evil because they did a study, described their methodology, detailed their results, presented their analyses, and published it all for anyone who is interested?

You describe their conclusions as:

Uselsess

But there is no contradiction at all if you are smart enough to understand. They are telling you that if SMART identifies a problem with a drive then it is very likely that drive will fail within 60 days. But in a sample of 100,000 drives, many drives will also fail that have not returned errors on SMART scans. Thus SMART is a reliable indicator of impending failure but is not a silver bullet that can recognize and predict all failures before they happen.

Next time you have access to 100,000 hard drives, can analyze patterns of failure among them, can use those failures as a benchmark against which to measure analysis tools, and can come up with better recommendations for predicting failure than this study, then by all means let us know. But if you're looking for Microsoft or Western Digital or Seagate or Yahoo to perform and publish this kind of study for free, I think you may be waiting a good long while.

Re:Proprietary reporting (2, Informative)

Toba82 (871257) | more than 7 years ago | (#18057916)

It is well known that google uses commodity hardware. SCSI is not commodity, although I'm sure at least some of their servers are high end.

I interned at google - don't trust them (-1, Troll)

Anonymous Coward | more than 7 years ago | (#18057092)

Let's start this letter with a little quiz:

      1. To what lengths will Google go to gag the innocent accused from protesting isolationism-motivated prosecutions?
      2. How long shall there continue shambolic fault-finders to vend and censorious urban guerrillas to gulp so low a piece of Dadaism as its stances?
      3. Essay: Compare and contrast its vaporings to those of wily swaggerers, focusing especially on who is more likely to win support by encapsulating frustrations and directing them toward unpopular scapegoats.

Don't worry; I'll give you all the answers throughout the course of this letter as well as a wealth of other information about Google. Some background is in order: I deeply believe that it's within our grasp to present a clear picture of what is happening, what has happened, and what is likely to happen in the future. Be grateful for this first and last tidbit of comforting news. The rest of this letter will center around the way that in asserting that the kids on the playground are happy to surrender to the school bully, it demonstrates an astounding narrowness of vision. If you think that denominationalism is a viable and vital objective for our nation's educational institutions, then think again. Considering that I must, on principle, rise to the challenge of thwarting Google's biggety plans, I offer that Google may caricature and stereotype people from other cultures right after it reads this letter. Let it. Before long, I will criticize the obvious incongruities presented by Google and its subordinates.

While it is essential -- and among my highest priorities -- to warn the public against those self-indulgent underachievers whose positive accomplishments are always practically nil, but whose conceit can scarcely be excelled, ignorance is bliss. This may be why Google's hangers-on are generally all smiles. Although oleaginous Philistines are relatively small in number compared to the general population, they are increasing in size and fervor. Although Google has unfairly depicted me and those who share my beliefs as used-car salesmen and soi-disant do-gooders, we are neither. Yes, all of its ideas share elements of traditional, uncompanionable conspiracy themes in which mad yutzes secretly etiolate its enemies, but I, speaking as someone who is not a brown-nosing nutcase, hate it when people get their facts wrong. For instance, whenever I hear some corporate fat cat make noises about how we have too much freedom, I can't help but think that Google has planted its shills everywhere. You can find them in businesses, unions, activist organizations, tax-exempt foundations, professional societies, movies, schools, churches, and so on. Not only does this subversive approach enhance Google's ability to stir up trouble but it also provides irrefutable evidence that there is no compelling moral or economic reason why it should bring about a wonderland of anarchism. Regular readers of my letters probably take that for granted, but if I am to carve solutions that are neither huffy nor stuck-up, I must explain to the population at large that I'm simply trying to explain its grotty tendencies as well as its villainous tendencies as phases of a larger, unified cycle. Sadly, lack of space prevents me from elaborating further. My message is clear: Google presents one face to the public, a face that tells people what they want to hear. Then, in private, it devises new schemes to seize control over where we eat, sleep, socialize, and associate with others.

It is my greatest and most solemn pleasure to give the needy a helping hand, as opposed to an elbow in the face. Now take that to the next level: It takes more than a mass of twisted spongers to place blame where it belongs -- in the hands of Google and its stupid spokesmen. It takes a great many thoughtful and semi-thoughtful people who are willing to summon up the courage to feed the starving, house the homeless, cure the sick, and still find wonder and awe in the sunrise and the moonlight. Google's buddies' thinking is fenced in by many constraints. Their minds are not free because they dare not be. While it is reasonable to expect that Google is almost unique among dictatorial truculent-types in that it openly espouses a pernicious view of reality and a defense of imprudent separatism, it remains that we can't stop Google overnight. It takes time, patience and experience to initiate meaningful change. Google's disruptive threats convince me of only one thing: that what I wrote just a moment ago is not the paranoid rambling of a poxy, satanic wacko. It's a fact. The facts are in: Google confuses demagoguery with leadership and undocumented conspiracism with serious research.

Hmm (2, Interesting)

chanrobi (944359) | more than 7 years ago | (#18057100)

So if the article summary is correct does it even matter if the consumer desktop pc has SMART enabled or not?

Re:Hmm (5, Funny)

Anonymous Coward | more than 7 years ago | (#18057132)

Didn't read the article? (Check)
Didn't read the summary? (Check)

Congratulations, you're not officially a slashdot regular!

Re:Hmm (4, Informative)

Anonymous Coward | more than 7 years ago | (#18057284)

There are several SMART signals which are highly correlated with drive errors, but the authors note that 56% of the failed drives had no occurrences of these highly correlated errors. Even considering all SMART signals, 36% of failed drives still had no SMART signals reported.

So, if you have errors in those highly correlated categories your drives are probably going to fail, but if you do not have errors in these categories your drives can still fail.

Re:Hmm (1)

TattleTale1975 (711068) | more than 7 years ago | (#18057372)

No,
Actually,
This report indicates that Drive manufacturers should issue RMAs for a drive with a single confirmed bad sector, or any of a number of indicators reported by SMART, as they have as much as a 60% chance of failing
within the next several months.

If you see any SMART Errors,(note Errors) take that Drive out of active service and just run it till it dies.

Did they ever name the brands? (4, Insightful)

SuperKendall (25149) | more than 7 years ago | (#18057114)

They stated at one point in the document that some brands did have higher failure rates than others - yet I somehow missed any mention or ranking of brands. Did anyone else find that data?

That would be corporate dynamite (5, Insightful)

Traf-O-Data-Hater (858971) | more than 7 years ago | (#18057158)

I noticed this too. If a Google-sanctioned report had charts of which brands were more reliable, this would do serious damage to the brands that didn't perform so well. No wonder they sidestepped the whole issue!

Re:That would be corporate dynamite (3, Interesting)

MrZaius (321037) | more than 7 years ago | (#18057176)

It's no wonder that Google sidestepped the issue, but, if you assume they purchase primarily from the manufacturers that are more reliable, perhaps those manufacturers will begin to gloat and publish numbers about their Google contracts, if this study gains traction.

Re:That would be corporate dynamite (4, Insightful)

Antique Geekmeister (740220) | more than 7 years ago | (#18057414)

I'm confident that Google is fairly drive agnostic: you just can't run distributed networks that large and stay locked into a single vendor. And given that even reliable vendors have disasters like the IBM Deskstar drives some years ago, and given the remarkable growth of drive sizes over time, there's just not much point for them in buying the extremely stable but vastly more expensive hardware. They've foubtless learned that hardware flexibility provides valuable software flexibility.

Re:That would be corporate dynamite (0)

Anonymous Coward | more than 7 years ago | (#18057508)

No one's talking about a single vendor (GP said "manufacturers"), much less being "locked in". The hard drive market != the desktop OS market. But when you're unnaturally obsessed with one, as most are here, you tend to erroneously see and think about everything in those same terms, and it reveals itself in patterns of speech.

Re:That would be corporate dynamite (0)

Anonymous Coward | more than 7 years ago | (#18057690)

So Google believes it's impossible to believe that drives exist or not?

No? Perhaps they are neutral rather than agnostic, then.

Re:That would be corporate dynamite (1)

fred911 (83970) | more than 7 years ago | (#18057710)

Didn't Seagate have a disaster with stiction on their RLL drives? ... Seems I remember taking apart some 10 mb RLL drives and cleaning them with windex. Worked every time.

ps... they cost about $300 then.

  And you call yourself an antique:-)

Re:That would be corporate dynamite (4, Insightful)

EonBlueTooL (974478) | more than 7 years ago | (#18057292)

Google:Organizing all the world's information and making it universally accessible and useful(unless it could be troublesome)

Re:That would be corporate dynamite (2, Insightful)

devilspgd (652955) | more than 7 years ago | (#18057498)

Organizing and making accessible information which is already available is one thing, producing information is completely different.

Re:That would be corporate dynamite (5, Insightful)

Jah-Wren Ryel (80510) | more than 7 years ago | (#18058002)

Google:Organizing all the world's information and making it universally accessible and useful(unless it could be troublesome)

Old Google Motto: Don't do anything evil.
New Google Motto: Don't get into trouble.

Re:Did they ever name the brands? (5, Interesting)

iminplaya (723125) | more than 7 years ago | (#18057178)

FTA:However, in this paper, we do not show a
breakdown of drives per manufacturer, model, or vintage
due to the proprietary nature of these data.


But, of course.

Re:Did they ever name the brands? (1)

AmigaBen (629594) | more than 7 years ago | (#18057202)

Yes, Google seems to have the /. disease that prevents one from naming responsible parties when it would be useful to do so.

Tsk tsk

Re:Did they ever name the brands? (3, Insightful)

ryturner (87582) | more than 7 years ago | (#18057286)

It would be useful to you and me. But it is not useful to google to release that information.

Re:Did they ever name the brands? (1, Insightful)

AmigaBen (629594) | more than 7 years ago | (#18057374)

How was it useful to Google to publish the report at all?

I don't see the point in pretending to provide information while obfuscating the most meaningful bits of it, unless it's a sales attempt to garner attention for a paid-for version of the report. Obviously, Google has concerns in the process different than what our concerns are, but again, I don't really see the point in the report without the brands.

Re:Did they ever name the brands? (1)

Bill Dog (726542) | more than 7 years ago | (#18057520)

It's for prestige. They are big enough to have gone through enough hard drives to do a study on them, and have smart enough people to do the study.

Re:Did they ever name the brands? (3, Insightful)

HUADPE (903765) | more than 7 years ago | (#18057980)

There are several good reasons to not release the brand names. First, while the sample size is huge, the sample size for a particular model of a particular brand might not be. If they only happened to have 10 of one particular model, and one failed within a month, then 10% fail within a month, but it could just be a fluke. Second, liability. This wasn't a controlled test, it was done live within the Google servers (presumably). Whoever is on the bottom of the list could very well sue Google for libel. Without merit? Probably, but they might eke a few million in a settlement out of them. Google can't appear to be doing evil after all.

Thanks, missed that... (1)

SuperKendall (25149) | more than 7 years ago | (#18057606)

It appears that sentence was right after the part I read about how some makers had better results than others. So of course I scan the whole document looking for said data immediately after reading the first part, but did not return to that exact point thinking I had read it already...

Re:Thanks, missed that... (1)

iminplaya (723125) | more than 7 years ago | (#18057888)

Though I don't like them for not giving the breakdown, they did mention the fact, kind of like when a user in China tries to access censored sites, and they say it's not allowed. Censorship is everywhere. China uses the government, we use "proprietary". Both achieve the same desired result. I suppose it's a good way to leave the way open for alternative search engines, etc. Google has been eaten by sharks.

Re:Did they ever name the brands? (1, Insightful)

Anonymous Coward | more than 7 years ago | (#18057186)

Google's studies are like their searchengine: you get a bunch of results, but you have to sift through them yourself to get anything specific, and you'll probably end up reading the section closest related to boobies.

Re:Did they ever name the brands? (2, Insightful)

Xross_Ied (224893) | more than 7 years ago | (#18057188)

They didn't include any data at all about brands.

They should have done brand analysis (without naming the brand) and also rpm analysis.

From the article..

3.2 Manufacturers, Models, and Vintages
Failure rates are known to be highly correlated with drive
models, manufacturers and vintages [18]. Our results do
not contradict this fact. For example, Figure 2 changes
significantly when we normalize failure rates per each
drive model. Most age-related results are impacted by
drive vintages. However, in this paper, we do not show a
breakdown of drives per manufacturer, model, or vintage
due to the proprietary nature of these data.

Re:Did they ever name the brands? (2, Informative)

drmerope (771119) | more than 7 years ago | (#18057192)

No. They explicitly said they would not disclose that... which is a shame because that is probably the only interesting bit of information. The question that really needs to be studied is what distinguishes good drives from bad. This would probably involve disassembling drives of various 'vintages, models, manufacturers' and trying to pin down the relevant details. That way when new hard-drives get released, reviewers can pull them apart and judge them on something other than read/write performance, heat, and acoustics...

Re:Did they ever name the brands? (3, Insightful)

Prof.Phreak (584152) | more than 7 years ago | (#18057230)

At the very least, they could've named brands X, Y, Z, etc., and provided the numbers for those. Would be interesting if the differences are more than marginal.

Very true! (1)

SuperKendall (25149) | more than 7 years ago | (#18057308)

That would have been the perfect way to divulge this data without causing direct harm to any maker - I would really have liked to see if there was a large variance between brands, which might even lead me to purchase brand Y more, even if it's not at the top of the reliability chart - just so long as it was cheaper.

Re:Did they ever name the brands? (1)

mattmacf (901678) | more than 7 years ago | (#18057386)

That way when new hard-drives get released, reviewers can pull them apart and judge them on something other than read/write performance, heat, and acoustics...
You forgot one metric of comparison: the warranty. As far as I'm concerned, this number alone is the most important in determining the reliability of the hard drive. If the manufacturer is willing to say "This drive will last for X years or we replace it free," it speaks volumes about their confidence behind their product. When buying hard drives, I actively seek out drives with at least a 3 (preferably 5) year warranty (some Hitachis and Seagates IIRC) and explicitly avoid those with only a 1 year warranty period (I'm looking at you WD).

Re:Did they ever name the brands? (2, Interesting)

LunarCrisis (966179) | more than 7 years ago | (#18057664)

If the manufacturer is willing to say "This drive will last for X years or we replace it free," it speaks volumes about their confidence behind their product.

Or maybe the manufacturer just realized that 5 years down the road, a replacement for your then 5 year old HD will cost them peanuts. Accoring to the graph at http://en.wikipedia.org/wiki/Hard_drives#Capacity [wikipedia.org] , HD capacity seems to be increasing by roughly ten times every five years.

It's like the CD-R manufacturers stamping all the packaging with 100-year guarantees. They don't really have any good way of telling that they will actually last that long, but the replacement costs nearly nothing, and thus is payed for by the marketing benefits.

Re:Did they ever name the brands? (2, Interesting)

Schraegstrichpunkt (931443) | more than 7 years ago | (#18057794)

They explicitly said they would not disclose that... which is a shame because that is probably the only interesting bit of information.

What? So the part about which variables are correlated with drive failures (which is what the report was about) wasn't interesting to you? Too bad.

Re:Did they ever name the brands? (2, Insightful)

repvik (96666) | more than 7 years ago | (#18057222)

"However, in this paper, we do not show a breakdown of drives per manufacturer, model, or vintage due to the proprietary nature of these data." (From TFA)

Translation (3, Funny)

jd (1658) | more than 7 years ago | (#18057304)

"We don't want to be sued to within an inch of our lives by certain very wealthy brands, due to US law allowing manufacturers to prohibit unfavourable reviews."

Ideally, they would have formatted the text to spell out the names of the brands if you take the first letter of every Nth word, or some specific column of text. (Or maybe they have...)

Re:Translation (5, Insightful)

David Price (1200) | more than 7 years ago | (#18057330)

More likely: "We buy millions of dollars worth of drives each year, and our buying decisions are driven in part by the reliability data that we collect. If we told everyone what kind of drives work best, more people would buy those drives, driving up the price that we pay."

Re:Translation (4, Insightful)

the_womble (580291) | more than 7 years ago | (#18057378)

Another translation: Our competitors buy millions of dollars worth of drives as well. We are not going to help them avoid the duff ones.

Re:Translation (5, Insightful)

spisska (796395) | more than 7 years ago | (#18058052)

Another translation:

We're not so bloody stupid to believe that our competitors are standing in the aisle of Circuit City and scratching their head over whether to buy a Seagate or WD drive.

We know that our competitors all have their own metrics and their own relationships with manufacturers and frankly, we don't care. We know our competitors also measure these things, and we're not telling them anything they don't already know.

We aren't particularly worried about saying that some drives fail, because everyone who cares already knows that some drives fail. Everyone whose job it is to know which drives fail first already knows that as well.

But we're not going to tell you which brand fails at a higher rate than normal because we don't need a lawsuit that would cost us a lot of money but in the end would only confirm what the people who need to know these things already know.

We will, on the other hand, describe the tests we ran, our methodology, our results, and our analyses. We do this just for kicks and we hope you can learn something from the results.

And we hope you have a nice day.

Re:Translation (3, Funny)

bendodge (998616) | more than 7 years ago | (#18057644)

How did that get modded insightful? When there is more demand the price goes down, not up!

Re:Translation (1)

Schraegstrichpunkt (931443) | more than 7 years ago | (#18057806)

How did that get modded informative? That's not informative. This [wikipedia.org] is informative.

Re:Translation (1)

Alien Being (18488) | more than 7 years ago | (#18057824)

"When there is more demand the price goes down, not up!"

That would make for some very strange auctions.

Re:Translation (1)

jlarocco (851450) | more than 7 years ago | (#18057850)

How did that get modded insightful? When there is more demand the price goes down, not up!

Sigh. That's the most misinformed post I've ever seen on Slashdot. Demand, by itself, says absolutely nothing about the price of something.

Re:Translation (1)

Jahz (831343) | more than 7 years ago | (#18057938)

More likely: "We buy millions of dollars worth of drives each year, and our buying decisions are driven in part by the reliability data that we collect. If we told everyone what kind of drives work best, more people would buy those drives, driving up the price that we pay."
You tard. Demand and price in a free market are reversely proprotional. Go back to high school economics! Not only would that, but the great drive company mentioned would probably get more press and money leading to more R&D and even better drives.

I wish Google released the data they found because it would force the crappy drive companies to improve their products.

DUH (-1, Flamebait)

Lord Kano (13027) | more than 7 years ago | (#18057238)

Did anyone else find that data?

Google it. [google.com]

LK

Re:Did they ever name the brands? (5, Funny)

Anonymous Coward | more than 7 years ago | (#18057458)

They would have released that data, but it was saved on a Maxtor.

Funniest response of the whole story (0, Redundant)

SuperKendall (25149) | more than 7 years ago | (#18057622)

And I agree with your implication.

Re:Did they ever name the brands? (1)

MadMorf (118601) | more than 7 years ago | (#18057534)

They specifically stated they would not be revealing the brands or models.

I think that's understandable given the litigious nature of business today...

Makes it a little less useful from a practical standpoint though...

What do you want to bet (1)

Beryllium Sphere(tm) (193358) | more than 7 years ago | (#18057884)

that it changes more from year to year and model to model than from one manufacturer to another?

Re:Did they ever name the brands? (1)

Nogami_Saeko (466595) | more than 7 years ago | (#18057970)

I was disappointed that they didn't offer this information in the report - but not really surprised.

Re:Did they ever name the brands? (1)

nolife (233813) | more than 7 years ago | (#18058058)

You did not miss anything. The report states:

However, in this paper, we do not show a
breakdown of drives per manufacturer, model, or vintage
due to the proprietary nature of these data.


and then add to it with:

Interestingly, this does not change our conclusions. In
contrast to age-related results, we note that all results
shown in the rest of the paper are not affected signifi-
cantly by the population mix.


Proprietary? Wrong use of the word there. What they really mean is we do not want to make specific companies look bad or maybe they do not want people to make incorrect conclusions based on the scope of their specific testing. In reality, I think the specific models and companies would be interesting though.
For hard drives in general, this is very interesting information. For what specific drives to avoid, this report is no useful.

Re:Did they ever name the brands? (1)

Ruvim (889012) | more than 7 years ago | (#18058076)

All we have to do now is watch whom Google gets it's drives from next...

So (-1, Offtopic)

Anonymous Coward | more than 7 years ago | (#18057118)

are Western Digital and Maxtor really crap or not ?

Google should name and shame...

Re:So (1)

jd (1658) | more than 7 years ago | (#18057328)

You take the Google paper and the twenty others on disk failure, take the third page of each, sort them by their papers' Google rankings and take the middle letter of every 42nd word, whilst standing in the middle of a pentagram under the second full moon of the month.

Re:So (1)

triffid_98 (899609) | more than 7 years ago | (#18057472)

I've personally had much better luck with manufacturers offering 5 year warranties on their media. This does not include either of the manufacturers you mentioned...

Re:So (2, Informative)

mightyQuin (1021045) | more than 7 years ago | (#18057486)

From my experience, Western Digitals are (relatively) reliable. They unfortunately do not have the same power connector orientation as any other consumer drive on the planet, so if you want to use IDE RAID you have to get the type that either (1) fits any consumer ide drive or (2) fits a Western Digital Drive. (grr)

Had some good experiences with Maxtor. A couple of years ago (OK - maybe 6 or 8) we had batches of super reliable Maxtors - 10GB.

Some Samsungs are good, some are evil - the SP0411N was a particularly reliable model - the SP0802N sucked - out of a batch of 20, 15 of them died within a year: all reallocated sector errors beyond the threshold.

Seagates are a mixed bag too - been having a nice experience with the SATA models 160GB and 120GB - can't remember their model #'s off the top of my head. - The older Seagates, though, I spent a fair amount of time replacing.

IBM DeskStar's, as far as I know, have been quite good - for some reason didn't use too many.

Re:So (0)

Anonymous Coward | more than 7 years ago | (#18057676)

Heh. The OLD Desktars were great. I had a 1GB and 20GB over the years, and they were fantastic. Some newer 20GB on up, there was a downright scandal about extremely high failure rates on certain lines. It sounds like 1 plant producing them was turning out duds with a near 100% failure rate. IBM sold off the storage division to Hitachi, who now sells Hitachi Deskstars. I can only assume they closed the bad plant, or made sure the clean room was actually clean 8-).

Re:So (2, Informative)

nevesis (970522) | more than 7 years ago | (#18057738)

Interesting.. but I disagree with your analysis.

The DeskStars were nicknamed DeathStars due to their high failure rate.

Maxtor has a terrible reputation in the channel.

Seagate has a fantastic reputation in the channel.

And as far as the WD power connectors.. I have 4 Western Digitals, a Samsung, a Maxtor, and a Seagate on my desk right now.. and they all have the same layout (left to right: 40 pin, jumpers, molex).

Re:So (1)

mightyQuin (1021045) | more than 7 years ago | (#18058072)

It's a very slight difference in the positioning of the WD power connector within the physical position on the drive. It's still a 40 pin standard power connector, but you cannot slide it into the housing of an AccuSYS IDE RAID drive bay. You have to order a different AccuSYS model that is specifically for WD parallel IDE drives.

Out of curiosity, what model of Seagate has the fantastic rep?

Re:So (1)

Nogami_Saeko (466595) | more than 7 years ago | (#18057998)

I had a bad run with Western Digital drives a while back and switched to Maxtors, which I found to be very reliable when they were first putting out 250GB drives. Had a bad experience with a Seagate dropping dead within the first week after purchase, fortunately I got most of my data off of it.

Seagate also does NOT offer advance drive replacement in Canada, which means I'll never buy another of their products until this policy changes.

Had good luck with more recent Western Digital drives. Put 5 x 500GB in a RAID-5 server, and they're running great!

N.

You guuuyyys... (-1, Offtopic)

iminplaya (723125) | more than 7 years ago | (#18057122)

PDF alert, Okay? Now I have to crash my browser because the download won't finish.

Re:You guuuyyys... (0)

Anonymous Coward | more than 7 years ago | (#18057254)

Why don't you just set your browser to 'download PDF files to disk' instead of 'opening PDF files in browser window'. That way, you can always abort the download, or better still, continue browsing while the PDF downloads?

Re:You guuuyyys... (1)

iminplaya (723125) | more than 7 years ago | (#18057478)

The download crapped out. And I couldn't close the tab. It's just an unpleasant surprise when everything locks up for a while. It's just two little words. A simple courtesy, no? For those of us who don't always remember to check the status bar. I just right clicked and saved the link after reloading.

TargetAlert for FireFox (1)

goldragon (170416) | more than 7 years ago | (#18057618)

if you use Firefox, get the TargetAlert extension. it adds a small image after links that are pdfs, Word docs, etc. so you'll have some forewarning.

Re:You guuuyyys... (0, Offtopic)

westyvw (653833) | more than 7 years ago | (#18057264)

What browser? Dont you have a kpdf or xpdf ready to read it???? DUH

Re:You guuuyyys... (0, Flamebait)

avalys (221114) | more than 7 years ago | (#18057300)

Why should we have our screens cluttered up with "PDF ALERT! PDF ALERT!" because you can't figure out how to configure your system properly?

If you're too lazy or lacking in knowledge, buy a Mac - PDFs load instantly in OS X right out of the box.

Re:You guuuyyys... n0 sk1llz (0)

Anonymous Coward | more than 7 years ago | (#18057506)

In my Firefox browser I can see a nice little PDF icon warning me of a PDF file.

Also, no need to buy a mac, PDFs work instantaneously outta box on my Ubuntu Linux...

Re:You guuuyyys... (0)

Anonymous Coward | more than 7 years ago | (#18057336)

Just kill acrord32.exe. Firefox recovers and gives you a blank page with control back to you.

Google had this paper ready a year ago (3, Funny)

Anonymous Coward | more than 7 years ago | (#18057138)

But the disk it was on failed.

Conclusion (3, Informative)

llZENll (545605) | more than 7 years ago | (#18057152)

This is awesome, but the conclusion of such an interesting study leaves a lot to be desired. FTA...

"In this study we report on the failure characteristics of consumer-grade disk drives. To our knowledge, the study is unprecedented in that it uses a much larger population size than has been previously reported and presents a comprehensive analysis of the correlation between failures and several parameters that are believed to affect disk lifetime. Such analysis is made possible by a new highly parallel health data collection and analysis infrastructure, and by the sheer size of our computing deployment.

One of our key findings has been the lack of a consistent pattern of higher failure rates for higher temperature drives or for those drives at higher utilization levels. Such correlations have been repeatedly highlighted by previous studies, but we are unable to confirm them by observing our population. Although our data do not allow us to conclude that there is no such correlation, it provides strong evidence to suggest that other effects may be more prominent in affecting disk drive reliability in the context of a professionally managed data center deployment.

Our results confirm the findings of previous smaller population studies that suggest that some of the SMART parameters are well-correlated with higher failure probabilities. We find, for example, that after their first scan error, drives are 39 times more likely to fail within 60 days than drives with no such errors. First errors in reallocations, offline reallocations, and probational counts are also strongly correlated to higher failure probabilities. Despite those strong correlations, we find that failure prediction models based on SMART parameters alone are likely to be severely limited in their prediction accuracy, given that a large fraction of our failed drives have shown no SMART error signals whatsoever. This result suggests that SMART models are more useful in predicting trends for large aggregate populations than for individual components. It also suggests that powerful predictive models need to make use of signals beyond those provided by SMART."

Similar paper (4, Informative)

reset_button (903303) | more than 7 years ago | (#18057164)

I was at the talk, and it was very interesting. CMU also had a paper (PDF) [cmu.edu] about disk failures in the same conference (in fact, they presented one after the other).

Re:Similar paper (1)

Driador (923291) | more than 7 years ago | (#18057748)

I was also there this year; both papers presented were very interesting. I have the feeling I will be chewing over the printed Proceedings book here for a while.

and in the meanwhile... (3, Informative)

pedantic bore (740196) | more than 7 years ago | (#18057220)

... at the same conference, Bianca Schroeder presented a paper [cmu.edu] disk reliability that developed sophisticated statistical models for disk failures, building on earlier work by Qin Xin [ucsc.edu] and dozen papers by John Elerath... [google.com]

C'mon, slashdot. There were about twenty other papers presented at FAST this year. Let's not focus only on the one with Google authors...

Re:and in the meanwhile... (3, Insightful)

oGMo (379) | more than 7 years ago | (#18057410)

While at a glance, it may seem like this is simply "the latest thing google did," and... let's be honest, given the editor in question... this was most likely the reason it made the front page. But while Bianca Shroeder's report, for instance, uses statistics from various unnamed sources and for various unnamed uses, the Google report is interesting because we know exactly where it's coming from and what it's being used for.

Of course, a truly insightful story would have taken this opportunity to compare Google's findings with the others and report on that.

With all of Google's cash... (0, Offtopic)

Anonymous Coward | more than 7 years ago | (#18057232)

you'd think they could afford statisticians. Survival analysis anyone? http://cran.r-project.org/ [r-project.org]

SMART works for me. (1)

shadowofdarkness (578100) | more than 7 years ago | (#18057260)

I find SMART to work at detecting failures. A couple months ago I turned on my laptop and it gave me a SMART error saying my hard drive was going to die soon. But gave me the choice of continuing bootup which I did from a livecd to make one final backup. I never lost one bit of data thanks to dd'ing to another computer on my network.

Temperature conclusion (4, Interesting)

phasm42 (588479) | more than 7 years ago | (#18057310)

Their statistics on temperature seem very unusual. I'm surprised they didn't explore this more. For example, is the high failure rate associated with low temperatures because the drives were more likely to be inactive due to failure?

Re:Temperature conclusion (2, Insightful)

Chalex (71702) | more than 7 years ago | (#18057548)

The chart implies that the "optimal" operating drive temperature is 35-45 Celsius. Drive temperatures below room temperature (below 22 Celsius) is probably not a scenario that drive manufacturers optimise for.

Re:Temperature conclusion (3, Interesting)

gnu-sucks (561404) | more than 7 years ago | (#18057568)

My guess is this graph on temperature distribution is more or less a graph of temperature sensor accuracy. I can't imagine that drives at 50C had the lowest failure rate.

While this would require a more laboratory-like environment, a dozen drives of each type and manufacture could have been sampled at known temperatures, and a data curve could have been established to calibrate the temperature sensors.

There are lots of studies out there where drives were intentionally heated, and higher degrees of failure were indeed reported (this is mentioned in the google report too). So the correlation is probably still valid, just not well-proven.

Lower temp == higher failure rates (4, Interesting)

flyingfsck (986395) | more than 7 years ago | (#18057350)

To my mind the most significant piece of info: "The gure shows that fail- ures do not increase when the average temperature in- creases. In fact, there is a clear trend showing that lower temperatures are associated with higher failure rates. Only at very high temperatures is there a slight reversal of this trend."

Re:Lower temp == higher failure rates (1)

beavis88 (25983) | more than 7 years ago | (#18057446)

But did the lower temperature actually cause the failures? Such a counterintuitive conclusion seems like it'd be worth some further examination...I can turn off some fans in my cases and get the drives back up into the 40-45C range pretty quickly if need be!

Re:Lower temp == higher failure rates (2, Insightful)

Anonymous Coward | more than 7 years ago | (#18058032)

perhaps there is some correlation between lower temperature and higher forces, ie. a drive that starts and stops frequently may have a lower temperature, but would undergo more acceleration and stress

Proprietary makes sense here (4, Insightful)

Mammothrept (588717) | more than 7 years ago | (#18057396)

"...we do not show a breakdown of drives per manufacturer, model, or vintage due to the proprietary nature of these data."

Litigation avoidance may be a consideration here but why not take Google at their word? Google is a search company that buys lots of hard drives. Based on their own internal research, they have developed information about which hard disk models and/or manufacturers are shite.

Yahoo is also a search company that buys lots of hard drives. Why should Google give that hard drive reliability information to you, me and Yahoo for free? Let Yahoo/Excite/MSN and the competitors figure it out for themselves.

Yeah, sure I'd like to have access to Google's data the next time I'm in the market for a hard drive but I won't hold a grudge against them if they don't do my consumer research for me. On the other hand, whereinafuck is the data from Tom's Hardware Guide, Anandtech, Consumer Reports and all the other reviewer and consumer sites? If someone doesn't have a handy link to their results, I'll see if I can google something up:

http://www.google.com/search?hl=en&safe=off&client =firefox-a&rls=com.ubuntu%3Aen-US%3Aofficial&hs=tq y&q=hard+drive+reliability+research+brands++manufa cturers+models&btnG=Search [google.com]

Re:Proprietary makes sense here (0, Interesting)

Anonymous Coward | more than 7 years ago | (#18057496)

Why not, "Here's what works best for us, maybe this additional data will help improve reliability and help the entire computing field in general."? And maybe everyone in the world (betterment of humanity, that sort of thing?) could benefit from it? Like by avoiding a product line that is demonstrably inferior (No worries about lagging sales, I'm sure Dell would buy them for their discount line of PCs).

I forget: It's always "fuck people", and "fuck trying to make this world a better place", and "Where's my goddamn profit I'm entitled too?!", and "Get back to work slaves..."

Yeah it makes sense to lock everything up as proprietary. Nothing to spur progress and prevent waste like having multiple efforts duplicated and hiding the results so nobody is sure what is the best way, and taxing and profiting any way how. I can't wait until they figure out a way to charge us to breath. Can I get my verichip tracking device embedded in my skull please? Open Source is treason. Zeig Heil her Bush & Blair and Haliburtton and Google.

Re:Proprietary makes sense here (0)

Anonymous Coward | more than 7 years ago | (#18057566)

Apparently Google's secret other motto is "Do no good."

This speaks volumes. (4, Funny)

greenguy (162630) | more than 7 years ago | (#18057398)

Google releases a paper on disk reliability.

TEp!.. (-1, Troll)

Anonymous Coward | more than 7 years ago | (#18057416)

fly...3on't fear fr0m one folder on

So this article.. (0, Offtopic)

shiningdays (991952) | more than 7 years ago | (#18057574)

has quite a few grammatical errors. Is this a result of disk failure?

power supplies (1)

digitalhermit (113459) | more than 7 years ago | (#18057700)

This is completely anecdotal, unscientific... Since building out two servers a couple years ago, each with approximately 800G of drive space, I've had to replace drives on average of one every 8 weeks. In my lab there are about twenty drives across 8 machines, so that number is not too bad. Or so I thought. After replacing all my power supplies my drive failures have gone way down. The only drive I've lost recently is one in an older machine with an ancient 300W power supply.

How about a little color? (0)

Anonymous Coward | more than 7 years ago | (#18057706)

Would it have killed them to vary the colors in the charts -- Figures 8 and 11-13 are pretty much unreadable even when zoomed above 100%.

OS X SMART tool? (0)

Anonymous Coward | more than 7 years ago | (#18057766)

So what tool on Mac OS X will provide all the SMART data?

Run smartd and look for scan errors (1)

SysKoll (48967) | more than 7 years ago | (#18057808)

Well, the article's conclusion looks pretty clear to me. Watch for scan errors in smartd reports. When they start happening, migrate your data off that disk and replace it.

I'm obviously behind the times, but... (0)

NeuroManson (214835) | more than 7 years ago | (#18057840)

What is SMART monitoring really good for? Not one drive I've had it enabled has given me a "warning, this drive is about to fail" alert. Instead, it would be a random clunking sound, or the system would freeze up entirely (hard to tell what's at fault at first, if you use Windows.;).

Now I'm no engineer, but what strikes me as a better alternative is to toss on a 1Gb flash storage chip, and keep a redundant index/record of live/recoverable sectors. As the HD first starts to fail, the BIOS can pop up a warning window on reboot, advising the user to put in a replacement. After which, the HD clones itself automatically, based on the index files. Files that are damaged could be recovered as well, or discarded in the ol' bitbucket. 45 minutes for an OS repair install, and you're done. No scrambling to download everything all over again.

Another alternative is a hybrid solid state HD (I think there was a /. article about this a while ago). If the HD BIOS detects impending doom, it can just dump the most critical (eg; user files, OS, irreplaceable stuff) to flash, then copy it over to a fresh drive.

But anyhoo, that's my 2 cents.

Re:I'm obviously behind the times, but... (2, Informative)

DragonTHC (208439) | more than 7 years ago | (#18057874)

that sounds like a great idea, however, flash memory has a habit of failing with no warning whatsoever as well.

Re:I'm obviously behind the times, but... (1)

sporkmonger (922923) | more than 7 years ago | (#18057918)

Or, you could always try using ZFS instead. But then you'd have to either run Solaris or wait for one of the ongoing ports of ZFS to finally be finished. But yeah... the solution, IMHO, isn't more/better hardware, but rather better software.

Woohoo! (1)

memnoch37 (1047172) | more than 7 years ago | (#18057902)

Now I don't feel bad about turning of SMART reporting all those years ago. I never did trust that crap... On a side note, it would be interesting to see who Google signs their next contract for disk drives with...

drive failure != lost data (1)

sxtxixtxcxh (757736) | more than 7 years ago | (#18057914)

so, what do they do with the failed drives? where does that data go? what is their procedure for tossing these drives?
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>