Beta

×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Intel Confirms Data Corruption Bug, Halts New SSDs

ScuttleMonkey posted more than 4 years ago | from the solid-state-death dept.

Data Storage 137

CWmike writes "Intel has confirmed that its new consumer-class X25-M and X18-M solid state-disk drives (SSDs) suffer from data corruption issues and said it has pulled back shipments to resellers. The X25-M (2.5-inch) and X18-M (1.8-inch) SSDs are based on a joint venture with Micron and used that company's 34-nanometer lithography technology. That process allows for a denser, higher capacity product that brings with it a lower price tag than Intel's previous offerings, which were based on 50-nanometer lithography technology. Intel says the data corruption problem occurs only if a user sets up a BIOS password on the 34-nanometer SSD, then disables or changes the password and reboots the computer. When that happens, the SSD becomes inoperable and the data on it is irretrievable. This is not the first time Intel's X25-M and X18-M SSDs have suffered from firmware bugs. The company's first generation of drives suffered from fragmentation issues resulting in performance degradation over time. Intel issued a firmware upgrade as a fix."

cancel ×

137 comments

Sorry! There are no comments related to the filter you selected.

Test before you ship (4, Interesting)

alain94040 (785132) | more than 4 years ago | (#28933773)

Maybe they should have used HW/SW co-verification (like Seagate in that study [eve-team.com] - an example of how a storage company tests their firmware).

For you software developers out there who enjoy free debuggers, you should know that we, hardware designers, also have our own debuggers. Except they are a little bit more expensive (think $500,000+) and can be quite bulky. But they are the only way to really test firmware before taping-out a chip.

Typical redditor (0, Troll)

Sybert42 (1309493) | more than 4 years ago | (#28933903)

This is the land of computer science people--they are the majority.

Re:Typical redditor (1)

QuoteMstr (55051) | more than 4 years ago | (#28933925)

What the hell is that supposed to mean? Data structures and algorithms don't suddenly work differently when they're synthesized from Verilog instead of compiled from C.

Re:Typical redditor (3, Insightful)

Anonymous Coward | more than 4 years ago | (#28934119)

Yes, they do.

C doesn't have voltage or current leaks.

Re:Typical redditor (1)

QuoteMstr (55051) | more than 4 years ago | (#28934155)

So how do voltage and current leaks invalidate the universal mathematical principles of computer science? I'm beginning to get a whiff of anti-intellectualism here.

Re:Typical redditor (4, Insightful)

Movi (1005625) | more than 4 years ago | (#28934317)

Because suddenly your code becomes time-based, eg it matters WHEN x=0 becomes x=1, and what's in between.

Believe me, this kicks you in the balls really hard. I still remember the frustration on my Altera course, where in simulation everything worked fine, but once flashed onto a FPGA everything went to shit.

Re:Typical redditor (2, Interesting)

atmurray (983797) | more than 4 years ago | (#28935475)

So? It's just a set of different paradigms. It's just like using a different programming language. 99.9% of the time if your code works during functional verification testing (which doesn't simulate the physics of hardware) it will work fine in timing/hardware verification and then also in real hardware (so long as you don't violate any timing constraints, which your synthesis tool will tell you about). That's one of the reasons why RTL synthesis tools like Cadence are so insanely expensive, because they do allow you to go from function verification which verifies the syntax and semantics of your code to hardware verification which allows you to ensure your design will work as expected in actual hardware. If you're getting "kicked in the balls really hard" then it's probably because you need to brush up on your VHDL/Verilog, just like if you're getting segfaults when writing C you're doing something wrong. It doesn't mean that the process is any less deterministic.

Re:Typical redditor (3, Insightful)

Obfuscant (592200) | more than 4 years ago | (#28935671)

... just like if you're getting segfaults when writing C you're doing something wrong. It doesn't mean that the process is any less deterministic.

If you are getting segfaults in C you usually ASSUME that the processor you are running on is acting in a deterministic manner and ASSUME the problem is your code.

The DIFFERENCE is that SOMETIMES the underlying hardware is not acting deterministically because it is a PHYSICAL system that has physical flaws or imperfections. Like leakage currents that are JUST a tiny bit too much, or depend on the state of the neighboring circuit or the temperature.

In other words, I've written C code that had "segfaults" and it wasn't the fault of the C code, it was memory issues that resulted in problems. And I've written C code that suffered from a buggy compiler, too. I've also written code that "misread" about 1% of the characters typed in at the terminal, and it wasn't the code that was at fault, it was the UART.

I don't know anything about the source of Intel's problem, but I will say that they can send me ALL of the "defective" SSDs and I'll give them a home where I promise never to set a password on the disk or change it after I do.

Re:Typical redditor (4, Informative)

NP-Incomplete (1566707) | more than 4 years ago | (#28935659)

On a chip, adding 2^256-1 and 1 may not equal 2^256 when:
  1. Your destination register is 256 bits.
  2. Your destination register is in a different clock domain.
  3. Your timing constraints are wrong.
  4. Your power grid cannot support switching 256 registers.

Functional simulations will only catch #1.

Re:Typical redditor (1)

chgros (690878) | more than 4 years ago | (#28934877)

C doesn't have voltage or current leaks.

But C has a lot more loops and pointers, which makes verification a lot harder (I work on a static analysis tool for C/C++, and it's also very expensive ;) )

Re:Typical redditor (0)

Anonymous Coward | more than 4 years ago | (#28935573)

Nonsense. In C/C++ I can set break points or use a debugger to see anything I want. In hardware it is often impossible to watch the signal at a part of a circuit because it is sensitive to my probes. too small, inside a chip etc. For that reason I have to figure out which part of the hardware is faulty by diagnosing its effects on other circuits that I can get too. The process of elimination can often be hard and lengthy - especially with analogue circuitry.

Re:Typical redditor (1)

chgros (690878) | more than 4 years ago | (#28936145)

I guess in hardware static analysis is easier, and dynamic analysis is harder.
The link I saw seemed to indicate static analysis.

Re:Typical redditor (1)

ihavnoid (749312) | more than 4 years ago | (#28935563)

Yes, it doesn't work. If you ever tried to design something using Verilog or VHDL, and tried to generate a real-world design, either an FPGA or a real chip, you will see that things aren't so easy.

I learned it the hard way, while doing my last year of undergraduate course. The simulation worked perfectly - correct input, correct output. On the other hand, making it work on the FPGA was a horrible, horrible, horrible job. Took 2 weeks of trying this, trying that, still with no clue.

Although the problem was a small behavior/synthesis mismatch, I found out that this was going to be a horrible job, because you may have bosses thinking just like you, and ask you to complete the implementation job by a few days. The truth is, that each synthesis job (equivalent to compiling) takes hours (if not days) to complete, and it is almost certain that it won't run on the first try. Believe me, there is a reason that there is a multi-billion dollar market for designing and verifying chips, where a huge portion of that is verification and debugging.

For firmwares, it is sorta similar state. You have to work around hardware bugs, e.g. you have to avoid calling some instruction that is supposed to work, and did work on simulation, because the processor screws itself when that instruction is called once every million time. The problem is, not calling that instruction may be possible, but identifying the problem gets really dirty.

Now I write simulators and models for simulation, rather than writing HDL code that should end up inside some FPGA or ASIC. I am much happier now, since Intel and AMD did a lot of work to verify and fix their dirty bugs, and I can trust the underlying hardware.

Re:Typical redditor (0)

Anonymous Coward | more than 4 years ago | (#28934037)

This is the land of nerds--News for nerds. Stuff that matters.

Or are you just new around here?

Re:Typical redditor (1, Funny)

Anonymous Coward | more than 4 years ago | (#28934147)

"Or are you just new around here?"

I would ask the same of you, replying to an obvious troll like that :P

Re:Test before you ship (5, Informative)

Anonymous Coward | more than 4 years ago | (#28934069)

As a professional FW tester, I can say 1) firmware can be tested easier than the hardware verification the parent is talking about, and 2) Parent is confusing HW verification with firmware verification. Don't confuse HW verification with Firmware, and don't confuse Software testing with hardware verification. They are vastly different than each other, and have their own set of tools and methods (try sitting through a STAR East or STAR West seminar as a FW tester - it is a total waste of time).

I can (and do) test firmware on buggy hardware all day long - its not an issue.

Re:Test before you ship (0)

Anonymous Coward | more than 4 years ago | (#28934137)

FPGA HW/SW development beats any Co-verification methodology if the CPU/MPU actually exists in Si. There is nothing like at speed validation.

BTW $500k for ModelSim seems a little steep :). You must be talking about an emulator.

Re:Test before you ship (1, Insightful)

Anonymous Coward | more than 4 years ago | (#28934235)

For you software developers out there who enjoy free debuggers, you should know that we, hardware designers, also have our own debuggers. Except they are a little bit more expensive (think $500,000+) and can be quite bulky. But they are the only way to really test firmware before taping-out a chip.

Or, if you designed your FW properly (as a piece of modularized software running with stubs and drivers for testability), you could have tested it before dumping it to a live EPROM. Or are you proposing that this was a real hardware fault, and not a problem with the firmware?

Sorry, your software is not a unique snowflake. I know you think it's special because it runs in an embedded environment, but if you chose to ignore what software developers have spent the last 60 or 70 years in developing best practices, then you do it at your own peril. Your failure to do things properly is not because your discipline is light years more complicated than ours, it's simply because you think you are too good to learn from us.

Re:Test before you ship (0, Flamebait)

rtb61 (674572) | more than 4 years ago | (#28935167)

Nah, c'mon, everyone knows what happens if it had been a software fault rather than a hardware fault, you simply lie about it for the first few months, while you create a patch and, blame the software fault on user configurations, hardware, drivers and other applications, then secretly incorporate a fix in the next, bug 'er' security fix.

Weird isn't it, hardware costs more to design and far more to produce and it has real warranties not B$ fantasy warranties and is way more reliable to 'boot' (get it, heh).

Here is a hint, you want to know which product is the most reliable, RTFW, read the F*****G warranty.

Re:Test before you ship (0)

Anonymous Coward | more than 4 years ago | (#28935529)

Intel develops and manufactures the most complicated artifacts in the universe, I think they probably know a...little something about testing. No matter what you do, sometimes bugs just makes it through.

Or just settle and Gag Order (1)

Anonymous Cowar (1608865) | more than 4 years ago | (#28935571)

At least they're not like some companies [slashdot.org] that ignore that there is a (tiny tiny tiny) problem and just gag its customers.

Ugh... summary.... (3, Informative)

blahplusplus (757119) | more than 4 years ago | (#28933833)

"The company's first generation of drives suffered from fragmentation issues resulting in performance degradation over time."

The performance degradation in the Intel X-25 is not because of a "firmware bug". All SSD's will suffer performance degradation whether or not their writing/wear leveling algorithms have been updated via firmware.

Re:Ugh... summary.... (4, Informative)

ShadowRangerRIT (1301549) | more than 4 years ago | (#28933997)

The X25-M's initial firmware was unusually bad; the degradation was more rapid and more severe than necessary. Thus, they issued a firmware update [slashdot.org] . The results were quite impressive [pcper.com] . It not only reduced the perf degradation, but it seems to have made writes faster across the board.

Re:Ugh... summary.... (2, Informative)

blahplusplus (757119) | more than 4 years ago | (#28934205)

"Although Intel acknowledged that all of its SSDs will suffer from reduced performance because of significant fragmentation, the type of write levels needed to reproduce PC Perspective's results aren't likely for everyday users, whether they're running Windows and Apple's Mac OS X. Even so, it still released the firmware upgrade to slow fragmentation."

Re:Ugh... summary.... (0)

Anonymous Coward | more than 4 years ago | (#28934261)

Then what did the firmware fix if not performance degradation. It didn't eliminate it completely, just the fragmentation from the firmware bug problem.

Re:Ugh... summary.... (2, Informative)

cecom (698048) | more than 4 years ago | (#28934487)

The X25-M's initial firmware was unusually bad; the degradation was more rapid and more severe than necessary.

Unusually bad? More severe than necessary? Not really. Even with this supposed degradation, it was ages ahead of any and all competition. What was unusually bad was the complete lack of understanding from all reviewers who did not understand basic principles and the fundamental limitations of flash and yet rushed ahead with their articles. Those poor fools expected that the driver should behave like a regular HDD - they weren't prepared for the unavoidable deterioration in performance.

I expect they will be similarly surprised when some drives stop working, because Flash has a very limited number of rewrites. Wear leveling improves the situation, but it just postpones the inevitable. For example, if the driver us full to capacity and you start rewriting a single sector at full speed, you will get to the 10000 rewrite limit relatively quickly.

Re:Ugh... summary.... (1)

maxume (22995) | more than 4 years ago | (#28934589)

Between spare sectors and the fact that sectors are not physical things (they are mapped), no, you won't hit the 10000 rewrite limit relatively quickly.

To put it more clearly, recent wear leveling algorithms move full sectors, spreading writes over the entirety of the actual physical storage.

Re:Ugh... summary.... (2, Informative)

cecom (698048) | more than 4 years ago | (#28934917)

Don't answer with generalities unless you have really thought about it. Wear-leveling is based on heuristics; since it cannot predict the future it is always possible to construct scenarios which will hit the worst case. And if it is theoretically possible, it will happen.

Imagine a simple case and go from there. Imagine a flash with 5 blocks total, 4 sectors per block. The logical capacity is 16 sectors; the extra block is over-provisioned for wear leveling, etc. Now, imagine that you have the 4 blocks neatly filled with occupied sectors and the 5-th block is erased.

What happens if you want to write to a random sector? The sector is written in the erased space in the 5th block and its physical position is updated in the map. If you repeat that operation 3 more times, the 5th block will get filled with 4 used sectors, and each of the other 4 blocks will have one invalid sector on the average. So far so good.

What happens if you want to rewrite a random sector now, though? Tough luck. You need to erase a whole block, pack all valid sectors in it, and write the modified sector.

From now on you get one erase per sector write. Not only that, but you get 3 additional writes. That is called write amplification and is unavoidable in the worst case.

Now, tell me, how will wear leveling have helped this? Wear leveling works well only well there is plenty of free space. And even then it is possible to construct artificial bad scenarios.

Re:Ugh... summary.... (1)

maxume (22995) | more than 4 years ago | (#28935143)

Ok, pin down 'relatively quickly' for a drive with 156 million logical sectors and tens of thousands (or more?) of reserve sectors, instead of made up specifics. Even in the event that Intel decided to only include 1 spare sector and are overstating write limits by a factor of 2, you should get something like 700 billion writes (for the 80 gigabyte consumer model!), given the exactly degenerate data layout that you specify.

Assuming you only write 1 bit for each of those sector deletions, that's still almost 100 gigabytes before you reach 1/2 the stated write factor. And that is a silly, silly, silly, silly degenerate case, not something to actually spend time thinking about when considering throwing one of the drives in a laptop.

Too early to adopt (1)

Ilgaz (86384) | more than 4 years ago | (#28935537)

What makes Intel a hard disk vendor anyway? Yes, it is still a disk. Expertise which Intel doesn't have is a huge factor along with software support.

Other alternative? It is "OCZ" and Samsung. What kind of software support do they give? Zero. Samsung can't even produce pages without english spelling mistakes.

Call me old fashioned, I am waiting and will continue to wait until Seagate, Western Digital does real stuff, not "we can do it too" stuff if you understand what I mean.

Re:Too early to adopt (3, Insightful)

magarity (164372) | more than 4 years ago | (#28935823)

What makes Intel a hard disk vendor anyway? Yes, it is still a disk
 
It's solid state mass storage, where "solid state" = "chips". A disk is a spinning thingy which is completely different. Since Intel designs and make chips (see: "solid state" = "chips"), it is a perfect choice for them to make solid state mass storage devices out of chips.
 
Have I mentioned the relationship between "solid state" and "chips" and how "solid state" != "spinning thingy"?

Re:Too early to adopt (1)

Ilgaz (86384) | more than 4 years ago | (#28935867)

Well, as a video guy, I can easily say we will keep on using our SCSI magnetic "thingies" until something that fast and that reliable, which won't "wear out" comes up from a trustable vendor.

Do you know the technology and expertise required to make a consumer price 1 TB drive? We don't speak about RAMAC here.

Intel better stay in their core business, a CISC CPU monopoly and leave the storage to people who actually knows it. BIOS password change results in data loss? come on really.

Re:Too early to adopt (1)

binary paladin (684759) | more than 4 years ago | (#28935945)

Not that I'm on either side of this but...

Is there some kind of brand of HD that doesn't "wear out"?

Re:Too early to adopt (1)

DaleGlass (1068434) | more than 4 years ago | (#28936555)

Well, as a video guy, I can easily say we will keep on using our SCSI magnetic "thingies" until something that fast and that reliable, which won't "wear out" comes up from a trustable vendor.

Then you need to start looking for some new way to store data, since traditional hard disks eventually wear out as well. As any mechanical device it will suffer wear, and sooner or later something will get out of tolerance.

Everything wears out at some point. Moving hard disk parts wear out mechanically, all chips will eventually suffer from enough electromigration to fail, electolytic capacitors will eventually dry out and fail.

Re:Too early to adopt (1)

kirillian (1437647) | more than 4 years ago | (#28936175)

Perhaps...I have heard pretty good things about OCZ's quality control, however.

More importantly, however, I have my doubts that Western Digital and Seagate are about to jump into the SSD realm head first. They have the rest of their business to think about. Despite the fact that technology may be moving to SSDs in the future, those who run the company have to consider the fact that they are responsible to their investors NOW. So, they will continue to make money from their current business model which makes their investors happy. I have a hard time seeing that they would spend the money and invest in this new technology which COULD potentially be left behind in a couple years...it seems to me to be a little bit of a catch-22...I could, in one sense, see them actually investing in SSDs, but I feel that they are more likely to follow the leads that other companies have lately - assume that the chicken will keep lay golden eggs.

Re:Ugh... summary.... (4, Informative)

Krizdo4 (938901) | more than 4 years ago | (#28934035)

The performance degradation in the Intel X-25 is not because of a "firmware bug".

Bugs can cause slowdowns, too

Though it's highly regarded, Intel's X25-M SSD had a firmware bug that adjusted the priorities of random and sequential writes, leading to a major fragmentation problem that dropped throughput dramatically. The issue was originally uncovered by PC Perspective after two months of testing. Those tests showed that write speeds dropped from 80MB/sec. to 30MB/sec. over time, and read speeds dropped from 250MB/sec. to 60MB/sec. for some large block writes.

https://www.techworld.com.au/article/302571/ssd_performance_--_slowdown_inevitable?pp=3 [techworld.com.au]

Before firmware update

the result suggested a write speed of 30 MB/sec.

http://pcper.com/article.php?aid=691&type=expert&pid=3 [pcper.com]

After firmware update

After composing myself, I did the same file copy I had tried earlier. 76 MB/sec.

http://pcper.com/article.php?aid=691&type=expert&pid=4 [pcper.com]

Not a firmware bug?

Re:Ugh... summary.... (1)

blahplusplus (757119) | more than 4 years ago | (#28934419)

""Although Intel acknowledged that all of its SSDs will suffer from reduced performance because of significant fragmentation, the type of write levels needed to reproduce PC Perspective's results aren't likely for everyday users, whether they're running Windows and Apple's Mac OS X. Even so, it still released the firmware upgrade to slow fragmentation.."

Re:Ugh... summary.... (1)

MobyDisk (75490) | more than 4 years ago | (#28936081)

Adding an optimization does not mean that the previous revision was a bug.

Re:Ugh... summary.... (1)

Eil (82413) | more than 4 years ago | (#28934467)

The performance degradation in the Intel X-25 is not because of a "firmware bug". All SSD's will suffer performance degradation whether or not their writing/wear leveling algorithms have been updated via firmware.

You're missing several months of history here.

Back in February, several reviewers found that the X-25s performance fell to unacceptably low levels after a certain threshold was reached. Intel tried to deny it, saying that you'd never see the problem in real-world usage and only benchmarking the disk in a certain way would trigger the behavior. Which may be true, but the hardcore "Pimp My PC" crowd aren't going to spend hundreds of dollars on a disk that has even a remote chance of being triggered into a non-recoverable slow mode.

Intel relented and released a firmware to fix the issue, and the benchmarkers and reviewers saw the fragmentation problem vanish [pcper.com] . It was a big deal because Intel positioned the disks to be the high-end in the SSD market and they were able to overcome most of the downsides to using SSDs in place of mechanical disks. (Except the price.)

Product Killing Bugs (0)

Anonymous Coward | more than 4 years ago | (#28933855)

Drivers and Firmware are Intel's biggest weakness. A major possible showstopper for Larrabee. This is just another example on top of the years of historical failures (e.g., all Intel IGPs which had appalling drivers, or late drivers - up to a year to deliver promised features).

Anyway, corruption bugs on storage are a product killer in the marketplace.

I find this disturbing (0)

Lord Byron Eee PC (1579911) | more than 4 years ago | (#28933861)

Not the bug, but the fact that its in the firmware. Are we looking at a future where we not only have to download updates to fix bugs in our applications and operating systems, but our hardware as well? Even worse, having a bug in a storage device is absolutely unacceptable. It's one thing when my webcam doesn't work, but if I lose all of my data, that's another.

To Intel's credit though, unlike Seagate, at least they are admitting there's a problem.

Re:I find this disturbing (0)

Anonymous Coward | more than 4 years ago | (#28933889)

They've seen, first hand, what not admitting a problem can do. Several times in fact. Also, you're quite correct, non correct storage media is worse than a non working computer.

Re:I find this disturbing (5, Insightful)

jtownatpunk.net (245670) | more than 4 years ago | (#28933987)

Future? You must be new to computers. I updated the firmware in my very first 80's printer to give it more features. Had to pop out the old chips and put in the new ones. I upgraded the firmware in modems from several different manufacturers (some more than once) to add features and fix bugs. I've updated the firmware (BIOS) on most of my motherboards. I've updated the firmware on optical drives. I've updated the firmware on a scanner. I've updated the firmware on SCSI controllers. I've updated the firmware on hard drives. I've updated the firmware on switches and routers. Hell, I've updated the firmware on keyboards.

This is hardly a new phenomenon.

Re:I find this disturbing (1)

digitalhermit (113459) | more than 4 years ago | (#28934413)

Hehe..
I remember updating my modem to support the .bis at some point.. Also remember upgrading TOS ROMs on my ST :D..

Re:I find this disturbing (3, Interesting)

Ungrounded Lightning (62228) | more than 4 years ago | (#28936099)

I remember updating the HARDWARE of my modem: Changing the swamping resistors to reduce the Q of the filters and broaden the passbands so the Rx side would work at 300 as well as the original 110 baud. B-)

Re:I find this disturbing (2, Interesting)

SBrach (1073190) | more than 4 years ago | (#28934481)

Dell has released updated firmware for my laptops BIOS 17 times.

Re:I find this disturbing (2, Informative)

couchslug (175151) | more than 4 years ago | (#28934923)

Aircraft (F-16 among others) flight control firmware has been updated by reprogramming UVPROMs for many years.

Re:I find this disturbing (1)

guruevi (827432) | more than 4 years ago | (#28936471)

I remember when firmware updates meant baking your chips in uv light and then plugging it into something you soldered on a perf-board and connected to the parallel port.

Re:I find this disturbing (1)

stonecypher (118140) | more than 4 years ago | (#28934019)

Are we looking at a future where we not only have to download updates to fix bugs in our applications and operating systems, but our hardware as well?

No, we're looking at a past like that. Lest you forget, both the 486 and the Pentium had firmware updates too (the Pentium FDIV bug being the better remembered of the two.) My first firmware update was a bugfix in a 300 baud accoustic coupler, way back in 1983 or thereabouts.

Can't imagine why you think this is anything new; even video game consoles have been doing this for ten years now.

Re:I find this disturbing (0)

Anonymous Coward | more than 4 years ago | (#28934063)

Yeah, work with servers. New firmware comes out all the time.

Re:I find this disturbing (0, Offtopic)

foobsr (693224) | more than 4 years ago | (#28934111)

Are we looking at a future where we not only have to download updates to fix bugs in our applications and operating systems, but our hardware as well?

No, it is all about updating your wetware, and It has been anticipated that things will be much worse [philipkdick.com] a long long time ago.

CC.

Re:I find this disturbing (1)

Movi (1005625) | more than 4 years ago | (#28934381)

From my perspective it's actually beggining to be quite common among HW manufacturers to release broken hardware. Actually had 2 run-ins with a required firmware upgrade to gfx boards (both nvidia)

#1 8800GTX 512MB who in it's video bios claimed to only have 256MB. I guess the windows drivers had their own VRam enumeration procedure, but this majorly put other drivers off to a hang (OSX - yeha i know hackintosh is bad, and noveau). I had to get the vbios from the board, hexedit it (4 offsets), then flash it back. Thankfully all went well and now it's reporting what it should have been in the first place. Why did the card lie about this, i have no clue.

#2 9800GTX 512 - would hang on any driver reload in windows. I spent DAYS figuring this out, first with WinXP, finally some older drivers managed to load, then with Windows7 - multiple builds, multiple version (x86, x64), BIOS settings, hackery. Finally something irked me "what if this card is lying too". Went to check for a BIOS update - huzzah "Fixes windows reload driver hang".

On both of these occurences, i wouldn't imagine a normal PC user doing these. So i guess releasing broken hardware which is then "fixed" is the norm. Now that i think about it.. AMD Phenom Look-Aside cache bug, countless ATi-Mac firmware, SMC and EFI updates. This is actually common, no?

Re:I find this disturbing (1)

Pyrion (525584) | more than 4 years ago | (#28934469)

I got one to add that I'm still working on:

GTX 285 - hangs with blue/black screen of death both in idle and in games although far more frequently at idle, for some people it happens so early and often that a RMA is their only option. For me it happens within 3-5 days of bootup. What I think the problem is: the card is designed to throttle down when it's not being fully utilized, but I suspect the voltage regulators weren't designed to handle this, so even during full utilization when the BIOS runs at its default profiles, you'll have massive voltage spikes and drops (I can only monitor the 3.3V sensor voltage for this in RivaTuner, but it appears to affect everything, fan speeds, core/memory clockrates, stability?) so I suspect that after some time, the voltage regulators drop the voltage for enough time that there isn't enough voltage to maintain everything in video RAM, which causes the card to hang. Fix, which I'm still testing because circumstances beyond my control haven't allowed me to try reaching a week of uptime: force the card to run in 3D performance mode in RivaTuner.

It consumes far more power and runs hotter, but I'll take both of these (it'd still be less than falling back to SLI 8800GT's) if the damn thing stays stable. No voltage craziness so far either.

Re:I find this disturbing (1)

Ilgaz (86384) | more than 4 years ago | (#28935771)

Normally, a good, supported modern device will eventually have bugs fixed with a firmware update. Companies can't really test millions of different configurations, usage patterns or a "one in the million" issue. Some companies like Apple have went beyond it and they would even ship "double click in gui" firmware updates. Of course, it is all fail safe.

I always pick hardware which *does have* firmware updates on site, with good documentation and release notes. For example, Lacie keeps updating their firewire and more advanced drives. Not because they can't be used without updating, it is because some engineers find some little issues which could be problem in rare cases or operating system issues, performance enhancements etc.

One thing of course, always read documentation and apply firmware update if it will benefit to you especially regarding BIOS updates.

Re:I find this disturbing (1)

spire3661 (1038968) | more than 4 years ago | (#28936543)

If you did happen to lose all your data because of this one particular bug, then you have no one to blame but yourself. Storage fails, ALOT. Plan accordingly.

Well.. (1)

mikkelm (1000451) | more than 4 years ago | (#28933897)

I find it difficult to really blame them for this. What an obscure bug. How do you QA yourself out of something like that without spending more than you did on your R&D?

Re:Well.. (1)

hf256 (627209) | more than 4 years ago | (#28933923)

I would have agreed with you on the obscure part if it only occured when the password is disabled. But to occur on password change and reboot seems more like an obvious case to me?

Re:Well.. (2, Insightful)

ShadowRangerRIT (1301549) | more than 4 years ago | (#28934103)

Not really. Making an educated guess from the article, it appears that this is implemented as a simple controller lockout, not actual encryption. So swapping the flash memory into another controller (common computer forensics technique) would bypass it. Most people paranoid enough to want a disk password want real encryption, so using Intel's half-measure of a password is likely a very uncommon scenario. The tests are probably very simple; glossing over this case would be an understandable, if not desirable, oversight.

Re:Well.. (1)

Pyrion (525584) | more than 4 years ago | (#28934021)

Take a down payment from your users as a massive discount in exchange for them signing on as "beta testers." If they actually find something wrong with the product and send in problem reports, then they get to keep the product for just that initial down payment so long as they keep sending in problem reports. If no problem reports come in within a given amount of time, bill them the remainder of the MSRP on the product, since it obviously works well enough for their uses.

I guarantee you something like this would've been found far quicker if these drives were in absurdly high demand due to an absurdly low cost in exchange for something Windows users have taken for granted - purchasing a "new" product in order to effectively become a beta tester. After all, Windows releases are never really "done" until Microsoft stops issuing updates for them. Why pay full price for something you know damn well hasn't been tested to death and beyond like the attention a product gets when consumers get a hold of it and start finding things QA never anticipated?

Re:Well.. (4, Interesting)

rickb928 (945187) | more than 4 years ago | (#28934113)

Is this a cost issue, or a thoroughness issue?

No, we dont catch every possible scenerio here, either, but we do try very, very hard. Knowing one of the coders in Intel's RAID drivers groups, he goes crazy with stuff. And he just writes Linux drivers. I do not envy him - this past year, every bug he's had to fix has been caused by someone else's code. Someone not writing Intel drivers. And he gets slammed every time for bad testing, as if he can test all the rest of the kernel team's stiff, NTM every fly-by-night Chinese hardware outfit. They're killing him.

I can't even say 'ext4', he just goes insane. Though he chuckles when I whisper 'ReiserFS', and opens another beer.

I'm glad I'm not in that line of work.

Re:Well.. (1)

syousef (465911) | more than 4 years ago | (#28935379)

I can't even say 'ext4', he just goes insane. Though he chuckles when I whisper 'ReiserFS', and opens another beer.

Perhaps a competitor has discovered this and hired someone to whisper "ReiserFS ReiserFS ReiserFS" in his ear repeatedly. That would explain the bugs. He's coding drunk.

BIOS password on a disk? (0)

Anonymous Coward | more than 4 years ago | (#28933945)

Forgive me if this is a really dumb question... But how do you BIOS password a disk?

BIOS passwords are for preventing the computer from booting or locking users out of the BIOS and have no impact on the disks in the system, no?

Re:BIOS password on a disk? (3, Informative)

ShadowRangerRIT (1301549) | more than 4 years ago | (#28934027)

They probably meant a hard disk password. Depending on implementation, this means either disk supported full disk encryption, or a simple firmware interlock that prevents reading through the controller without the password (but could be bypassed with forensic tools that read the disk surface directly).

Re:BIOS password on a disk? (0)

Anonymous Coward | more than 4 years ago | (#28935809)

They probably meant a hard disk password. Depending on implementation, this means either disk supported full disk encryption, or a simple firmware interlock that prevents reading through the controller without the password (but could be bypassed with forensic tools that read the disk surface directly).

This is an SSD hard drive, which means there's no "surface" to speak of as in traditional spinning-platter hard drives. I'm curious though how difficult it would be to directly read from the SSD's chips though, bypassing a simple firmware interlock?

Re:BIOS password on a disk? (1)

Tycho (11893) | more than 4 years ago | (#28936469)

That would be why the ATA standard requires the data to be encrypted with AES, so removing the physical flash chips and attempting to read them would do no good without the encryption key and the data would only be in 512 byte blocks with some ECC code and with an unknown physical to logical mapping. Good luck on decrypting and reconstructing the contents of a 160GB drive 512 bytes at a time with an unknown and complex type of error checking code.

I've seen this before (1)

argent (18001) | more than 4 years ago | (#28934007)

Intel says the data corruption problem occurs only if a user sets up a BIOS password on the 34-nanometer SSD, then disables or changes the password and reboots the computer.

What does this mean? The flash drive has a password lockout? If so:

(1) a password lockout on a drive is daft, you want to encrypt the drive or not worry about it.

(2) flash drives trashing themselves irretreivably when you reboot after enabling passwords? I've seen that before, on "secure" thumb drives. I won't have anything to do wit that kind of hardwarelockout or encryption after that.

Re:I've seen this before (1)

LearnToSpell (694184) | more than 4 years ago | (#28934079)

a password lockout on a drive is daft, you want to encrypt the drive or not worry about it.

That's hardly daft. I have motion-detecting laser bullets in my foyer, but I still lock my front door.

Re:I've seen this before (4, Insightful)

Grishnakh (216268) | more than 4 years ago | (#28934273)

Why bother though? If someone breaks in, you'll have to fix or replace your front door, even though the motion-detecting laser robots zapped him. If you just leave your front door unlocked instead, intruders can just walk in, and the laser-wielding robots can zap him, and then automatically dispose of the body for you too. This way, the intruder won't cause any damage.

Re:I've seen this before (4, Funny)

NotQuiteReal (608241) | more than 4 years ago | (#28934545)

To keep out the innocent neighbor kids or the maid who comes on the wrong day. You only want to dispose of bodies that deserve it.

You'll sleep better that way.

Re:I've seen this before (3, Funny)

Grishnakh (216268) | more than 4 years ago | (#28934611)

The maid I can understand, but if your neighbor's kids are anything like mine, they're not innocent.

Re:I've seen this before (1)

Pyrion (525584) | more than 4 years ago | (#28935005)

If only because your homeowners insurance requires it for them to maintain full liability?

Re:I've seen this before (1)

argent (18001) | more than 4 years ago | (#28935163)

You have things backwards.

Encrypting the drive ... in software, mind, not in the drive's firmware ... is like locking the front door. It's simple, safe, works for all doors, and is unlikely to break down and kill someone accidentally.

Putting a password on the drive is like leaving the door unlocked and booby-trapped.

Feature Not A Bug (5, Insightful)

mrbene (1380531) | more than 4 years ago | (#28934061)

Seriously, I'd say this is in the By Design bucket. For the security conscious - set a BIOS password. If the (feds/aliens/wife/others) remove the password, all access to the data is gone.

Brilliant! Secure!

Mind you, not being able to change my password once every other day might hinder my current security model.

Re:Feature Not A Bug (1)

Tycho (11893) | more than 4 years ago | (#28936549)

It is important to set the password on the hard drive itself and delete the password in the BIOS when "they" come. Setting a BIOS password for the computer itself is the only option on many desktop computers and would be a waste of time. When "they" come they will boot the computer, see the password, giggle madly, mock you, turn the computer off, disassemble the computer, remove the drives and happily read the contents of the hard drives on another computer. For really stupidly broken motherboards, and regardless of original cost or manufacturer, resetting the CMOS using the CMOS jumper would be something "they" might do as well and may actually work more often than not for BIOS passwords.

Seriously though, the government advises those government contractors and employees working with sensitive data and who use a laptop to have the hard drive password set and thus encrypted with AES and to either have a prompt for the hard drive password at boot up or to delete the hard drive encryption key from the BIOS in order to quickly and easily make the data on the drive useless.

According to Intel (1)

SlashDev (627697) | more than 4 years ago | (#28934097)

"the data corruption problem occurs only if a user sets up a BIOS password on the 34-nanometer SSD, then disables or changes the password and reboots the computer". A password protected SSD? Can someone please explain? I must be new to computers...

Re:According to Intel (1)

Wesley Felter (138342) | more than 4 years ago | (#28934373)

Yes, you must be new to computers since hard disks have had passwords for years. It was a popular feature in the "enterprise" market before full-disk encryption became practical.

Re:According to Intel (1)

Ilgaz (86384) | more than 4 years ago | (#28935687)

and we see the very practical (!) results of them. It has nothing to do with being new to "computers", I just saw it in 2009 while setting up a Lenovo thinkpad BIOS and something told me it is one thing to stay away on that machine. I'd better pay to PGP guys.

SD cards have passwords too and as far as I know, they are also hardware based. Funny thing is, that is one thing the phone vendors hate since they create problems with firmware upgrade process which is already a very risky thing on a smart phone. Nokia states "please remove password of your memory card before the update process". It seems some interesting things happened.

Re:According to Intel (1)

ihavnoid (749312) | more than 4 years ago | (#28935655)

Password protection was supported for a long time, and is a part of the standard ATA specifiation. Although it typically has nothing to do with full-disk encryption, it was more or less enough to keep honest people honest, and add a little bit of cost+effort to bypass it.

Many RAID controllers use this feature to prevent the user from connecting a RAID-formatted hard drive to a normal ATA controller, thereby accidently destroying all data. Unlocking the drive is a non-issue, since they use the same password that you might find after a few minutes of googling, and if the RAID controller that locked it is available, you can unlock it without any problem.

Non-destructive fw update coming + rave on G2 (2, Informative)

owlstead (636356) | more than 4 years ago | (#28934107)

Although this bug should have been caught faster it seems that it is possible to update the firmware without any data loss (fortunately I have put it in a laptop, power outages are no problem). I've looked at the Intel site and the flash utility seems to be simply bootable from CD - if this is the last bug I'll be a very happy punter indeed.

My 80 GB G2 SSD replaced a not too fast laptop drive. I'm now trying Linux, but I'll try Vista as well just for fun - I'll just write my 80 GB to an external drive using Gparted. These drives come highly recommended even if they would slow down to 50% of performance (which, it seems, they don't). I unzipped Eclipse to it and JavaDoc and I could see that the archiver that unzipped the .zip has some performance issues reading the index. It took longer than the unzipping and gunzipping and untarring (the Eclipse gunzipping/untarring took less than 2 seconds - yikes). The only thing faster is the tmpfs in RAM which I used to compile the OpenJDK in on my "workstation". Starting Eclipse takes now less time on my laptop than on my workstation even though it got twice as few cycles.

Re:Non-destructive fw update coming + rave on G2 (1)

D Ninja (825055) | more than 4 years ago | (#28934225)

My 80 GB G2 SSD replaced a not too fast laptop drive. I'm now trying Linux, but I'll try Vista as well just for fun - I'll just write my 80 GB to an external drive using Gparted. These drives come highly recommended even if they would slow down to 50% of performance (which, it seems, they don't). I unzipped Eclipse to it and JavaDoc and I could see that the archiver that unzipped the .zip has some performance issues reading the index. It took longer than the unzipping and gunzipping and untarring (the Eclipse gunzipping/untarring took less than 2 seconds - yikes). The only thing faster is the tmpfs in RAM which I used to compile the OpenJDK in on my "workstation". Starting Eclipse takes now less time on my laptop than on my workstation even though it got twice as few cycles.

This just goes to show how much of a bottle neck traditional hard drives really are. A friend of mine recently replaced his hard drive in with an SSD and I was extremely impressed by the speed improvement - so much so that I'm considering installing an SSD drive on my computer as the primary hard drive and using the second as backup space.

Re:Non-destructive fw update coming + rave on G2 (1)

Pyrion (525584) | more than 4 years ago | (#28935025)

If your OS is small enough, skip the Flash SSD altogether, get 4GB of cheap DDR memory and a Gigabyte i-RAM SSD and put your OS on that.

Next "Ask Slashdot"... (3, Funny)

neokushan (932374) | more than 4 years ago | (#28934171)

"How to recover lost/corrupted files from an SSD?"

Re:Next "Ask Slashdot"... (-1)

Anonymous Coward | more than 4 years ago | (#28934255)

By pissing on it.

Re:Next "Ask Slashdot"... (0)

Anonymous Coward | more than 4 years ago | (#28935491)

Send your SSD and $40 USD. I will return you data with a ShamWoW, but only if you act within the next 10 minutes

I know who should answer (1)

Ilgaz (86384) | more than 4 years ago | (#28935569)

Ones who flames us whenever we say "it is early, don't beta test storage hardware" should come up and answer them. Especially when it is predictably personal memories which has no backup.

In an enterprise environment which X-25 was originally designed for, data loss is not a huge problem. They have all kinds of backups,verification, mirroring and cool filesystems like ZFS. When it comes to personal data of ordinary OS X or Windows user, the problem begins. Whenever they suggest an untested technology to ordinary people, they should leave a phone number or working mail address to get called when 1000s of unreproducible personal jpegs are gone forever.
 

Re:I know who should answer (1)

Tycho (11893) | more than 4 years ago | (#28936565)

And the CSR on the other end when called should not be able to mute, end, or transfer the call without supervisor assistance.

At least it's not Seagate (1, Informative)

Anonymous Coward | more than 4 years ago | (#28934307)

Conservatively, 40% of Seagate's high-capacity (1TB+) drives have suffered from a firmware bug which bricked the drive. Seagate has promised free data recovery + firmware fix on affected units - not many people know this! So if your SATA or external Seagate has failed recently on boot, you may be able to recover the drive and your data free. Customer support is very sketchy but if you keep trying for the free data recovery you will succeed. http://www.engadget.com/2009/01/19/seagate-offers-fix-free-data-recovery-for-disks-affected-by-fir/2 [engadget.com]

I hope some dodgy dealers sell these (0)

Anonymous Coward | more than 4 years ago | (#28934461)

I would never put a password on my drive, so no corruption for me, but I could use this to get a cheaper price, and I *really* want to put silent drives in my multimedia PC.

Discount? (0)

Anonymous Coward | more than 4 years ago | (#28934857)

I can live without the password feature...

Solid State Disk Revolution (3, Insightful)

JakFrost (139885) | more than 4 years ago | (#28935055)

This really seems like a very unlikely event to happen to trigger the problem on these drives for most users since from my experience personally and professionally I have yet to see anyone actually know about BIOS passwords, much less about setting a password on the drive using the ATA secure drive password feature. I am surprised that this was even caught by anyone unless it was a complete fluke or there actually are people or companies using this type of a feature for security. (I don't doubt it but haven't seen it.)

I personally own the first generation Intel X25-M 80GB MLC SSD [intel.com] and I have written about it extensively here on this forum. I heard rumors that the new TRIM feature support will only made available to this second generation release of these drives but I'm unsure if that is really true. I'm on the fence right now whether I should sell my G1 drive and upgrade to the G2 because of this feature and also for a little more performance because I am so happy with the performance of this drive and also the current 8820 firmware that solved the fragmentation and slowdown issues.

If you are one of those folks who is still sitting around not knowing what to do when all of this Solid State Disk news is coming out all over then you are missing the biggest paradigm shift to computing performance since the transfer from floppy disks to hard drives.

With the upcoming re-release of this newly affordable drive around 2009-08-28 from Intel X25-M G2 80GB MLC SSD at ~$230 USD from Newegg [newegg.com] or ZipZoomFly [zipzoomfly.com] you should definitely dig down deep and save a little money to buy one of these drives and experience the biggest performance and responsiveness improvement to your computer that you could imagine.

If you need a primer on the SSD revolution check out my previous post regarding the articles to read.

Required Reading for Solid State Drives (Score 1) [slashdot.org]

Re:Solid State Disk Revolution (1)

maxume (22995) | more than 4 years ago | (#28935279)

Does your OS support TRIM yet? If not, you shouldn't be on the fence, prices are plummeting (the newer, faster drives from Intel are cheaper...) and it isn't going to help you any until you upgrade your OS anyway.

Re:Solid State Disk Revolution (1)

Ilgaz (86384) | more than 4 years ago | (#28935637)

I am extremely old fashioned in regards to hard drives. Not buying until something with normal price comes out from 2 vendors of mine, Seagate and Western Digital. They do storage for years.

Basically Intel is a CPU vendor/monopoly. Not a GPU vendor or a hard disk manufacturer.

Re:Solid State Disk Revolution (1)

karnal (22275) | more than 4 years ago | (#28936037)

Intel makes chips.

Graphic cards have chips. Given, they don't necessarily pander to the high end.

Flash drives have chips. Intel can make chips.

Intel. Chips. Enjoy.

Re:Solid State Disk Revolution (1)

Luthair (847766) | more than 4 years ago | (#28936649)

Intel, and other SSD manufacturers are getting a free ride on reliability and performance. When these types of problems occur in the storage world it can be game over for the manufacturer.

Re:Solid State Disk Revolution (1)

gordyf (23004) | more than 4 years ago | (#28936601)

"I dunno about this chip-based storage from the biggest chip manufacturer in the world. I'm gonna wait until a company that has never made Flash makes Flash-based storage instead."

Yeah, that makes total sense.

Was stuff like this not expected? (1, Funny)

Anonymous Coward | more than 4 years ago | (#28935287)

It is called the bleeding edge for a reason.

Problem is, in the future as hardware is becoming more complicated I think we're going to see more and more issues like this. It seems that it's mostly engineers that end up writing the code at this level, especially when dealing with hardware, and they just can't write software for crap. I have worked with many over the years and there is not one I would consider capable of writing something that needed to be very reliable.

ee

What took them so long to report this? (4, Informative)

AllynM (600515) | more than 4 years ago | (#28935419)

Welcome to 2 weeks ago:

http://www.pcper.com/comments.php?nid=7544 [pcper.com]

Allyn Malventano
Storage Editor, PC Perspective

OCz SSD (0)

Anonymous Coward | more than 4 years ago | (#28936569)

ooh fucksticks.. Now I need to research OCz SSD drives (which i'm completely not happy with spending 900 bucks for a 250g drive) to see if the bug applies here?

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?
or Connect with...

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>