Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Optimizing Linux Systems For Solid State Disks

Soulskill posted more than 5 years ago | from the bit-by-bit dept.

Data Storage 207

tytso writes "I've recently started exploring ways of configuring Solid State Disks (SSDs) so they work most efficiently in Linux. In particular, Intel's new 80GB X25-M, which has fallen down to a street price of around $400 and thus within my toy budget. It turns out that the Linux Storage Stack isn't set up well to align partitions and filesystems for use with SSD's, RAID systems, and 4k sector disks. There are also some interesting configuration and tuning that we need to do to avoid potential fragmentation problems with the current generation of Intel SSDs. I've figured out ways of addressing some of these issues, but it's clear that more work is needed to make this easy for mere mortals to efficiently use next generation storage devices with Linux."

cancel ×

207 comments

Sorry! There are no comments related to the filter you selected.

fp (-1, Troll)

Anonymous Coward | more than 5 years ago | (#26940907)

fp

Still too expensive... (0)

Anonymous Coward | more than 5 years ago | (#26940917)

I really don't care about performance, when they retail for $400. Talk to me when I can get an 80GB one for under $50.

Re:Still too expensive... (1)

von_rick (944421) | more than 5 years ago | (#26940975)

The more people buy it the sooner it will get under $50. But considering the recent financial conditions, people would rather let others buy the SSD so that they can get it for $50 in August 2010. I'm afraid this time its gonna take longer than that to see a tenfold reduction in storage device costs.

Re:Still too expensive... (1)

the_humeister (922869) | more than 5 years ago | (#26941247)

I've considered getting a large capacity CF card (16 GB or 32 GB) to use as a solid state drive for my laptop. The CF + adapter combination is a lot cheaper than these new SSD. So why should I get a SSD vs. a CF card?

Re:Still too expensive... (4, Informative)

NekoXP (67564) | more than 5 years ago | (#26941283)

> So why should I get a SSD vs. a CF card?

10 times better performance and wear-leveling worth a crap.

Re:Still too expensive... (2, Informative)

tinkerghost (944862) | more than 5 years ago | (#26941705)

So why should I get a SSD vs. a CF card?

Your CF card is going to use the USB interface which maxes out at about 40Mbps as opposed to using an internal SSD's SATAII interface which maxes at 300Mbps. Not quite an order of magnitude, but close.

On the other hand, if you're going to use an external SSD connected to the USB port, then you wouldn't see any difference between the 2 in terms of speed. Lifespan might be longer w/ the SSD due to better wear leveling, but in either case you're probably going to lose or break it before you get to the fail point.

Re:Still too expensive... (5, Informative)

Anonymous Coward | more than 5 years ago | (#26941857)

A real SSD has several advantages over using CF cards, but not for the reasons you state.

With a simple plug adapter, CF cards can be connected to an IDE interface, so speeds won't be limited by interface speed. The most recent revision of the CF spec adds support for IDE Ultra DMA 133 (133 MB/s)

A couple of additional points, just because I love nitpicking:
- A USB 2.0 mass storage device has a practical maximum speed of around 25 MB/s, not 40 Mb/s.
- The so-called SATA II interface (that name is actually incorrect and is not sanctioned by the standardization body) has a maximum speed of 300 MB/s, not Mb/s.

Re:Still too expensive... (1)

KingMotley (944240) | more than 5 years ago | (#26942767)

Except that the lastest gen SSD's exceed 250MB/sec throughput. If the latest CF spec just added 133MB/sec, then that would be a huge bottleneck in throughput.

Re:Still too expensive... (2, Informative)

karnal (22275) | more than 5 years ago | (#26942331)

Why is this informative? CF with an adapter is NOT USB.

From my experience, using an adapter puts it on the native interface - notably, with CF, it's easiest to put the device into a machine that has a native IDE (not SATA) interface. CF is pin compatible with IDE.

Now, in the current offering of SLC/MLC "drives" you can actually get better read/write since they "raid" for lack of a better term the internal chips. I'm using a transcend ATA-4 CF device that gets around 30MB/sec read/write in a machine in my garage; it's an SLC device that isn't their top of the line, but it was more cost-effective.

So, using the IDE/ATA-4 interface on the CF card, it gets lower CPU utilization than a USB device. Still doesn't hit the 40MB/sec you quoted, but 40MB/sec is a pipe dream on USB in my experience.

Re:Still too expensive... (2, Informative)

Dr. Ion (169741) | more than 5 years ago | (#26942637)

Your CF card is going to use the USB interface

This is Informative?

CF cards are actually IDE devices. The adapters that plug CF into your IDE bus are just passive wiring.. no protocol adapter needed.

It's trivial to replace a laptop drive with a modern high-density CF card, and sometimes a great thing to do.

The highest-performance CF cards today use UDMA for even higher bandwidth.

HighSpeed USB can't reasonably get over 25MB/sec from the cards using a USB-CF adapter, but you can do better by using its native bus.

Re:Still too expensive... (2, Interesting)

couchslug (175151) | more than 5 years ago | (#26942297)

If it's an older laptop or the mechanical hard disk died, go for it. Addonics make SATA CF adapters so you are not restricted to IDE CF adapters.

Re:Still too expensive... (1)

jimmyhat3939 (931746) | more than 5 years ago | (#26942363)

No doubt. But, I really think that within 5 years you're going to see most laptops using only an SSD.

Mere mortals need mroe toy budget (4, Insightful)

wjh31 (1372867) | more than 5 years ago | (#26940921)

I think the bigger challenge will be in getting mere mortals to have a $400 toy budget to afford the SSD

Hyperinflation to the rescue (1, Funny)

Anonymous Coward | more than 5 years ago | (#26940931)

Your government is working towards it.

Re:Mere mortals need more toy budget (1)

Carrion Creeper (673888) | more than 5 years ago | (#26940967)

I for one hope he is successful so that when SSDs become more affordable, or even the default, Linux will be nicely optimized.

Re:SSD's should have no problem with fragmentation (3, Insightful)

von_rick (944421) | more than 5 years ago | (#26941031)

From economics, lets turn our attention to optimizing this toy of ours. The thing with SSDs is that they don't have a read/write head to worry about. This means that no matter where the data is stored in the device, all we need to do is specify the fetch location and the logic circuits select that block to extract the data from desired location. From what I've heard, the SSDs have an algorithm to actually assign different blocks to store the data so that the memory cells in a single locations aren't overused.

Re:SSD's should have no problem with fragmentation (0)

Anonymous Coward | more than 5 years ago | (#26941089)

Yes, that's true. But the important thing is ensuring that the OS/filesystem breaks the data up into appropriate sized chunks that match up with the block size that the disk controller uses. This has nothing to do with fragmentation.

Re:SSD's should have no problem with fragmentation (5, Interesting)

v1 (525388) | more than 5 years ago | (#26941395)

I don't think this is going to be a significant problem when compared to normal seek time problems.

Lets say we have 100 k of data to read. 512 byte blocks would require 200 reads. 4k blocks would require 25 reads.

For rotating discs: If the data is contiguous, we have to hope that all the blocks are on the same track. If they are, then there is 1 (potentially very costly) seek to get to the track with all the blocks on it. The cost of the seek is dependent on the track it's going to, the track it's on, and whether or not the drive is sleeping or spun down. Otherwise we also get to do another very short seek, which is going to add a bit of time to get to the next adjacent track. Worst case scenario all 200 blocks are on different tracks, scattered randomly on the platter, requiring 200 seeks. Ouch ouch ouch.

For SSDs: What is important is the number of cells we have to read. Cells will be 4k in size. All seek times are essentially zero. Best case scenario, all data is contiguous, and the start block is at the start of a cell. Read time boils down to how fast the flash can read 20 cells. Worst case scenario is where the data is 100% fragmented, such that all 200 512 byte blocks reside in a different cell, requiring 200 cell reads. (10fold increase in time required) There will also be overhead in copying out the 512 byte data from each buffer and assembling things, but this time is negligible for this comparison.

While the 20x time increase (order N) looks significant, it's important to compare the probabilities involved, and just how bad things get. The most important difference between how these two drives react is the space between fragments. In the "worse case' for SSD, 100% fragmentation, is highly unlikely. I don't even want to think about what a spinning disc would do if asked to perform a head seek for 100% of the blocks in say, a 1mb file. The read head would probably sing like a tuning fork at the very least. 2000 cell reads compared to 2000 seeks, the SSD will win handily every single time, even if the tracks on the disc are close.

If the spacing between fragments is anything near normal, say 30-100k, then there will be some seeking going on with the disc, and there will be some wasted cell reads with the SDD, but having to do an extra one cell read compared with having to do an extra head seek, again the SSD wins hands down. The advantage of the SSD actually goes down as fragmentation goes down, because most fragments are going to cause a head seek, each of will significantly widen the time gap. Also a spinning disc will read in the blocks much faster than the cells on a SSD.

I realize the OP was more describing the possibility of "not so much bang for the buck as you are expecting" due to fragmentation, and I know the above hits more on comparing the two than what happens to the SSD, but if you consider the effects of fragmentation on a spinning disc, and then weigh how the impact compares with a SSD, it's easy to see that fragmentation that sent you running for the defrag tool yesterday may not even be noticeable with a SSD. So I'd call this a "non-issue".

What I'm waiting for is them to invest the same dev time in read speeds as write speeds. SSDs don't appear to be doing any interleaved reads - they're doing it for the writes because they're so slow. Though at this point I wonder if read speeds are just plain running into a bus speed limit with the SSDs?

Another file strategy - file segregation by f(x) (4, Insightful)

spineboy (22918) | more than 5 years ago | (#26942001)

Why not functionally group files to decrease or eliminate fragmentation? Or maybe this is already done.
For example - I have a large collection of MP3 files. They essentially do not change, as in I don't edit them, and rarely erase them. The file system could look at they type of file (mp3, vs doc) and place it accordingly. It could also look at the last change in the file and place it in a certain area. Older unchanged files are placed in a tightly placed/packed file area that is optimized and not fragmented.

Re:SSD's should have no problem with fragmentation (1)

jimmyhat3939 (931746) | more than 5 years ago | (#26942447)

Good analysis. The statistics I've read indicate that SSD's don't perform all that much better than hard drives in real-world scenarios. I think this is part of the reason for that performance. On the other hand, they do use less energy, which is a clear positive for a laptop.

Re:SSD's should have no problem with fragmentation (1)

vux984 (928602) | more than 5 years ago | (#26942529)

On the other hand, they do use less energy, which is a clear positive for a laptop.

And thus they are cooler. A clear positive for any system, but especially a laptop.
They are also silent and don't vibrate.
They are also, from what I understand, more reliable.

I'm seriously considering flash drives for my desktop PC... they just need one more capacity jump and I think they'll be worth it. $400 for 128MB is a touch small.. but I'll go for it at $400 for 256MB. On my main PC I'm only using 236GB of my 500GB drive, and I could easily move 150GB of that onto my 1TB external e-sata drives that I turn on when I need.

Re:SSD's should have no problem with fragmentation (1)

Jurily (900488) | more than 5 years ago | (#26941199)

This means that no matter where the data is stored in the device, all we need to do is specify the fetch location and the logic circuits select that block to extract the data from desired location.

Which is why you don't need head-optimized I/O schedulers like Anticipatory, which waits a couple of ms after every read to see if there's more from that area, thus saving on seek times.

SSD's must be optimized differently. For instance, they can't write arbitrary small pieces of data, only whole blocks. Thus, if you want to optimize it, you'd better make sure to write whole blocks at a time if possible, and not have small files cross boundaries if they don't have to.

Re:SSD's should have no problem with fragmentation (0)

Anonymous Coward | more than 5 years ago | (#26941221)

Every mass storage device since cassette tapes read/writes a whole block at a time.

Re:SSD's should have no problem with fragmentation (1)

diskis (221264) | more than 5 years ago | (#26941647)

Yes, but for SSD's the blocks are larger - problems when essentially all software is optimized for smaller blocks.

What?! (0)

Anonymous Coward | more than 5 years ago | (#26941877)

You mean after all the hoopla the Linux people made about the Anticipatory Scheduler, the code is nothing more than:

wait_awhile()

What a ripoff.

Re:SSD's should have no problem with fragmentation (-1)

Anonymous Coward | more than 5 years ago | (#26941421)

Your clumsy use of technical sounding but inappropriate jargon, coupled with the fact you've totally failed to grasp the issue, belies your ignorance and renders your attempt to sound insightful completely risible.

Re:Mere mortals need mroe toy budget (2, Insightful)

KibibyteBrain (1455987) | more than 5 years ago | (#26940973)

Well, they will obviously go down in price eventually. The real price issue won't be affordability but rather value. Do most consumers out there really want a what would seem to average out to slightly faster drive, or an order of magnitude or two more storage? There have always been fast drive solutions in the past and they have never been very popular, and quickly become obsolete. Eventually some sort of SSD will take over the market, but I don't believe this sort of compromised experience business model will sell them, unless cloud storage and internet everywhere becomes mainstream fast.

Re:Mere mortals need mroe toy budget (1)

Average (648) | more than 5 years ago | (#26941047)

Sure. There are *lots* of considerations beyond speed to want SSDs.

First is battery life. Batteries suck. Laptops pulling 5 or 6 watts total make that suck more bearable. SSDs are part of that.

There's also noise. Hard drives have gotten much quieter. But in a dead-silent conference room, I want dead-silence.

Even form-factor is an issue. a 2.5" cylinder is a notable chunk of a small notebook. 1.8" drives are, generally, quite slow. SSDs can be worked into design.

Re:Mere mortals need mroe toy budget (3, Informative)

piripiri (1476949) | more than 5 years ago | (#26941073)

Sure. There are *lots* of considerations beyond speed to want SSDs

And SSD drives are also shock-resistant.

"shock-resistant" (1, Funny)

Anonymous Coward | more than 5 years ago | (#26942283)

Sure. There are *lots* of considerations beyond speed to want SSDs

And SSD drives are also shock-resistant.

The drives will be shocked when they see what I have in my pr0n collection.

Re:Mere mortals need mroe toy budget (1)

beaviz (314065) | more than 5 years ago | (#26942355)

Sure. There are *lots* of considerations beyond speed to want SSDs

And SSD drives are also shock-resistant.

But... Are they resistant to shouting [sun.com] ?

Re:Mere mortals need mroe toy budget (1)

berend botje (1401731) | more than 5 years ago | (#26941185)

For dead-silence you might be better off with getting a LED backlight. In my laptop I can't hear the hard drive over the whine of the backlight converter.

Re:Mere mortals need mroe toy budget (0)

Anonymous Coward | more than 5 years ago | (#26941259)

News Flash: your backlight converter is busted.

Re:Mere mortals need mroe toy budget (2, Interesting)

Anonymous Coward | more than 5 years ago | (#26941355)

As other components become less noisy, the "solid state" electronics' acoustic noise becomes audible. It isn't necessarily faulty electronics, just badly designed with no consideration for vibrations due to electromagnetic fields changing at audible frequencies. These fields subtly move components and this movement causes the acoustic noise. Most often it is a power supply or regulation unit which causes high pitched noises. Old tube TV sets often emit noise at the line frequency of the TV signal (ca. 15.6kHz for PAL, ca. 15.8kHz for NTSC).

Re:Mere mortals need mroe toy budget (1)

andreyvul (1176115) | more than 5 years ago | (#26941773)

By old, you mean every single CRT TV.
I've heard the 16kHz whine whenever I mute the sound.
CRT monitors are exempt because VGA line frequency is > 22 kHz.

Re:Mere mortals need mroe toy budget (1)

berend botje (1401731) | more than 5 years ago | (#26941693)

Thanks for the info. However, it seems most converters are busted, as I can hear them on quite a lot of laptops or tft screens.

I don't mean to frighten you, but perhaps you should have your ears checked next time you get a physical. If you've spend considerable time around heavy machinery or loud music, it might be you have lost the ability to hear high pitched sounds. As this goes gradually, it isn't generally noticed.

Really, get it checked out and (when applicable) change your habits regarding to exposure to loud sounds.

Re:Mere mortals need mroe toy budget (1)

bcrowell (177657) | more than 5 years ago | (#26941689)

Sure. There are *lots* of considerations beyond speed to want SSDs.

Another example: I have a tiny NSLU2 network appliance that I use as a music server. In the out-of-the-box configuration, it runs Linux from a ROM, but you can add an external drive via a USB cable and boot Linux off of that. It doesn't have SATA, so that wasn't an option.

I'm not sure why this guy paid $400 for an 80 Gb SSD. I just upgraded my music server to a 64 Gb SSD, and it only cost $100. Maybe the one he got is a fancier, faster drive?

For my application, none of these filesystem performance things are really an issue at all. I'm almost always reading, almost never writing, and the bottleneck for speed, when there is one, is always the ARM CPU.

It's great if people enjoy tinkering with the latest technology, but the impression I get is that this just isn't the right time to be switching your desktop machine to SSD. Price per Gb is going down rapidly, but is still very poor compared to platters. Performance with SSD technology is potentially much better than with platters, but it will probably be a few more years until (a) operating systems are optimized for SSDs, and (b) all the drives on the market are really optimized for performance the way they should be.

Re:Mere mortals need mroe toy budget (1)

rcw-home (122017) | more than 5 years ago | (#26942145)

I'm not sure why this guy paid $400 for an 80 Gb SSD. I just upgraded my music server to a 64 Gb SSD, and it only cost $100. Maybe the one he got is a fancier, faster drive?

Price/GB for SSDs seems to be largely proportional to the number of write operations per second the SSD can handle. Once a handful of manufacturers solve that particular puzzle, I expect prices will drop significantly.

Re:Mere mortals need more toy budget (1)

conureman (748753) | more than 5 years ago | (#26941215)

I've been wrestling this idea around as a sound studio solution, and it seems that an external storage unit makes the most sense, with a DRAM card for the currently working files. Almost affordable, anyway.

Re:Mere mortals need mroe toy budget (1)

amclay (1356377) | more than 5 years ago | (#26941097)

Obviously he needs to overclock his SSD. That would be epic.

Re:Mere mortals need mroe toy budget (0)

Anonymous Coward | more than 5 years ago | (#26941253)

yes

Re:Mere mortals need mroe toy budget (-1, Troll)

jellomizer (103300) | more than 5 years ago | (#26941419)

This story says to me.
"Hey Look I am a rich kid, who can afford expensive gear. And I have the time to optimize it. See how much better I am then the rest of you. "

Re:Mere mortals need mroe toy budget (1, Insightful)

Anonymous Coward | more than 5 years ago | (#26942159)

Well maybe you should check who the story submitter is.
If he doesn't "have the time to optimize it", we're in deep trouble :-)

Re:Mere mortals need mroe toy budget (1)

Hatta (162192) | more than 5 years ago | (#26941739)

You can buy a 32GB SSD for less than $100 [oempcworld.com] today. Is that within the budget of mere mortals?

Agreed .. But equally important is ... (1, Interesting)

Anonymous Coward | more than 5 years ago | (#26940933)

Yes, we do need progress in that area. However, for many of us who require better-than-average data security, the matter of SSD's read/write behaviour makes the devices extremely vulnerable to analyses and discovery of data the owner/author of which believes to be inaccessible to others: 'secure wiping', or lack thereof, is the issue. As i understand it, 'secure wiping' programs fail to do their job, on SSD's . It's been reported among 'criminals' that SSD's are a 'forensic analyst's dream come true' ! and so it must be for corporate spies, etc,, who have a yen for theft of private data.

Re:Agreed .. But equally important is ... (2, Informative)

ultrabot (200914) | more than 5 years ago | (#26940977)

However, for
many of us who require better-than-average data security, the matter of SSD's read/write behaviour makes the devices extremely vulnerable to analyses and discovery of data the owner/author of which believes to be inaccessible to others: 'secure wiping', or lack thereof, is the issue.

Obviously you should be encrypting your sensitive data.

Also, it should be no problem to write a bootable cd/usb that does a complete wipe. Just write over the whole disk, erase, repeat. No wear leveling will get around that.

Re:Agreed .. But equally important is ... (3, Insightful)

Antique Geekmeister (740220) | more than 5 years ago | (#26941193)

Such tools already exist. Even the venerable "dd if=/dev/zero of=/dev/sda" is extremely efficient at flushing a drive well beyond the ability of any but the most well-equipped recovery services, and it's a lot faster than the "overwrite with zeroes, then ones, then 101010..., then 010101..., then random data" approach used by some people with too much time on their hands and too much paranoia for casual data.

Re:Agreed .. But equally important is ... (1)

WNight (23683) | more than 5 years ago | (#26942629)

Yes, dd, especially with random data, is pretty much as secure as any commercial product. But they all fail to touch the hidden blocks the drive has remapped because of potential failure.

Re:Agreed .. But equally important is ... (1, Informative)

Kjella (173770) | more than 5 years ago | (#26941211)

Also, it should be no problem to write a bootable cd/usb that does a complete wipe. Just write over the whole disk, erase, repeat. No wear leveling will get around that.

At least for OCZ drivers, the user capacity is several gigs lower than the user capacity, like 120GB to 128GB. I don't know about your data but pretty much can ble left in those 8GB. The only real solution is to not let sensitive data touch the disk unencrypted.

Re:Agreed .. But equally important is ... (3, Informative)

raynet (51803) | more than 5 years ago | (#26941543)

Unfortunately flash SSDs usually have some percentage of sectors you cannot directly access, these are used for wear leveling and bad sector remapping. So when you dd with /dev/zero, it is quite possible that some part of the original data is left intact. And there can be quite alot of those sectors, I recall reading on one SSD drive that had 32GiB flash in it, but had 32GB available for the user, so 2250MiB was used for wear leveling and bad sectors (helps to get better yealds if you can have several bad 512KiB cells).

Re:Agreed .. But equally important is ... (1)

WNight (23683) | more than 5 years ago | (#26942625)

Agreed. And not just SSDs. Regular HDs remap sectors if they think they're failing. But usually they do so without you noticing a failure, which means that an almost perfectly readable copy of that sector has simply been remapped. No amount of overwriting will ever hit that sector because the drive is sure it's doing you a favor.

The info is still there, just a few debug commands away.

No Money, Mo Problems (1)

flakblas (1366723) | more than 5 years ago | (#26940951)

I know right? Send some cheddar my way Mr. Gates.

Re:No Money, Mo Problems (3, Funny)

larry bagina (561269) | more than 5 years ago | (#26941241)

No worries. Once Barack Obama(1) pays for your house and car, he'll pay off your credit card bills.
  1. future generations of Americans

Re:No Money, Mo Problems (1)

flakblas (1366723) | more than 5 years ago | (#26941343)

Nice.

Re:No Money, Mo Problems (0)

Anonymous Coward | more than 5 years ago | (#26942035)

It is. From the way Americans spend money they don't have to buy things they want to talk about owning around the water cooler the President is doing exactly what Americans want him to do.

Of course, the Republicans will shriek, but just what is their platform:

1. Fiscal Responsibility (historical). You've got to be kidding.

2. Freedom from overly intrusive government (historical). You've got to be kidding (and, yes, I do know we're "at war".)

3. Put in place a cadre of puppet sycophants that will establish American government as an Evangelical Theocracy. Bingo.

Of course, the detractors will trot out a chorus line of sad families in truly desperate straits through no fault of their own. But as much as people might not think so, these cases are the minority. But why should the Republicans suddenly care about co-lateral damage? They never have before; here or in other countries.

The President is doing anything he can think of to kick start the economy from the ground up. The stated alternative of 'companies should pay no taxes' and 'banks can falsify assets' is what put us here, aided and abetted by a stupid, selfish populace. Reganomics works as well as Communism, it's a good thing they are both going to die. America is richer than the ol' CCCP so Reganomics is taking longer to expire. Probably needs a head-shot.

Please try and remember that Obama is well right of the Welfare State the Repubs are screaming about. (I do wish he'd get rid of that bitch Pelosi though. She's bi-partisanship's Typhoid Annie.)

Is it only linux? (4, Interesting)

jmors (682994) | more than 5 years ago | (#26940979)

This article makes me wonder if any OS is really properly optimized for SSDs. Has there been any analysis as to whether or not windows machines properly optimize the use of solid state disks? Perhaps the problem goes beyond just linux?

Re:Is it only linux? (2, Informative)

Jurily (900488) | more than 5 years ago | (#26941071)

unfortunately the default 255 heads and 63 sectors is hard coded in many places in the kernel, in the SCSI stack, and in various partitioning programs; so fixing this will require changes in many places.

Looks like someone broke the SPOT rule.

As for other OSes:

Vista has already started working around this problem, since it uses a default partitioning geometry of 240 heads and 63 sectors/track. This results in a cylinder boundary which is divisible by 8, and so the partitions (with the exception of the first, which is still misaligned unless you play some additional tricks) are 4k aligned.

Re:Is it only linux? (5, Insightful)

NekoXP (67564) | more than 5 years ago | (#26941339)

Yeah, hard disk manufacturers.

Since they moved to large disks which require LBA, they've been fudging the CHS values returned by the drive to get the maximum size available to legacy operating systems. Since when did a disk have 63 heads? Never. It doesn't even make sense anymore when most hard disks are single platter (therefore having 1 or 2) and SSDs don't even have heads.

What they need to do is define a new command structure for accurately determining the best structure on the disk - on an SSD this would report the erase block size or so, on a hard disk, how many sectors are in a cylinder, without fucking around with some legacy value designed in the 1980's.

Re:Is it only linux? (1, Interesting)

Anonymous Coward | more than 5 years ago | (#26941641)

Somebody please mod the parent up to 5.

Yeah, hard disk manufacturers.

Since they moved to large disks which require LBA, they've been fudging the CHS values returned by the drive to get the maximum size available to legacy operating systems. Since when did a disk have 63 heads? Never. It doesn't even make sense anymore when most hard disks are single platter (therefore having 1 or 2) and SSDs don't even have heads.

What they need to do is define a new command structure for accurately determining the best structure on the disk - on an SSD this would report the erase block size or so, on a hard disk, how many sectors are in a cylinder, without fucking around with some legacy value designed in the 1980's.

With the drive electronics as complex as they are nowdays you'd think the OS wouldn't need to know much. Just give it a couple of stats to allow the file system to align properly and stop with all this CHS translation.

chs no longer used (1, Informative)

Anonymous Coward | more than 5 years ago | (#26941959)

i haven't yet found a sata device
(even doms) that require chs addressing.

clearly it was a mistake to use hardware
quirks to address sectors, but the again,
ata became a de facto standard before
realized it might become one.

Re:Is it only linux? (1)

Dr. Ion (169741) | more than 5 years ago | (#26942659)

A bigger problem is our reluctance to move off 512-byte sectors. Who needs that fine granularity of LBA?

That's two sectors per kilobyte.. dating back to the floppy disk. And we still use this quanta on TB hard disks.

Re:Is it only linux? (0)

Anonymous Coward | more than 5 years ago | (#26941125)

That's not the way advances in compatible systems are achieved. First the new technology has to be fast enough to be a drop-in replacement for its predecessor. After a while, other parts of the system adapt to the changed characteristics of the whole and thereby realize more of the potential which came with the new technology.

Re:Is it only linux? (2, Informative)

mxs (42717) | more than 5 years ago | (#26941245)

Of course it goes beyond just Linux. Microsoft is aware of the problem and working on improving its SSD performance (they already did some things in Vista as the article states, and Windows 7 has more in store; google around to find a few slides from WinHEC on the topic).

The problem with Windows w.r.t. optimizing for SSDs is that it LOVES to do lots and lots of tiny writes all the time, even when the system is idle (and moreso when it is not). Try moving the "prefetch" folder to a different drive. Try moving the system log event files to a different drive. And try to keep an eye out for applications that use the system drive for small writes, extensively (or muck about in the registry a lot). These are the hard parts. The easier parts would be to make sure hibernation is disabled, pagefiles are not on the SSD (good luck in getting Windows to not use pagefiles at all; possible, but painful even if you have a dozen gigs of memory), prefetching is disabled, the filesystem is properly aligned, printer spools, etc. With only the things Windows provides, it is painful to attempt to prolong your SSD's life (this is not just about performance; remember that you only have a limited amount of erases until the drive becomes toast).

There are some solutions; MFT for Windows (http://www.easyco.com/) provides a block device that consolidates many small writes into larger ones and does not overwrite anything unless absolutely necessary (i.e. changes are written onto the disk sequentially; overwriting only takes place once you run out of space). It is very, very costly, but it does its job well. Performance skyrockets, drive longevity improves by an order of magnitude.

You can also use hacks such as Windows SteadyState; This also streamlines writes (but adds another layer of indirection). Performance improves, but you get to deal with SteadyState-issues. EFT also works (and is less of a GUI-y system, though largely providing the same services even on Windows 2000/XP); you have got to be careful though, if your system tends to lose power or crash, all the changes since the last boot will be lost; EFT can be made to write out all the changes it has accumulated -- but after that, the only way to reenable it is to restart the system.

Windows is not particularly nice to SSDs when used as a system disk. For data partition it is not quite as bad (although if you deal with many small writes, you might still run into heaps of trouble). The optimizations related here for Linux are applicable to Windows as well (aligning filesystem blocks to erase-blocks and 4k nand-sectors). You would also want to attempt to move stuff that does lots of small writes to a different (spinning) disk -- system logs, for instance, and most spool directories. You'd also want to make absolutely sure that you do not have access time updates enabled; each of those is, essentially, a write (even if ultimately consolidated).

Re:Is it only linux? (1)

maxume (22995) | more than 5 years ago | (#26942287)

Wear leveling algorithms would have to be nearly brain dead for a few megabytes a minute to kill a disk with 100 gigabytes and 100,000 write cycles (that's billions of minutes, which is hundreds of years).

Parsimony suggests that optimizing for the characteristics of the device is a good idea, but SSD wear isn't something that desktop users even need to think about.

Re:Is it only linux? (0)

Anonymous Coward | more than 5 years ago | (#26941279)

Apparently Vista, Windows 7, and OS X automatically align the partition when formatting a disk. XP needs to be tweaked by hand.

Re:Is it only linux? (1)

eharvill (991859) | more than 5 years ago | (#26941541)

Does that include the system drive or just the data drives?

Re:Is it only linux? (1)

Fackamato (913248) | more than 5 years ago | (#26941429)

Yes. The soon-to-be-released OCZ Vertex is discussed in this forum, with a poll from an OCZ guy on how the firmware will be optimized... many IO/s or many MB/s? http://www.ocztechnologyforum.com/forum/forumdisplay.php?f=186 [ocztechnologyforum.com] Partition alignment is important, as is some registry tweaks. Disable prefetch and search indexing, probably some other services that are useless and/or just waste the SSD's life span instead of enhancing performance.

Re:Is it only linux? (1)

HartDev (1155203) | more than 5 years ago | (#26941673)

Do you think that this will be addressed in Linux before it will anything else? MS has not been on the ball twice now, Windows ME and Vista. And Linux is being used a lot in servers, which wouldn't really need a solid state drive as much as a laptop would.

Ironically I was just going out to buy a small one (3, Informative)

earthforce_1 (454968) | more than 5 years ago | (#26941011)

If I mount /home on a separate drive, (good to do when upgrading) the rest of the Linux file system fits nicely on a small SSD.

Re:Ironically I was just going out to buy a small (0)

Anonymous Coward | more than 5 years ago | (#26941169)

I've been doing this for years with CF cards.

Put the volatile stuff on a spindle, the rest on a CF card.

Re:Ironically I was just going out to buy a small (0)

Anonymous Coward | more than 5 years ago | (#26941907)

"coincidentally", not "ironically".

Forget Disk paradigm (0)

Anonymous Coward | more than 5 years ago | (#26941143)

Why not use it as a 'permanent' ram. I'll be more than happy with only an enormous Hashmap on it. Just an easy api to handle it.
Forget about using it as a disk, it's not.

Toy budget (1)

conureman (748753) | more than 5 years ago | (#26941147)

Most of us can't afford to worry about this, but does the Fusion-io suffer from this issue?

What is teh specific issue? (1)

DJRumpy (1345787) | more than 5 years ago | (#26941161)

Surely it's not the block size. I know nothing about filesystems beyond basics. Windows could specify the block size to be used. I assumed that Linux did the same? I have no idea about OS X either.

Are there standard block sizes in use for Linux and OS X filesystems? Can they be modified when they are formatted? If so, and the issue really is due to blocksize and fragmentation as a result, this would seem like an easy fix. Linux and OS X already resist fragmentation. I won't speak to MS's efforts there as they state NTFS does, but the implementation seems to be very different in the real world.

Some of you FS guru's fill us in here. How hard is it to implement something like variable block sizes, or to allow you to specify block size at format time?

No. Not Now. Not Ever. I'm Coming For All of You! (5, Funny)

Anonymous Coward | more than 5 years ago | (#26941177)

> Vista has already started working around this problem, since it uses a default partitioning geometry of 240 heads and 63 sectors/track. This results in a cylinder boundary which is divisible by 8, and so the partitions (with the exception of the first, which is still misaligned unless you play some additional tricks) are 4k aligned. So this is one place where Vista is ahead of Linuxâ¦.

Although the technology it is used in is repugnant, NTFS has always been the One True Filesystem. It descended from DIGITAL's ODS2 (On Disk Structure 2) which traces back to the original Five Models (PDP 1, 8, 10, 11 and 12). You see, ODS was written by passionate people with degrees and rich personal lives in Massachusetts who sang and danced before the fall of humanity to the indignant Gates series who assimilated their young wherever possible and worked them into early graves during his epic battle with the Steves before the UNIX enemy remerged after a 25 year sleep and nuked the United States, draining all of its technological secrets to the other side of the world. Gates, realizing what he's done, now travels the universe seeking to rebuild his legacy by purifying humanity while the Steve series attempts to rebuild itself. Some of the original Five are still around, left to logon to Slashdot and witness what's left of the shadow of humanity still in the game as they struggle blindly around in epic circles indulging new and different ways to steal music, art and technology to make up for their lack of creativity long ago bred out of them by the Gates series.

Re:No. Not Now. Not Ever. I'm Coming For All of Yo (1)

kclittle (625128) | more than 5 years ago | (#26941243)

I have mod points, but cannot find the "Totally Bonkers" mod...

Re:No. Not Now. Not Ever. I'm Coming For All of Yo (0)

Anonymous Coward | more than 5 years ago | (#26941273)

Don't do drugs, man.

Re:No. Not Now. Not Ever. I'm Coming For All of Yo (1)

Gogogoch (663730) | more than 5 years ago | (#26941433)

Please mod the parent funny; so say we all.

One True File System (0)

Anonymous Coward | more than 5 years ago | (#26942375)

Although the technology it is used in is repugnant, NTFS has always been the One True Filesystem.

I thought ZFS was.

Why pretend these are ordinary disks? (4, Insightful)

jensend (71114) | more than 5 years ago | (#26941187)

SSDs gradually gain more and more sophisticated controllers which do more and more to try to make the SSD seem like an ordinary hard drive, but at the end of the day the differences are great enough that they can't all be plastered over that way (the fragmentation/long term use problems the story linked to are a good example). I know that (at present- this could and should be fixed) making these things run on a regular hard drive interface and tolerate being used with a regular FS is important for Windows compatibility, but it seems like a lot of cost could be avoided and a lot of performance gained by having a more direct flash interface and using flash-specific filesystems like UBIFS, YAFFS2, or LogFS. I have to wonder why vendors aren't pursuing that path.

Re:Why pretend these are ordinary disks? (4, Interesting)

NekoXP (67564) | more than 5 years ago | (#26941401)

Because Intel and the rest want to keep their wear-leveling algorithm and proprietary controller as much of a secret as possible so they can try to keep on top of the SSD market.

Moving wear-levelling into the filesystem - especially an open source one - effectively also defeats the ability to change the low-level operation of the drive when it comes to each flash chip - and of course, having a filesystem and a special MTD driver for *every single SSD drive manufactured* when they change flash chips or tweak the controller, could get unwieldy.

Backing them behind SATA is a wonderful idea, but this reliance on CHS values I think is what's killing it. Why is the Linux block subsystem still stuck in the 20MB hard-disk era like this?

Is 1 Disk Raid the solution? (0)

WittyName (615844) | more than 5 years ago | (#26941317)

Partition the drive into BlockSize/4KB logical disks.
Make sure the alignment is correct, then RAID these
into 1 big disk.

This gives us one usable disk with maybe 128kb clusters.

Small files would need to share a cluster, but they
would have done that anyway..

Take a look at Maemo . . . (1)

PolygamousRanchKid (1290638) | more than 5 years ago | (#26941321)

. . . which runs on the Nokia N800/N810 "Internet Tablets" (www.maemo.org). They might have done some tweaking, since this is Linux running on SSDs.

Re:Take a look at Maemo . . . (2, Interesting)

DragonTHC (208439) | more than 5 years ago | (#26941367)

Don't forget android.

Re:Take a look at Maemo . . . (1)

kamatsu (969795) | more than 5 years ago | (#26942201)

Dunno about maemo, but android uses flash-optimized yaffs for its storage, and doesn't include support for traditional linux types like ext2 or 3.

Re:Take a look at Maemo . . . (1)

ADRA (37398) | more than 5 years ago | (#26942617)

Maemo and several other embedded systems have been using flash based disk storage for years. The problem is that SSD isn't a flash storage device, its a hard-drive interface wrapped around a flash device.

Since Linux can't see the flash devices themselves, it can't properly implement a flash based hard-drive interface.

repeated re-write issues? (1)

supernova87a (532540) | more than 5 years ago | (#26941381)

when I saw the headline, I was thinking not so much the fragmentation issues, but the repeated re-writing of logs and other small frequently accessed files that SSDs are susceptible to (maximum # of rated read-write cycles). Have there been any developments in that area?

Re:repeated re-write issues? (4, Informative)

nedlohs (1335013) | more than 5 years ago | (#26941529)

It will outlast a standard hard drive by orders of magnitude so it's completely not an issue.

With wear leveling and the technology now supporting millions of writes it just doesn't matter. Here's a random data sheet: http://mtron.net/Upload_Data/Spec/ASIC/MOBI/PATA/MSD-PATA3035_rev0.3.pdf [mtron.net]

"Write endurance: >140 years @ 50GB write/day at 32GB SSD"

Basically the device will fail before it reaches the it runs out of write cycles. You can overwrite the entire device twice a day and it will last longer than your lifetime. Of course it will fail due to other issues before then anyway.

Can there be a mention of SSDs without this out-dated garbage being brought up?

Re:repeated re-write issues? (4, Informative)

A beautiful mind (821714) | more than 5 years ago | (#26942047)

There are a few tricks up the manufacturer's sleeve to make this slightly better than it really is:

1. large block size (120k-200k?) means that even if you write 20 bytes, the disk physically writes a lot more. For logfiles and databases (quite common on desktops too, think of index dbs and sqlite in firefox for storing the search history...) where tiny amounts of data are modified, this can add up rapidly. Something writes to the disk once every second? That's 16.5GB / day, even if you're only changing a single byte over and over.

2. Even if the memory cells do not die, due to the large block size, fragmentation will occur (most of the cells will have a small amount of space used in them). There has been a few articles about this that even devices with advanced wear leveling technology like Intel's exhibit a large performance drop (less than half of the read/write performance of a new drive of the same kind) after a few months of normal usage.

3. According to Tomshardware [tomshardware.com] unnamed OEMs told them that all the SSD drives they tested under simulated server workloads got toasted after a few months of testing. Now, I wouldn't necessary consider this accurate or true, but I'd sure as hell would not use SSDs in a serious environment until this is proven false.

Re:repeated re-write issues? (2, Informative)

berend botje (1401731) | more than 5 years ago | (#26942105)

All nice and dandy, but these figures aren't exactly honest. In a normal scenario your filesystem consists for a large part on static data. These blocks/cells are never rewritten. Therefore the writes (for logfiles etc) are concentrated on a small part of the disk, wearing it out rather more quickly.

Having a few Compact Flash disks wear out in the recent past, I'm not exactly anxious to replace my server disks with SSD.

What is different about SSD's? (1)

deathguppie (768263) | more than 5 years ago | (#26941461)

From what I can scrape together quickly off of the Internet IANASE (I am not a software engineer). The biggest difference seems to be the lack of a need for error checking and disk defrag etc. Since the a normal spinning hdd does not actually delete a file but just removes the markers the filesystem treats all areas the same and does the same things to both real and non-real data to keep the disk state sane. In an SSD all of this leads to a lot of unneeded disk usage and premature degradation of the drive itself.

There seems to be more about Data set management but I don't quite understand it.. maybe someone more knowledgeable could explain it?

Re:What is different about SSD's? (1)

ADRA (37398) | more than 5 years ago | (#26942739)

Flash devices have the inherent weakness that if you write to the same place in the disk say 10000 times, that part of the disk will stop working.

Its kind of like a corrupt sector(piece of the disk) on your regular hard-drive, but instead of the timer being based on some drive defects or head crashes, its based on a write timer.

Why is this a big deal? Say I have a file called foose.txt. I decide that my neat program will open the file, increment a number, then close the file again. It sound pretty simple, but imagine if this ran 20x a minute. That's 1200x writes an hour, or 28800 times in one day.

If I was running this application against a raw flash device, I would have killed that 10000 write flash sector in a single day.

What the sophisticated management software in SSD's does is notice that I'm writing to that sector too many times and decides to move my written file to somewhere else instead. So, I'm still killing my flash based device by eating up 28800 writes to the device as a whole, but 28800 writes spread over hundreds/thousands of sectors is a lot better than killing a single sector.

That's why the selective and flexible selection of writing to a flash based disk is so important. Many Linux based flash disk technologies do basically the same thing as SSD does behind the scenes, but since Linux can't see behind the veil which is SSD, we can't use flash file-systems on top of SSD disks. Because of this, I imagine that the author would like Linux devs to better support SSD's by getting non-flash file systems to support SSD better than they are today.

Don't SSD's have a pre-set number of writes? (2, Funny)

DJRumpy (1345787) | more than 5 years ago | (#26941639)

I'm just sitting here thinking. Doesn't an SSD have a preset number of writes in it due to it's nature?

Does it really matter if they spread these writes around on the hard drive when the number of writes the drive is capable of doing is still the same in the end?

To drastically oversimplify, lets say that each block can be written to twice. Does it really matter if they used up the first blocks on the drive and just spread towards the end of the drive partition with general usage rather than jumping all over to try to spread the writes around?

Am I thinking about this the wrong way? What benefit does it give them to spread the writes around if the total number of writes doesn't change? Doesn't it just further fragment the files with little gain?

Re:Don't SSD's have a pre-set number of writes? (2, Informative)

berend botje (1401731) | more than 5 years ago | (#26942135)

Say you 100 cells and can write 10 times to each cell.

Having every cell written to nine times: 100 * 9 = 900 writes and you still have a completely working disk.

Writing 900 writes to the first couple of cells: you now have 90 defective cells. In fact, as you still have to rewrite the data to working cells, you have lost your data as there aren't enough working cells.

Re:Don't SSD's have a pre-set number of writes? (0)

Anonymous Coward | more than 5 years ago | (#26942165)

Not sure why your post is modded funny.

You are correct - there is a limited count on the number of times you can write to the SSD. Actually, the number of times you can erase is what is limited, but it amounts to the same thing.

The problem is that most file-systems want to keep important book-keeping data in a fixed location on a disk so that they can find it. That isn't a good idea with SSDs because you will quickly "burn out" that location because of all the read/erase/write cycles on it. When that location is gone, the SSD will be unusable to that file-system.

Most hardware wear-leveling (the "smarts" inside the SSD) works by fooling the file-system. It will remap locations on the SSD so that all of the SSD wears equally. Even when a filesystem thinks it is repeatedly erasing/writing a single location, the SSD will spread it out across all available locations.

On the other hand, file-systems that are "flash aware" will do the wear-leveling themselves, even on raw flash chips that don't have hardware wear-leveling.

The question that arises is which is better - doing wear-leveling in SSD or in the file-system? There are pros and cons to each method.

Re:Don't SSD's have a pre-set number of writes? (1)

DJRumpy (1345787) | more than 5 years ago | (#26942367)

So in effect, instead of 'burning' out a specific section of an SDD, they will simply burn out the entire disk at once due to wear leveling? Seems to me they are Robbing peter to pay Paul and the end result is still the same, albeit with far more fragmentation. If you are then forced to defragment your SDD to get the peformance back, you are in effect killing your SDD due to all the erase/writes that defragging will cause.

I think I prefer the slow and progressive method rather than waking up some morning to find that it's burned out once wear leveling can't find any good blocks to erase.

That leads to another few questions. What about data that is stored in a location that was written to for a final time? (say the 10th erase/write cycle in that block using the analogy above). The data is still retrievable right? It simply can't be written to again?

I wonder if the wear leveling algorithms will also store files that are read-only in nature in blocks that are close to failure?

Last but not least, do the OS's turn off the 'last accessed' property that is commonly used across most OS's? Seems that would leave to much more rapid failure.

The responses prove... (0)

Anonymous Coward | more than 5 years ago | (#26942207)

that there's way too much effort and so much overhead for so little gain and the fatal problem of SSDs having a limited lifespan is just too much to overcome.

SSDs are awesome as a simple storage medium for stuff you don't change around much, i.e. a replacement for floppies/optical media/etc. They are NOT, however, a replacement for hard drives, and it's sad that people continue to push them in that direction when it is utterly futile and, frankly, stupid to do so.

AC because this is a harsh truth that no one wants to admit and therefore would be modded down to oblivion by mods that believe it's a troll.

take a look at zfs (0)

Anonymous Coward | more than 5 years ago | (#26942317)

Seems to me that Sun's zfs filesystem is ready to use the ssd storage. The copy-on-write strategy would seem to avoid the hot spots as zfs picks new blocks from the free pool rather than rewriting the same block.

Install Windows (0)

Anonymous Coward | more than 5 years ago | (#26942405)

get out of that faggot o/s while you can. it's nothing but a bunch of dog shit. i hear it's big among dick smoking faggots.

FAGGOTS SHOULD ALL DIE!!!!!

Thinkpad X300 came with defrag tools (2, Insightful)

Britz (170620) | more than 5 years ago | (#26942647)

I purchased an X300 Thinkpad for the company this week and took a close look at it. I thought expensive business notebooks come without crapware. And I was sure the X300 would be optimized. But they had defrags scheduled! I always thought defrag is a no no for ssds. Now I am not sure anymore. I deinstalled it first. But who knows?

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>