Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

US Supercomputer Uses Flash Storage Drives

timothy posted more than 5 years ago | from the every-subsystem-counts dept.

Data Storage 72

angry tapir writes "The San Diego Supercomputer Center has built a high-performance computer with solid-state drives, which the center says could help solve science problems faster than systems with traditional hard drives. The flash drives will provide faster data throughput, which should help the supercomputer analyze data an 'order of magnitude faster' than hard drive-based supercomputers, according to Allan Snavely, associate director at SDSC. SDSC intends to use the HPC system — called Dash — to develop new cures for diseases and to understand the development of Earth."

cancel ×

72 comments

Sorry! There are no comments related to the filter you selected.

Wow.. (2, Funny)

i_want_you_to_throw_ (559379) | more than 5 years ago | (#29330575)

Imagine a beo...... umm.. nevermind

Re:Wow.. (1)

Larryish (1215510) | more than 5 years ago | (#29330923)

"... intends to use the HPC system -- called Dash -- to develop new diseases for cures...

There, fixed that for ya.

a trade of speed for lifespan (0)

Anonymous Coward | more than 5 years ago | (#29330585)

sounds like trading performance for lifespan, maybe a fair trade but how much?

Re:a trade of speed for lifespan (1)

gravos (912628) | more than 5 years ago | (#29330653)

Well, trading speed for cost. Many supercomputers are about maximum performance and damn the expense, so it may be a reasonable tradeoff. With proper redundancy SSDs and hard disks are equally capable of long-term data integrity.

Re:a trade of speed for lifespan (0)

Anonymous Coward | more than 5 years ago | (#29331155)

If enterprise grade SSD, they will likely have lifespan, too, far in excess of HDD.

Problems to solve with it: (1)

ctrl-alt-canc (977108) | more than 5 years ago | (#29330597)

1) design SSDs with a longer lifespan [slizone.com]

Re:Problems to solve with it: (2, Informative)

Barny (103770) | more than 5 years ago | (#29330605)

You not been following that thread enough, you will note the new patriot SSD have a 10yr warranty, but of course a "supercomputer" wouldn't use those.

Other pci-e based SSD I have seen around give upto 50yr life span.

Damage.Inc here btw

Re:Problems to solve with it: (2, Informative)

gabebear (251933) | more than 5 years ago | (#29330657)

The article says it's using intel SSDs hooked up via SATA, which come with the regular 3 year disk warranty.

Re:Problems to solve with it: (1)

Barny (103770) | more than 5 years ago | (#29330861)

They say they are using Intel SSD via SATA, their higher end drives typically have a 2M hrs MTBF, almost double what most HDD are these days.

http://www.intel.com/design/flash/nand/extreme/index.htm [intel.com]

Re:Problems to solve with it: (1)

Xiterion (809456) | more than 5 years ago | (#29332023)

MTBF isn't the only parameter to care about. I'm highly suspicious of the exceedingly high MTBF numbers given for SSDs. What are they doing, giving you the MTBF with absolutely zero data being written to the flash chips? I mean, honestly, a 228 *year* MTBF should raise some red flags about how relevant the number is.

Re:Problems to solve with it: (1)

Barny (103770) | more than 5 years ago | (#29333751)

FusionIO SSD are rated upto 48 years of continuous use at 5TB written/erased a day.

Not sure if intel rate theirs that high.

Re:Problems to solve with it: (1)

Xiterion (809456) | more than 5 years ago | (#29333941)

Interesting. I guess 43 years with 5 TB/day turnover and an 80 gig drive is consistent with 1 million erase/write cycle durability flash. The main reason a claimed life of 200+ years makes me wonder is other electronics components often have a life of far less, often limited by things like bond wire failures. Also, once a company starts bragging life spans that are not only 5 times longer than most people keep the devices but are over twice the lifespan of the potential users it starts to become a spec of marginal usefulness. Also, curse you for giving me a new shiny toy to lust over.

Re:Problems to solve with it: (1)

afidel (530433) | more than 5 years ago | (#29345913)

MTBF has NOTHING to do with a devices lifetime, it has to do with the average time for devices of that type to fail. If you have 200 of these running you should expect to have 1 to fail per year. This compares with the ~2% per year failure rate I've experienced in my datacenter over the last 3 years using mostly Segate enterprise class drives in a large number of different chassis from a variety of vendors. If you have thousands of these running you might have a failure per day, but still about one fourth the failures you would have had with traditional disks.

Re:Problems to solve with it: (2, Informative)

nbert (785663) | more than 5 years ago | (#29330663)

My favorite computer magazine [heise.de] once tested an ordinary USB flash drive and it still worked after 16 million write cycles on the same file. Since they are using Intel SATA-SSD at SDSC I'm assuming that those drives are SLC, which last ~10x longer than (cheaper) multi-cell drives.

But even if drives start to fail they'll just replace them like they do with any other supercomputer setup, so it's more a cost factor than a problem.

Re:Problems to solve with it: (1)

gabebear (251933) | more than 5 years ago | (#29330943)

SATA SSDs have write-leveling implemented in the disk controller. If the drive was primarily empty then moving the file to a new block every time it was written would be trivial. I don't read German, but unless the disk was first filled to capacity, then test wasn't very useful in determining the reliability of SSDs.

Re:Problems to solve with it: (0)

Anonymous Coward | more than 5 years ago | (#29330697)

If you're going to build an ultra-fast supercomputer, there's a more pressing problem that you could solve with it: figure out why the average user can't handle a little RTFM.

By the way, your subject should have been "Problems with it to solve" or "Problems to solve for it". But that's okay, I'm sure English is your first/only language.

Re:Problems to solve with it: (0)

Anonymous Coward | more than 5 years ago | (#29330903)

Actually, no, GP meant what he/she said. It could alternatively have been worded: "Problems to solve using it". But that's OK. I'm sure reading comprehension is your first/only skill.

Re:Problems to solve with it: (2, Insightful)

bubbaD (182583) | more than 5 years ago | (#29331389)

"But that's okay, I'm sure English is your first/only language." That seems to be a really lame attempt to insult native English users. There's no grammatical rules against "problems to solve with it." Even "To problem solve with it" is acceptable because the rule against split infinitives is considered obsolete and old fashioned. English has amazing flexibility. It is the perl of human languages!

Re:Problems to solve with it: (1)

Phoghat (1288088) | more than 5 years ago | (#29338509)

Then what is the C++?

Re:Problems to solve with it: (1)

Rockoon (1252108) | more than 5 years ago | (#29330851)

You've been out of the loop for awhile I see.

A 250GB SSD can have over 2.5PB (petabytes) written to it before it cannot be written to anymore.

Re:Problems to solve with it: (1)

maxume (22995) | more than 5 years ago | (#29336207)

10 years from now, there are going to be people with 7 year old flash drives fretting about the fact that they wear out.

BS (1)

gweihir (88907) | more than 5 years ago | (#29330615)

FLASH is about read access time. Throughput can be gotten far cheaper with conventional drives and RAID1.

The rest is the usual nonsense for the press.

Re:BS (1)

Rockoon (1252108) | more than 5 years ago | (#29330901)

You are talking about 'cheaper' in regards to supercomputers? Their super computer has 68 of these, [appro.com] which are so expensive they wont even give you a price tag without calling them for a quote.

Re:BS (1)

afidel (530433) | more than 5 years ago | (#29345941)

Uh, no enterprise class vendor just gives out pricing without working through a salesman or VAR. Mostly because they want the lead but also because they want to make sure the solution being quoted is the correct one for the customer. If you're serious it's not hard to get quotes, we had quotes for a half dozen different vendors and almost a dozen different solutions when we recently purchased a new SAN.

Re:BS (2, Informative)

pankkake (877909) | more than 5 years ago | (#29330913)

FLASH is about read access time. Throughput can be gotten far cheaper with conventional drives and RAID1.

You mean RAID0 [wikipedia.org] . Note that you could do RAID0 with Flash drives and have both.

Re:BS (1)

smallfries (601545) | more than 5 years ago | (#29331923)

You mean throughput on sequential reads. What makes you assume that is the type of throughput they are measuring?

Can they find a cure for wear levelling? (1)

Smidge207 (1278042) | more than 5 years ago | (#29330637)

You will note the new patriot SSD have a 10yr warranty, but of course a "supercomputer" wouldn't use those. Other pci-e based SSD I have seen around give upto 50yr life span. Damage.Inc here btw

Re:Can they find a cure for wear levelling? (1, Informative)

Barny (103770) | more than 5 years ago | (#29330889)

No, I'm Batman!

Re:Can they find a cure for wear levelling? (1)

Barny (103770) | more than 5 years ago | (#29333783)

Yeah I am burning karma, but its there to be burnt.

Why did the OP just wholesale copy one of my posts from earlier in the thread? Including the comment at the end which states my user name from the forum my original post was in reference to.

Re:Can they find a cure for wear levelling? (1)

MartinSchou (1360093) | more than 5 years ago | (#29333793)

No you're not!

Everybody knows that Dr. Sheldon Cooper is Batman!

Re:Can they find a cure for wear levelling? (0)

Anonymous Coward | more than 5 years ago | (#29331249)

Multiple account sock puppets are NOT cool, Barny [slashdot.org]

Cost savings? (5, Insightful)

gabebear (251933) | more than 5 years ago | (#29330641)

"Hard drives are still the most cost-effective way of hanging on to data," Handy said. But for scientific research and financial services, the results are driven by speed, which makes SSDs makes worth the investment.

Why is the super computer ever being turned off? Why not just add more RAM?

SSD is cheaper than DDR ( ~$3/GB vs ~$8/GB ), but also ~100 times slower.

Exactly. Just use RAM. (1)

EWAdams (953502) | more than 5 years ago | (#29330689)

RAM + uninterruptible power supply, and you're done. The only thing you need storage for is loading apps and data to begin with.

Re:Exactly. Just use RAM. (0)

Anonymous Coward | more than 5 years ago | (#29331753)

And you can do that either with a netboot server or just have a local disk. But the best performance will come from staying in RAM and reading in / writing out to a parallel file system.

What are they doing that they need that much local storage anyway? If their data sets are that big that they are working on, writing out interim results and reading those back in is going to really hurt. Even though the file copies can be run in the background, moving that kind of data around on the interconnect is going to hurt running processes. Or, they have to add another interface to move the data over a separate path.

Maybe they are using their compute nodes as their parallel file system as well, but if that's true, they either don't have many users or the data sets aren't that big. And if that's what they are doing, they tie up their compute cluster as users store and retrieve data sets from longer-term storage (e.g. tape).

I'm sure someone ran the numbers but I thought the trend in cluster computing was towards diskless nodes with lots of memory.

Re:Exactly. Just use RAM. (2, Interesting)

dkf (304284) | more than 5 years ago | (#29333411)

If their data sets are that big that they are working on, writing out interim results and reading those back in is going to really hurt.

They're a supercomputing centre, so yes, the data sets are that big. And the users like taking copies of them and moving them around; there are even reasons for doing this that aren't linked to recovering from a crash (such as being able to rerun a simulation from part way through, rather than having to wade through the whole lot from the beginning).

Re:Cost savings? (2, Interesting)

maxume (22995) | more than 5 years ago | (#29330703)

It could be a technical issue (i.e., they are targeting simplicity). Hooking up 1 TB of SSDs involves 4 SATA cables, hooking up an additional terabyte of RAM involves finding special widgets that hold as much RAM as possible, and the parts to make them talk to the nodes.

Re:Cost savings? (0)

Anonymous Coward | more than 5 years ago | (#29330819)

but the ram can be read at a much faster rate

Re:Cost savings? (1)

gabebear (251933) | more than 5 years ago | (#29330891)

True, I'm sure they have a valid technical reason, but the article completely fails to point out what that would be.

If the "special I/O nodes" are connected using a 10Gb network, then the 4x3Gb SATA drives would let them saturate the network bandwidth.

Re:Cost savings? (0)

Anonymous Coward | more than 5 years ago | (#29331051)

TFA says there's 768GB RAM in total, which is almost as big as the 1TB Flash array.

Re:Cost savings? (1)

maxume (22995) | more than 5 years ago | (#29336143)

That's 768 GB of RAM across all of the nodes whereas the SSDs are 1 TB per node.

Re:Cost savings? (1)

Eugene (6671) | more than 5 years ago | (#29331559)

Space, Heating, Electricity will all be a factor building pure RAM Drives.

Re:Cost savings? (4, Informative)

MartinSchou (1360093) | more than 5 years ago | (#29331133)

Space requirements.

Biggest DDR3 SO-DIMM modules I could find were 4 GB. They are 30 mm x 66.7 mm [valueram.com] and the standard allows for [jedec.org]

The DDR3 SO-DIMM is designed for a variety of maximum component widths and maximum lengths, refer to the applicable raw card for exact componet size allowed. Components used
in DDR3 SO-DIMMs are also limited to a maximum height (as shown in dimension "A" of MO-207) of 1.35 mm. [page 19]

You now have an absolute minimum size of 2,701.35 mm^3 (1.35 mm x 30 mm x 66.7 mm), or 675.3375 mm^3/GB. This is a very very idealized minimum by the way.

An Intel 2½" drive is 49,266.28 mm^3 (100.4 mm x 7 mm x 70.1 mm) [intel.com] and currently maxes out at 160 GB leaving you with 307.91425 mm^3/GB. That's 46% of the space that would be needed for DDR3 RAM. Add to that that Intel's 2nd generation SSDs are only using one side of the PCB, and you can expect the storage space requirements to be halved.

Then there's the fact that the SSDs are directly replaceable. In other words, they don't need to rebuild the computer, buy super special boards or anything like that - you can replace a harddrive with an SSD without having to spec out a new supercomputer.

In the end, if you wanted to replace the system with something that could provide 1 TB of RAM per node, they would need a VERY expensive system. Even with 8 GB modules, you would need to somehow fit 128 of them onto a board. I'd really love to see the mother- and daughter-boards involved with that.

In the end it doesn't just come down to raw price or speed of the storage device (RAM vs SSD vs HDD vs tape), but also all the other factors involved, such as space, power, heat and the stuff you need to use it (i.e. a brand new super computer that can support 1 TB RAM/node vs 48 GB at the moment.

Or to use a really bad car analogy, some company has found out that using a BMW M5 Touring Estate [bmw.co.uk] gives them faster deliveries than using a Ford Transit. Now you're suggesting that they should be delivering stuff via aeroplanes. Yes, it's much faster, but you need a brand new transportation structure built up around this, which you also need to factor into your cost assessments.

Re:Cost savings? (1)

gabebear (251933) | more than 5 years ago | (#29331417)

I really doubt it's the space requirements. The cooling system is likely going to be several times the size of the computer.

You have a couple facts wrong.
  • They only have 4 nodes with these drives
  • Each node has 16 DDR2 slots that hold 4GB sticks

They aren't maxing out the RAM slots on each node and they seem to be relying on these IO nodes to increase performance. I'd like to know how/why and this article doesn't explain anything.

DRAM/DDR drives aren't anything new; hooking up 1TB of DDR would be expensive, but so is 1TB of X25-E drives
http://techreport.com/articles.x/16255/12 [techreport.com]
http://www.ddrdrive.com/ [ddrdrive.com]

Re:Cost savings? (1)

BikeHelmet (1437881) | more than 5 years ago | (#29335309)

Long-running simulations can run completely awry if one of the DIMMs dies part-way in.

Being able to record snapshots for later reuse or verification helps ensure the correctness of the simulation.

Re:Cost savings? (1)

gabebear (251933) | more than 5 years ago | (#29337267)

Long-running simulations can run completely awry if one of the DIMMs dies part-way in.

Being able to record snapshots for later reuse or verification helps ensure the correctness of the simulation.

Sure, but mechanical disks make more sense for storing snapshots. They have 768GB of RAM and 4TB of SSDs in their cluster.

Re:Cost savings? (1)

BikeHelmet (1437881) | more than 5 years ago | (#29337327)

Sure, but mechanical disks make more sense for storing snapshots. They have 768GB of RAM and 4TB of SSDs in their cluster.

Perhaps. But 768GB would take a really long time to write to disk. Maybe they don't want to lose all that time? The fastest SSD setups I've seen have multi-GB/sec throughput.

If you can read and write 2GB per second, you can use flash as a sort of "slow RAM" - although I'm not saying they're doing that in this case.

Re:Cost savings? (0)

Anonymous Coward | more than 5 years ago | (#29339303)

The SATA SSDs they are using only transfer data about twice as fast as a mechanical disk. Making the RAID array twice as big with regular disks would give you equal bandwidth.

SSDs are great because of their low latency... which doesn't matter when archiving snapshots.

Re:Cost savings? (1)

BikeHelmet (1437881) | more than 5 years ago | (#29343241)

The SATA SSDs they are using only transfer data about twice as fast as a mechanical disk.

True. Too bad they aren't using ioDrives.

Re:Cost savings? (1)

afidel (530433) | more than 5 years ago | (#29345991)

If they're doing 48GB/node they better have speced that machine more than a year ago because everyone has known since this time last year that Nehalem is the best density solution and has the best $/MIP out there and almost all of the Nehalem boards support 72GB/node using cheap 4GB DIMM's. It really has been stupid to use anything else except possibly cell since the beginning of this year.

Depends on how/when/where/what you use SSD's for (-1, Offtopic)

Anonymous Coward | more than 5 years ago | (#29332071)

" Why not just add more RAM? SSD is cheaper than DDR ( ~$3/GB vs ~$8/GB ), but also ~100 times slower. - by gabebear (251933) on Sunday September 06, @08:39AM (#29330641) Homepage

System RAM is SHARED RAM, first of all - more than 1 thing is "going on" in it, @ ALL times (this is not the case w/ using SSD's for specialized tasks (& they tend to EXCEL in webserver or DB server environs & tasks. Proof thereof is here -> http://techreport.com/articles.x/17183/8 [techreport.com] for ALL KINDS of "Back Office/Server Class" type tasks - this is, by this point, a WIDELY recognized industry fact though...))

Personally, for more "end-user" type tasks here @ home? Well - I use SSD's here, "true" ones, meaning NOT based on FLASH RAM (with its slower write cycles & inferior longevity).

----

1.) A CENATEK RocketDrive (2gb PC-133 SDRAM, PCI 2.2 133mb/sec. bus transfer rates)

2.) A GIGABYTE IRAM (4gb DDR-400 RAM, SATA 1 150mb/sec. bus transfer rates)

----

I use TRUE SSD's in this manner here for performance gains:

----

1.) Pagefile.sys placement (all alone by itself on the CENATEK RocketDrive on a 2gb NTFS partition, uncompressed, so it is a "dedicated task" there & that one only).

2.) WebBrowser Program Caches (all of them in IE, FireFox, & Opera) - &, on an NTFS compressed partition, so the files are even TINIER & pickup that much faster into memory (small offset due to decompression of data into memory, but, today's CPU's & RAM speeds make up for that - on GIGABYTE IRAM)

3.) OS and application logs (like eventlogs & far more from apps + the OS also - on GIGABYTE IRAM) - again, on an NTFS compressed partition, for the same reasons as above.

4.) %Temp% &/or %tmp% environment alteration (so app & OS 'temp ops' take place in a higher speed environs & off the main disk too - on GIGABYTE IRAM)

5.) %Comspec% placement (cmd.exe on Windows NT-based OS' - on GIGABYTE IRAM)

6.) PRINT SPOOLER location (on GIGABYTE IRAM)

----

So, that all "said & aside"? What kind of performance gains do I see, & how do they work? Ok:

----

A.) Faster seek/access to said files, especially since they're small & OF BOTH "READ/WRITE NATURE" (which normal RAM types FLY on, vs. FLASH, & no "writeback caching" required really).

B.) A lot less "read/write head movement contention" on my main OS + Programs bearing HDD's, simply by moving said files + activities from my main HDD's

C.) No fragmentation of my main OS + Programs bearing diskdrive from said activities &/or files I moved from my main OS + Program bearing HDD's

----

That's also "borne out" by tests OTHERS RAN, an example of the likes of which, is here -> http://hothardware.com/Articles/Gigabyte-IRAM-Storage-Device1/?page=3 [hothardware.com] OR here -> http://techreport.com/articles.x/17183/5 [techreport.com] as well as many other technically oriented websites online.

(The gains seen? Hey, they only make complete sense, as the types of RamDrives/RamDisks/SSD's I use here are (respectively as listed above) based on PC-133 SDRAM &/or DDR-400 here, because they're F A S T E R by far & do not need "writeback caching" to offset write performance hits FLASH RAM has)... FLASH based units are fine for reads, but not so fine for writes (though writeback caches CAN offset this some).

Plus - as you can see above? Well - I do a great deal of tasks that need BOTH read and WRITE speeds here on SSD's, & I go into them above in my 2nd list (&, they work - you can try them yourself IF you have an SSD of the type I use especially (not FLASH RAM based)).

Nor do they require measures (that have overheads mind you) like "garbage cleanup" &/or "wear-levelling" engines to function properly, or to extend their lives (the CENATEK? I've been using it everyday since 2002 with no problems @ all, as an example thereof - let's see a FLASH RAM based SSD last THAT long!)

E.G.-> Once my power went out, before I had a UPS, & the pagefile.sys reverted to the C:\ drive root (typical default), & I was like "My system seems a LOT slower, wtf?" & then I noted that my drive was gone & I had to redo it via diskmgmt.msc & set the pagefile.sys, & other ops there, & reboot. Once I did?? All was "up to speed" again. You can SERIOUSLY tell when this happens (almost made me think I "sucked in a spyware/malware" almost, & yes, the diff. in performance/snappiness of the system was IMMEDIATELY APPARENT & QUITE NOTICEABLE, as to a difference).

It just works, & for ALL of the above... Those are just some ideas/"food 4 thought" on this note, & I invite others to "extend" on them, or, offer diff. ideas here in this exchange also.

HOWEVER - PERSONALLY? WELL - I think we're being sold "FLASH" ram based SSD's, FIRST, for what many folks have noted - in 'planned product obsolescence'... & the future of these units?

WELL - I think that 64-bit capability will LASTLY "usher in" drives like I use here: "REAL RAM" on them (DDR etc.) instead of FLASH - so you have longer life (the CENATEK I am using has been going strong on read/write tasks since 2002, no hassles, for example, but has a 4gb per board size limit (16gb when 4 boards are striped/spanned though) &, for LARGER SIZED "True SSD's" as I call them (not based on FLASH RAM).

Well, that is once drivers for the kinds of SSD I use can go over 4gb memory addressability?

Then, & ONLY THEN, will we be using these units to their FULL capability, AND performance potentials... w/out the need for "wear levelling" or "garbage cleanup" processing occurring, nor the limited lifespan of FLASH units, vs. ones that use "TRUE RAM", as the ones I do, as examples today (but, which are limited to 4gb per board memory onboard due to 32-bit driver nature on them), & with LARGER SIZES than the ones I have (again, 4gb per board, 16gb spanned/striped).

(SO - Call it a "hunch"... as to what I think the future is on these units, & it's NOT "FLASH" RAM based, not when high-end performance is concerned, and when 64-bit becomes more "mainstream"...)

APK

P.S.=> Been "into this stuff" since the days of the software based RamDrive (circa 1991-2001, when I wrote one myself (APK Ramdisk - don't install this on anything but Windows 2000/XP & do so on an UNPATCHED build of the OS first, so it picks up under service pack &/or hotfixes later as a LEGACY DRIVER though) & later did work for EEC Systems/SuperSpeed.com & their ramdisk softwares which the work was used to place them as a FINALIST @ Microsoft Tech-Ed 2001-2002, 2 yrs. in a row, in the hardest category there - SQLServer Performance Enhancement (which it works great for, as the results in the URL I first post above clearly shows in how SSD's &/or RAMDrives can clearly enhance both DB server performance AND Webserver performance as well).

Later, I got "into" Solid-State Boards (2002, via the CENATEK RocketDrive (2gb PC-133 SDRAM, PCI 2.2 133mb/sec. bus transfer rates) & more currently using a faster one called the Gigabyte "IRAM" (4gb DDR-400 RAM, SATA 1 150mb/sec. bus transfer rates) & they work for things like the above + how I use them, for better overall system performance (better than HDD's do, due to less latency & tremendous access/seek speeds, especially w/ smaller files) - &, yes, I can "keep state" between boots (CENATEK's unit has a backing power supply, & the GIGABYTE unit has a Lithium Ion battery for that), & can even boot up from the IRAM, but I choose not to (WD Velociraptor, 2 WD "Raptor X's", a 74gb Raptor, + 36gb Raptor as my 10,000 rpm disks I use here, they're all quite quick is why & larger (plus, I had a system that ran & has continued to do so, no problems, since 2006 - no sense in "messing with a watch that runs", & redoing that setup so I just opted to use my SSD's as offloading units @ home, to ease the burden on my HDD's & it works) is all... apk

Off topic? (0)

Anonymous Coward | more than 5 years ago | (#29361275)

The post was about ramdisks. The ac who signs off as apk replied about ramdisks. He did so with a quote from the original poster and he replied in direct response to that poster's question, and with some good ideas. I'd like to know how he was considered off topic. Somebody's a tad trigger happy with the down mods I'd say.

Re:Cost savings? (3, Insightful)

SpaFF (18764) | more than 5 years ago | (#29332179)

There are plenty of reasons why supercomputers have to be shut down....besides the fact that even with generators and UPSes facilities outages are still a fact of life. What if there is a kernel vulnerability (insert 50 million ksplice replies here...yeah yeah yeah)? What if firmware needs to be updated to fix a problem? You can't just depend on RAM for storage. HPC jobs use input files that are ten's of Gigabytes and produce output files that can be multi Terrabytes. The jobs can run for weeks at a time. In some cases it takes longer to transfer the data to another machine that it takes to generate/process the data. You can't just assume that the machine will stay up to protect that data.

Where and how is it used? (2, Interesting)

joib (70841) | more than 5 years ago | (#29330643)

TFA isn't particularly detailed, beyond saying SSD's are used on "4 special I/O nodes".

One obvious thing would be to use SSD's for the Lustre MDS while using SATA as usual for the OSS's. That could potentially help with the "why does ls -l take minutes" issue familiar to Lustre users on heavily loaded systems, while not noticeably increasing the cost of the storage system as a whole.

I won't be impressed (1)

judolphin (1158895) | more than 5 years ago | (#29330843)

Until supercomputers use SD cards.

Re:I won't be impressed (1)

drinkypoo (153816) | more than 5 years ago | (#29331029)

With SDXC going up to 104MB/sec and 2 TB, it's only a matter of time.

Re:I won't be impressed (1)

BikeHelmet (1437881) | more than 5 years ago | (#29335323)

"up to"

Hehe... ;D

I'll take some of that 5.0gbit USB 3.0 as well, please.

Re:I won't be impressed (1)

drinkypoo (153816) | more than 5 years ago | (#29378907)

I've gotten quite good speeds on SD cards. The usual problem is a lackluster USB interface, but they don't all connect via USB. There's nothing wrong with SD that eliminating the USB connection won't solve.

Hardware guys.... (1)

drdrgivemethenews (1525877) | more than 5 years ago | (#29333001)

I've lost track of how many times hardware dudes have jammed a bunch of the newest fastest hardware into a box to achieve "100x" the "performance" of prior systems. Without a sliver of irony, or the slightest effort to analyze how software will use all this new hardware. Or what the serviceability of the new machine will be. Or any of the hundred other things that will combine to turn their "100x" into "1.25x".

--------

Boot time is O(1).

Disk speed should be irrelevant! (1)

Terje Mathisen (128806) | more than 5 years ago | (#29333043)

I think it was Amdahl who said that a "supercomputer is a machine which is fast enough to turn cpu-bound problems into io-bound problems", which means that disk speed could become a limiting factor.

I have trouble seeing how having SSD arrays can make a big difference though!

All current supercomputers have enough RAM to handle the entire problem set, simply because _all_ disks, including SSDs, are far slower than RAM.

A supercomputer, like those which are used by oil companies to do seismic processing, does need fast disk, but only in order to load the input data, and this is an almost totally sequential process.

Regular disk arrays are just as fast as SSD arrays for sequential IO, so unless they have found a supercomputer problem which requires significant amounts of random access disk IO, having SSDs available should only provide marginal speedups.

Terje

Re:Disk speed should be irrelevant! (1)

pehrs (690959) | more than 5 years ago | (#29333653)

I can only agree, but notice that they are talking about a very small HPC (5.2 teraflops) and claim that they can significantly speed up data searches. There are certainly a few scenarios where you need to quickly and frequently search through a sparse, permanent dataset that is an order too large for your RAM and can benefit from the decreased delay in SSD storage.

However, for general purpose HPC systems the SSD is still a hard disk, and therefore way too slow for anything involving the computation. The extra factor 30x in the cost of storage will be very hard to motivate.

Re:Disk speed should be irrelevant! (1)

kramulous (977841) | more than 5 years ago | (#29334175)

The article specifically talk about document searching. Some of the document searching people we have require >110 Terabytes of memory (per job). On a cluster that is pretty difficult to store in ram.

A lot of supercomputers are clusters and those clusters typically don't have huge amounts of memory ... they are high on compute.

Flash wear-and-tear (1)

planetoid (719535) | more than 5 years ago | (#29333285)

So I gather flash storage technology is a lot less prone to "write failures after 'x' amount of write operations" than it was 5 or 10 years ago?

This is one of the reasons I don't trust solid state drives. Sure, I've had my fair share of crashes for traditional platter drives in my life, but if you have a program that writes thousands of times to the media every hour... what's the lifespan going to be on that flash storage?

I haven't been paying attention to tech news -- maybe some clever inventor improved flash to not have this problem anymore and nobody told me?

Re:Flash wear-and-tear (1)

rrohbeck (944847) | more than 5 years ago | (#29333437)

That's what you have SMART for. Just run smartd [sourceforge.net] or add smartctl [sourceforge.net] to your own scripts. Intel SSDs report the wear parameter in SMART attribute 233.

SSDs and databases (2, Interesting)

Richard_J_N (631241) | more than 5 years ago | (#29334123)

I've just gone through the process of setting up a pair of servers (HP DL380s) for Linux/Postgres. Our measurements show that the Intel X25-E SSDs beat regular 10k rpm SAS drives by a factor of about 12 for fdatasync() speed. This is important for a database system, as a transaction cannot COMMIT until the data has really, really hit permanent storage. [It's unsafe to use the regular disk's write cache, and personally, I don't trust a battery-backed write cache on the RAID controller much either. So not having to wait for a mechanical seek is really useful. Read speeds are also better (10x less latency), and the sustained throughput is about 2x as good.

So, yes, SSDs are a good idea for database loads, where the interaction is with the real world, and where once a transaction has completed, some other real-world process has happened. BUT, most supercomputer workloads are, in principle, re-startable (i.e. if you lose an hour's work due to a hardware failure, you can just re-run the simulation code, and throw away the intermediate state).

So, for simulations, the cost of dataloss is an hour of re-work, not irretrievable information. Given that, we can get much better performance by storing everything in RAM, enabling all the write-caches, and sticking with standard SATA, provided that, every so often, the data is flushed out to disk. If something goes wrong, just revert to the last savepoint, which could be an hour ago, rather than having to be 10ms ago.

[BTW, HP "don't support" SSDs in their servers, but the Intel SSD X25-E disks do work just fine. Though I did, unfortunately, have to buy some of HP's cheapest SAS drives ($250 each) just to obtain the mounting kits for the SSDs.]

SSDs & databases (good) & ENDUSER stuff to (0)

Anonymous Coward | more than 5 years ago | (#29335017)

"This is important for a database system, as a transaction cannot COMMIT until the data has really, really hit permanent storage. [It's unsafe to use the regular disk's write cache, and personally, I don't trust a battery-backed write cache on the RAID controller much either. So not having to wait for a mechanical seek is really useful. Read speeds are also better (10x less latency), and the sustained throughput is about 2x as good. So, yes, SSDs are a good idea for database loads, where the interaction is with the real world, and where once a transaction has completed, some other real-world process has happened." - by Richard_J_N (631241) on Sunday September 06, @04:38PM (#29334123)

Exactly! I noted that back in 1996, for EEC Systems (now SuperSpeed.com) - which, in turn, helped lead to a GOOD review in "Windows NT Magazine" (now Windows IT Pro) in the April 1997 Issue "Back Office Performance" pg. #61 topic (cover story), & for their ramdisk softwares (SuperDisk - whilst I improved their diskdriver block device driver diskcache, SuperCache I/II, by up to 40% more on paid contract to they)...

They took the same idea you expound on now, when I noted it back then, & that idea worked to place EEC Systems/SuperSpeed.com as a FINALIST @ Microsoft Tech-Ed 2001-2002, 2 yrs. in a row, in the hardest category there - SQLServer Performance Enhancement (which it works great for, as the results in the URL I first post above clearly shows in how SSD's &/or RAMDrives can clearly enhance both DB server performance AND Webserver performance as well).

IT JUST WORKS!

Others have noted it as well, but NOT ONLY FOR DB performance gains - also for WEBSERVERS, FILESERVERS, & MORE... an example thereof being here -> http://techreport.com/articles.x/17183/8 [techreport.com]

HOWEVER, I'd like to note some "creative uses" of these units (same ideas I put out for CENATEK, which for years was featured on their main page as "An Independent users review" of their SSD product, of which I am a proud & happy owner of no less) Albeit, this time, for more "home/end user" type application:

System RAM is SHARED RAM, first of all - more than 1 thing is "going on" in it, @ ALL times (this is not the case w/ using SSD's for specialized tasks (& they tend to EXCEL in webserver or DB server environs & tasks. Proof of that much is from the responder I replied to, and, from techreport above (see that url))).

Personally, for more "end-user" type tasks here @ home? Well - I use SSD's here, "true" ones, meaning NOT based on FLASH RAM (with its slower write cycles & inferior longevity).

----

1.) A CENATEK RocketDrive (2gb PC-133 SDRAM, PCI 2.2 133mb/sec. bus transfer rates)

2.) A GIGABYTE IRAM (4gb DDR-400 RAM, SATA 1 150mb/sec. bus transfer rates)

----

I use TRUE SSD's in this manner here for performance gains:

----

1.) Pagefile.sys placement (all alone by itself on the CENATEK RocketDrive on a 2gb NTFS partition, uncompressed, so it is a "dedicated task" there & that one only).

2.) WebBrowser Program Caches (all of them in IE, FireFox, & Opera) - &, on an NTFS compressed partition, so the files are even TINIER & pickup that much faster into memory (small offset due to decompression of data into memory, but, today's CPU's & RAM speeds make up for that - on GIGABYTE IRAM)

3.) OS and application logs (like eventlogs & far more from apps + the OS also - on GIGABYTE IRAM) - again, on an NTFS compressed partition, for the same reasons as above.

4.) %Temp% &/or %tmp% environment alteration (so app & OS 'temp ops' take place in a higher speed environs & off the main disk too - on GIGABYTE IRAM)

5.) %Comspec% placement (cmd.exe on Windows NT-based OS' - on GIGABYTE IRAM)

6.) PRINT SPOOLER location (on GIGABYTE IRAM)

----

So, that all "said & aside"? What kind of performance gains do I see, & how do they work? Ok:

----

A.) Faster seek/access to said files, especially since they're small & OF BOTH "READ/WRITE NATURE" (which normal RAM types FLY on, vs. FLASH, & no "writeback caching" required really).

B.) A lot less "read/write head movement contention" on my main OS + Programs bearing HDD's, simply by moving said files + activities from my main HDD's

C.) LESS fragmentation of my main OS + Programs bearing diskdrive (which I call "the long-term performance gain" here) from said activities &/or files I moved from my main OS + Program bearing HDD's

----

Those same types of results, good ones, were also "borne out" by tests OTHERS RAN, an example of the likes of which, is here -> http://hothardware.com/Articles/Gigabyte-IRAM-Storage-Device1/?page=3 [hothardware.com] [hothardware.com] OR here -> http://techreport.com/articles.x/17183/5 [techreport.com] [techreport.com] as well as many other technically oriented websites online.

(The gains seen? Hey, they only make complete sense, as the types of RamDrives/RamDisks/SSD's I use here are (respectively as listed above) based on PC-133 SDRAM &/or DDR-400 here, because they're F A S T E R by far & do not need "writeback caching" to offset write performance hits FLASH RAM has)... FLASH based units are fine for reads, but not so fine for writes (though writeback caches CAN offset this some).

Plus - as you can see above? Well - I do a great deal of tasks that need BOTH read and WRITE speeds here on SSD's, & I go into them above in my 2nd list (&, they work - you can try them yourself IF you have an SSD of the type I use especially (not FLASH RAM based)).

Nor do they require measures (that have overheads mind you) like "garbage cleanup" &/or "wear-levelling" engines to function properly, or to extend their lives (the CENATEK? I've been using it everyday since 2002 with no problems @ all, as an example thereof - let's see a FLASH RAM based SSD last THAT long!)

E.G.-> Once my power went out, before I had a UPS, & the pagefile.sys reverted to the C:\ drive root (typical default), & I was like "My system seems a LOT slower, wtf?" & then I noted that my drive was gone & I had to redo it via diskmgmt.msc & set the pagefile.sys, & other ops there, & reboot. Once I did?? All was "up to speed" again. You can SERIOUSLY tell when this happens (almost made me think I "sucked in a spyware/malware" almost, & yes, the diff. in performance/snappiness of the system was IMMEDIATELY APPARENT & QUITE NOTICEABLE, as to a difference).

It just works, & for ALL of the above... Those are just some ideas/"food 4 thought" on this note, & I invite others to "extend" on them, or, offer diff. ideas here in this exchange also.

HOWEVER - PERSONALLY? WELL - I think we're being sold "FLASH" ram based SSD's, FIRST, for what many folks have noted - in 'planned product obsolescence'... & the future of these units?

WELL - I think that 64-bit capability will LASTLY "usher in" drives like I use here: "REAL RAM" on them (DDR etc.) instead of FLASH - so you have longer life (the CENATEK I am using has been going strong on read/write tasks since 2002, no hassles, for example, but has a 4gb per board size limit (16gb when 4 boards are striped/spanned though) &, for LARGER SIZED "True SSD's" as I call them (not based on FLASH RAM).

Well, that is once drivers for the kinds of SSD I use can go over 4gb memory addressability?

Then, & ONLY THEN, will we be using these units to their FULL capability, AND performance potentials... w/out the need for "wear levelling" or "garbage cleanup" processing occurring, nor the limited lifespan of FLASH units, vs. ones that use "TRUE RAM", as the ones I do, as examples today (but, which are limited to 4gb per board memory onboard due to 32-bit driver nature on them), & with LARGER SIZES than the ones I have (again, 4gb per board, 16gb spanned/striped).

(SO - Call it a "hunch"... as to what I think the future is on these units, & it's NOT "FLASH" RAM based, not when high-end performance is concerned, and when 64-bit becomes more "mainstream"...)

APK

P.S.=> Been "into this stuff" since the days of the software based RamDrive (circa 1991-2001, when I wrote one myself (APK Ramdisk - don't install this on anything but Windows 2000/XP & do so on an UNPATCHED build of the OS first, so it picks up under service pack &/or hotfixes later as a LEGACY DRIVER though) & later did work for EEC Systems/SuperSpeed.com (see notes above on that for details @ the start of this post reply here).

Later, I got "into" Solid-State Boards (2002, via the CENATEK RocketDrive (2gb PC-133 SDRAM, PCI 2.2 133mb/sec. bus transfer rates) & more currently using a faster one called the Gigabyte "IRAM" (4gb DDR-400 RAM, SATA 1 150mb/sec. bus transfer rates)... glad I did too! Software vs. Hardware?? Software is NICE, but hardware is usually ALWAYS better (9/10 times, & especially for performance' sake)...

Yes, & they work for things like the above + how I use them, for better overall system performance!

(MUCH better than HDD's do, due to less latency & tremendous access/seek speeds, especially w/ smaller files)

AND? Hey - &, yes, I can "keep state" between boots (CENATEK's unit has a backing power supply, & the GIGABYTE unit has a Lithium Ion battery for that), & can even boot up from the IRAM, but I choose not to (WD Velociraptor, 2 WD "Raptor X's", a 74gb Raptor, + 36gb Raptor as my 10,000 rpm disks I use here, they're all quite quick is why & larger (plus, I had a system that ran & has continued to do so, no problems, since 2006 - no sense in "messing with a watch that runs", & redoing that setup so I just opted to use my SSD's as offloading units @ home, to ease the burden on my HDD's & it works) is all... apk

Re:SSDs and databases (1)

StuartHankins (1020819) | more than 5 years ago | (#29335755)

We moved from the battery-backed 5i controller on the DL380's (7x 15K drives in an MSA30) to the BL460c's and the EVA4400 (using basically the same 15K drives, but with 16 of them -- Exchange, a .8TB SQL server and a few fileservers also access this space).

The disk speed increase was enormous -- it really blew us away. What used to take between 3 and 4 hours can be done in about 8 minutes now.

IMHO using physical drives is much safer than using the SSD's and to scale up all we do is add additional shelves & drives. If you have more than a handful of servers, get a SAN. Easier management than having to play with each individual machine, I don't have the time or patience for that.

Re:SSDs and databases (1)

Richard_J_N (631241) | more than 5 years ago | (#29336141)

Could you be more specific about what actually gave the improvement? Was it just something simple, eg RAID 5 -> RAID6.

My main point though was that for supercomputer simulations (but not email or warehouse management), it's OK to risk data, and then just re-start the simulation from an hour ago. So why not just enable the disk write-caches or put the database on a RAMdisk? Without the safety requirements (such as no write-cache), the benefits of SSDs aren't needed.

BTW, I am very happy with the Intel SSDs for reliability - though just to be safe, we're using a RAID 1 array of two of them per server, and a RAIS array of two servers for the system (DRBD keeps all the RAID pairs in sync between primary and hot-spare server).

Re:SSDs and databases (1)

StuartHankins (1020819) | more than 5 years ago | (#29336633)

The management overhead using local vs consolidated storage is significant. We've been able to reduce worries and speed disk access with the SAN. That was a win for the techs and management.

There's not a direct comparison with RAID on an individual server and RAID on the EVA4400. Yes it's still a RAID5 (safety is most important to us) but the leveling aspect of the SAN provides additional performance. If I want to increase performance I add drives to the SAN and give them to that slice I've allocated for this server. No rebuilding necessary. When you're talking TB of storage that's sweet. Look into it if you've got the budget -- around 60k gets you 16 x 146GB drives and another 8 x 400GB drives for your second tier storage.

Re:SSDs and databases (1)

StuartHankins (1020819) | more than 5 years ago | (#29336703)

I know -- bad form to reply to myself etc etc. but it occurred to me that you may not be familiar with how this SAN applies storage.

Your RAID 5 partition allocated to this server (think of it like a slice of the whole pie) is smaller than the total SAN storage. In a "normal" single-server storage environment you probably allocate all space among the local drives. Each hardware RAID partition usually goes on separate drives, so that if you have 7 drives and need 2 partitions one of which is RAID-5, at least 3 of the drives must be used to create the RAID-5. In this SAN, it stripes across *all* drives even though it obviously doesn't take up the whole drive of each. Your other slices do the same thing. A RAID-1 partition is striped on the same drives which are striped as RAID-5, similar to how Linux software RAID works (without the performance penalties). So your data has more spindles which increase as you add drives, also each spindle has less data to deal with for your slice (unless you grow your slice).

You group drives together -- primary storage with your fast drives and secondary with your big slow drives for instance. If all the SAN storage for that group isn't allocated (for instance we have like 2.5 TB out of 3.5TB allocated) and a drive fails, the total SAN storage for that group is reduced by the amount of whatever drive(s) failed yet you still have a spare -- until there isn't any more total unused storage for it to sacrifice. Like a "super" RAID5.

When a drive dies, HP gets a message from the system, and we get an email that a drive died (which drive, capacity, model etc). It ships out from HP that same day and we go down to the colo and replace it. (That's a colo issue -- outside techs have to be escorted and replacing a drive is simple so we just drive there and don't request an HP tech). So management-wise it's a snap.

I haven't yet tested SAN-to-SAN replication (we're in this recession or something, funding is tough to get) but that's a whole other level of benefits.

Re:SSDs and databases (1)

Richard_J_N (631241) | more than 5 years ago | (#29355387)

Thanks for your informative posts. Sadly we can't afford a SAN anyway, though it might be a nice idea in future. What I don't understand is, how can a SAN improve the time for fdatasync() - i.e. for the data to be flushed to physical disk, and then control to return to the application? This is essential for database stuff.

As to my disinclination to trust battery backed cache - if the power goes out, it means we have about 4 hours to get it back. If that also fails, we have dataloss.

Re:SSDs and databases (1)

afidel (530433) | more than 5 years ago | (#29346067)

There are 3rd parties that sell the HP disk sleds for a LOT less than $250 per! Oh, and I'd like to find out why you don't trust BBW, tons of enterprises rely on them every day and I don't hear tons of horror stories about problems caused by the BBW.

I am blind! (0)

Anonymous Coward | more than 5 years ago | (#29340821)

I am blind!

Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?