Beta

Slashdot: News for Nerds

×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Build Your Own $2.8M Petabyte Disk Array For $117k

Soulskill posted more than 4 years ago | from the we-know-exactly-what-you'd-do-with-that-much-storage dept.

Data Storage 487

Chris Pirazzi writes "Online backup startup BackBlaze, disgusted with the outrageously overpriced offerings from EMC, NetApp and the like, has released an open-source hardware design showing you how to build a 4U, RAID-capable, rack-mounted, Linux-based server using commodity parts that contains 67 terabytes of storage at a material cost of $7,867. This works out to roughly $117,000 per petabyte, which would cost you around $2.8 million from Amazon or EMC. They have a full parts list and diagrams showing how they put everything together. Their blog states: 'Our hope is that by sharing, others can benefit and, ultimately, refine this concept and send improvements back to us.'"

cancel ×

487 comments

Not ZFS? (2, Insightful)

pyite (140350) | more than 4 years ago | (#29284975)

Good luck with all the silent data corruption. Shoulda used ZFS.

Re:Not ZFS? (0)

Anonymous Coward | more than 4 years ago | (#29285167)

So what? Use opensolaris instead of debian on yours then.

Re:Not ZFS? (4, Interesting)

anilg (961244) | more than 4 years ago | (#29285403)

Get both Debian and ZFS.. Nexenta. Links in my sig.

Yeah, but with Amazon you get FREE SHIPPING !! (2, Insightful)

Anonymous Coward | more than 4 years ago | (#29285177)

I love free shipping, even if it costs me more !! I like FREE STUFF !!

Re:Not ZFS? (2, Insightful)

Lord Ender (156273) | more than 4 years ago | (#29285693)

Are you saying that with the more expensive system, disks never fail and nobody ever has to get up in the night?

You know why Amazon charges that much? (4, Insightful)

Nimey (114278) | more than 4 years ago | (#29284995)

Support.

Re:You know why Amazon charges that much? (-1, Troll)

Anonymous Coward | more than 4 years ago | (#29285169)

Support.

Like the manufacturers of yo mama's bras? She's got big titties!

Re:You know why Amazon charges that much? (5, Funny)

bytethese (1372715) | more than 4 years ago | (#29285237)

For the 2.683M difference, that support better come with a "happy ending" for the entire staff...

Re:You know why Amazon charges that much? (4, Funny)

drooling-dog (189103) | more than 4 years ago | (#29285401)

Damn. I was going to offer support for half of that price until I saw this new requirement...

Re:You know why Amazon charges that much? (3, Funny)

machine321 (458769) | more than 4 years ago | (#29285491)

For 2.683M, you can probably afford to outsource that part.

Re:You know why Amazon charges that much? (5, Insightful)

Richard_at_work (517087) | more than 4 years ago | (#29285501)

And backup, redundancy, hosting, cooling etc etc. The $117,000 cost quoted here is for raw hardware only.

Re:You know why Amazon charges that much? (5, Insightful)

interval1066 (668936) | more than 4 years ago | (#29285625)

Backup: depends on the backup strategy. I could make this happen for less than an additional 10%. But ok, point taken.

Redundancy: You mean as in plain redundancy? These are RAID arrays are they not? You want redundancy at the server level? Now you're increasing the scope of the project which the article doesn't address. (Scope error)

Hosting: Again, the point of the article was the hardware. That's a little like accounting for the cost of a trip to your grandmother's, and factoring in the cost of your grandmother's house. A little out of scope.

Cooling: I could probably get the whole project chilled for less than 6% of the total cost, depending on how cool you want the rig to run.

I think you're looking for a wrench in the works where none exist.

Re:You know why Amazon charges that much? (5, Insightful)

MrNaz (730548) | more than 4 years ago | (#29285627)

Redundancy can be had for another $117,000.
Hosting in a DC will not even be a blip in the difference between that and $2.7m.

EMC, Amazon etc are a ripoff and I have no idea why there are so many apologists here.

Re:You know why Amazon charges that much? (1)

geekprime (969454) | more than 4 years ago | (#29285679)

SO you're saying EMC provides all that?

Please think before you post.

Re:You know why Amazon charges that much? (4, Interesting)

MoonBuggy (611105) | more than 4 years ago | (#29285699)

The lowest cost of an (apparently) comparable solution on their site is from Dell, at $826,000 per PB. That includes hardware and support but still requires hosting, cooling and so on at extra cost. To quote backup and redundancy as part of the cost seems misleading, since none of the solutions appear to include that.

Basically, in order to compare favourably to the Dell units simply requires that one can get support for less than $709,000. If you want to throw in backup and redundancy, then buy twice as many units - you've still got change from half a million compared to the single Dell unit in order to cover the extra power, support and cooling costs, not to mention that support costs don't necessarily scale linearly.

Re:You know why Amazon charges that much? (4, Insightful)

johnlcallaway (165670) | more than 4 years ago | (#29285571)

It's great having someone tell you they will be there in three hours to replace your power supply, that you then have to dedicate a staff person to be with when they go out on the shop floor because some moron in security requires it. If they had just left a few spare parts you could do it yourself because everything just slides into place anyway.

That 2.683M also pays for salaries, pretty building(s), advertising, research, conventions, and more advertising.

I could hire a couple of dedicated staff to have 24x7 support for far less than 2.683M, plus a duplicate system worth of spare parts.

This stuff isn't rocket science. Most companies don't need high-speed, fiber-optic disk array subsystems for a significant amount of their data, only for a small subset that needs blindingly fast speed. The rest can sit on cheap arrays. For example, all of my network accessible files that I open very rarely but keep on the network because it gets backed up. All of my 5 copies of database backups and logs that I keep because it's faster to pull it off of disk than request a tape from offsite. And it's faster to backup to disk, then to tape.

BackBlaze is a good example of someone that needs a ton of storage, but not lightening fast access. Having a reliable system is more important to them than one that has all the tricks and trappings of an EMC array that probably 10% of all EMC users actually use, but they all pay for.

A Very Shortsighted Article (3, Insightful)

eldavojohn (898314) | more than 4 years ago | (#29284999)

Before realizing that we had to solve this storage problem ourselves, we considered Amazon S3, Dell or Sun Servers, NetApp Filers, EMC SAN, etc. As we investigated these traditional off-the-shelf solutions, we became increasingly disillusioned by the expense. When you strip away the marketing terms and fancy logos from any storage solution, data ends up on a hard drive.

That's odd, where I work we pay a premium for what happens when the power goes out, what happens with a drive goes bad, what happens when maintenance needs to be performed, what happens when the infrastructure needs upgrades, etc. This article left out a lot of buzzwords but they also left out the people who manage these massive beasts. I mean, how many hundreds (or thousands) of drives are we talking here?

You might as well add a few hundred thousand a year for the people who need to maintain this hardware and also someone to get up in the middle of the night when their pager goes off because something just went wrong and you want 24/7 storage time.

We don't pay premiums because we're stupid. We pay premiums so we can relax and concentrate on what we need to concentrate on.

Re:A Very Shortsighted Article (4, Informative)

SatanicPuppy (611928) | more than 4 years ago | (#29285127)

The focus of the article was only on the hardware, which was extremely low cost to the point of allowing massive redundancy...This is not an inherently flawed methodology.

If you can deploy cheap 67 terabyte nodes, then you can treat each node like an individual drive, and swap them out accordingly.

I'd need some actual uptime data to make a real judgment on their service vs their competitors, but I don't see any inherent flaws in building their own servers.

Re:A Very Shortsighted Article (5, Insightful)

Desler (1608317) | more than 4 years ago | (#29285161)

The point is that the costs of services like Amazon or NetApp, etc include the costs for support, server maintenance, upgrades, etc. That they are only comparing this to just the bare minimum price for this company to construct their server is highly misleading.

Re:A Very Shortsighted Article (4, Informative)

staeiou (839695) | more than 4 years ago | (#29285187)

We don't pay premiums because we're stupid. We pay premiums so we can relax and concentrate on what we need to concentrate on.

They actually do talk about that in the article. The difference in cost for one of the homegrown petabyte pods from the cheapest suppliers (Dell) is about $700,000. The difference between their pods and cloud services is over $2.7 million per petabyte. And they have many, many petabytes. Even if you do add "a few hundred thousand a year for the people who need to maintain this hardware" - and Dell isn't going to come down in the middle of the night when your power goes out - they are still way, way on top.

I know you don't pay premiums because you're stupid. But think about how much those premiums are actually costing you, what you are getting in return, and if it is worth it.

Re:A Very Shortsighted Article (1)

IGnatius T Foobar (4328) | more than 4 years ago | (#29285603)

I know you don't pay premiums because you're stupid. But think about how much those premiums are actually costing you, what you are getting in return, and if it is worth it.

It's called a "Cost/Benefit Analysis" and every PHB in the world knows how (and when!) to do one.

Re:A Very Shortsighted Article (1)

TheLinuxSRC (683475) | more than 4 years ago | (#29285191)

In the article he does mention that this solution is not for everyone and that failover and other features are outside the scope of the article. However, for his particular usage this is a nice solution.

My question is, where does one acquire the case he uses? My company currently stores a lot of video and the 10TB 4U machines I have been building are quickly running out of space. This would be an ideal solution for my needs.

Re:A Very Shortsighted Article (1)

joelmax (1445613) | more than 4 years ago | (#29285361)

FTA The Cases were custom built.

Re:A Very Shortsighted Article (4, Informative)

Tx (96709) | more than 4 years ago | (#29285197)

We don't pay premiums because we're stupid. We pay premiums because we're lazy.

There, fixed that for you ;).

Ok, that was glib, but you do seem to have been too lazy to read the article, so perhaps you deserve it. To quote TFA, "Even including the surrounding costsâ"such as electricity, bandwidth, space rental, and IT administratorsâ(TM) salariesâ"Backblaze spends one-tenth of the price in comparison to using Amazon S3, Dell Servers, NetApp Filers, or an EMC SAN.". So that aren't ignoring the costs of IT staff administering this stuff as you imply, they're telling you the costs including the admin costs at their datacentre.

Re:A Very Shortsighted Article (0)

Anonymous Coward | more than 4 years ago | (#29285437)

We don't pay premiums because we're stupid. We pay premiums because we're lazy.

There, fixed that for you ;).

Ok, that was glib, but you do seem to have been too lazy to read the article, so perhaps you deserve it. To quote TFA, "Even including the surrounding costsâ"such as electricity, bandwidth, space rental, and IT administratorsâ(TM) salariesâ"Backblaze spends one-tenth of the price in comparison to using Amazon S3, Dell Servers, NetApp Filers, or an EMC SAN.". So that aren't ignoring the costs of IT staff administering this stuff as you imply, they're telling you the costs including the admin costs at their datacentre.

So why don't they include their own salaries for the people that are doing all this work? You know, so they get a fair comparison on price? Oh, right, because the article is shortsighted.

Not that shortsighted for their purposes (5, Insightful)

Overzeetop (214511) | more than 4 years ago | (#29285223)

Yeah, this only works if your the geeks building the hardware to begin with. The real cost is in setup and maintenance. Plus, if the shit hits the fan, the CxO is going to want to find some big butts to kick. 67TB of data is a lot to lose (though it's only about 35 disks at max cap these days).

These guys, however, happen to be both the geeks, the maintainers, and the people-whos-butts-get-kicked-anyway. This is not a project for a one or two man IT group that has to build a storage array for their 100-200 person firm. These guys are storage professionals with the hardware and software know how to pull it off. Kudos to them for making it and sharing their project. It's a nice, compact system. It's a little bit of a shame that there isn't OTS software, but at this level you're going to be doing grunt work on it with experts anyway.

FWIW, Lime Technology (lime-technology.com) will sell you a case, drive trays, and software for a quasi-RAID system that will hold 28TB for under $1500 (not including the 15 2TB drives - another $3k on the open market). This is only one fault tolerant, though failure is more graceful than a traditional RAID). I don't know if they've implemented hot spares or automatic failover yet (which would put them up to 2 fault tolerant on the drives, like RAID6).

Re:A Very Shortsighted Article (0)

Anonymous Coward | more than 4 years ago | (#29285283)

We don't pay premiums because we're stupid. We pay premiums so we can relax and concentrate on what we need to concentrate on.

Yes but how much is this convenience worth to your company? How many people can you hire with $2,6M?

If money is no concern, why not outsorce your whole business to somebody else? Why do anything at all?

Re:A Very Shortsighted Article (3, Interesting)

parc (25467) | more than 4 years ago | (#29285343)

At 67T per chassis and 45 drives documented per chassis, they're using 1.5T drives. 1 petabyte would then be 667 drives.

The worst part of this design that I see (and there's a LOT of bad to see) is the lack of an easy way to get to a failed drive. When a drive fails you're going to have to pull the entire chassis offline. Google did a study in 2007 of drive failure rates (http://labs.google.com/papers/disk_failures.pdf) and found the following failure rates over drive age (ignoring manufacturer):
3mo: 3% = 20 drives
6mo: 2% = 13 drives
1yr: 2% = 13 drives
2yr: 8% = 53 drives

Their logic is probably along the lines of "we're already paying someone to answer the pager in the middle of the night," but jeez, you're going to have to take a node offline ever 2-3 days for the first year and then almost 2 a day after that!

Re:A Very Shortsighted Article (4, Insightful)

Anarke_Incarnate (733529) | more than 4 years ago | (#29285671)

You will more than likely NOT have to take a node offline. The design looks like they place the drives into slip down hot plug enclosures. Most rack mounted hardware is on rails, not screwed to the rack. You roll the rack out, log in, fail the drive that is bad, remove it, hot plug another drive and add it to the array. You are now done.

They went RAID 6, even though it is slow as shit, for the added failsafe mechanisms.

Re:A Very Shortsighted Article (2, Informative)

fulldecent (598482) | more than 4 years ago | (#29285541)

>> You might as well add a few hundred thousand a year for the people who need to maintain this hardware and also someone to get up in the middle of the night when their pager goes off because something just went wrong and you want 24/7 storage time.

>> We don't pay premiums because we're stupid. We pay premiums so we can relax and concentrate on what we need to concentrate on.

Or... you could just buy ten of them and use the left over $1m for electricity costs and an admin that doesn't sleep

Libraries (0)

TheBiGW (982686) | more than 4 years ago | (#29285007)

What's that in Libraries of Congress?

Re:Libraries (0)

Anonymous Coward | more than 4 years ago | (#29285115)

Demon Seed FTW.

But, anyways, that's 51.2 LOC [lesk.com] , or about 3 Proteus, IIRC.

Ripoff (4, Insightful)

asaul (98023) | more than 4 years ago | (#29285027)

Looks like a cheap downscale undersized version of a Sun X4500/X4540.

And as others have pointed out, you pay a vender because in 4 years they will still be stocking the drives you bought today, where as for this setup you will be praying they are still on ebay

Re:Ripoff (3, Insightful)

Anonymous Coward | more than 4 years ago | (#29285209)

why wouldn't you just build an entirely new pod with current disks and migrate the data? You could certainly afford it.

Re:Ripoff (1)

pyite (140350) | more than 4 years ago | (#29285357)

why wouldn't you just build an entirely new pod with current disks and migrate the data? You could certainly afford it.

Maybe because there's no need to update and you just want to be able to replace broken drives?

Re:Ripoff (1, Interesting)

Anonymous Coward | more than 4 years ago | (#29285349)

No, it's the google model: when a drive dies it's dead and doesn't matter anymore; when a server dies it's dead and doesn't matter anymore. The infrastructure built on top of the pods takes care of replicating data so a failure only removes one of several copies of the data.

cheap drives too (2, Informative)

pikine (771084) | more than 4 years ago | (#29285351)

Reliant Technology sells you NetApp FAS 6040 for $78,500 with a maximum capacity of 840 drives, without the hard drive (source: Google Shopping). If you buy FAS 6040 with the drives, most vendors will use more expensive and less capacity 15k rpm drives instead of the 7200rpm drives the BlackBlaze Pod uses, and this makes up a lot of the price difference. The point is, you could buy NetApp and install it yourself with cheap off-the-shelf consumer drives and end up spending about the same magnitude amount of money. I estimate that NetApp would cost just 1.5x the amount.

NetApp FAS 6040 at $78,500 + 840 x 1.5TB drives at $120 each = $179,300 which gives you 1.26PB. Cost per petabyte is $142,500, only slightly more expensive than BlackBlaze $117,000 from the article. The real story is that BlackBlaze is able to show a competitive edge of $30,000, or being 20% cheaper.

Re:cheap drives too (1, Interesting)

Anonymous Coward | more than 4 years ago | (#29285681)

The point is, you could buy NetApp and install it yourself with cheap off-the-shelf consumer drives and end up spending about the same magnitude amount of money.

You haven't bought a NetApp (or an EMC, Compellent, or XXX brand SAN) before - it's doesn't work that way.

You get to buy NetApp Shelves of NetApp drives which sit behind your NetApp Controller. The drives, while mechanically identical to those you buy from NewEgg, run a special FW version. If you did manage to get it working, you sure as hell aren't going to get any support from your storage vendor.

Some of the newer NetApp controllers can sit in front of another SAN, but a bunch of commodity drives does not a SAN make.

Consumer drives don't work behind a pair of SAN controllers from ANY dominant storage vendor. Period. It sucks - maybe this should be what we're aiming to change.

Re:Ripoff (5, Interesting)

timeOday (582209) | more than 4 years ago | (#29285427)

Depends on how it works. Hopefully (or ideally) it's more like the google approach - build it to maintain data redundancy, initially with X% overcapacity. As disks fail, what do you do then? Nothing. When it gets down to 80% or so of original capacity (or however much redundancy you designed in), you chuck it and buy a new one. By then the tech is outdated anyways.

Re:Ripoff (1)

sarkeizen (106737) | more than 4 years ago | (#29285577)

This is the oft repeated rationale. Personally...I don't see it as so cut-and-dried. Four years from now you may throw this thing away but it also realizes it's ROI way sooner than the branded hardware + support contract (considering that the cost of support increases over time it's always possible that you will NEVER get a positive ROI on a product). The truth is to get the most out of your money you have to run the numbers in each case. Not only that but you should do so at each renewal period. For example we own a plethora of Nortel equipment much of which is still useful but is also EOL. We pay a premium in support for these products many of which could be had cheaply in the secondary market (this is not limited to Ebay BTW). These devices are part of a much larger system so system replacement is expensive. The correct solution is to budget for replacement (outright or incremental), calculate your failure rate and buy and store replacement units (don't forget to calculate disposal costs). Instead the admins act stupidly they request hundreds of thousands of dollars to replace the system outright right away. When I ask them to justify this they hem and haw about labour used in maintaining the system, or service interruptions (implicitly falling for the logical fallacy of: 'newer is better"). However they never seem able to come up with figures for this. i.e. How much time do you spend resetting this hardware when it fails? How much downtime do we incur with it?

So we replace it...incurring significant downtime of course.

Anyway all that said I think their device has merit...I think for smaller shops having redundant power would be useful.

Re:Ripoff (1)

afidel (530433) | more than 4 years ago | (#29285587)

Uh, for the difference in price you buy and could build 3x the number of nodes needed and keep them powered off and still come out hundreds of thousands cheaper. In reality you might need a say 20% extra nodes and about the same in spare HDD's over the 5 year life of the system (any more than 5 years and it's probably not worth the power to keep them going). I have to question why they put the OS on a single HDD, flash would have been cheaper and more reliable. I also have to wonder WTF is up with using non-ES drives, the ES drives only cost a couple percent more and are actually built to run 24x7. Oh and anyone running a storage business with non-ECC ram is NOT someone I'm going to trust my data to!

Re:Ripoff (1)

Delgul (515042) | more than 4 years ago | (#29285601)

Yeah... therefore, what we do at our company is to buy extra drives and put them on the shelf. For the money you save you can easily put a replacement drive (or even two or three, but this is overkill) on the shelf for every drive you put in the array. You will still be saving _massive_ amounts of money...

Re:Ripoff (1, Interesting)

Anonymous Coward | more than 4 years ago | (#29285619)

Looks like a cheap downscale undersized version of a Sun X4500/X4540.

Or, if you also want software in appliance form, along with flash accelerator drives and support, the Sun Storage 7210 [sun.com] which holds 46 TB in its 4U chassis and is expandable to 142 TB.

Sun has been undercutting NetApp prices with these ZFS-based "Unified Storage" systems, especially since they don't charge for software features (NFS, CIFS, HTTP, replication, etc.) separately like NetApp does.

By the way, if you want to try the software, there's a VMware/VirtualBox VM image [sun.com] of the storage appliance. You can replace the simulated drives with real ones if you like.

That's great but what about all the hidden costs? (1, Insightful)

Desler (1608317) | more than 4 years ago | (#29285035)

That's all fine and dandy but where is my support going to come from when this server has issues? Are they throwing in for free maintenance and upgrades to this server when it no longer meets requirements? If not, this figure is highly disingenuous.

Re:That's great but what about all the hidden cost (2, Informative)

CoolCash (528004) | more than 4 years ago | (#29285141)

If you check out what the company does, they are an online backup company. They don't host servers on this array, just backup data from your desktop. They just need massive amounts of space which they make redundant.

Re:That's great but what about all the hidden cost (2, Insightful)

hodagacz (948570) | more than 4 years ago | (#29285195)

They designed and built it so they should know how to support it. If someone else builds one, just learning how to get that beast up and running is excellent hands on training.

Re:That's great but what about all the hidden cost (1)

TooMuchToDo (882796) | more than 4 years ago | (#29285527)

If you need the support, go pay the premium. Those of us with the appropriate technical background welcome the cheaper implementations.

Cool. (1, Interesting)

SatanicPuppy (611928) | more than 4 years ago | (#29285049)

Nominally a Slashvertisement, but the detailed specs for their "pods" (watch out guys, Apples gonna SUE YOU) are pretty damn cool. 45 drives on two consumer grade power supplies gives me the heebie jeebies though (powering up in stages sounds like it would take a lot of manual cycling, if you were rebooting a whole rack, for instance), and I'd be interested to know why they chose JFS (perfectly valid choice) over some other alternative...There are plenty of petabyte capable filesystems out there.

Very interesting though. I tried to push a much less ambitious version of this for work, and got slapped down because it wasn't made by (insert proprietary vendor here). Of course, we're still having storage issues because we can't afford the proprietary solution, but at least there is no non-branded hardware in our server room.

Re:Cool. (1)

XorNand (517466) | more than 4 years ago | (#29285203)

It's not all that interesting, IMHO. If you read the description, all network I/O is done using HTTPS. The comparison to Amazon's S3 is fair, but it's ridiculous to compare this to NetApp or any of the other SANs they have listed; no iSCSI, no fiber channel.

Re:Cool. (1)

SatanicPuppy (611928) | more than 4 years ago | (#29285549)

67 terabytes for under 8000 dollars isn't interesting? Ooookay...

I don't give a damn about iSCSI; this isn't a database server, it's just a flat data file server...Most datacenters are limited by their network bandwidth anyway, not their internal bandwidth, and https isn't any worse than sftp. Paying Amazon a thousand times more, and I'd still be limited by MY bandwidth, not their internal bandwidth.

If they can deliver more storage for less price, then more power to 'em.

Re:Cool. (1)

TooMuchToDo (882796) | more than 4 years ago | (#29285593)

Really? Fiber channel tops out at what? 4Gb/sec? 8Gb/sec? Distribute your data chunks across enough chunk servers, and you can easily compete against that much cheaper.

Disclaimer: I'm currently doing HPC work at a US accelerator lab as part of one of the LHC experiments. I know how to move data around *fast*.

It's all clear now. (4, Funny)

grub (11606) | more than 4 years ago | (#29285057)


AHhh, this is why the EMC guy committed suicide. It wasn't because he was dying of cancer.

My math is a bit rusty... (1, Funny)

Anonymous Coward | more than 4 years ago | (#29285067)

...but that doesn't add up. $7,867 / 67 petabytes = $117.42/petabyte, not $117,000/petabyte.

Perhaps they were using the 'new' math.

Re:My math is a bit rusty... (5, Informative)

Desler (1608317) | more than 4 years ago | (#29285095)

It's not your math that's rusty it's your reading skills.

Linux-based server using commodity parts that contains 67 terabytes of storage at a material cost of $7,867.

Re:My math is a bit rusty... (2, Informative)

ShadowRangerRIT (1301549) | more than 4 years ago | (#29285139)

You misread. It's $7,867 per 67 terabytes. So at the hard disk standard for a petabyte (base 10, not base 2), 1000 TB == 1 PB:
(1000 TB / 67 TB) * $7,867 = $117417.91

No ECC? (0)

Anonymous Coward | more than 4 years ago | (#29285071)

No ECC? Good luck....

My plan comes to fruition! (5, Informative)

elrous0 (869638) | more than 4 years ago | (#29285081)

Soon I shall have a single media server with every episode of "General Hospital" ever made stored at a high bitrate. WHO'S LAUGHING NOW, ALL YOU WHO DOUBTED ME!!!!

And how big is a petabyte you ask? There have been about 12,000 episodes of General Hospital aired since 1963. If you encoded 45 minute episodes at DVD quality mpeg2 bitrate, you could fit over 550,000 episodes of America's finest television show on a 1 petabyte server, enough to archive every episode of this remarkable show from its auspicious debut in 1963 until the year 4078.

Re:My plan comes to fruition! (3, Funny)

ShadowRangerRIT (1301549) | more than 4 years ago | (#29285175)

But what about storing the new episodes in HD? Clearly a masterpiece of TV such as this should not be stored at mere SD quality!

Re:My plan comes to fruition! (4, Funny)

RMH101 (636144) | more than 4 years ago | (#29285213)

I think we have a new metric unit of storage, to rival the (now deprecated) Library Of Congress SI unit.

Re:My plan comes to fruition! (1)

snspdaarf (1314399) | more than 4 years ago | (#29285231)

I wouldn't watch Genital Hospital with a gun to my head! Give me All My Children, or give me Death!

Well, maybe Tea and Cake instead of Death, but you get the idea.

Re:My plan comes to fruition! (1)

elrous0 (869638) | more than 4 years ago | (#29285493)

I think you need to show more respect for a show that gave both Rick Springfield and John Stamos their acting debuts. These episodes also have incredible historic value. Years from now, when historians are needing footage of Demi Moore before plastic surgery, you'll thank me!

Re:My plan comes to fruition! (5, Funny)

ari_j (90255) | more than 4 years ago | (#29285263)

Soon I shall have a single media server with every episode of "General Hospital" ever made stored at a high bitrate. WHO'S LAUGHING NOW, ALL YOU WHO DOUBTED ME!!!!

And how big is a petabyte you ask? There have been about 12,000 episodes of General Hospital aired since 1963. If you encoded 45 minute episodes at DVD quality mpeg2 bitrate, you could fit over 550,000 episodes of America's finest television show on a 1 petabyte server, enough to archive every episode of this remarkable show from its auspicious debut in 1963 until the year 4078.

Of all the computer systems out there, yours is the one for which becoming self-aware terrifies me the most.

Re:My plan comes to fruition! (1)

maxume (22995) | more than 4 years ago | (#29285383)

What's intimidating about a self-absorbed, over-acting computer?

Re:My plan comes to fruition! (1, Funny)

Anonymous Coward | more than 4 years ago | (#29285477)

But we already have William Shatner.

Re:My plan comes to fruition! (2, Interesting)

maxume (22995) | more than 4 years ago | (#29285529)

William Shatner has continued to be awesome into well into his 70s. He even went on Conan and mocked Sarah Palin (while gently ribbing himself).

Of the personalities in Hollywood, he is one I like quite a bit.

Re:My plan comes to fruition! (1)

WMD_88 (843388) | more than 4 years ago | (#29285431)

General Hospital was only 30 minutes originally; it didn't become 60 until the late 70s. And even then, the number of commercials per hour has surely changed over time. So, your estimate is quite off. I prefer One Life to Live anyway ;D

Re:My plan comes to fruition! (0)

Anonymous Coward | more than 4 years ago | (#29285575)

So about half my porn collection then?

Re:My plan comes to fruition! (0)

Anonymous Coward | more than 4 years ago | (#29285637)

You'd need less space if you applied "deduplication"...

Disk replacement? (3, Insightful)

jonpublic (676412) | more than 4 years ago | (#29285089)

How do you replace disks in the chassis? We've got 1,000 spinning disks and we've got a few failures a month. With 45 disks in each unit you are going to have to replace a few consumer grade drives.

Re:Disk replacement? (2, Informative)

markringen (1501853) | more than 4 years ago | (#29285121)

slide it out on a rail, and drop in a new one. and there is no such thing as consumer grade anymore, they are often of much higher quality stability wise than server specific drives these days.

Re:Disk replacement? (0)

Anonymous Coward | more than 4 years ago | (#29285331)

slide it out on a rail, and drop in a new one.
and there is no such thing as consumer grade anymore, they are often of much higher quality stability wise than server specific drives these days.

More like write down which drive failed so you can identify it later, turn off the machine, slide out the server, take off the top, replace the drive and hope you replaced the correct one. Then put the cover back on slide it back in and turn on PSU1, wait and turn on PSU2. Finally log in and make sure everything is running correctly.

On a NetApp, it's look for the drive with the red light and replace it.

Re:Disk replacement? (1)

TheGratefulNet (143330) | more than 4 years ago | (#29285287)

yeah, the lack of ANY kind of hot swap on those chassis is laughable.

totally the wrong way to go. this guy is hell bent on density but he let that over-ride common sense!

Re:Disk replacement? (2, Informative)

maxume (22995) | more than 4 years ago | (#29285595)

It sounds like they just soft-swap a whole chassis once enough of the drives in it have failed.

If their requirements are a mix of cheap, redundant and huge (with not so much focus on performance), cheap disposable systems may fit the bill.

Re:Disk replacement? (1)

TheGratefulNet (143330) | more than 4 years ago | (#29285657)

that's a LOT of drives to take offline if just 1 fails.

really ugly design. very amateurish.

there are bezels and frames that allow FRONT mount and hot swap.

and btw, all the drives I see in commercial storage are notebook style (2.5") sas drives. I could not believe it (why not 3.5"??) but its a fact; small form factor sas drives are taking over. there must be a good reason for it or sun (et al) wouldn't be using those 'small drives'.

Re:Disk replacement? (1)

LordKazan (558383) | more than 4 years ago | (#29285449)

be like google - hardware redundancy and software handling the failover.

take down the node with a bad drive, swap the drive, rebuild that pod's RAID (preferably i would RAID6 them as it has better error recovery than RAID5 at the expense of storage size being [drive size]*[number of drives - 2] instead of [drive size]*[number of drives - 1] of RAID5). when it comes back up it syncs to it's other copy.

i would also get LARGE write cache drives and any databases would be running with LARGE ram buffers for performance.

for the same price as you'd shell out for "professional grade hardware" you can get 5x the "consumer grade hardware" and that's more than enough to facilitate hot data redundancy and failover.

your IT guy might even have something to do other than play World of Warcraft.

wtf? (5, Insightful)

pak9rabid (1011935) | more than 4 years ago | (#29285123)

FTA...

But when we priced various off-the-shelf solutions, the cost was 10 times as much (or more) than the raw hard drives.

Um..and what do you plan on running these disks with? HD's don't magically store and retreive data on their own. The HD's are cheap compared to the other parts that create a storage system. That's like saying a Ferrari is a ripoff because you can buy an engine for $3,000.

Re:wtf? (1)

ShadowRangerRIT (1301549) | more than 4 years ago | (#29285269)

RTFA. That $117,000 figure includes the whole rack, not just the raw HDs (which come to $81,000 according to their chart). They priced out everything in what they refer to as a "storage pod" in detail, so you can see for yourself. My primary concern is the fact that the boot disk (priced separately) doesn't appear to have a drop in back up. If one of the 45 storage HDs goes down, you can replace it (presumably it supports hot swapping), but if the boot drive goes you've got downtime.

Re:wtf? (1)

corsec67 (627446) | more than 4 years ago | (#29285429)

Looking at the case, where they have a vibration reducing layer of foam under the lid screwed down onto the drives, and with the pods stacked in the frame like they are, you have to pull a whole unit out anyways to replace a drive.

So, no hot-swap of anything anyways. PSUs fail pretty commonly in my experience, and not only do they not have redundant PSUs, they have 2 non-redundant power supplies. (RAID 0 for PSUs..... what happens when the 12V rail gets a huge surge that fries the boards on all of the drives) They might have been better off using a RAID 0 in the pod, and mirroring stuff between pods, so that when they take a pod down for maintenance (or it goes *poof*), it has less of an impact.

Also the design doesn't have any "Replace THIS DRIVE --->" indicators when they want to replace a drive, so they would have to hope the monkey gets it right in replacing drives/power supplies.

Re:wtf? (1)

SatanicPuppy (611928) | more than 4 years ago | (#29285641)

It's a little odd they didn't just choose to netboot, or boot off a cd or something. Having a boot drive at all seems like an unnecessary point of failure.

Re:wtf? (0)

Anonymous Coward | more than 4 years ago | (#29285275)

FTA...

Um..and what do you plan on running these disks with? HD's don't magically store and retreive data on their own. The HD's are cheap compared to the other parts that create a storage system.

The article answered that... and they are running software RAID with JFS and Tomcat to retrieve the data.

And EMC is a rip-off, their feature set isn't justification for the mark-up.

This is from someone who has to maintain these things, my Clariion is slower and harder to maintain than my Linux storage server. FC vs SATA, both over iSCSI. Ingenuity and innovation for the win.

Re:wtf? (1)

pak9rabid (1011935) | more than 4 years ago | (#29285481)

This is from someone who has to maintain these things, my Clariion is slower and harder to maintain than my Linux storage server. FC vs SATA, both over iSCSI. Ingenuity and innovation for the win.

Inquiring minds want to know...why would yall spend the money on FC drives only to be run over iSCSI? Why not just use SATA drives in your Clariion? I'm sure they would have been cheaper.

Re:wtf? (0)

Anonymous Coward | more than 4 years ago | (#29285307)

Aha! relevancy FTW!

but... does it run... Lunix?

Re:wtf? (0)

Anonymous Coward | more than 4 years ago | (#29285327)

"That's like saying a Ferrari is a ripoff because you can buy an engine for $3,000"

$3000? Pffft. The Trailer Park Boys could get you an engine for a mere 2 buckets of chicken and a drive to the liquor store [youtube.com] .

Re:wtf? (0)

Anonymous Coward | more than 4 years ago | (#29285365)

FTA:

On top of that is the JFS file system, and the only access we then allow to this totally self-contained storage building block is through HTTPS running custom Backblaze application layer logic in Apache Tomcat 5.5. After taking all this into account, the formatted (useable) space is 87 percent of the raw hard drive totals. One of the most important concepts here is that to store or retrieve data with a Backblaze Storage Pod, it is always through HTTPS. There is no iSCSI, no NFS, no SQL, no Fibre Channel. None of those technologies scales as cheaply, reliably, goes as big, nor can be managed as easily as stand-alone pods with their own IP address waiting for requests on HTTPS.

they are missing hardware mgmt (5, Interesting)

TheGratefulNet (143330) | more than 4 years ago | (#29285253)

where's the extensive stuff that sun (I work at sun, btw; related to storage) and others have for management? voltages, fan-flow, temperature points at various places inside the chassis, an 'ok to remove' led and button for the drives, redundant power supplies that hot-swap and drives that truly hot-swap (including presence sensors in drive bays). none of that is here. and these days, sas is the preferred drive tech for mission critical apps. very few customers use sata for anything 'real' (it seems, even though I personally like sata).

this is not enterprise quality no matter what this guy says.

there's a reason you pay a lot more for enterprise vendor solutions.

personally, I have a linux box at home running jfs and raid5 with hotswap drive trays. but I don't fool myself into thinking its BETTER than sun, hp, ibm and so on.

Re:they are missing hardware mgmt (4, Insightful)

N1ck0 (803359) | more than 4 years ago | (#29285673)

Its better at what they need it for. Based on the services and software they describe on their site, it looks like they store data in the classic redundant chunks distributed over multiple 'disposable' storage systems. In this situation most of the added redundancy that vendors put in their products doesn't add much value to their storage application. Thus having racks and racks of basic RAIDs on cheap disks and paying a few on-site monkeys to replace parts is more cost effective then going to a more stable/tested enterprise storage vendor.

You can get 2TB drives now (1)

cibyr (898667) | more than 4 years ago | (#29285281)

Since you can now get 2TB drives you should be able to fit 90TB in one of these boxes :)

And I thought I was doing well with a few terabytes in my home server (but hey, ZFS should save me from silent data corruption when the drives inevitably start to fail).

Sooooo, (0)

Anonymous Coward | more than 4 years ago | (#29285295)

When are they going to sell the Backblaze kit everything but the hard-drives?

Everything looks rather standard except the case and the HD panels inside it

I am sure there are companies who would really like to buy one or two.

Re:Sooooo, (1)

Christophotron (812632) | more than 4 years ago | (#29285653)

hell, *I* would like to buy one, for my own personal use! $8000 seems very cheap for 67 terabytes of storage in a neat little package. My 4TB raid was quite expensive compared to this (on a $ per TB basis) and it's almost full now. I can definitely see something like this in my future. running ZFS for error detection, of course. And probably 2 redundant PSUs instead of standard consumer-grade ones. Wouldn't want one of those to go out and take half of my drives with it!

Online storage is way too expensive and internet connection speeds here in the USA will suck too badly for too long to even consider it..

No comparison (0)

Anonymous Coward | more than 4 years ago | (#29285311)

Those "outrageously overpriced" models have multiple controllers that have battery backed up caches that mirror their data, SAS or FC instead of SATA, hot swappable components (power supplies, fans, drives, controllers, cache modules, etc), 99.999% uptime, testing/certification for EMI, shock, vibration, thermal, GUI/phone home management, and 24/7 on-site support. They are designed for high performance, mission critical situations. The blog is from a company that's doing backups. They did a good job, but it's apples and oranges. They don't have the performance, uptime, or support requirements. They're doing their own support and aren't selling the HW, so they don't have the certification. Their top loading trays are going to make it fun to replace a drive at the top of the cabinet.

Or wait 5 years and buy it at newegg for $280 (3, Funny)

dicobalt (1536225) | more than 4 years ago | (#29285369)

and save $2,799,720.

No Linux support? (0)

Anonymous Coward | more than 4 years ago | (#29285397)

So there is a small blog write up that demonstrates you use a fairly unreliable hardware setup... love those rubber bands!... but your $5 a month service only supports Windows and newer Mac computers? .... meh

board standoffs under the HDDs = HDDs will fry. (0)

Anonymous Coward | more than 4 years ago | (#29285421)

So, air will take the path of least resistance - so all those fans will be moving the coolest air (under the hard drives) and pushing it out the back. They really, really are going to hate replacing ~ 30 drives per enclosure after a few weeks.

Better plug them in from the bottom of the chassis, and put the standoffs on the "top" so the hot air will at least rise off the disks and can be push/pulled it out by that godawful fan system.

Oh yeah - TWO 760W power supplies? 1500 watts per 45 drives? That's pretty horrible by enterprise standards. They will spend 2X on powering these over a hp EVA4400.

Liability insurance (1)

scsirob (246572) | more than 4 years ago | (#29285443)

If you build a petabyte stack using 1.5TB disks you need about 800 drives including RAID overhead. With an MTBF for consumer drives of 500,000 hours, a drive will fail roughly every 10-15 days, if your design is good and you create no hotspots/vibration issues.

Rebuild times on large RAID sets are such that it is only a matter of time before they run a double drive failure and lose their customers data. The money they saved by going cheap will be spent on lawyers when they get the liability claims in.

Cool for home pr0n collection, but business? (1)

filesiteguy (695431) | more than 4 years ago | (#29285447)

Though I don't run a datacenter, I do rely heavily on one. My co-manager is in charge of keeping my 80 TB of data online 24/7 using redundant HP StorageWorks 8000 EVA units. [hp.com]

These cost a bit and have drives which fail at a fairly infrequent rate. It doesnt' hurt that the data center is kept at 64 degrees by two (redundant) chillers and has 450 KVa redundant power conditioners keeping the electricity on at all times. (We do shut off the power to the building once a month to check these and the diesel generator housed on the premises as well.)

Now - paying $x,xxx per year for maintenance on these units is cheap insurance in my mind. If something goes wrong, HP is available 24/7 to be onsite with replacement parts. This has - in fact happened - during the past few years. A controller on the array went bad, causing disk read failures. We instantly called HP, had a tech onsite, and had the controller replaced within a few hours of the problem being detected.

OTOH - for someone's 4 petabyte home pr0n collection, this might be a good idea! :P

Lets try to be a bit more supportive here! (4, Insightful)

fake_name (245088) | more than 4 years ago | (#29285559)

If an article went up describing how a major vendor released a petabyte array for $2M the comments would full of people saying "I could make an array with that much storage far cheaper!"

Now someone has gone and done exactly that (they even used linuxto do it) and suddenly everyone complains that it lacks support from a major vendor.

This may not be perfect for everyones needs, but it's nice to see this sort of innovation taking place instead of blindy following the same path everyone else takes for storage.

What's all the hate? (5, Insightful)

xrayspx (13127) | more than 4 years ago | (#29285583)

These guys build their own hardware, think it might be able to be improved on or help the community, and they release the specs, for free, on the Internet. They then get jumped on by people saying "bbbb-but support!". They're not pretending to offer support, if you want support, pay the 2MM for EMC, if you can handle your own support in-house, maybe you can get away with building these out.

It's like looking at KDE and saying "But we pay Apple and Microsoft so we get support" (even though, no you don't). The company is just releasing specs, if it fits in your environment, great, if not, bummer. If you can make improvements and send them back up-stream, everyone wins. Just like software.

I seem to recall similar threads whenever anyone mentions open routers from the Cisco folks.

Battery Backup? (0)

Anonymous Coward | more than 4 years ago | (#29285599)

I'd like to see support for power supply redundancy or, at least, battery backup for the raid cards before I consider this as a viable solution - even for homebrew.

Components (1)

HogGeek (456673) | more than 4 years ago | (#29285611)

Not too shabby.

I had recently built a "storage pod" for my media @ home (6T using 4 1.5T drives), and had a hell of a time finding "good" components. So, I looked this over, and while it's made up of "consumer components" a couple of the components seem impossible to find for this as well.

Case: Custom Built
HD Backplane: Custom made by chinese manufacturer.

So good luck building a "one off" for your small business/home, as I'll also bet these prices are for "quantity" (quality not withstanding)

FC / iSCSI / 10GBe / Cache / Snapshot etc (0)

Anonymous Coward | more than 4 years ago | (#29285651)

Lots missing here. How do I access the box? FC? FCoE? NFS only? 1Gbe or 10Gbe? What about cache, and caching algorith,s / logic? Snapshotting? Mirroring? Clones? Remote Mirroring?

Yes, all of these can be done at limited levels with Linux (openfiler for instance) but this implementation loses a lot in the ports and cache.

Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Create a Slashdot Account

Loading...