×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Build Your Own 135TB RAID6 Storage Pod For $7,384

CmdrTaco posted more than 2 years ago | from the for-all-your...-movies-yeah-thats-it dept.

239

An anonymous reader writes "Backblaze, the cloud-based backup provider, has revealed how it continues to undercut its competitors: by building its own 135TB Storage Pods which cost just $7,384 in parts. Backblaze has provided almost all of the information that you need to make your own Storage Pod, including 45 3TB hard drives, three PCIe SATA II cards, and nine backplane multipliers, but without Backblaze's proprietary management software you'll probably have to use FreeNAS, or cobble together your own software solution... A couple of years ago they showed how to make their first-generation, 67TB Storage Pods"

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

239 comments

My God... (1)

AngryDeuce (2205124) | more than 2 years ago | (#36834264)

It's full of stars!!

Re:My God... (2)

ByOhTek (1181381) | more than 2 years ago | (#36834288)

It's full of slashvertisements!!

Re:My God... (2)

x6060 (672364) | more than 2 years ago | (#36834648)

It's not a slashvertisement if they tell you how to build it yourself.....

Re:My God... (2)

TheRaven64 (641858) | more than 2 years ago | (#36834804)

Except for the bit about how it would be even better if you paid for their proprietary management software...

Re:My God... (4, Insightful)

x6060 (672364) | more than 2 years ago | (#36835308)

Did you notice how they even gave you the alternatives to their software? Essentially they are saying "We developed this for our own internal use and if you would LIKE to pay for it its cool. If you dont then there are these other free alternatives." But then again just because some company is mentioned in the article it MUST be a slashvertisment.

Re:My God... (0)

ByOhTek (1181381) | more than 2 years ago | (#36835132)

So, you are suggesting they aren't promoting their proprietary software, or their service (while criticizing the competitors) on that page?

Odd, I must be on some pretty interesting, and highly specific-targed hallucinogens.

Re:My God... (2)

x6060 (672364) | more than 2 years ago | (#36835342)

You must be. They even GIVE you the free alternatives to their software. But in the same page they give you everything you need to do it yourself. You just have to add some hardware.

Re:My God... (5, Insightful)

Dillon2112 (197474) | more than 2 years ago | (#36835096)

My problem with Backblaze is their marketing is very misleading...they pit these storage pods up against cloud storage and assert that they are "cheaper", as though a storage pod is anything like cloud storage. It isn't. Sure, there's the management software issue that's already been mentioned, but they do no analysis on redundancy, power usage, security, bandwidth usage, cooling, drive replacement due to failure, administrative costs, etc. It's insulting to anyone who can tell the difference, but there are suits out there who read their marketing pitch and decide that current cloud storage providers like Google and Amazon are a rip off because "Backblaze can do the same thing for a twentieth the price!" It's nuts.

You can see this yourself in their pricing chart at the bottom of their blog post. They assert that Backblaze can store a petabyte for three years for either $56k or $94k (if you include "space and power"). And then they compare that to S3 costing roughly $2.5 million. In their old graphs, they left out the "space and power" part, and I'm sure people complained about the inaccuracies. But they're making the same mistake again this time: they're implicitly assuming the cost of replicating, say, S3, is dominated by the cost of the initial hardware. It isn't. They still haven't included the cost of geographically distributing the data across data centers, the cost of drive replacement to account for drive failure over 3 years, the cost of the bandwidth to access that data, and it is totally unclear if their cost for "power" includes cooling. And what about maintaining the data center's security? Is that included in "space"?

On a side note, I'd be interested to see their analysis on mean time between data loss using their system as it is priced in their post.

You could say the Backblaze is serving a different need, so it doesn't need to incur all those additional costs, and you might be right, but then why are they comparing it to S3 in the first place? It's just marketing fluff, and it is in an article people are lauding for its technical accuracy. Meh.

Already approaching Petabytes? (1)

Hermanas (1665329) | more than 2 years ago | (#36834284)

Wow, are we already approaching Petabyte clusters? I'm still getting used to Terabyte!

Re:Already approaching Petabytes? (0)

Anonymous Coward | more than 2 years ago | (#36834392)

Well, that's at least a Petabit cluster...

Re:Already approaching Petabytes? (1)

gman003 (1693318) | more than 2 years ago | (#36834592)

According to TFA's TFA, the company has a total capacity of 16 petabytes, using only 201 pods (many being the old 1.0 pods with 67TB storage).

Re:Already approaching Petabytes? (1)

Walker1337 (2400896) | more than 2 years ago | (#36835046)

Most of the high end storage providers have Petabyte arrays now. I work for Netapp and I have personally installed a single cluster with 624 2TB SATA drives in it.

Again? (0)

DeHackEd (159723) | more than 2 years ago | (#36834286)

Re:Again? (1)

DeHackEd (159723) | more than 2 years ago | (#36834298)

Ugh, replying to myself. I missed the link in the post.

But nothing's changed, right? It's the same chassis, same diagrams from backblaze. Only ~2 years of bigger drives is new.

Re:Again? (1)

Amouth (879122) | more than 2 years ago | (#36834798)

and different hardware/raid/multiplier/power harness setup..

basically the same just updated - and worth an note about.. i wish they sold or someone sold the setup sans drives (or just the bare case) - it looks fun to mess with but don't have a lot of free time now days.

Re:Again? (0)

Anonymous Coward | more than 2 years ago | (#36834976)

If you read the blog post the parts that changed besides the hard drives are: motherboard, CPU, RAM amount, additional Gigabit ethernet, Changed from Debian 4 to 5, changed file system from JFS to ext4, changed to a different PCIe SATA card to eliminate the 4th one that used to run on just a regular PCI slot amongst other things. In other words, RTFA.

Re:Again? (0)

Anonymous Coward | more than 2 years ago | (#36835170)

There is more updated in there than just the drives. From the Backblaze blog entry:

We’ve made several improvements to the design that have doubled the performance of the storage pod. Most of the improvements were straightforward and helped by Moore’s Law. We bumped the CPU up from the Intel dual core CPU to the Intel i3 540 and upgraded the motherboard from one Gigabit Ethernet port to a Supermicro motherboard with two Gigabit Ethernet ports. RAM dropped in price, so we doubled it to 8 GB in the new pod. More RAM enables our custom Backblaze software layer to create larger disk caches that can really speed up certain types of disk I/O.

In the first generation storage pod, we ran out of the faster PCIe slots and had to use one slower PCI slot, creating a bottleneck. Justin Stottlemyer from Shutterfly found a better PCIe SATA card, which enabled us to reduce the SATA cards from four to three. Our upgraded motherboard has three PCIe slots, completely eliminating the slower PCI bottleneck from the system.

OLD OLD NEWS (0)

Anonymous Coward | more than 2 years ago | (#36834290)

They have had a blog post on this topic for almost a year at least.

Re:OLD OLD NEWS (1)

kalalau_kane (1621021) | more than 2 years ago | (#36834692)

Sun has been selling this same design for several years -- Sun x4500 released October 2006. - 6 SATA controllers - 48 top loading SATA drives - 2 x86 CPU.

Not enough (2)

bryan1945 (301828) | more than 2 years ago | (#36834306)

For a true porn collector yet.

Re:Not enough (0)

Anonymous Coward | more than 2 years ago | (#36834362)

It's even far from it if you want your porn in 3D and 8k definition, as it should be.

Re:Not enough (1)

rbrausse (1319883) | more than 2 years ago | (#36835194)

fun fact: porn industry has problems with high definition [nytimes.com]

The high-definition format is accentuating imperfections in the actors — from a little extra cellulite on a leg to wrinkles around the eyes. [..] "The biggest problem is razor burn," said Stormy Daniels, an actress, writer and director. "I'm not 100 percent sure why anyone would want to see their porn in HD."

Re:Not enough (0)

Anonymous Coward | more than 2 years ago | (#36835356)

"Perfection" is overrated. Where some see flaws, others might see features ;).

The "Girl Next Door" and MILF types might not be "perfect" but they're still popular.

Feelin' HOT HOT HOT (0)

GameboyRMH (1153867) | more than 2 years ago | (#36834336)

Something about all those drives being packed in there like hot metal sardines gives me a bad feeling...

Re:Feelin' HOT HOT HOT (1)

L4t3r4lu5 (1216702) | more than 2 years ago | (#36834566)

I wouldn't be surprised if the top of the case fit flush with the hard drive cases and was used as a heatsink. Alu top case, finned, with a bank of fans in push/pull configuration, and a hot/cold arrangement of ducting along the racks.

That's how I'd do it, anyway.

Re:Feelin' HOT HOT HOT (1)

cmiller173 (641510) | more than 2 years ago | (#36835066)

You would be surprised that there is a piece of foam between the top of the case and the drives if you RTFA!

Re:Feelin' HOT HOT HOT (1)

Anrego (830717) | more than 2 years ago | (#36834574)

The multipliers make me more nervous!

Seriously... my experience with sata multipliers has been that they should be avoided at all costs.

Re:Feelin' HOT HOT HOT (0)

Anonymous Coward | more than 2 years ago | (#36834850)

Their very specific selection of SYBA-branded SATA card is because that card works best with the multipliers. They have this figured out already, or else how on earth would they have hundreds of these pods working well in a datacenter.

Re:Feelin' HOT HOT HOT (3, Informative)

hjf (703092) | more than 2 years ago | (#36834662)

This is nothing new. You've never been in a datacenter before, kid. You can ask a grownup one day and he can take you there and you will feel the heat. And NOISE. No offense, but I think you're one of those gamer kids who builds rigs for max FPS, with esoteric water cooling and silent fans everywhere.

Yeah, no, you don't need to pamper your hardware that much. Even laptop drives work way hot (60C+) for years with no issue.

Most servers are built that way too. The Sun x4500 is extremely densely packed. And there are hundreds running just fine.

Re:Feelin' HOT HOT HOT (1)

Lorien_the_first_one (1178397) | more than 2 years ago | (#36834766)

Thank you for pointing that out about laptop drives. I have one at home burning it up at over 50C.

Re:Feelin' HOT HOT HOT (1)

houghi (78078) | more than 2 years ago | (#36834946)

I have one running as a server. The fan inside is broken so no cooling at all. It runs around 100C for several months now.

Re:Feelin' HOT HOT HOT (1)

Kjella (173770) | more than 2 years ago | (#36834972)

Well that noise are the massive fans that keep the temperature of the equipment fairly close to ambient. If you quiet down the fans, the room temperature won't change much but power-hungry components will suddenly be way, way above room temperature. I had a really crappy cabinet crammed with back-to-back disks, didn't think much of it until they started dying... checked the SMART data, oh 75C for the top drive... that's 50C or so above the ambient temperature in the room. Better cabinet with more space, more and bigger fans, now it's down to 40-45C. It's not to "pamper" that hardware they do it, it's to do it quietly. If you don't care that your gaming machine sounds like a jet engine taking off, there's no problem.

Re:Feelin' HOT HOT HOT (0)

Anonymous Coward | more than 2 years ago | (#36834824)

Something about all those drives being packed in there like hot metal sardines gives me a bad feeling...

apparently it is not an issue as their blogpost [backblaze.com] says:

We monitor the temperature of every drive in our datacenter through the standard SMART interface, and we’ve observed in the past three years that: 1) hard drives in pods in the top of racks run three degrees warmer on average than pods in the lower shelves; 2) drives in the center of the pod run five degrees warmer than those on the perimeter; 3) pods do not need all six fans—the drives maintain the recommended operating temperature with as few as two fans; and 4) heat doesn’t correlate with drive failure (at least in the ranges seen in storage pods).

Re:Feelin' HOT HOT HOT (1)

gweihir (88907) | more than 2 years ago | (#36835160)

Thermal design is highly non-intuitive. So you experiment, measure and have monitoring and automated emergency-shutdown in place. You do not even net fan-monitoring with this setup. Just very simple disk-temperature monitoring will tell you when a fan is down. My guess would be that they can tolerate one fan failure for some time and do a forced shutdown if two go down.

This is for experienced engineers. I have done things like this before, and I think I could design both hardware and software for these boxes. It is not magic, just solid engineering with a solid understanding of the problems involved.

This is a huge step forward (3, Funny)

mugurel (1424497) | more than 2 years ago | (#36834338)

for both internet security and privacy: each of us can now store his own local copy of the internet and surf offline!

Re:This is a huge step forward (1)

AmberBlackCat (829689) | more than 2 years ago | (#36835028)

That would actually be nice. If every site I ever went to was cached locally. Like having a browser cache with unlimited size. It would be miles better than archive.org, if you remember a site from years ago and wish you could go back. Even better if it prefetched links you never clicked on.

Re:This is a huge step forward (1)

demonbug (309515) | more than 2 years ago | (#36835086)

for both internet security and privacy: each of us can now store his own local copy of the internet and surf offline!

Of course, with my 150GB/month bandwidth cap it is going to take ~70 years to fill it up...

But you cant use it without getting too hot? (1)

drolli (522659) | more than 2 years ago | (#36834380)

Or can somebody tell me if the cooling of the HDs is ok if they are stacked like in the picture?

Re:But you cant use it without getting too hot? (1)

gweihir (88907) | more than 2 years ago | (#36835122)

First, it depends on airflow. That is pretty close to optimal in the design. Second, you can monitor disk temperature and even have an emergency slowdown or shut-off if they overheat. Monitoring and shut-down is easy to script, maybe half a day if you know what you are doing.

Re:But you cant use it without getting too hot? (3, Interesting)

demonbug (309515) | more than 2 years ago | (#36835272)

Or can somebody tell me if the cooling of the HDs is ok if they are stacked like in the picture?

According to their blog post about it, they see a variation of ~5 degrees within unit (middle drives to outside drives) and about 2 degrees from the lowest unit in a rack to the highest. They also indicate that the drives stay within the spec operating temperature range with only two of the six fans in each chassis running.

Keep in mind these are 5400 RPM drives, not the 10K+ drives you would expect in an application where performance is critical. These are designed for one thing - lots of storage, cheap. No real worries about access times, IOPS, or a lot of the other performance measures that a more flexible storage solution would need to be concerned with. These are for backup only - nice large chunks of data written and (hopefully) never looked at again.

Can't actually store 135TB of data (4, Interesting)

gman003 (1693318) | more than 2 years ago | (#36834390)

The article says it uses RAID 6 - 45 hard drives are in the pod, which are grouped into an arrays of 15 that use RAID 6 (the groups being combined by logical volumes), which gives you an actual data capacity of 39TB per group (3TB * (15 - 2) = 39TB), which then becomes 117TB usable space (39TB * 3 = 117TB). The 135TB figure is what it would be if you used RAID 1, or just used them as normal drives (45 * 3TB = 135TB).

And these are all "manufacturer's terabytes", which is probably 1,024,000,000,000 bytes per terabyte instead of 1,099,511,627,776 (2^40) bytes per terabyte like it should be. So it's a mere 108 terabytes, assuming you use the standard power-of-two terabyte ("tebibyte', if you prefer that stupid-sounding term).

Re:Can't actually store 135TB of data (2, Informative)

GameboyRMH (1153867) | more than 2 years ago | (#36834418)

A manufacturer's terabyte would be 1,000,000,000,000 bytes.

Re:Can't actually store 135TB of data (-1)

gman003 (1693318) | more than 2 years ago | (#36834514)

No. most manufacturers define the terms as 1024 bytes per kilobyte, 1000 kilobytes per megabyte, 1000 megabytes per gigabyte, and 1000 gigabytes per terabyte. Which gets really confusing sometimes - they can't even stay consistent within their own system.

I haven't checked how Hitachi does it, but that's how Seagate and Western Digital do it. I would assume Hitachi marks them the same way.

Re:Can't actually store 135TB of data (1)

Wildclaw (15718) | more than 2 years ago | (#36834704)

I haven't checked how Hitachi does it, but that's how Seagate and Western Digital do it.

Bullshit. Neither of my new 2TB Western Digital disks come with 2048*10^9 storage space.

Mod parent -1, Idiot (0)

Anonymous Coward | more than 2 years ago | (#36834786)

No. most manufacturers define the terms as 1024 bytes per kilobyte, 1000 kilobytes per megabyte, 1000 megabytes per gigabyte, and 1000 gigabytes per terabyte. Which gets really confusing sometimes - they can't even stay consistent within their own system.

I haven't checked how Hitachi does it, but that's how Seagate and Western Digital do it. I would assume Hitachi marks them the same way.

No, actually, you're completely wrong.

Hitachi [hitachigst.com] (click Specifications):

Capacity - One GB is equal to one billion bytes and one TB equals 1,000GB (one trillion bytes) when referring to hard drive capacity.

Seagate [seagate.com]:

When referring to hard drive capacity, one gigabyte, or GB, equals one billion bytes and one terabyte, or TB, equals one trillion bytes.

Western Digital [wdc.com] (click Specifications):

As used for storage capacity, one megabyte (MB) = one million bytes, one gigabyte (GB) = one billion bytes, and one terabyte (TB) = one trillion bytes.

Some floppies use hybrid measurements, but hard drives have been entirely powers of ten for ages.

Re:Can't actually store 135TB of data (4, Informative)

Kjella (173770) | more than 2 years ago | (#36834794)

Hitachi:
"Capacity - One GB is equal to one billion bytes and one TB equals 1,000GB (one trillion bytes) when referring to hard drive capacity."

Western Digital:
"As used for storage capacity, one megabyte (MB) = one million bytes, one gigabyte (GB) = one billion bytes, and one terabyte (TB) = one trillion bytes."

Seagate (PDF product sheets):
"When referring to hard drive capacity, one gigabyte, or GB, equals one billion bytes and one terabyte, or TB, equals one trillion bytes."

So no, no and more no. Sometimes there really should be a "-1, Wrong" moderation...

Re:Can't actually store 135TB of data (-1, Troll)

OverlordQ (264228) | more than 2 years ago | (#36834538)

And these are all "manufacturer's terabytes", which is probably 1,024,000,000,000 bytes per terabyte instead of 1,099,511,627,776 (2^40) bytes per terabyte like it should be.

No it shouldn't be. Stop bastardizing the SI prefixes. Terra is the prefix for 10^12. 135 TerraBytes would therefore be 135 000 000 000 000 bytes

Re:Can't actually store 135TB of data (3, Informative)

gman003 (1693318) | more than 2 years ago | (#36834658)

Common usage for the past 50 years has been that, in the context of computer memory capacity, 'tera-" is to be interpreted as 2^40 (with "giga-" being 2^30, and so on). You'll note that I included a sidenote on 'tebibytes" to appease revisionists like you.

PS: It's rather ironic that someone accusing me of bastardizing SI prefixes can't even spell 'terabytes" properly. Unless you're somehow referring to Earth Bytes or something.

Re:Can't actually store 135TB of data (0)

OverlordQ (264228) | more than 2 years ago | (#36834834)

Just because it's been used that way in the past shouldn't be justification for continuing to bastardize it.

Re:Can't actually store 135TB of data (0)

Anonymous Coward | more than 2 years ago | (#36835082)

I'm going to bypass the whole debate and use Earth Bytes from now on.

Re:Can't actually store 135TB of data (0)

Anonymous Coward | more than 2 years ago | (#36835310)

The metric system is way too clean as it is now, Americans need to bastardize it a little so they don't look too silly for not using it. To further help the cause make sure you act as if SI is case insensitive, and write km, Km and KM interchangeably.

Re:Can't actually store 135TB of data (0)

Anonymous Coward | more than 2 years ago | (#36835062)

Yeah, and from your Internet provider, is it a megabit 1048576 bits? (answer: no, it is not).

Re:Can't actually store 135TB of data (0)

Anonymous Coward | more than 2 years ago | (#36835400)

Since your argument is based on "common usage in the past 50 years" instead of arguing directly that overloading these prefixes is a good idea, I've been thinking... what is the minimum x such that "common usage in the past x years" is enough to justify doing something that is not directly justified?

Re:Can't actually store 135TB of data (2)

Just Some Guy (3352) | more than 2 years ago | (#36835010)

Stop bastardizing the SI prefixes. Terra is the prefix

The irony: it is strong with this one.

Re:Can't actually store 135TB of data (1)

Inda (580031) | more than 2 years ago | (#36834746)

Tell me about it!!!

£4,561.68 still sounds like a steal. In fact, I might just steal one and save even more!

I actually spend more than that on food for the family per year. I wonder...

Re:Can't actually store 135TB of data (1)

FreeBSDbigot (162899) | more than 2 years ago | (#36834822)

The 135TB figure is what it would be if you used RAID 1

Actually, RAID 1 (mirroring) would cut the usable space in half; RAID 0 (striping) would keep it at 135TB.

Re:Can't actually store 135TB of data (1)

complete loony (663508) | more than 2 years ago | (#36834860)

Data is also duplicated across different pods so you can lose one due to power supply issues and not care for a while. RAID across local groups of disks does seem a bit pointless when you already have a layer of redundancy across the whole rack.

Re:Can't actually store 135TB of data (1)

ari_j (90255) | more than 2 years ago | (#36835154)

Just a small quibble: RAID level 1 would give you a capacity of 3TB with an absurd amount of redundancy. Level 0 is that one that would give you 135TB striped across all 45 disks.

Re:Can't actually store 135TB of data (0)

Anonymous Coward | more than 2 years ago | (#36835304)

raid1 is mirroring, that would result in 3*45/2=67,5 TB.

You probably meant raid0, striping, that would use ALL space for user purpose and speed upp transfers significantly compare to any other raid level.

Deja Vooooooo.... (0)

Anonymous Coward | more than 2 years ago | (#36834408)

Didn't we cover this story a couple of years ago with smaller drives?

Re:Deja Vooooooo.... (1)

cmiller173 (641510) | more than 2 years ago | (#36835196)

Didn't the summary say so and provide a link to the previous story. Of course, in addition to the drives getting bigger they changed a couple other things (MB, memory, CPU, SATA cards, SATA multipliers, wiring), but it is the same case so sure it's the same.

The price is too high.. (1)

adamjcoon (1583361) | more than 2 years ago | (#36834416)

You can buy 68 internal drives (2TB each) for the low price of $5439.32 http://www.newegg.com/Product/Product.aspx?Item=N82E16822152245 [newegg.com] I'm not a hardware expert, but I imagine you could connect them somehow for less than $1944.68.. ($7384 - $5439.32)

Re:The price is too high.. (1)

h4rr4r (612664) | more than 2 years ago | (#36834464)

$2000 to connect 68 drives seems crazy cheap. A good raid controller can cost more than that.

Re:The price is too high.. (2)

tomz16 (992375) | more than 2 years ago | (#36834516)

Nope, not at all... $2,000 is actually really cheap IMHO. Try to find a way to connect 68 drives cheaply (RAID cards and SATA multiplier backplanes are both pretty expensive). Don't forget that you also need a custom case, motherboard, ram, cpu, PS, and cooling for everything.

Re:The price is too high.. (0)

Anonymous Coward | more than 2 years ago | (#36834540)

"I'm not a hardware expert"

If you hadn't told me, I'd have never known.

Re:The price is too high.. (1)

gman003 (1693318) | more than 2 years ago | (#36834556)

Yes, 2TB drives are more cost-effective (price per terabyte) than the 3TB drives. But one of the major costs for Backblaze is power and space. They pay about $2,000 per month per rack in space rental, power and bandwidth, regardless of whether that rack is using 3TB drives of 300gb drives. So the difference in hardware costs is payed back by the increased density.

Re:The price is too high.. (1)

hjf (703092) | more than 2 years ago | (#36834588)

The price also includes custom made cases, fans, the power supplies, and custom-made port multiplier SATA backplanes. The custom parts make it pretty expensive, I guess.

Re:The price is too high.. (1)

b0bby (201198) | more than 2 years ago | (#36834610)

The 3TB drives are $6300 for 45 at newegg, and you'll need less cases/space/power for them - it's probably a wash in the end.

Re:The price is too high.. (0)

Anonymous Coward | more than 2 years ago | (#36834620)

cost it all up and add additional psu and controllers for 68 drives. we'll wait...let's all tell people how to do something they are already doing.

Re:The price is too high.. (0)

Anonymous Coward | more than 2 years ago | (#36834644)

I'm not a hardware expert, but I imagine you could connect them somehow for less than $1944.68.. ($7384 - $5439.32)

Not that easily. Beside a strong power supply and a big case you need enough ports.
Either you do one mainboard + processor + RAM + case for each, which means 7 computers with 10 ports each, leaving 300$ per machine.

Or you need a mainboard that can take e.g. 5 12-port SATA controllers (which are normally PCIe x8, good luck finding a board with that many slots).
Here you are strictly in server territory, and that means an immediate price x 10 :-) from what you are used to.

Re:The price is too high.. (2)

kiwimate (458274) | more than 2 years ago | (#36834876)

You might want to read the actual blog [backblaze.com] where they explain what they use in a bit of detail. This isn't my area of expertise either, but I do know that running 10 servers is very different from running 100 servers, which is also different from running 1000 servers. There are many questions that crop up that you really don't have to consider when you're down in the smaller arenas. (E.g. patch management - manually patching 10 servers is feasible and more cost effective than having an OTS solution; manually patching 1000 servers, not so much.)

They do also state at the outset:

In this post, we'll share how to make a 2.0 storage pod, and you're welcome to use the design. We'll also share some of our secrets from the last three years of deploying more than 16 petabytes worth of Backblaze storage pods. As before, our hope is that others can benefit from this information and help us refine the pods.

My reading - they definitely know more about this than I do, and they're not too proud to admit there could be lessons they can learn from the community.

RAID-6 (1)

hjf (703092) | more than 2 years ago | (#36834506)

RAID-6, really?
After 5+ years working with ZFS, personally, I wouldn't touch md/extX/xfs/btrfs/whatever with a 10 foot pole. Solaris pretty much sucks (OpenSolaris is dead and the open source spinoffs are a joke), but for a storage backend it's years ahead of Linux/BSD.

Sure, you can run ZFS on Linux (I did) and FreeBSD (I do), but for huge amounts of serious data? No thanks.

Re:RAID-6 (1)

TheRaven64 (641858) | more than 2 years ago | (#36834866)

Sure, you can run ZFS on Linux (I did) and FreeBSD (I do), but for huge amounts of serious data? No thanks.

What do you count as a serious amount of data? And what makes the FreeBSD version inferior in your opinion (aside from being a slightly older version - I think -STABLE now has the latest OpenSolaris release)?

Genuinely curious: I'm thinking of building a FreeBSD/ZFS NAS and I'd like to know if there's anything in particular that I need to look out for. Performance isn't really important, because most of the time I'll be accessing it over WiFi anyway, which is liekly to be far more of a bottleneck than anything else. I'm planning on using 3 2TB disks, for 4TB of storage space in a RAID-Z configuration.

maintain the earth's population at 500,000 (-1)

Anonymous Coward | more than 2 years ago | (#36834510)

from the georgia stone freemasons' bible. it could become difficult to be any one of us several billion unchosens? who gets to decide?

no gadgets required

should it not be considered that the domestic threats to all of us/our
freedoms be intervened on/removed, so we wouldn't be compelled to hide our
sentiments, &/or the truth, about ANYTHING, including the origins of the
hymenology council, & their sacred mission? with nothing left to hide,
there'd be room for so much more genuine quantifiable progress?

you call this 'weather'? much of our land masses are going under
water, or burning up, as we fail to consider anything at all that really
matters, as we've been instructed that we must maintain our silence (our
last valid right?), to continue our 'safety' from... mounting terror.

meanwhile, back at the raunch; there are exceptions? the unmentionable
sociopath weapons peddlers are thriving in these times of worldwide
sufferance? the royals? our self appointed murderous neogod rulers? all
better than ok, thank..... us. their stipends/egos/disguises are secure,
so we'll all be ok/not killed by mistaken changes in the MANufactured
'weather', or being one of the unchosen 'too many' of us, etc...?

truth telling & disarming are the only mathematically & spiritually
correct options. read the teepeeleaks etchings. see you there?

diaperleaks group worldwide. thanks for your increasing awareness?

Anything over 2TB should be ZFS... (2)

QuietLagoon (813062) | more than 2 years ago | (#36834544)

... if you really care about the data. ZFS has built-in so much more data integrity checks, and more extensive data integrity checks [oracle.com], than the vanilla RAID6 arrays.

.
Both FreeBSD [freebsd.org] and FreeNAS [freenas.org], in addition to OpenSolaris [opensolaris.org], support ZFS.

Yes. (0)

Anonymous Coward | more than 2 years ago | (#36834730)

...And actual useful snapshot capabilities. And utilities so easy to use, even your freakin' grandma can sling about storage pools.

7K for software raid? and why a low end cpu? (1)

Joe_Dragon (2206452) | more than 2 years ago | (#36834642)

Why not use a SAS card?
why have three PCIe cards that are only X1 when a x4 or better card with more ports has more PCI-e bandwidth and some even have there own RAID cpu on them.

Why use a low end I3 cpu in a 7K system? at least go to i5 even more so with software raid.

Re:7K for software raid? and why a low end cpu? (4, Informative)

drinkypoo (153816) | more than 2 years ago | (#36834722)

Hardware RAID controllers are stupid in this context. The only place they make sense is in a workstation, where you want your CPU for doing work, and if the controller dies you restore from backups or just reinstall. Using software RAID means never having to try to get a rebuilder software to convert the RAID from one format to another because the old controller isn't available any more, or because you can't get one when you really need one to get that project data out so you can ship and bill.

Re:7K for software raid? and why a low end cpu? (3, Insightful)

gman003 (1693318) | more than 2 years ago | (#36834756)

Because, for this project, raw storage capacity is much more important than performance. Besides, they claim their main bottleneck is the gigabit Ethernet interface - even software RAID, the PCIe x1, and the raw drive performance is less of a limiting factor.

Yeah, in a situation where you need high I/O performance, this design would be less than ideal. But they don't - they're providing backup storage. They don't need heavy write performance, they don't need heavy read performance. They just need to put a lot of data on a disk and not break anything.

PS: SAS doesn't really provide much better performance than SATA, and it's a lot more expensive. Same for hardware RAID - using those would easily octuple the cost of the entire system.

Re:7K for software raid? and why a low end cpu? (1)

gweihir (88907) | more than 2 years ago | (#36835090)

Very simple: Best bang for the buck. Your approach just increases cost without any real benefit in the target usage scenario. For example, the i5 is just a waste of money and energy. Hardware RAID drives cost, but the only "advantage" is has is that it is easier to use for clueless people.

Backblaze is speaking about scalability in SF (3, Informative)

Jim Ethanol (613572) | more than 2 years ago | (#36834664)

If you're in the SF Bay Area check out http://geeksessions.com/ [geeksessions.com] where Gleb Budman from Backblaze will be speaking about the Storage Pod and their approach to Network & Infrastructure scalability along with engineers from Zynga, Yahoo!, and Boundary. This event will also have a live stream on geeksessions.com.

Full Disclosure: This is my event.

50% discount to the event (about $8 bucks and free beer) for the Slashdot crowd here: http://gs22.eventbrite.com/?discount=slashdot [eventbrite.com]

Re:Backblaze is speaking about scalability in SF (2)

gpuk (712102) | more than 2 years ago | (#36835032)

Hi Jim

I'm quite a few timezones East of you, meaning the live stream will start at 0300 local on Wednesday for me. I'm willing to tough it out and stay up to watch it if necessary but it would be much more civilised if I could watch a playback. Will it be available for download later or is it live only?

It sucks I've only just learnt about geeksessions :( Some of your earlier events look awesome

Re:Backblaze is speaking about scalability in SF (0)

Anonymous Coward | more than 2 years ago | (#36835340)

Are we allowed to throw tomatoes, eggs, or dead babies at the Zynga reps? You'd get a lot more attendees, I guarantee it.

Original blog post (5, Informative)

Baloroth (2370816) | more than 2 years ago | (#36834708)

Here [backblaze.com] is a link to Backblaze's actual blog entry for the new pods 135TB, and here [backblaze.com] is the original 67TB pods. The blog article is actually quite fascinating. Apparently they are employee owned, use entirely off-the-shelf parts (except for the case, looks like), and recommend Hitachi drives (Deskstar 5K3000 HDS5C3030ALA630) as having the lowest failure rate of any manufacturer (less than 1% they say).

I found it kinda amusing that ext4's 16TB volume limit was an "issue" for them. Not because its surprising, but because... well, its 16TB. The whole blog post is actually recommended reading for anyone looking to build their own data pods like this. It really does a good job showing their personal experience in the field and problems/not problems they have. For instance: apparently heat isn't an issue, as 2 fans are able to keep an entire pod within the recommended temperature (although they actually use 6). It'll be interesting to see what happens as some of their pods get older, as I suspect that their failure rate will get pretty high fairly soon (their oldest drives are currently 4 years old, I expect when they hit 5-6 years failures will start becoming much more common.) All in all, pretty cool. Oh, and it shows how much Amazon/ Dell price gouges, but that shouldn't really shock anyone. Except the amount. A petabyte for three years is $94,000 with Backblaze, and $2,466,000 with Amazon.

P.S. I suspect they use ext4 over ZFS because ZFS, despite the built in data checks, isn't mature enough for them yet. They mention they used to use JFS before switching to ext4, so I suspect they have done some pretty extensive checking on this.

Re:Original blog post (1)

cgfsd (1238866) | more than 2 years ago | (#36835054)

This reminds me of the Sun 4500 which held 48 drives. Based off an AMD processor running Solaris x86 and ZFS.

The overall concept is great, but in practice replacing bad drives was a pain.

When I asked the Sun rep about replacing the drives, he said about once a year or when you get about half a dozen drives failed, power down the system, pull it out of the rack and replace the failed drives.

Would I store critical data on something like this, hell no. You get what you pay for.

Re:Original blog post (1)

Baloroth (2370816) | more than 2 years ago | (#36835384)

They mention that they have one guy dedicated to building new pods and replacing old drives. Out of ~9000 drives and ~200 pods, they replace ~10 drives per week, and with the RAID6 data redundancy the chance of losing data is absolutely minimal. RAID6 uses 2 drives for data parity, so I believe you would need 3 drives out of 45 to fail within a week to actually lose data. I suspect they would shut a pod down if 2 drives in it failed at the same time. Since the failure rate, including infant mortality, is only ~5 percent per year per drive, the chances of even that happening are pretty tiny. I'm not sure what brand of drives the Sun 4500 uses, but 6 a year sounds like a lot. I'm guessing this is considerably more reliable. All in all, because they have a person dedicated to maintaining the system on a weekly basis, this seems like it wouldn't even be all that bad for critical data. I wouldn't make it your only copy (fires/storms do happen) but it definitely seems reliable as an offsite backup.

... but you can't use it (2)

savanik (1090193) | more than 2 years ago | (#36834810)

With the latest bandwidth caps I'm seeing on my provider (AT&T U-verse), I can download data at a rate of 250 GB per month. So it'll take me 45 YEARS to fill up that 135 TB array. Something tells me they'll have better storage solutions by then.

In the meantime, I'm just waiting for Google to roll out the high-speed internet in my locale next year - maybe then I'll have a chance at filling up my current file server.

Re:... but you can't use it (1)

GodfatherofSoul (174979) | more than 2 years ago | (#36834990)

Crazy enough, you can actually *buy* content instead of downloading it from Pirate Bay.

Re:... but you can't use it (1)

Anonymous Coward | more than 2 years ago | (#36835162)

And, crazily enough, that gets counted against your download cap too!

Unless you're talking about buying the physical media and ripping it yourself, in which case... congratulations! That's completely irrelevant to his complaint!

Re:... but you can't use it (0)

Anonymous Coward | more than 2 years ago | (#36835248)

Or even create it! What a concept! Start a business as a wedding photographer or videographer. If you download a few hours of raw HD DV for post production work (say, in Cinelarra), you're going to need a TON of storage.

Meh (0)

Anonymous Coward | more than 2 years ago | (#36834992)

Not really that useful for any data that needs to be accessible 100% of the time. The drive do not look to be hot swapable and there is no redundancy anywhere in the design.
Even with all those raid groups with a single processor read write times are going to be hideous. Also not knowing about the software Your volumes/aggregates may be limited to a single RAID group which limits the usefulness.

Yeah its a cheap solution but its usefulness in a production or backup environment is limited. There are storage providers out there that have systems with price points not much higher than this that aren't as unreliable.

Engineering competence does give an edge (1)

gweihir (88907) | more than 2 years ago | (#36835068)

I did something a bit similar on a smaller scale about 9 years ago. (Linux software RAID, 12 disk in a cheap server). The trick is to make sure that you pay something like 70% of the total hardware cost for the disks. It is possible, it can be done reliable, but you have to know what you are doing. If you are not a competent and enterprising engineer, forget it (or become one). But the largest cost driver in storage is that people want to buy storage pre-configured and in a box that they do not need to understand. This is not only very expensive, (when I researched this 9 years ago, disk part of total price was sometimes as low as 15%!), but gives you lower performance and lower reliability. And also less flexibility.

Re:Engineering competence does give an edge (2)

Walker1337 (2400896) | more than 2 years ago | (#36835130)

But the largest cost driver in storage is that people want to buy storage pre-configured and in a box that they do not need to understand. This is not only very expensive, (when I researched this 9 years ago, disk part of total price was sometimes as low as 15%!), but gives you lower performance and lower reliability. And also less flexibility.

You aint kidding. I have installed systems for people that cost hundreds of thousands of dollars and they cant even give me basic information in order to complete the install. How many disks to each head? No Idea. How big do you want your RAID groups? No idea. Excuse me sir this IP and Gateway are in different subnets can I have another? That last one has actually happened more than once.

ME WANT!! (0)

Anonymous Coward | more than 2 years ago | (#36835266)

I can't imagine who has a need for such a ridiculous amount of storage, but nevertheless...

ME WANT!

After all, "640K ought to be enough for anybody"...okay, he was talking about memory, but still...

*Sigh* (goes back to tinkering with 3 TB RAID array/server)

Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...