Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Inside Facebook's Infrastructure

CmdrTaco posted more than 4 years ago | from the when-it-crashes-it-burns dept.

Social Networks 77

miller60 writes "Facebook served up 690 billion page views to its 540 million users in August, according to data from Google's DoubleClick. How does it manage that massive amount of traffic? Data Center Knowledge has put together a guide to the infrastructure powering Facebook, with details on the size and location of its data centers, its use of open source software, and its dispute with Greenpeace over energy sourcing for its newest server farm. There are also links to technical presentations by Facebook staff, including a 2009 technical presentation on memcached by CEO Mark Zuckerberg."

Sorry! There are no comments related to the filter you selected.

funny technical presentation (0)

Anonymous Coward | more than 4 years ago | (#33745808)

you got me there. i shall seek revenge.

Re:funny technical presentation (1)

jra (5600) | more than 4 years ago | (#33746300)

Amusingly, I read that as "a technical presentation on mendacity by Mark Zuckerberg".

Environmentalist (3, Interesting)

AnonymousClown (1788472) | more than 4 years ago | (#33745812)

I support environmental causes (Sierra Club and others), but I for one will not support Greenpeace and I don't think they are credible. They use violence to get their message out and their founder is now a corporate consultant that shows them how to get around environmental laws and pollute.

That's all.

Re:Environmentalist (2, Funny)

bsDaemon (87307) | more than 4 years ago | (#33745856)

Yeah, but if they're against Facebook, they can't be all bad. Sort of like the Mafia vs Castro, right?

Re:Environmentalist (1)

Rogerborg (306625) | more than 4 years ago | (#33746320)

Funny, I was just thinking that if Greenpeace are against Facebook, that means Facebook can't be all bad.

Then I remembered that we're not in Bushville Kiddy Matinee Land, and that it could easily be all-bad vs all-bad. Best result is that they attack facebook's data centre with clockwork chainsaws, and it falls on them and crushes them into Soylent Green.

Re:Environmentalist (1)

orient (535927) | more than 4 years ago | (#33749700)

Sort of like the Mafia vs Castro, right?

Right, Castro seems to be the good guy here.

Re:Environmentalist (1)

tehcyder (746570) | more than 4 years ago | (#33748132)

They use violence to get their message out

Examples please.

Sorry, I mean [citation needed]

Facebook ID (3, Funny)

Thanshin (1188877) | more than 4 years ago | (#33745828)

It's time to invent the Facebook Identity card.

You can't remember your passport number? No worries, your Facebook Identity card will say who you are. And how many friends you've got. And the name of your pet. And whether you went to the bathroom at your usual time that morning. And what kind of men you find attractive.

Semper Facebook Identity!

Re:Facebook ID (1)

jon42689 (1098973) | more than 4 years ago | (#33745872)

I threw up in my mouth a little bit upon reading this. God save us.

Re:Facebook ID (1)

PFactor (135319) | more than 4 years ago | (#33746010)

As a former Marine I'm afraid I'm going to have to "liberate" you for your perversion of "Semper Fi".

Re:Facebook ID (1)

tcopeland (32225) | more than 4 years ago | (#33746148)

> As a former Marine I'm afraid I'm going to have to "liberate" you

I think the nomenclature these days is that he's a target that needs to be "serviced".

Re:Facebook ID (1)

Mongoose Disciple (722373) | more than 4 years ago | (#33746450)

When you put it that way, it sounds like something that would run afoul of DADT.

Re:Facebook ID (0)

Anonymous Coward | more than 4 years ago | (#33746382)

Semper my shorts.

Re:Facebook ID (0)

Anonymous Coward | more than 4 years ago | (#33746670)

Semper my shorts.

Be very careful what you wish for.

Re:Facebook ID (0)

Anonymous Coward | more than 4 years ago | (#33746882)

Semper, latin for 'always'.

Always shorts are less than desireable?

Re:Facebook ID (0)

Anonymous Coward | more than 4 years ago | (#33746586)

As a former Marine I'm afraid I'm going to have to "liberate" you for your perversion of "Semper Fi".

Well as a not former Marine (nor Marine full stop), I'm afraid that you must learn that US Marines did not invent nor have exclusive rights on the Latin word "samper" which predates them by some millenia.

Re:Facebook ID (0)

Anonymous Coward | more than 4 years ago | (#33748202)

Yeah, just ask Crazy Joe Davola.

Sic Semper Tyranis!

Re:Facebook ID (1)

alphax45 (675119) | more than 4 years ago | (#33746012)

Facebook only knows what YOU decide to tell them. They can't (yet?) read your mind. As long as you’re smart about what data you decide to give them it is a great tool to keep in touch with friends and in a lot of cases family. There are privacy settings (although not always easy to find/use) that allow you to control who can see your data. I just set mine to “Friends Only” (one button now) and I only friend people I know/trust. I don't see why people always say Facebook knows everything about everyone. They know what people decide to tell them and nothing more.

Re:Facebook ID (0)

Anonymous Coward | more than 4 years ago | (#33746052)

There are privacy settings that allow you to control who can see your data

Cool, can I use them to hide my private info from facebook (the company)?

Re:Facebook ID (3, Interesting)

rtaylor (70602) | more than 4 years ago | (#33746104)

Facebooks knows anything about you that 3rd parties (friends, family, etc.) might tell them too.

I didn't create an account or provide any information to facebook; yet there are bits and pieces of information on it about me.

Re:Facebook ID (2, Interesting)

tophermeyer (1573841) | more than 4 years ago | (#33747048)

One of the reasons that's keeping me from deleting my facebook account is that having it active allows me to untag myself from all the pictures that I wish my friends would stop making public. If I didn't have an account they could link to, my name would just sit on the picture for anyone to see.

Re:Facebook ID (1)

mlts (1038732) | more than 4 years ago | (#33746412)

I use four items for FB:

First, i assume anything I put on FB will end up in my worst enemy's hands, with the paranoiac fantasy #include option turned on. This means not listing when out on vacation because someone who may be interested in a burglary might be reading, not listing where I work, not listing the exact model of vehicle I have, and so on.

Second, what I do is set permissions so all my stuff is only visible by one group of friends. This group, I manually add people too. This way, should someone I don't intend accidentally get friended, or it is someone that I have to friend someone for political reasons, they will have friend access to essentially nothing.

Third, every 4-6 months, I go through my FB profile and delete stuff. No, it isn't deleted completely as I am sure that FB keeps snapshots as well as changelogs, but it doesn't hurt to preen information from there.

Fourth, it is good to always check permissions on FB privacy settings and applications periodically. A lot of people end up with a rogue app on their profile spamming friends with malware-ridden links.

Re:Facebook ID (1)

Cylix (55374) | more than 4 years ago | (#33747798)

I think the only safe rule with applications is to not use them due to their sloppy policies with information.

At least that is my rule anyway.

Re:Facebook ID (1)

Archangel Michael (180766) | more than 4 years ago | (#33747660)

Gee thanks, now I gotta pee.

Re:Facebook ID (0)

Anonymous Coward | more than 4 years ago | (#33750786)

Right... How many "friends" you have, if "friend" means "nameless person whom you've never actually met and will never meet".

Slashdotted (2, Informative)

devjoe (88696) | more than 4 years ago | (#33745882)

Maybe Data Center Knowledge should put some of that knowledge to work, as the article is slashdotted after only 5 comments.

Re:Slashdotted (0)

Anonymous Coward | more than 4 years ago | (#33746016)

Agreed :(

Re:Slashdotted (1)

L4t3r4lu5 (1216702) | more than 4 years ago | (#33746272)

I tried to Coral Cache it, but their stupid redirect from the "human friendly" URL just resulted in "Resource not found" error.

More fool them.

Slashdotted (1)

daffmeister (602502) | more than 4 years ago | (#33745894)

Looks like Data Center Knowledge could use some of that infrastructure.

First page of Article (1)

ihatejobs (1765190) | more than 4 years ago | (#33745960)

I managed to load the first page of the article before it got slashdotted:

With more than 500 million active users, Facebook is the busiest site on the Internet and has built an extensive infrastructure to support this rapid growth. The social networking site was launched in February 2004, initially out of Facebook founder Mark Zuckerberg’s dorm room at Harvard University and using a single server. The company’s web servers and storage units are now housed in data centers around the country.

Each data center houses thousands of computer servers, which are networked together and linked to the outside world through fiber optic cables. Every time you share information on Facebook, the servers in these data centers receive the information and distribute it to your network of friends.

We’ve written a lot about Facebook’s infrastructure, and have compiled this information into a series of Frequently Asked Questions. Here’s the Facebook Data Center FAQ (or “Everything You Ever Wanted to Know About Facebook’s Data Centers”).

How Big is Facebook’s Internet Infrastructure?
Facebook is currently the world’s most popular web site, with more than 690 billion page views each month, according to metrics from Google’s DoubleClick service. Facebook currently accounts for about 9.5 percent of all Internet traffic, slightly more than Google, according to HitWise.

Facebook requires massive storage infrastructure to house its enormous stockpile of photos, which grows steadily as users add 100 million new photos every day. People share more than 30 billion pieces of content on Facebook each month. In addition, the company’s infrastructure must support platform services for more than 1 million web sites and 550,000 applications using the Facebook Connect platform.

To support that huge activity, Facebook operates at least nine data centers on both coasts of the United States, and is in the process of building its first company-built data center in Oregon. Although more than 70 percent of Facebook’s audience is in other countries, none of the company’s data centers are located outside the United States.

For most of its history, Facebook has managed its infrastructure by leasing “wholesale” data center space from third-party landlords. Wholesale providers build the data center, including the raised-floor technical space and the power and cooling infrastructure, and then lease the completed facility. In the wholesale model, users can occupy their data center space in about five months, rather than the 12 months needed to build a major data center. This has allowed Facebook to scale rapidly to keep pace with the growth of its audience.

In January 2010 Facebook announced plans to build its own data centers, beginning with a facility in Prineville, Oregon. This typically requires a larger up-front investment in construction and equipment, but allows greater customization of power and cooling infrastructure.

Where are Facebook’s Data Centers Located?

Facebook currently leases space in about six different data centers in Silicon Valley, located in Santa Clara and San Jose, and at least one in San Francisco. The company has also leased space in three wholesale data center facilities in Ashburn, Virginia. Both Santa Clara and Ashburn are key data center hubs, where hundreds of fiber networks meet and connect, making them ideal for companies whose content is widely distributed.

Facebook’s first company-built data center is nearing completion in Prineville, Oregon. If Facebook’s growth continues at the current rate, it will likely require a larger network of company-built data centers, as seen with Google, Microsoft, Yahoo and eBay.

How Big Are Facebook’s Server Farms?

A rendering of an aerial view of the Facebook data center in Princeville, Oregon.

As Facebook grows, its data center requirements are growing along with it. The new data center Oregon is a reflection of this trend.

In the data centers where it currently operates, Facebook typically leases between 2.25 megawatts and 6 megawatts of power capacity, or between 10,000 and 35,000 square feet of space. Due to the importance of power for data centers, most landlords now price deals using power as a yardstick, with megawatts replacing square feet as the primary benchmark for real estate deals.

Facebook’s new data center in Oregon will be much, much larger. The facility was announced as being 147,000 square feet. But as construction got rolling, the company announced plans to add a second phase to the project, which will add another 160,000 square feet. That brings the total size of the Prineville facility to 307,000 square feet of space – larger than two Wal-Mart stores.

Re:First page of Article (1)

ihatejobs (1765190) | more than 4 years ago | (#33745976)

Perhaps they got it up and running again, but here's page two in case it dies:

This chart provides a dramatic visualization of Facebook’s infrastructure growth. It documents the number of servers used to power Facebook’s operations.

“When Facebook first began with a small group of people using it and no photos or videos to display, the entire service could run on a single server,” said Jonathan Heiliger, Facebook’s vice president of technical operations.

Not so anymore. Technical presentations by Facebook staff suggest that as of June 2010 the company was running at least 60,000 servers in its data centers, up from 30,000 in 2009 and 10,000 back in April 2008.

There are companies with more servers (see Who Has the Most Web Servers? for details). But the growth curve shown on the chart doesn’t even include any of the servers that will populate the Oregon data center – which may be the first of multiple data centers Facebook builds to support its growth.

What kind of servers does Facebook use?
Facebook doesn’t often discuss which server vendors it uses. In 2007 it was buying a lot of servers from Rackable (now SGI), and is also known to have purchased servers from Dell, which customizes servers for its largest cloud computing customers.

Facebook VP of Technical Operations Jonathan Heiliger has sometimes been critical of major server vendors’ ability to adapt their products to the needs of huge infrastructures like those at Facebook, which don’t need many of the features designed for complex enterprise computing requirements. “Internet scale” companies can achieve better economics with bare bones servers that are customized for specific workloads.

In a conference earlier this year, Heiliger identified multi-core server vendors Tilera and SeaMicro as “companies to watch” for their potential to provide increased computing horsepower in a compact energy footprint.

But reports that Facebook planned to begin using low-power processors from ARM - which power the iPhone and many other mobile devices - proved to be untrue. “Facebook continuously evaluates and helps develop new technologies we believe will improve the performance, efficiency or reliability of our infrastructure,” Heiliger said. “However, we have no plans to deploy ARM servers in our Prineville, Oregon data center.”

A look at the fully-packed server racks inside a Facebook data center facility.

What kind of software does Facebook Use?
Facebook was developed from the ground up using open source software. The site is written primarily in the PHP programming language and uses a MySQL database infrastructure. To accelerate the site, the Facebook Engineering team developed a program called HipHop to transform PHP source code into C++ and gain performance benefits.

Facebook has one of the largest MySQL database clusters anywhere, and is the world’s largest users of memcached, an open source caching system. Memcached was an important enough part of Facebook’s infrastructure that CEO Mark Zuckerberg gave a tech talk on its usage in 2009.

Facebook has built a framework that uses RPC (remote procedure calls) to tie together infrastructure services written in any language, running on any platform. Services used in Facebook’s infrastructure include Apache Hadoop, Apache Cassandra, Apache Hive, FlashCache, Scribe, Tornado, Cfengine and Varnish.

How much Does Facebook Spend on Its Data Centers?
An analysis of Facebook’s spending with data center developers indicates that the company is now paying about $50 million a year to lease data center space, compared to about $20 million when we first analyzed its leases in May 2009.

The $50 million a year includes spending is for leases, and doesn’t include the cost of the Prineville project, which has been estimated at between $180 million and $215 million. It also doesn’t include Facebook’s investments in server and storage hardware, which is substantial.

Facebook currently leases most of its data center space from four companies: Digital Realty Trust, DuPont Fabros Technology, Fortune Data Centers and CoreSite Realty.

Here’s what we know about Facebook’s spending on its major data center commitments:

Facebook is paying $18.1 million a year for 135,000 square feet of space in data center space it leases from Digital Realty Trust (DLR) in Silicon Valley and Virginia, according to data from the landlord’s June 30 quarterly report to investors.
The social network is also leasing data center space in Ashburn, Virginia from DuPont Fabros Technology(DFT). Although the landlord has not published the details of Facebook’s leases, data on the company’s largest tenants reveals that Facebook represents about 15 percent of DFT’s annualized base rent, which works out to about $21.8 million per year.
Facebook has reportedly leased 5 megawatts of critical load – about 25,000 square feet of raised-floor space – at a Fortune Data Centers facility in San Jose.
In March, Facebook agreed to lease an entire 50,000 square foot data center that was recently completed by CoreSite Realty in Santa Clara.
Facebook also hosts equipment in a Santa Clara, Calif. data center operated by Terremark Worldwide (TMRK), a Palo Alto, Calif. facilityoperated by Equinix (EQIX) and at least one European data center operated by Telecity Group. These are believed to be substantially smaller footprints than the company’s leases with Digital Realty and DuPont Fabros.
That adds up to an estimated $40 million for the leases with the Digital Realty and DuPont Fabros, When you add in the cost of space for housing equipment at Fortune, CoreSite, Terremark, Switch and Data, Telecity and other peering arrangements to distribute content, we arrive at an estimate of at least $50 million in annual data center costs for Facebook.

Facebook’s costs remain substantially less than what some other large cloud builders are paying for their data center infrastructure. Google spent $2.3 billion on its custom data center infrastructure in 2008, while Microsoft invests $500 million in each of its new data centers. Those numbers include the facilities and servers.

How Many People Are Needed to Run Facebook’s Data Centers?
As is the case with most large-scale data centers, Facebook’s facilities are highly automated and can be operated with a modest staff, usually no more than 20 to 50 employees on site. Facebook has historically maintained a ratio of 1 engineer for every 1 million users, although recent efficiencies have boosted that ratio to 1 engineer for every 1.2 million users.

Facebook’s construction project in Prineville is expected to create more than 200 jobs during its 12-month construction phase, and the facility will employ at least 35 full-time workers and dozens more part-time and contract employees.

Re:First page of Article (1)

ihatejobs (1765190) | more than 4 years ago | (#33745988)

Last page.

Facebook says the Prineville data center will be designed to a Gold-level standard under the LEED (Leadership in Energy and Environmental Design) program, a voluntary rating system for energy efficient buildings overseen by the US Green Building Council. The Prineville facility is expected to have a Power Usage Effectiveness (PUE) rating of 1.15. The PUE metric (PDF) compares a facility’s total power usage to the amount of power used by the IT equipment, revealing how much is lost in distribution and conversion. An average PUE of 2.0 indicates that the IT equipment uses about 50 percent of the power to the building.

The cool climate in Prineville will allow Facebook to operate without chillers, which are used to refrigerate water for data center cooling systems, but require a large amount of electricity to operate. With the growing focus on power costs, many data centers are designing chiller-less data centers that use cool fresh air instead of air conditoning. On hot days, the Prineville data center will use evaporative cooling instead of a chiller system.

“This process is highly energy efficient and minimizes water consumption by using outside air,” said Heiliger. Water conservation is also a growing focus for major data center projects, which in some cases can create capacity challenges for local water utilities.

A key function of data centers is providing an uninterruptible power supply (UPS) to continuously provide power to servers. This is another area where Facebook is gaining energy savings. The Prineville data center will use a new, patent-pending UPS system that reduces electricity usage by as much as 12 percent. The new design foregoes traditional uninterruptible power supply (UPS) and power distribution units (PDUs) and adds a 12 volt battery to each server power supply.

The use of custom servers with on-board bateries was pioneered by Google, which last year revealed a custom server that integrates a 12 volt battery, which the company cited this design as a key factor in the exceptional energy efficiency data for its data centers.

Here’s how this approach saves power: Most data centers use AC power distribution in which a UPS system stands between the utility power grid and the data center equipment. If utility power is lost, the UPS system can tap a large bank of batteries (or in some cases, a flywheel) for “ride-through” power until the generator can be started. This approach requires that AC power from the grid be converted into DC power to charge the batteries, and then converted back to AC for the equipment. Each of those conversions includes a power loss, reducing the amount of electricity that reaches the servers.

Finally, the Facebook Prineville facility will also re-use excess heat expelled by servers, which will help heat office space in the building, a strategy also being implemented by Telehouse and IBM.

How Does Facebook Choose the Sites for Its Data Centers?
Data center site selection is a complex process. A typical site search must consider the availability and cost of power, the cost and availability of land, fiber connectivity to the site, vulnerability to natural disasters, the capacity of local water and sewage systems, the local business environment, incentives from state and local governments, and many other variables.

Connectivity between data centers is also a consideration. Facebook has created two data center clusters on each coast, one in Silicon Valley and another in nothern Virginia. When the company decided to build its own data center, it conducted an in-depth site search in several Western states.

“After a rigorous review process of sites across the West Coast, Facebook concluded that Prineville offered the best package of resources – including a suitable climate for environmental cooling, renewable power resources, available land, talented regional workforce and supportive business environment,” said Tom Furlong, Director of Site Operations for Facebook.

Why is Greenpeace Criticizing Facebook?

Facebook’s new Oregon data center, which has been designed to be highly energy-efficient, is located in a town where the local utility that uses coal to generate the majority of its power. This fact was soon highlighted by environmental blogs and even a Facebook group.

In mid-February the environmental group Greenpeace International called on Facebook to rethink plans for its Oregon data center and find a way to run the facility entirely on renewable energy.

“Given the massive amounts of electricity that even energy-efficient data centers consume to run computers, backup power units, and power related cooling equipment, the last thing we need to be doing is building them in places where they are increasing demand for dirty coal-fired power,” Greenpeace said in a statment, which was published on its web site. “Facebook and the cloud should be run on clean renewable energy Facebook could and should be championing clean energy solutions, and not relying on the dirty fuel sources of the past to power their new data center.”

Facebook, which has touted the energy efficiency of the Prineville facility, has responded at length to the issue, both on Data Center Knowledge and directly to Greenpeace.

“It’s true that the local utility for the region we chose, Pacific Power, has an energy mix that is weighted slightly more toward coal than the national average (58% vs. about 50%),” Facebook’s Barry Schnitt said. “However, the efficiency we are able to achieve because of the climate of the region minimizes our overall carbon footprint. Said differently, if we located the data center most other places, we would need mechanical chillers, use more energy, and be responsible for an overall larger environmental impact—even if that location was fueled by more renewable energy.”

The Greenpeace critiques have continued, growing sharper and targeting Facebook CEO Mark Zuckerberg with a letter and video. Here’s Data Center Knowledge’s full coverage of the story:

Facebook’s Green Data Center, Powered by Coal?
Facebook Responds on Coal Power in Data Center
Facebook Responds to Greenpeace
Greenpeace, Facebook & The Media Megaphone
Facebook Responds to Greenpeace Letter
Greenpeace vs. Facebook, Continued
Is Facebook Secretive About its Data Centers?

  Lots of companies simply DO NOT talk about their data centers, or even acknowledge their existence. But that’s starting to change, as some companies are pursuing a more open approach and a deeper level of engagement with the communities where these facilities are located. Facebook has been in the forefront of this movement with its project in Prinveille.

Facebook held a groundbreaking announcement with local officials, and has used the Prineville Data Center Facebook page to engage with the community, sharing a detailed list of nearly 60 contractors working on the project, along with regular updates on the ways the company is contributing to the community. An example: Facebook and its construction contractors, DPR/Fortis, are sponsors of the Crook County Fair and local Picnic in the Park events in Prineville.

Freaking SEOs... (2, Insightful)

netsharc (195805) | more than 4 years ago | (#33745966)

Facebook is... Facebook has... fucking SEO monkeys must be at work making sure the company isn't referred to as "it", because that ruins the google-ability of the article, and they'd rather have SEO ratings than text that reads like it's been written by a fucking 3rd grader.

SEO-experts... even worse than lawyers.

Re:Freaking SEOs... (1)

netsharc (195805) | more than 4 years ago | (#33746000)

"that's not been written by"...

Whargbl!

Re:Freaking SEOs... (1)

ari_j (90255) | more than 4 years ago | (#33747760)

I actually took your original wording to mean that third-graders would have done a better job.

Mark Zuckerberg's presentation link is wrong (2, Interesting)

francium de neobie (590783) | more than 4 years ago | (#33745972)

It links to Facebook's "wrong browser" page. The real link may be here: http://www.facebook.com/video/video.php?v=631826881803 [facebook.com]

Re:Mark Zuckerberg's presentation link is wrong (1)

L4t3r4lu5 (1216702) | more than 4 years ago | (#33746282)

It's Facebook. It knows which browser you used. It changed the link just for you based upon that information.

Everyone else sees what you posted.

Tinfoil hats mandatory from here on in.

Re:Mark Zuckerberg's presentation link is wrong (0)

Anonymous Coward | more than 4 years ago | (#33747810)

The same joke appeared on

http://tech.slashdot.org/story/10/09/25/1526234/Facebook-Unveils-Details-of-Downtime

Why are you ruining a perfectly good meme?

Wow facebook (0)

Anonymous Coward | more than 4 years ago | (#33745994)

You are using an incompatible web browser.

Sorry, we're not cool enough to support your browser. Please keep it real with one of the following browsers:

        * Mozilla Firefox
        * Safari
        * Microsoft Internet Explorer

-------------

Using Firefox 3.6.10.

I'm sticking with antisocial networking (2, Funny)

Average_Joe_Sixpack (534373) | more than 4 years ago | (#33745998)

USENET and /. (RIP Digg)

Cache (2, Informative)

minus9 (106327) | more than 4 years ago | (#33746030)

Re:Cache (1)

mariushm (1022195) | more than 4 years ago | (#33746160)

or just paste the url of each page in Google and you'll get the first result a link to the page with "cached" link in the bottom right.

WTF facebook (0)

Anonymous Coward | more than 4 years ago | (#33746066)

All i get is a message saying they are not cool enough to support my browser and i should go install a "real" browser. I am using firefox so wtf? amatuer.

You are using an incompatible web browser.

Sorry, we're not cool enough to support your browser. Please keep it real with one of the following browsers:

        * Mozilla Firefox
        * Safari
        * Microsoft Internet Explorer

Yawn.. move along (2, Informative)

uncledrax (112438) | more than 4 years ago | (#33746068)

The article isn't worth reading IMO, not unless you're curious as to how much electricity some of the FB datacenters use. Otherwise it's light on the tech details.

Re:Yawn.. move along (2, Informative)

drsmithy (35869) | more than 4 years ago | (#33746818)

The article isn't worth reading IMO, not unless you're curious as to how much electricity some of the FB datacenters use. Otherwise it's light on the tech details.

Indeed. "All you wanted to know about FaceBook's infrastructure" and little more than a passing mention about their storage ? That's vastly more interesting information than where their datacenters might physically be.

Re:Yawn.. move along (1)

kaizokuace (1082079) | more than 4 years ago | (#33748306)

woah woah woah there buddy. Who said anything about reading articles!

Re:Yawn.. move along (1)

stiller (451878) | more than 4 years ago | (#33749326)

If you want tech details, check out this excellent talk by Tom Cook of FB from Velocity last June: http://www.youtube.com/watch?v=T-Xr_PJdNmQ [youtube.com]

Princeville, Oregon??? (0)

Anonymous Coward | more than 4 years ago | (#33746238)

Started reading and got to the aerial rendering of Facebook's new data center in Princeville, Oregon. From the rendering it looked more like western Oregon (not that I have ever been) so I decided to Google Map it. Searching google maps and regular google results there doesnt seem to be a Princeville, Oregon. Where the heck is Priceville, Oregon?

Re:Princeville, Oregon??? (1)

afaik_ianal (918433) | more than 4 years ago | (#33746690)

It's Prineville (without a "c").

Re:Princeville, Oregon??? (1)

kiwimate (458274) | more than 4 years ago | (#33746714)

Helps if you spell it correctly - it's Prineville without a "c" and with an "n". From Bing maps, it's about 150 miles south-east of Portland.

Call me dense, but... (4, Interesting)

mlts (1038732) | more than 4 years ago | (#33746868)

Call me dense, but with all the racks of 1U x86 equipment FB uses, wouldn't they be far better served by machines built from the ground up to handle the TPM and I/O needs?

Instead of trying to get so many x86 machines working, why not go with upper end Oracle or IBM hardware like a pSeries 795 or even zSeries hardware? FB's needs are exactly what mainframes are built to accomplish (random database access, high I/O levels) and do the task 24/7/365 with five 9s uptime.

To boot, the latest EMC, Oracle and IBM product lines are good at energy saving. The EMC SANs will automatically move data and spin down drives not in use to save power. The CPUs on the top of the line equipment not just power down what parts are not in use, but wise use of LPARs or LDoms would also help with energy costs just due to having fewer machines.

Re:Call me dense, but... (5, Insightful)

njko (586450) | more than 4 years ago | (#33747676)

The purpose of server farms with comodity hardware is just to avoid vendor lock-in, if you have a good business but you are tied to a vendor the Vendor has a better business than you. they can charge you whatever they want.

Re:Call me dense, but... (3, Insightful)

mlts (1038732) | more than 4 years ago | (#33747892)

That is a good point, but to use a car analogy, isn't it like strapping a ton of motorcycles together with duct tape and having people on staff to keep them all maintained so the contrivance can pull a 18-wheeler load? Why not just buy an 18-wheeler which is designed and built from the ground up for this exact task?

Yes, you have to use the 18-wheeler's shipping crates (to continue the analogy), but even with the vendor lock-in, it might be a lot better to do this as opposed to trying to cobble a suboptimal solution that does work, but takes a lot more man-hours, electricity, and hardware maintaining as opposed to something built from the factory for the task at hand.

Plus, zSeries machines and pSeries boxes happily run Linux LPARs. That is as open as you can get. It isn't like it would be moving the backend to CICS.

Re:Call me dense, but... (1)

njko (586450) | more than 4 years ago | (#33748140)

Well tactically you sholud allways use the right tool for the problem. but strategically you should be aligned with long term goals and the vision of the CTO/CIO and other nonTechnical factors. Vendor independence may be one of the neccesaries key items to achieve the vision.

Re:Call me dense, but... (3, Interesting)

RajivSLK (398494) | more than 4 years ago | (#33751920)

Well we do the same thing as facebook but on a much smaller scale... Our "commodity hardware" (mostly supermicro motherboards with generic cases, memory etc) has pretty much the same uptime and performance as vendor servers. For example we have a Quad CPU database server that has been up for 3 years. If I remember correctly it cost about 1/2 as much as a server with equivalent specs from a vendor.

The system basically works like this. Buy 5 or so (or 500 if you are facebook) servers at once with identical specs and hardware. If a server fails (not very often) there are basically 4 common reasons:

1) Power supply or fan failure -- very easy to identify.
    Solution: Leave server down until maintenance day (or whenever you have a chance) swap for a new power supply (total time 15min [less time that calling the vendor tech support]).

2) Hard drive failure -- usually easy to identify
    Solution: Leave server down until maintenance day (or whenever you have a chance) swap for a new hard drive (total time 15min [less time that calling the vendor tech support]). When the server reboots it will automatically be setup by various autoconfig methods (bootP whatever). I suspect that facebook doesn't even have HDs in most servers.

3) Ram Failure -- can be hard to indentify
    Solution: Leave server down until maintenance day (or whenever you have a chance) swap for new ram (total time 15min [less time that calling the vendor tech support]).

3) Motherboard Failure (almost never happens) -- can be hard to indentify
    Solution: Replace entire server -- keep old server for spare parts (ram, power supply whatever)

I don't really see what a vendor adds besides inefficiency. If you have to call a telephone agent who then has to call a tech guy from the vendor who then has to drive across town at a moments notice to spend 10 minutes swapping out your ram it's going to cost you. At a place like facebook why not just hire your own guy?

Re:Call me dense, but... (0)

Anonymous Coward | more than 4 years ago | (#33747868)

Given the shelf life of servers at facebook, I would guess that overall it's far cheaper to buy x86 and pay the extra electricity ad space costs. Mainframes aren't exactly cheap.

Re:Call me dense, but... (2, Interesting)

Cylix (55374) | more than 4 years ago | (#33748008)

The latest x86 architecture lines are moving far more in the direction of mainframe type units in terms of density and bandwidth. This is a hardware type from several years back and would not be really compare to the denser offerings being explored today. However, the reasoning behind commodity hardware is not just the ability to switch to one platform from another, but rather it keeps costs down with vendor competition. One design can be produced by multiple vendors with the goal of earning the lowest bid. There are several other advantages as well with a commodity or generic based design.

With commodity hardware that is not designed with five nines there is an expectation the application can fail away. The need for the application to fail away gracefully is actually more fundamental then at the server level. When considering application resiliency you want to target at the datacenter level so that you are not locked to a specific region. To build something as large as facebook they are no longer load balancing at the router, but at the datacenter level itself. With this concept the datacenter becomes a bucket entity with the ability to service X traffic and if it should fail you simply move services away. With a sufficiently advanced version of this very generic and very hardware abstracted model it is now possible to distribute load to third party farms via cloud infrastructures.

Still, the world is not black and white and even within these models there will be small clusters of special purpose hardware for things like data warehousing and reporting. Far more typical I find the larger systems in industries where there can be no possible downtime or the loss of data cannot occur.

Re:Call me dense, but... (1)

mlts (1038732) | more than 4 years ago | (#33749312)

Very true. However, by moving the redundancy to the top of the application stack, doesn't that become inefficient after a while?

For example, I have an application that sits on a mainframe on a big LPAR with the database server on another LPAR on the same machine. Because the redundancy is handled by everything below it, the application can be a lot simpler, with fewer bugs. It does not have to worry about being consistent among a number of boxes, just run and let the rest of the stack below it do

On a more traditional non-mainframe platform such as Solaris or AIX, the whole stack shares in the high availability. The machines have a heartbeat monitor, the database is bifurcated and knows when to let the secondary machine take over, the OS has multiple IO paths, and the application either doesn't care, or it is able to know when to spin up on the secondary hardware and when to spin back down, letting the primary machine take the lead. (This is assuming an active/passive failover scenario, of course.)

With only the application handling the redundancy, it is up to the custom application writers to handle consistency issues, deal with dead servers by failing away and failing back to the hardware when it comes up. Of course, this means that more hardware has to be thrown at the application. There is a point with parallelism of diminishing returns, but this depends on what the app is doing. It also means that a lot of additional functionality has to be coded in at the app layer such as when to consider a machine or a data center failed and to shunt to another, when to fail back, and so on.

This doesn't mean that Big Iron is the only solution out there by any means. However, it means that if anything fails in the application, there will be a world of hurt, as opposed to a failure in a piece of the stack of a more traditional system being able to be worked around by other means.

Re:Call me dense, but... (1)

Cylix (55374) | about 4 years ago | (#33754830)

The application doesn't necessarily have to the be redundant aware. It really depends on several factors and most importantly how session data is handled. In most cases, the actual load is handled by specialized load balancers that manage the distribution. However, depending on exactly how much money you want to throw at the scenario there are various ways to scale up the application.

I'm not sure how much crack you are smoking, but you just said parallelism has worse diminishing returns then a monolithic platform. That is safely comparing the concept of megahertz to multiple cores.

At this juncture the massive app farm is generally composed of systems which handle traffic via multiple datacenters. Again, this concept scales fairly well and does so on commodity hardware. The concept of losing an individual component, system or datacenter is what makes the farm resilient.

You are really not thinking large enough to scale to fit the model I'm discussing. In most instances where I have seen systems which were designed to scale very large for non-computational purposes they have achieved this through the use of multiple VMS. This setup entails the same structure of the application farm and doesn't result in a great deal of gain in my opinion.

Just fyi, I'm not actually speaking in theory regarding any of this. Now, I would like to see some examples of very large infrastructures which handle massive amounts of traffic on a monolithic structure.

Re:Call me dense, but... (1)

Danathar (267989) | more than 4 years ago | (#33749912)

Do you have an IBM business card or phone number they can call you at for a Sales pitch?

Re:Call me dense, but... (2, Interesting)

mlts (1038732) | more than 4 years ago | (#33750944)

Actually neither. Its just that to an observer like me, FB is trying to reinvent the wheel on a problem that already has been solved.

Obviously, IBM is not cheap. Nor is Oracle/Sun hardware. However, the time and money spent developing a large scale framework on the application layer is not a trivial expense either. It might be that the time FB puts in trying to deploy something uncharted like this may cost them more in the long run.

Re:Call me dense, but... (0)

Anonymous Coward | more than 4 years ago | (#33750094)

Call me dense, but with all the racks of 1U x86 equipment FB uses, wouldn't they be far better served by machines built from the ground up to handle the TPM and I/O needs?

Rents (economic profits) go to the owner of the scarce resource. If Facebook depends on specialized hardware, instead of commodity hardware, then who owns the scarce resource? Once locked-in, Facebook would be vulnerable to hold-up.

See, an MBA is not useless.

Re:Call me dense, but... (1)

saleenS281 (859657) | about 4 years ago | (#33754696)

EMC won't automatically move anything. Stop listening to marketing droids. "FASTv2" on anything you can connect to a mainframe is nothing but a marketing slide at this point, and likely will continue to be for the next 6 months.

Re:Call me dense, but... (2, Interesting)

TheSunborn (68004) | about 4 years ago | (#33754932)

The problem is that for any specific budget* the x86-64 solution will give you more aggregate io and more processor hardware then the mainframe. The argument for the mainframe is then that the software might be more easy to write but there don't exists any mainframe which can serve even 1/10 of Facebook so you need to cluster them anyway. And if you need to special cluster magic you might as well have x86-64.

And IBM will not promise you 99.999% uptime if you buy a single mainframe. If you need that kind of uptime you need to buy multiple mainframes and cluster them.

*Counting in either rackspace used or money paid for hardware.

How many times a day do people check Facebook? (2, Interesting)

Comboman (895500) | more than 4 years ago | (#33747358)

"690 billion page views to its 540 million users in August"? Good lord, that's 1278 page views PER USER in just one month! That's (on average) 41 page views per user, per day, every single day! The mind boggles.

Re:How many times a day do people check Facebook? (1)

kaizokuace (1082079) | more than 4 years ago | (#33748374)

yea? so? it's not like an addiction or something! </fiending>

Re:How many times a day do people check Facebook? (0)

Anonymous Coward | more than 4 years ago | (#33748952)

It's all the scripts auto playing zygna click "games".

Re:How many times a day do people check Facebook? (0)

Anonymous Coward | more than 4 years ago | (#33749698)

I could be wrong, but I’m willing to bet the hit counts are inflated because of the way a lot of the facebook games are written. In many cases they reload the whole page every time a user clicks something and each of those counts as a page view. A single Mafia Wars fan can rack up a lot of page reloads by hitting “Do Mission” over and over.

I’ve seen some facebook games with imbedded advertising, and those games are DESIGNED to inflate the number of page reloads.

Re:How many times a day do people check Facebook? (1)

Revvy (617529) | more than 4 years ago | (#33749802)

I'm wondering if they're counting every one of the automated page update checks as a page view. I'd be really curious to see exactly what they count as a page view.

Re:How many times a day do people check Facebook? (1)

carlosap (1068042) | more than 4 years ago | (#33750530)

41 pages per users or more, with more smartphones that number is going to grow, personally i visit 40 pages or more, thats a crazy number. I dont use traditional email any more, just check the comments of my friends and thats it.

Re:How many times a day do people check Facebook? (2, Interesting)

Overzeetop (214511) | more than 4 years ago | (#33751410)

Have you seen how often Facebook crashes /has problems? you have to constantly reload the thing to get anything done. Thank goodness Google Calendar doesn't have that problem or I'd probably have a thousand hits a day to my calendar page alone.

Also, FB pages tend to be pretty content-sparse. It's not uncommon for me to hit a dozen pages in 2-3 minutes if I check facebook.

Re:How many times a day do people check Facebook? (1)

gangien (151940) | more than 4 years ago | (#33752964)

go to fb, go to friends page, go back, go to another friends page, go back, go to farmville, go back

7 page views, for doing very little, right? how many times a day?

that's ignoring things like farmville gifts which require i think 3 page views for accepting/sending/returning each gift.

and of course there's the teen girl that does lord knows how many things.

infrastructure secrecy versus openness (2, Interesting)

peter303 (12292) | more than 4 years ago | (#33747914)

Its interesting how FB is open about their data server infrastructure while some places like Google and MicroSoft ware very secretive. It is competitive for Google to shave every tenth of second off of a search they can through clever software and hardware. They are an "on ramp" to the Information Super Highway, not a destination like FB. And because Google is one of the largest data servers on the planet, even small efficiency increases translate in mega-million-dollar savings.

Re:infrastructure secrecy versus openness (0)

Anonymous Coward | more than 4 years ago | (#33748162)

Huh? What rock have you been hiding under? Google talks in-depth about techniques that it uses to be efficient. They also publish papers about it.

Re:infrastructure secrecy versus openness (0)

Anonymous Coward | more than 4 years ago | (#33749348)

If you think the details in this article makes FB open, then you're really confused. Google is many times more open, they even give back source occasionally.

data servers = industrial engines of 21st century (2, Interesting)

peter303 (12292) | more than 4 years ago | (#33748090)

When these data centers start showing up as measurable consumers of the national power grid and components of the GDP, you might consider them metamorphically as power-plants of the information industry.

In his book on the modern energy industry "The Bottomless Well", author Peter Huber places commodity computing near the top of his "energy pyramid". Peter's thesis is modern technology has transformed energy into ever more sophisticated and useful forms. He calls this "energy refining". At the base of his pyramid are relative raw energy like biomass and coal. The come electrivity, computing, optical, etc. I think its interesting to view computing a refined form of energy.

drainbead (0)

Anonymous Coward | more than 4 years ago | (#33748472)

lol: "using coal to power our data center in a COOL climate area gives
less carbon footprint then locating it in a warmer region but with renewable
energy".
i don't care how much RENEWABLE energy you are using. use trillions. it's freaking RENEWABLE!

Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?