×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

How Facebook Runs Its LAMP Stack

Soulskill posted about 5 years ago | from the very-carefully-and-sometimes-not dept.

Linux Business 111

prostoalex writes "At QCon San Francisco, Aditya Agarwal of Facebook described how his employer runs its software stack (video and slides). Facebook runs a typical LAMP setup where P stands for PHP with certain customizations, and back-end services that are written in C++ and Java. Facebook has released some of the infrastructure components into the open source community, including the Thrift RPC framework and Scribe distributed logging server."

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

111 comments

Open source (5, Funny)

Norsefire (1494323) | about 5 years ago | (#27541485)

As I recall, some of their code was made open source [techcrunch.com] in 2007, although not deliberately.

Re:Open source (0)

Anonymous Coward | about 5 years ago | (#27541545)

"The reprinting of this code violates several laws and we ask that people not distribute it further."

Your defenition of Open Source is a bit weird.

Re:Open source (1, Funny)

Anonymous Coward | about 5 years ago | (#27541651)

*whoosh*

Mark Zuckerberg is a GEGAWNTIC DOUCHE w peach fuzz (2, Interesting)

Anonymous Coward | about 5 years ago | (#27541845)

Ah, the sweet ironies (and hypocrisies) in life. There's something beautifully creepy about a person fighting so hard against the same thing they fought so hard to create. In today's case, the culprit is Mark Zuckerberg, the young man more responsible than perhaps any other for his generation's obsession with displaying itself publicly on the internet. The New York Times has reported that a judge turned down Facebook's request to have "unflattering documents" about Zuckerberg removed from the website of Harvard magazine 02138.

At the center of the issue is an article in 02138 about Facebook's evolution and the subsequent lawsuit from classmates asserting Zuckerberg stole the idea and computer source code to begin his own project. The New York Times calls the article "sympathetic to the plaintiffs's account and questions the validity of Mr. Zuckerberg's claims."

The 02138 article also contains Zuckerberg's handwritten application to Harvard, and a journal that "contains biting comments about himself and others."

Perhaps Gawker summarized it best, saying, "This is the same dude who made billions from a website that allows you to let everyone in your friend network know when you are peeing."

And now he's mad that a private persona he would like to keep that way has entered the public domain. Yes, the sweet ironies and hypocrisies in life: why do we love them so much?

whatever (0)

speedtux (1307149) | about 5 years ago | (#27541507)

Whatever they're doing, it's not working too well. Sure, they manage to serve the pages, but the user experience is confusing and it seems to take them forever to roll out new and improved versions.

Re:whatever (1)

rarel (697734) | about 5 years ago | (#27541643)

It sheds a light on the infrastructure at least. Their implementation may not be very bright, but that's quite enlightening nonetheless. Many others would just keep us in the dark.

Re:whatever (4, Funny)

AHuxley (892839) | about 5 years ago | (#27541771)

To quote a joke on slashdot
"Is there anything Java cannot make slow."

Re:whatever (4, Funny)

Tubal-Cain (1289912) | about 5 years ago | (#27542411)

Too paraphrase the answer:
"Sun's stock price plummet.

Re:whatever (1)

Yfrwlf (998822) | about 5 years ago | (#27544661)

Meh, not having an opinion on the capabilities of the Java system itself, anything which needs extra library Y that you aren't using yet is going to slow down while loading it though. I know that Java is even slower due to it being a virtual machine and not just a library of course. Most programs now days have ports for all three major OSes any way, at the same time computers keep getting faster making running things like Java less annoying, so I can't decide whether Java use will increase or not.

Re:whatever (1)

thsths (31372) | about 5 years ago | (#27546149)

>at the same time computers keep getting faster making running things like Java less annoying

Luckily Sun has an answer for that, too: the next Java version is always more than compensating for faster computing :-(. Iremember that back in the days I timed JRE 1.0 or 1.1 to start within 3 seconds on a 486 (!). It takes longer to start JRE 1.6 on my state of the art dual core now.

Re:whatever (0)

Anonymous Coward | about 5 years ago | (#27544267)

Java cannot make my increasing impatience for Java slower.

Re:whatever (1)

Antique Geekmeister (740220) | about 5 years ago | (#27541777)

Well, they have the now-classic problems of needing to add new features to draw more users, to keep users from getting bored, and especially to convince investors and advertisers that there's any growth market for them. This means constantly using their infrastructure in new ways: new database manipulations, new tables, new display utilities, and retaining compatibility across a broad range of clients and older applications. This inevitably slows down new releases.

Re:whatever (2, Insightful)

theillien (984847) | about 5 years ago | (#27541997)

Whatever they're doing, it's not working too well. Sure, they manage to serve the pages, but the user experience is confusing and it seems to take them forever to roll out new and improved versions.

That has little to do with the infrastructure and more to do with the site design. Please don't blame the sys engineers/admins for the poor interface design.

the blame is with management (0, Flamebait)

speedtux (1307149) | about 5 years ago | (#27542429)

That has little to do with the infrastructure and more to do with the site design. Please don't blame the sys engineers/admins for the poor interface design.

Well, the fact that they gave a talk about their LAMP stack tells you that they consider engineering more important than site design. Furthermore, a poor choice of infrastructure makes doing good site design hard.

And that's my point: Facebook is evidently driven by system stuff and programmers, while it should be driven by site design.

Re:the blame is with management (5, Insightful)

Anonymous Coward | about 5 years ago | (#27542635)

That has little to do with the infrastructure and more to do with the site design. Please don't blame the sys engineers/admins for the poor interface design.

Well, the fact that they gave a talk about their LAMP stack tells you that they consider engineering more important than site design. Furthermore, a poor choice of infrastructure makes doing good site design hard.

And that's my point: Facebook is evidently driven by system stuff and programmers, while it should be driven by site design.

Clearly, $MY_SPECIALTY should drive the entire system! They made a big mistake by allowing $OTHER_SPECIALTY to take precedence. Everyone knows that only $MY_SPECIALTY should dominate all design plans. Duh.

Re:the blame is with management (1)

rinoid (451982) | about 5 years ago | (#27542821)

>Clearly, $MY_SPECIALTY should drive the entire system! They made a big mistake by allowing $OTHER_SPECIALTY to take precedence. Everyone knows that only $MY_SPECIALTY should dominate all design plans. Duh.

Awesome.

---

The user experience and flow at FB improved IMHO with the recent changes.

Re:the blame is with management (4, Insightful)

Firehed (942385) | about 5 years ago | (#27542649)

If your site infrastructure is influencing how you design, you've made some sort of monolithic error along the way. Good code completely separates the content from the design. It's not like they've just hacked up a Wordpress install (which seems to go out of its way to tie content and design together) - Facebook employs hundreds if not now thousands of programmers; it's pretty safe to assume there's at least one UI/UX specialist on board as well.

All things considered, I'd actually say that Facebook's design is pretty decent, but that's of course a matter of opinion. A lot of the code that went into that design sucks, but that's what happens when you have to support IE6. Regardless, I think it's great that they're sharing knowledge about how they've managed to use and customize an infrastructure to support 200,000,000 users, especially with the amount of traffic they have to deal with. That's well beyond the scale that many governments have to worry about!

Re:the blame is with management (1)

WillKemp (1338605) | about 5 years ago | (#27543825)

[......] a Wordpress install (which seems to go out of its way to tie content and design together) [......]

Huh??? How do you make that out? Wordpress is just a content management system that spits out mostly almost plain text wherever you put a relevant PHP command. There's not much of what it outputs that contains any serious formatting - apart from a few situations where there's no real alternative.

Re:the blame is with management (1)

speedtux (1307149) | about 5 years ago | (#27546005)

If your site infrastructure is influencing how you design, you've made some sort of monolithic error along the way

I haven't, but Facebook evidently has.

All things considered, I'd actually say that Facebook's design is pretty decent

I find it confusing as hell, and so do most people I know.

I think it's great that they're sharing knowledge about how they've managed to use and customize an infrastructure to support 200,000,000 users,

Come on, scalability is off-the-shelf stuff these days.

Re:the blame is with management (0)

Anonymous Coward | about 5 years ago | (#27542653)

Well, the fact that they gave a talk about their LAMP stack tells you that they consider engineering more important than site design.

Imagine that. An architecture presentation rather than a site design presentation at a conference which is geared towards architects and whose primary speakers have been infrastructure tool developers. Big surprise.

Re:the blame is with management (1)

Frools (1326479) | about 5 years ago | (#27542655)

It was a talk by the Director of Engineering to "an independent online community focused on change and innovation in enterprise software development".
Of course it was about their LAMP stack and backend infrastucture and not their site design...

Re:the blame is with management (1)

kv9 (697238) | about 5 years ago | (#27542913)

Furthermore, a poor choice of infrastructure makes doing good site design hard.

yeah, they obviously should have used RoR. that certainly scales better. oh, wait...

Re:the blame is with management (1)

ZachPruckowski (918562) | about 5 years ago | (#27545173)

Well, the fact that they gave a talk about their LAMP stack tells you that they consider engineering more important than site design. Furthermore, a poor choice of infrastructure makes doing good site design hard. And that's my point: Facebook is evidently driven by system stuff and programmers, while it should be driven by site design.

Aditya Agarwal is Facebook's Director of Engineering. Infoq is a site about software engineering. This is a talk by a software engineer, hosted at a software engineering site. Of course it's about software engineering. That tells us nothing about Facebook's priorities vis a vis engineering and design.

Jeff Kaplan (Blizzard's Lead Game Designer) gave a talk about WoW quests at GDC. That doesn't mean that Blizzard thinks that PvE content design is more important than PvP class balance or graphics, it just means that the game designer is talking about his area of expertise in front of a crowd that's interested in his area of expertise. If I wanted to hear a Blizzard employee talk about graphics, I'd go to Nvision or something, and I'd listen to whoever Blizzard's graphics engine guy is, and if I wanted to hear about Blizzard's PvP class balance ideas, I'd look for talks by Greg Street (Ghostcrawler, their lead class balance guy)

Not very well (1)

kbrasee (1379057) | about 5 years ago | (#27541517)

Every few days I run into whole sections of core Facebook functionality that are just plain broken for hours. Earlier this week, my main page wouldn't load for most of the day. And every couple of weeks I'm greeted with a "Sorry, you can't log in right now." message.

Re:Not very well (1)

firmamentalfalcon (1187583) | about 5 years ago | (#27541605)

There is much more to Facebook than its LAMP stack, so you shouldn't just blame LAMP stack. And even if the LAMP stack does not work perfectly, there are still insights that developers can glean from it.

Re:Not very well (0)

julesh (229690) | about 5 years ago | (#27541935)

There is much more to Facebook than its LAMP stack, so you shouldn't just blame LAMP stack. And even if the LAMP stack does not work perfectly, there are still insights that developers can glean from it.

I'd say you're right not to blame the LAMP stack. I blame PHP for most of the problems.

According to the presentation, facebook's traffic is around 20,000 page requests per second (!). They have somewhere in the region of 5,000 servers in their main datacenter and (I believe) others scattered around the world, but restricting it to just that main center, that means each server is handling around 4 requests per second. Probably three-four times that at peak times, I'd guess. If the system were written in a compiled language, that wouldn't be even slightly taxing. They could probably reduce their infrastructure to around half its current size. But large chunks of it are written in interpreted languages, with the worst offender being the huge volume of PHP that constitutes their front end. PHP is, as I've profiled it, horribly inefficient. If you've got low traffic, it's a great solution for creating a quick, simple web site, but it doesn't easily scale to the kind of traffic facebook manages. They're spending a fortune on hardware and datacenter running costs because they've chosen an inefficient architecture. They've had to invest significant time, if you listen to the presentation, modifying PHP to make it less inefficient to even get to where they are now.

Re:Not very well (1, Troll)

drinkypoo (153816) | about 5 years ago | (#27541957)

PHP is, as I've profiled it, horribly inefficient. If you've got low traffic, it's a great solution for creating a quick, simple web site, but it doesn't easily scale to the kind of traffic facebook manages.

FUD-o-riffic. Facebook can cache most of their content which dramatically reduces the overhead of using a scripting language. They're not generating the whole page every time, unless they're big fucking idiots.

Re:Not very well (1, Interesting)

Anonymous Coward | about 5 years ago | (#27543197)

I know someone who works at Facebook.

FUD-o-riffic.

According to my contact, PHP is a serious problem there. It scales poorly, requiring Facebook to throw more hardware at it.

They're not generating the whole page every time, unless they're big fucking idiots.

On the contrary, they hire lots of smart people. But they have legacy code that was not well planned for the size of operation they have now, and it has been painful to try to clean things up after the fact.

Re:Not very well (0)

Anonymous Coward | about 5 years ago | (#27543717)

According to my contact, PHP is a serious problem there. It scales poorly, requiring Facebook to throw more hardware at it.

Huh? That's what scalability is, PHP is a shared nothing and scales fine. I did see the source of their leaked index page. If that's any indicator of engineering quality, it's no fucking wonder they're having problems.

Re:Not very well (3, Interesting)

coryking (104614) | about 5 years ago | (#27543243)

Facebook can cache most of their content which dramatically reduces the overhead of using a scripting language

True. But writing cache code is not easy and makes your code more brittle. It increases the likely hood a user will interact with the website and do something, say "update my profile" only when they click "save", their profile hasn't updated yet because your cache sucks. Then you have to plaster your site with bullshit messages about "please allow 30 seconds to see the change".

But what is far, far, far worse is you are allocating programming resources to non-features. Caching is a non-feature that adds zero value to your website. Your users dont interact with your cache. They interact with your website--and I bet if you are like any moderatly complex site, you've got all kinds of bugs that annoy the hell out of them. So rather than allocate your developer time to fixing those annoying bugs (thus adding value) or adding new features (thus adding value), you are stuck pissing away time optimizing bullshit your users never see.

So yeah. You can cache the fuck-all out of your website. But only by stealing developer time away from working on features that make your users happy. Of course if you wrote the thing in C instead of PHP, you'd have a different set of development problems of which I could only have nightmares about.

In otherwords, engineering is always a tradeoff. Use PHP (and MySQL) and piss away developer time on caching the fuck around their weakness. Use a compiled language like C and piss away developer time doing fuck-if-I-know because you didn't free mallocs or had to write a template language from scratch or some insane shit like that. Pick your poison!

Re:Not very well (4, Interesting)

drinkypoo (153816) | about 5 years ago | (#27543411)

As you say, there is a tradeoff. It doesn't matter if you're fighting the need to cache intelligently in PHP, or the need to get everything right because you're developing a complete solution in C (or whatever) or the need to interface to someone else's system for serving pages if you're using something in between. It also doesn't matter if you're using a servlet technology, or you're punching bits out on a paper tape and feeding it into a machine which converts it into EBCDIC and... you get the idea: don't fuck up.

In any case the whole argument is fucking stupid because: PHP is not implemented in PHP. And Facebook is not implemented in pure PHP. See summary: Facebook runs a typical LAMP setup where P stands for PHP with certain customizations. At some point you have to ask yourself how many wheels you want to reinvent. If you extend PHP you can reinvent fewer wheels. I'm not sure it's the right answer, but I'm sure it's not a horribly wrong one. I'm also absolutely certain that barring some massive development in processing the future is only going to involve more parallelism and more clustering, and that if you expect PHP to scale on a single machine you're a bozo.

What I have personally noticed about using PHP is that a single page load can consume an absolutely insane amount of memory. This problem, too, is mitigated or eliminated by aggressive use of caching. In order to cache properly you need to do something intelligent with your data store, which I think is where most people fall down. Having looked into the mishmash that most CMSes produce in the db is enough to make you weep. I long for an elegant object-oriented CMS based on practically anything, but the simple truth is that PHP is by far the easiest thing to get going without spending any money and that has probably done more than anything else to propel it to the head of the FOSS class, at least in terms of popularity. A staggering number of quite excellent websites seem to be built with it as well.

In summary, I reject the notion that PHP is a serious limiting factor for the majority of websites and that most of those for whom it is have failed to understand PHP. (Not that I'm any PHP guru.) It's true that a clustered web application is significantly more complex than something which is not clustered. However, it's also [potentially] far more scalable. At some point you simply run out of machine. When you can't get anything better from Sun (AFAICT they make the single machines which can handle the most threads today) you're going to have to cluster, even if it's only to two machines. At that point you'll have far more complexity invested in having a single system image to work with and the pain of moving to a cluster will be magnified that much more as well. If you accept the notion that clustering is today and for the foreseeable future the best way to handle scalability (which I admit is at this point not a proven notion, but is at least a well-supported theory) then the idea that PHP is a major limiting factor is just plain silly. Sun is circling the drain, and everyone else is concentrating on clustering. Your call...

Re:Not very well (1)

encoderer (1060616) | about 5 years ago | (#27546761)

FWIW, clustering doesn't have to be hard. Vertical webserver sharding can get complex. (eg, these servers load accounts, these servers handle signups, etc)

But you can simply scale out many PHP/Python/Ruby apps by storing Session data in the DB or on a NAS share. Then do a simple round-robin in the balancer.

Storing session data that way does give u a bit of a hit so you won't see try (O)N scaling. Probably something more like (0)0.9N. EG 2 servers would give you 1.8x the capacity of 1 that stores session data on a local physical disk.

Re:Not very well (1)

encoderer (1060616) | about 5 years ago | (#27546719)

This is basically rubbish. HTML Rendering is not the bottleneck. It's the DB. Tweaking the MySQL Query Cache and using memcache is the solution here.

In both cases, you'd update your cache as part of the DB update.

Any cron-driven pre-caching is just the PB to your caching jelly.

The query cache requires no add'l code to implement and can produce huge efficiencies. Simple sharding will ensure any given mysql server has enough cache resources to remain performent.

And memcache has swell libs that sit in front of an entire cluster of mc servers and make your life easy. Cache server sharding and striping are done for you.

It becomes as simple as:


function load_profile(id)
{
    memcache = memcached.instance();
    if (profile = memcache.read('profile_' + id))
            return profile;

    p = profile.instance();
    if (profile = p(id))
            return profile;

    throw new Exception('not found');

}

Re:Not very well (0)

Anonymous Coward | about 5 years ago | (#27542243)

I'd guess. If the system were written in a compiled language, that wouldn't be even slightly taxing.

This is meaningless, even if we assume that you're talking about compiling to native code. For starters, PHP is commonly run with an opcode cache, delivering reasonable performance. When I benchmarked, an opcode cache performed about the same as roadsend (PHP to Scheme to C to native object code).

This is the same argument as C vs asm, where the economic reality of the situation is ignored. "If you're not hand optimizing assembler you're writing inefficient code". The team coding in assembly can impress us all in 5 years when they launch their basic site. Meanwhile, market and mindshare are dominated by more business minded players.

They've had to invest significant time, if you listen to the presentation, modifying PHP to make it less inefficient to even get to where they are now.

It doesn't matter what language you're using, if there's a valid and pressing business case for optimizing you do it.

Yeah, Blame the Language (2)

aoheno (645574) | about 5 years ago | (#27542603)

PHP is the most popular language on the planet for a good reason - transaction rate.

If code is written in any language such that the app cannot handle more than 12 transactions per second, it's time to find another profession instead of blaming the language.

Depending on the application, PHP can handle several hundred transactions per second, on *one* machine. It is common knowledge that Java requires far more resources to achieve a typical transaction rate, than PHP.

Both Java and PHP Are Interpreted (1)

aoheno (645574) | about 5 years ago | (#27542789)

Both Java and PHP are interpreted languages because this is how you create a cross-platform language.

Each gets compiled to bytecode which gets executed in a OS specific VM.

Jave bytecode is compiled manually. PHP bytecode is compiled automatically using an encoder. In both cases code is compiled once and reused.

We chose PHP for our website because of it's efficiency in terms of development (e.g. no class generation step for programmers) and execution overhead. You don't need as much memory or cpu to run a typical PHP server.

Frankly, most websites do not need an app server. Wikipedia uses PHP, not Java. It is not a 'simple' website that you say PHP is suited for.

Re:Both Java and PHP Are Interpreted (3, Informative)

julesh (229690) | about 5 years ago | (#27547093)

Both Java and PHP are interpreted languages because this is how you create a cross-platform language.

Each gets compiled to bytecode which gets executed in a OS specific VM.

Java is JIT compiled to native code, whereas PHP is bytecode interpreted. The difference is more than an order of magnitude. In fact, judging by this comparison [debian.org], in many cases Java is about 100 times faster than PHP.

Frankly, most websites do not need an app server. Wikipedia uses PHP, not Java. It is not a 'simple' website that you say PHP is suited for.

Wikipedia is presenting uncustomised content to most users. It runs a huge squid cache in front of its PHP servers. If it tried to run PHP for each user it would crawl. I run mediawiki locally on an AMD Athlon64 2200+. It takes ~0.2 seconds of 100% CPU time to process a simple request. There is simply no way Wikipedia could run without content cacheing.

This is not to say that the task of serving that content is cheap. But they're doing a lot better than facebook; they're serving 30,000 requests/sec with only 350 servers. The difference, I suspect, is mostly down to the amount of cacheing they prform.

Facebook is much less able to cache content. It doesn't have a squid front end because relatively few users see the same exact content, unlike for wikipedia; most users are logged in most of the time and see pages customised for themselves.

Re:Not very well (2, Insightful)

FishWithAHammer (957772) | about 5 years ago | (#27542823)

You do realize that you can do massive improvements to PHP perf in the space of 5-10 minutes without a recompile right...? The idea that PHP is "slow" is FUDtastic. Of course it's slow if all you're doing is letting it interpret every time, but with APC or another caching mechanism it's interpret once, run-the-bytecode every other time. Massive speed improvements.

Re:Not very well (1)

kv9 (697238) | about 5 years ago | (#27543015)

but with APC or another caching mechanism it's interpret once, run-the-bytecode every other time. Massive speed improvements.

that, plus lots of content caching (think Wikipedia style, with gobs of squids in front of your apaches) makes it even faster.

note to mods, use less crack. parent is not flamebait.

Re:Not very well (1)

FishWithAHammer (957772) | about 5 years ago | (#27544949)

I've had some dipshit going through and modding a lot of my posts flamebait over the last couple days. It's actually kind of funny.

I've been thinking about trying to ditch apache on some mid-volume sites of mine and seeing how lighttpd handles it, see if there's any improvement. I don't use the majority of apache's functionality, so it could be a definite plus.

Re:Not very well (4, Interesting)

Firehed (942385) | about 5 years ago | (#27542893)

PHP, as a language, is more than capable of handing four requests per second (which can be said of pretty much anything other than punch cards).

Writing bad code in PHP, however, will of course slow things way down. Just like not having indexes on your databases, or doing stupid/unnecessary JOINs. Or not caching properly (see: Wordpress). Writing fast and efficient code in any language is easy enough provided you're a skilled programmer. Facebook, unfortunately, started off as Zuckerberg paying a friend with some web skills to build out a system, and it grew so quickly that replacing the code (or, rather, the DB schema) with something that doesn't suck probably became near-impossible. If you write code with scalability in mind, it's not a tremendous problem.

Of course, nothing is going to cope well with the sheer volume that Facebook deals with. There's plenty you can do along the way to help yourself out, which Facebook may or may not have done. You can bet that nobody thought the site would ever have 200MM users when the first lines of code were written; they probably never expected 1% of that. Writing intelligent code is the most important part of scalability - writing smart DB queries and minimizing the number required probably being the biggest part of that. Have your MySQL servers instead of PHP do some calculations in queries (hashes, query-related math, etc) usually doesn't hurt since you're generally offloading CPU-intensive operations to a disk-bound machine (i.e., has spare cycles).

There's all sorts of tricks and optimizations. Some are language-specific, and some aren't. But making bad decisions early on is a lot harder to fix than an inefficient foreach loop.

Re:Not very well (1)

julesh (229690) | about 5 years ago | (#27545183)

PHP, as a language, is more than capable of handing four requests per second (which can be said of pretty much anything other than punch cards).

Not, I think, if your application is running into the 1 million lines of code range, as suggested by the presentation. Remember that the PHP interpreter has to reparse each of those lines for each request.

Re:Not very well (1, Interesting)

Anonymous Coward | about 5 years ago | (#27545965)

You might note that the presentation covers this to some extent. They mention some customizations they've made to PHP in the area of caching the bytecode from the PHP source files. They mention that PHP, by default, will stat the file system every 2 minutes to see if files have changed. From the sound of the presentation, they've probably customized it to check for updated files only when explicitly instructed to since they don't change the code that often.

Don't get me wrong...I still think PHP is completely unsuited to a site the size of FaceBook, but it's not reparsing the PHP source file for every request.

Re:Not very well (0)

Anonymous Coward | about 5 years ago | (#27546035)

Not if your application uses bytecode caching (APC, xcache). If you use mod_php then it doesn't have to re-load the libraries on each request either.

Re:Not very well (1)

encoderer (1060616) | about 5 years ago | (#27546673)

Unless you're, you know, not braindead and you implement opcode caching... Not to mention, those 1MM lines are no doubt spread across thousands of files, classes and functions. Only a fraction would be interpreted for each request.

Re:Not very well (1)

Frools (1326479) | about 5 years ago | (#27545481)

Have your MySQL servers instead of PHP do some calculations in queries (hashes, query-related math, etc) usually doesn't hurt since you're generally offloading CPU-intensive operations to a disk-bound machine.

Interestingly in the presentation he said they actually do the opposite of that, things like md5 hashes are done in the application rather than on the DB because its much easier to scale up the number of web servers

Re:Not very well (4, Interesting)

Fweeky (41046) | about 5 years ago | (#27543049)

They have somewhere in the region of 5,000 servers in their main datacenter and (I believe) others scattered around the world, but restricting it to just that main center, that means each server is handling around 4 requests per second

I somewhat doubt every single one of them is a dynamically driven webserver. Probably at least half are databases, search servers, caching servers, backend appservers, file servers, CDN type stuff, backup servers, hot spares, admin servers, staging machines, etc.

For example: Newzbin has 5 webservers in main rotation; it also has 7 search servers (plus one development machine with similar specs), 6 database machines, 2 backend systems running most of our cronjobs, 2 admin servers, 1 web development server, and 2 systems for building and deploying OS's from. As far as load is concerned, the backend stuff is far more important than the frontend. Sure, we could rewrite the main site in Java or Scala or C++ and get away with 3 webservers and still be N+2, but trust me, those extra two or three webservers is not a significant cost next to that of development.

I can either spend £5k on extra equipment (plus occasionally boosting our space and bandwidth costs, but those are dominated by other systems already), or I can spend £70k a year on another developer, who *still* won't allow us to match our development speed with PHP, and then rewrite tens of thousands of lines of code, likely into much more.

Much of our backend is written in C. That's where the big payoffs for efficient languages is, not a bit of database-limited HTML rendering. Judging by how many big sites are still running PHP, Python and Ruby for their frontends, this would seem to be the case elsewhere, too.

Re:Not very well (1)

chromatic (9471) | about 5 years ago | (#27544917)

If the system were written in a compiled language, that wouldn't be even slightly taxing.

What makes you think the system is CPU bound?

Re:Not very well (1)

Sebilrazen (870600) | about 5 years ago | (#27541731)

Every few days I run into whole sections of core Facebook functionality that are just plain broken for hours. Earlier this week, my main page wouldn't load for most of the day. And every couple of weeks I'm greeted with a "Sorry, you can't log in right now." message.

Are you kidding me? That's not broken, those may be value added features. I'd give a kidney if I didn't need to see the results every time a "friend" - using the term loosely - took a goddamn quiz to find out they were Pablo Picasso in a previous life, yet they were born in 1972 and Pablo died in 1973.

Re:Not very well (1)

rs79 (71822) | about 5 years ago | (#27541841)

You do realize you can customize FB to get as much or as little as you want in terms of those notifications don't you? Look up skillfoo.

The problem I have with FB is not that (any more) it's the fact their, as of the last rev) paged are so JS computationally complex now that rendering my FB home page cause my (fairly contemorary IBM/XP) laptop to just freeze for about 7 seconds while it's doing its JS and ajax goodness.

Previously, only slashdots new look did this, and not for as long.

I can get past the idea that almost nobody in the world understands how the iterface works and that it changes too quickly for any book to be of use, but this slowness. Wow, I dunno.

Re:Not very well (1)

WillKemp (1338605) | about 5 years ago | (#27543915)

[......] rendering my FB home page cause my (fairly contemorary IBM/XP) laptop to just freeze for about 7 seconds while it's doing its JS and ajax goodness.

That's probably XP then, not FB - cos it doesn't do it on my (fairly contemporary Lenovo/Linux) laptop.

core broken functionality (1)

viralMeme (1461143) | about 5 years ago | (#27541797)

"Every few days I run into whole sections of core Facebook functionality that are just plain broken for hours"

What response did you get when you reported it to the Bug Reporting site [facebook.com]?

Re:core broken functionality (1)

kbrasee (1379057) | about 5 years ago | (#27542321)

Didn't submit a bug report, I never knew they had such a page. They certainly don't display it in a place where it's easily noticed.

Re:core broken functionality (1)

viralMeme (1461143) | about 5 years ago | (#27542803)

"Didn't submit a bug report, I never knew they had such a page. They certainly don't display it in a place where it's easily noticed"

I Googled on 'facebook bug report' and which lead to an article that mentioned it ..

A hodge podge mess (1, Insightful)

thammoud (193905) | about 5 years ago | (#27541587)

As an architect, I decided to view the presentation so that I can learn new things about scalability and architecture. This presentation came across as very amateurish and lacks any serious technical depth.Facebook seems to be stitched together as a set of "solution de jour" technologies without any real architecture behind it. Too many languages, frameworks and other gems. These guys took the notion of the right language for the task to an extreme. I have to believe that code releases into production is a big challenge for these folks.

Re:A hodge podge mess (0)

Anonymous Coward | about 5 years ago | (#27541795)

No it's a 'mash-up'. They're just shifting paradigms for the new Web 2.0 generation.

Haven't you heard of innovating?

Re:A hodge podge mess (1)

julesh (229690) | about 5 years ago | (#27541815)

These guys took the notion of the right language for the task to an extreme. I have to believe that code releases into production is a big challenge for these folks.

It explains a lot of the complaints you here about them regularly (I'm not a user, but almost everyone I know is...), specifically that they're not rolling out new features as often as they did when they were small, and that large parts of the site are often unavailable for lengthy periods.

a hodge podge customized solution (1, Troll)

viralMeme (1461143) | about 5 years ago | (#27541875)

"Facebook seems to be stitched together as a set of "solution de jour" technologies without any real architecture behind it. Too many languages, frameworks and other gems. These guys took the notion of the right language for the task to an extreme. I have to believe that code releases into production is a big challenge for these folks"

What's 'hodge podge' about a highly customized solution. It is precisely what LAMP is all about. It does seem to work for them and with Facebook supporting 200 million active users, it is a good example of an Open Source success, so they must be doing something right.

Re:A hodge podge mess (1)

RedK (112790) | about 5 years ago | (#27542033)

You mean like technical depth in architecture documents ? The only people where I work that can ramble on for hours without ever saying 1 clear thing are IT architects.

Re:A hodge podge mess (1)

Gothmolly (148874) | about 5 years ago | (#27542109)

Declaring yourself an 'architect' doesn't mean a) you are one or b) you do anything OTHER than vague, powerpointy kinds of things. Given that Facebook is bigger and more complex than anything you have built, you should consider reserving judgment.

Re:A hodge podge mess (0)

Anonymous Coward | about 5 years ago | (#27542925)

Given that the federal government is bigger and more complex than anything I've ever built, should I also be silent on the coherence or propriety of its design?

The quality of the facebook architecture is measured in its ability to satisfy its quality attributes and less so in its ability to keep ambling along relatively effectively.

How maintainable is is? How operable is it? How secure is it?

All of these things are independent of size or complexity.

Re:A hodge podge mess (1)

thammoud (193905) | about 5 years ago | (#27543275)

In other news, GM is about to declare bankruptcy, yet one point in time, they produced the most number of cars in the world. I have news for you. THEY WERE SHITTY CARS and eventually that caught up with them. I have no formal mechanical or design engineering background. I bet you have no opinion on GM since you never produced cars.

Re:A hodge podge mess (1)

kv9 (697238) | about 5 years ago | (#27543037)

This presentation came across as very amateurish and lacks any serious technical depth.

much like your post.

Re:A hodge podge mess (3, Insightful)

mfnickster (182520) | about 5 years ago | (#27543519)

Facebook seems to be stitched together as a set of "solution de jour" technologies without any real architecture behind it. Too many languages, frameworks and other gems.

I was thinking the opposite - they have developed an architecture that is modular enough to allow them to develop different pieces using different technologies, yet they all work together pretty seamlessly. I'd say that's quite an accomplishment!

Re:A hodge podge mess (1)

rinoid (451982) | about 5 years ago | (#27545109)

Blah blah blah blah.

"Facebook seems to be stitched together as a set of "solution de jour" technologies without any real architecture behind it."

What's "de jour" about PHP or LAMP? Christ man they have been around for over a decade.

So if this was written in Cobol would that make you happy?

Have you developed anything with the reach, user count, view count of Facebook?

FB doesn't seem to be buckling under the pressure. More photos are uploaded to FB than any other site man:
http://www.techcrunch.com/2009/02/22/facebook-photos-pulls-away-from-the-pack/ [techcrunch.com]

What's so hodge-podge about that implementation?! Because it's not JSP serving?

One question: (4, Interesting)

bogaboga (793279) | about 5 years ago | (#27541661)

About how much has Facebook saved by using Open Source Software? I ask because I am not familiar with licensing costs from competing solutions. Thanks!

Re:One question: (5, Informative)

julesh (229690) | about 5 years ago | (#27541779)

About how much has Facebook saved by using Open Source Software? I ask because I am not familiar with licensing costs from competing solutions. Thanks!

I haven't watched the presentation so don't know if this is answered there, but it's hard to pin down any numbers on precisely how many servers facebook operates. That said, an estimate of their expected power usage in their recently acquired second datacenter [datacenterknowledge.com] is 6 megawatts, placed at twice the usage in their current datacenter. Realistically, this probably equates to a cluster of around 5,000 machines in the current datacenter.

Costs per machine are likely to be restricted to Windows Server Web Edition; other software would not be needed on all machines (depending on cluster architecture, of course) so would be a trivial cost in comparison. Retail for the web edition is $399; I think we could expect such a high profile user to qualify for a 50% discount. This would put their software costs at about $1M. Considering that they're believed to have spent over 100 times this on hardware and support costs over the last year, I doubt this would be a particular concern. Price of purchase is not a factor in why facebook does not run on proprietary software.

Re:One question: (0)

Anonymous Coward | about 5 years ago | (#27541843)

Facebook isn't a pure LAMP stack. They have a heavy Oracle database installation because MySQL can't perform for all workloads.

Re:One question: (2, Funny)

David Gerard (12369) | about 5 years ago | (#27542081)

Microsoft has announced the infrastructure for its cloud computing service Azure, formerly (and presently) Windows Vapor.

"We want all open source innovation to happen on Windows. In practice, Windows is too slow, and just putting Linux underneath the same software stack triples performance. So we're running the Windows versions of the software on Linux using Wine."

The new Microsoft Amazingly Open And Genuine Public License allows you complete freedom to use, modify and redistribute the software provided that every copy comes with a DVD of Windows Vista Ultimate, you acknowledge that Microsoft's FAT patent protects a remarkable and valuable innovation in computer science and all documentation is in OOXML.

(work in progress, not yet on notnews [today.com])

MOD PARENT UP (1)

da_matta (854422) | about 5 years ago | (#27543479)

This is both informative (a believable in-the-ball-park analysis of Facebook operational costs) & insightful (price is not enough to be the best solution)

Re:One question: (0)

Anonymous Coward | about 5 years ago | (#27541811)

considering who facebook is owned by, saving a few thousand by using unsupported opensource isnt that big of a deal

Re:One question: (0)

Anonymous Coward | about 5 years ago | (#27541977)

considering who facebook is owned by

Mark Zuckerberg? Facebook is a privately-owned company.

Re:One question: (1)

rinoid (451982) | about 5 years ago | (#27545141)

>...by using unsupported opensource isnt that big of a deal

Uh-oh the scary phrase "unsupported opensource" that all the FUD spreaders use when someone suggests using OSS on a project.

Guess what? Open source software is supported. It is supported by your engineers actually doing something with their knowledge other than sitting on the other end of a phone tree waiting for support to ask you if you've rebooted. It is supported by the many supporters to the code tree and by the open, expansive user forums.

Re:One question: (-1, Troll)

scientus (1357317) | about 5 years ago | (#27541863)

open source software doesn't have licenses. It is free. Well, at least free software [gnu.org] is, "open source" is kind of a wash word as its been stretched to mean anything.

Re:One question: (0)

Anonymous Coward | about 5 years ago | (#27542611)

open source software doesn't have licenses. It is free. Well, at least free software [gnu.org] is, "open source" is kind of a wash word as its been stretched to mean anything.

Please expand the acronym "GPL" for me.

Re:One question: (1)

ceejayoz (567949) | about 5 years ago | (#27542619)

Yeah, that's why they call it the GNU General Public License [gnu.org], eh?

Re:One question: (2, Insightful)

scientus (1357317) | about 5 years ago | (#27543455)

yeah sorry, wrote without thinking.

It doesnt have licences is the same way as commercial apps. Also agreeing to the licence is not mandatory to simply use the software, unlike the presumptions made by proprietary licences. In that way its licence is very different, but I did use the wrong words.

Re:One question: (1)

ducomputergeek (595742) | about 5 years ago | (#27541873)

That's always hard to calculate. It's always licensing costs vs. man hour costs. I've been involved in projects where people were trying to get MySQL to play nicely with EC2 without paying for MySQL enterprise. The amount of man hours it cost them would have made DB2 a much better solution. Especially with the fact that DB2 Express-C is free and already had AMI's ready to go.

Re:One question: (1, Informative)

Anonymous Coward | about 5 years ago | (#27541933)

I think the real question is, if they run with LAMP so much, how come they have and request for so many oracle developers?

Palo Alto, CA

Description
Facebook is seeking an Oracle Applications Database Administrator to join the IT team and help build and maintain the IT application footprint. This is a full-time position based in our main office in downtown Palo Alto and will report to the manager of IT Development.

Re:One question: (0)

BitZtream (692029) | about 5 years ago | (#27542729)

Because if you get someone who actually knows how to deal with Oracle, then you get someone who has a clue.

This is in contrast to the fact that half of the high school kids in America think they are 'MySQL DBAs' because they installed it once.

An Oracle developer will have no problem working with MySQL, though they may find themselves banging their heads against solid objects often when they are forced to deal with a hack that claims to be a database server. You won't find a high percentage of people claiming to by 'MySQL developers' that actually have any clue about databases in general, let alone a system the size of Facebook.

Re:One question: (0, Flamebait)

FishWithAHammer (957772) | about 5 years ago | (#27542863)

Not quite. As I understand it, Oracle is used pretty heavily in the back end. It propagates out to faster-but-looser MySQL systems for web display.

Not that I disagree with you about the tardery of most MySQL "admins". ;-)

Related /. article (1)

rwa2 (4391) | about 5 years ago | (#27541925)

I know this should be the job of tags, but to help put this in context, remember the recent uptime [slashdot.org] comparison that showed Facebook with pretty decent availability compared to other social networking sites. I'd say it takes the admins a fair amount of disclipine and perseverance to attain those kinds of numbers. (of course, it probably has nothing to do with the uptime of their various sundry and mostly useless modules, but I'd guess that's a different set of admins than the ones that care for the core LAMP platform)

This almost makes Facebook geek "cool" in my book, but I guess all the non-geek "cool" kids who use it already think so.

Re:Related /. article (2, Interesting)

BitZtream (692029) | about 5 years ago | (#27542677)

It takes pretty much 0 work to make LAMP continue to function. Its for all practical purposes, set it up once (properly) and forget it.

It takes work to make the applications on top of it function continually as thats where the change occurs. LAMP isn't going down on its own, it'll appear to 'go down' because of the 'mostly useless modules' that work along with it fail, not because LAMP does.

I would expect the admin(s) that care for 'the core LAMP platform' spend most of their time doing other stuff. In reality, its probably only multiple to avoid any single person holding to much knowledge and to maintain coverage while that person isn't at work. I just can't imagine they do a whole lot of work 'keeping it running', with the exception of handling database growth and performance, which is more likely handled by the people who design and work with the applications that use that database.

Re:Related /. article (1)

mR.bRiGhTsId3 (1196765) | about 5 years ago | (#27543155)

I am not a DBA (but I did stay at a holiday iinn express last night). But wouldn't it make more sense that the DBA's handle db performance and tuning since they are the ones intricately familiar with how it is set up. Isn't the whole point that the developers working on the db apps should just know what data they want not how the db is configured.

Come on guys (0)

Anonymous Coward | about 5 years ago | (#27542129)

We at Facebook know how to program, unlike the programmers for slashdot.org. I wrote the code that deploys the system and believe me, it's great and not fraught with problems as some here suggest. LAMP is a great tool, and we leverage it for what it is.

Re:Come on guys (2, Interesting)

BitZtream (692029) | about 5 years ago | (#27542617)

While I think Facebook is nothing more than one big popularity contest, I have to agree.

At least most of the stuff on Facebooks website works.

With slashdot, half the time clicking on a comment to expand it doesn't work unless I refresh several times or copy and paste the link into a new browser.

The right hand sidebar will say 'freshmeat' and show stuff from linux.com and vice versa.

At first I thought this was because I still used IE and that was the problem, being that slashdot doesn't cater to IE users, fine. So after I switched to Chrome I figured it wouldn't be an issue, yet its not any different.

I still can't expect expanding a comment to work, I still get crap listed as fossfor.us showing freshmeat entries, 'get more comments' doesn't do shit half the time.

As I've said countless times, programming in PHP and using MySQL 99% of the time means you don't know what you are doing. There are, however, those few large sites that use it that can actually justify its usage because it fits, but only if you actually know what your doing.

I have websites powered by PHP, ASP.NET, ASP, Java, and C. Some of those are good fits for what they do, some of them aren't and I've learned that the hard way. I've also learned that in most cases things are written because a developer 'knows' a specific language. My personal opinion is, if you only 'know' one language, you aren't a programmer. A real programmer can use just about any language given a good reference manual, and can be proficient in that language rather quickly after starting to work with it.

Unfortunately, most people who call themselves programers, aren't. They just happen to be able to get by with a language they've been spoon fed in the past long enough to hack out some POS that barely manages to get the job done and will drive any sane programmer absolutely mad when they get stuck taking over after the original devs are found to be incompetent.

Makes you wonder how many online services have failed because of arrogance and ignorance of the developers.

Re:Come on guys (1, Insightful)

kv9 (697238) | about 5 years ago | (#27543107)

With slashdot, half the time clicking on a comment to expand it doesn't work unless I refresh several times or copy and paste the link into a new browser. [...] At first I thought this was because I still used IE and that was the problem, being that slashdot doesn't cater to IE users, fine. So after I switched to Chrome I figured it wouldn't be an issue, yet its not any different.

if you're using the index2.pl (beta) in anything other than Firefox you're clearly an idiot. they state the support for that browser and the fact that it's beta should be enough of a hint to stop whining

if you're using the classic index and are having any of these problems you're lying through your teeth because they dont exist.

either way, you're an idiot.

Re:Programming FAIL (0)

Anonymous Coward | about 5 years ago | (#27545163)

Big Media and Cable are exactly like this too. I had the gross misfortune to work for a "media search" subsidiary of a massive nation-wide cable company. There were a handful of sharp programmers, don't get me wrong, but the rest of them were flat out disgustingly bad. Apparently nobody knew the first thing about software engineering, algorithms, let alone database design. I had the distinction of being in charge of all internal systems, run by a manager who was the most incompetent I have ever run into in 15+ years.

One choice example of sheer stupidity was this: in order to read thru a search result set, they read 100 rows. then to get the next 100 they REread the 1st 100, threw them away and continued with the 2nd set of 100. Repeat for the 300th, 400th records.

They created these massive Berkeley DB's but completely trashed it so that you couldn't actually use the hashes for anything. So it was basically a linear scan. The underlying technology was JBoss-based and they didn't have so much as the first clue how to even do that right.

I was shown the door after I wouldn't shut up about how fscking inexcusably imcompetent the lot of them were (incl.) their non-existent management. I'm not the least bit unhappy about leaving them. I hope they fail and do so spectacularly. It probably doesn't have too much bearing on it but nearly all of the development team were foreigners. I don't know how many of them were H1B but even if they weren't, it was a profound example of Russians, Indians, and other Slavic programmers can be every bit of shit as their American counterparts.

Misread (1)

Cyphertube (62291) | about 5 years ago | (#27542367)

When I first saw the post, I though it said how Facebook RUINS its LAMP stack.

I think that has to do with my experience with the apps and how often things timeout in that regard. It's a little frustrating and I'm sure it has nothing to do with the guys at Facebook, but it is interesting to find how that third-party experience affects my subconscious.

Why Spend Miilions When You Can Spend Billions? (1)

aoheno (645574) | about 5 years ago | (#27542513)

Well, I'll be - the first LAMPCJ Stack - make it too big to fail.

For all of you fellow architect bods out there, this is how you do it:

PHP - California, Texas and France
C++ - New Jersey and Tibet
Java - California, India, and Somalia

Now, what does this variable name in Somali represent?

Who gives a shit? (2, Insightful)

PingXao (153057) | about 5 years ago | (#27544163)

I guess a story on /. with only 75 comments after 7 hours pretty much answers that question, eh?

Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...