Great answers to this week's interview questions. Mick Morgan, of the UK's CCTA [Central Computer and Telecommunications Agency] has turned this Q&A session into a truly detailed primer on how to choose the hardware and operating system behind a high-profile Web site - and has dispelled quite a few myths in the process. You'll want to read this interview even if you're not into server mechanics. It contains enough personal insight and wit to be of interest even to Slashdot's least-technical readers. (Click below to see what we mean!)
Seems like a simple question, but why Linux? It seems like all the other high powered sites are using BSD of one variant or another.
Raul Acevedo asks:
In the original Sunday Times article, you are quoted as saying:
"... you can't beat them [Linux on Intel] in the bangs for your buck department. It blows Sun out of the water..."
Could you elaborate on how Linux compares to Solaris? Did you mean that Linux blows Sun out of the water in terms of price/performance (which is obvious since Linux is free), or just in general for your particular needs?
I'd be curious to hear your thoughts on Linux vs. Solaris, not just in terms of price, but overall performance, reliability, maintainability, and ease of use. As a developer, I'm seeing Linux considered as an alternative to Solaris in many places, but there's little factual (or even anecdotal) information comparing the two.
I'll take these two together since the answers overlap.
In retrospect, I wish I /had/ chosen OpenBSD ;-)
And I would certainly choose OpenBSD over GNU/Linux if I were building a firewall, or an intrusion detection system (based on say, Marcus Ranum's NFR) where packet capture at wire speed was important. (No - that tells you nothing about CCTA's network architecture....)
The choice of GNU/Linux seems to have caused all sorts of interest (witness this interview itself) when a *BSD may not have been so "controversial". Frankly I'm a little surprised at the reaction the choice seems to have generated. After all, we are just talking about web servers here. Many ISPs choose GNU/Linux on Intel for exactly the same reasons I have done - best value for money for the task in hand.
Let's put this into perspective first though - and dispel a few myths which seem to have cropped up in the press. I have emphatically /not/ ditched Solaris in favour of GNU/Linux. I still have 14 operational Solaris boxes running on the network. I have GNU/Linux running on 5 Dell Poweredge 2300s (with half a gig of RAM each - the Times article suffered from poor editing). I also run GNU/Linux on my desktop in the office, on my laptop and desktop machine at home and on a couple of internal servers handling DNS and proxy services for CCTA.
The GNU/Linux choice came about for two reasons:
- - I had operational experience of GNU/Linux on a day to day basis.
- - I was faced with replacing life expired Sun hardware (including a SPARC 1000E and a couple of Sparc 20s) as part of the normal process of hardware/maintenance/upgrade.
On the second point. When the usual business planning round came up and I had to make decisions about hardware replacement for some of the older servers, it was obvious GNU/Linux on Intel could be a much cheaper option than simple replacement of the Sun hardware. Consider: a Dual 450MHz Pentium II, with 27 gig of disk, internal DDS3 and CDROM and half a gig of RAM costs less than £5000; a dual 300MHz UltraSPARC 2 with similar configuration costs around three times that. Question. Do I need to spend that kind of money simply to run a Web server? So I ran some tests and concluded that - no I didn't need to spend that kind of money (taxpayers money I should add) and plumped for the GNU/Linux on Intel combination on the purely pragmatic grounds of best value for money for the job in hand.
For the purpose of testing I took as a benchmark the maximum real life hit rate I had ever seen on one of the Solaris servers - around 1.5-2 million hits in a day. (By hit, I mean http GET or POST request). Then I doubled that as working assumption of a realistic maximum load in my environment.
For testing I took a fairly standard, but reasonably specced PC (a single Pentium 450MHz processor, 256Mb ECC SDRAM, single 18Gb LVD 10,000 RPM SCSI disk) and loaded Redhat Linux 5.2 running Apache 1.3.3. (Because that was what I had to hand). Apart from the Web server, I turned off all other daemons. I then loaded that server with a complete copy of my main www.open.gov.uk web.
In order to simulate a real life load, I had to find some way of grabbing a randomised list of URLS from the server which reflected the real world as closely as possible. After some testing with a variety of home spun scripts and commmand line web testers (such as webgrab) it quickly became clear that I would bog down the clients long before I made any real demands on the server. Some searching around and questions of colleagues lead me to http://alumni.caltech.edu/~dank/fixing-overloaded-web-server.html which is a useful site pointing to benchmarks and tools. This pointed me to http_load at http://www.acme.com/software/http_load/ which turned out to be pretty nifty since it runs in a single process. And of course, being OSS, I could tweak the code slightly to match my requirements. Thus armed I built some lists of URLs which were deliberately chosen to represent small text/HTML files, medium sized gif/jpeg files and large PDFs since this is the real life world on the public web servers. In load testing the server I then fired up just three client machines (one SPARC 5 running Solaris and two low end Pentiums running GNU/Linux since that was all I had to hand).
In peak load testing over a sustained 4 hour period I managed to get the server to deliver over 13,000 Mbytes in just under 500,000 HTTP transfers. During that period, CPU utilisation never went above 10%, and was usually around the 5% mark. Disk utilisation was minimal. The network connection rate was much higher than anything I'd seen in real life on the existing external servers (some 500 established connections during snapshots on the load testing period). Also during the test, Apache complained that it had reached the MaxClients setting (then 150) with no adverse effects.
Given that such a reasonably low end server handled most of what I could throw at it in my test environment, I concluded that GNU/Linux on only slightly beefier hardware made eminent sense.
Do you get many cracker/script kiddie attacks on the various web sites you run?
Any high profile site is going to attract unwelcome visitors. My job is made harder, and more stressful, by such attention - but that is what I am paid for. My friends know that I have nightmares about waking up to find graffiti (which is all it is) on one of my customers sites.
Like any other conscientious sysadmin I take a personal interest in the security of my servers. Naturally I will use all the tools at my disposal to minimise the vulnerabilities. But of course I get unwelcome attention.
A plea to the community if I may. And here I can do no better than quote from Fyodor's article in Phrack Volume 8 issue 54 where he discusses remote OS fingerprinting:
Sysadmins are not stupid. They are simply usually overworked and have to balance the need to provide services to their customer base with the need to minimise the risks to those services. Attacking public servers (whoever owns them) merely serves to irritate sysadmins, and usually nobody else."A worse possibility is someone scanning 500,000 hosts in advance to see what OS is running and what ports are open. Then when someone posts (say) a root hole in Sun's comsat daemon, our little cracker could grep his list for 'UDP/512' and 'Solaris 2.6' and he immediately has pages and pages of rootable boxes. It should be noted that this is SCRIPT KIDDIE behavior. You have demonstrated no skill and nobody is even remotely impressed that you were able to find some vulnerable .edu that had not patched the hole in time. Also, people will be even _less_ impressed if you use your newfound access to deface the department's web site with a self-aggrandizing rant about how damn good you are and how stupid the sysadmins must be."
I was not overjoyed to notice comments on /. of the form "whoo, so the Royal Web site has moved to Linux. I've got a rootkit with your name on it" (you know who you are). Consider. I have just moved some high profile web sites to the OS of choice to you readers. You want to see that OS taken seriously. Scribbling graffiti all over such a web site would have all sorts of negative impacts on the perceptions of people who matter.
Besides, you'd upset me.
If you could add or change three things about Linux to make your job easier or more enjoyable, what would they be?
1. The ability to read BUGTRAQ, evaluate the threat, consider vulnerability to that threat and auto patch or upgrade accordingly. It should then email me saying "I'm OK now, you can go back to reading /.".
2. An artificial intelligence based real time log watcher and network daemon which could learn network connect patterns and modify either the stack or the services running accordingly. The system should be capable of real-time blocking (a la portsentry) of "hostile" connects, co-operation with external IDS systems and firewalls, real-time reconfiguration of external security components, real-time alerts to other hosts on the lines of "hey guys, I'm being hit by X, watch it." It should then email me saying "I'm OK now, you can go back to reading /." :-)
3. An ASCII character based version of rogue. I miss it.
What kind of redundancy do you build into the server system for such a large and important site, ie. round-robin style servers or large, beefy superboxes, etc...
You can see from answer above that I do not use "large, beefy superboxes". Frankly you don't need to to run a Web server. Nor do I use round robin DNS or other load balancing such as CISCO local director. In my experience of the sites I run, I don't need to do so. None of the sites gets hit hard enough to warrant the additional complexity of mirrored, load balanced servers. Our most popular site by far is the Royal Household site. That takes around 2-2.5 million hits per week (though I expect that to go up slightly now). The highest consistent hit rate I have seen is around 1.5-2 million hits per day. Any of the servers I have could cope with that. The redundancy we build in is in having backup hardware ready to run.
To what extent is the Royal Family involved with the site (e.g. content creation)?
The Royal Family take an active interest in both of the royal web sites (one of which is hosted by the Press Association - www.royalinsight.gov.uk -). This interest includes both the current content of the sites, as well as future developments. The Queen herself launched royal.gov.uk in March 1997.
What's the official reaction to these sites running Linux? Assuming the British Government, and Her Majesty, are aware that their public image on the Internet is being presented via software that is non-traditional and non-commercial, what do they think of it all?
The priority for the heavily visited royal web site is accessibility, balanced of course by reliability and security. These are the important issues, rather than the nature of the server operating system.
What is your background? Are you a techie, an admin person, or an other? Do you use Linux personally? If so, did you come from a Unix, Windows or other background?
I am a techie (though some of my friends and colleagues are a little less complimentary than that). My background is in Unix sysadmin and network management. I joined CCTA in 1993 from the UK Treasury where I was responsible for their Unix based OA system. Prior to that I was responsible for IT security in the Treasury. I have done some small systems development work in the past on MS/DOS machines (way before windows really took off) and CP/M micros. Most of my early career was in specialist support areas such as statistics, though I did a short stint in policy for a while in the mid to late 80's - didn't like it much.
Yes, I use GNU/Linux personally. It is my preferred platform for home use.
Dicky also asks:
And a related question: What is the primary system around your department?
Depends what you mean by my department. In my area of responsibility the main systems are all *nix based. But the corporate desktop is NT4.
Brian Knotts asks:
The obvious question: Does the Queen read Slashdot? :-/
No. The Queen's interest in Internet matters is non-technical, although she sees on her visits to a wide variety of organisations the increasingly imaginative uses for the Internet.
Simon Brooke asks: I've been very pleased lately to see Open.Gov's clear policy statement on the use of open standards. I'm personally involved in working with some large UK companies on their own Web standards policies, and having this to point to has been extremely useful to me. How difficult was it to get buy in to these standards by all the people who 'own' different Government sites, and how difficult is it to enforce?
I notice, for example, that the Scottish Parliament's web site, and my local Council's Web site, do not yet conform. Without wishing to point fingers at specific organisations, is it your intention to cajole all sites within .gov.uk to conform to these standards? Is it appropriate for members of the public to draw administrators of these sites attention to these standards?
CCTA has long been a standards based organisation. My colleague Neil Pawley is CCTA's representative to W3C. Neil is also lead designer on the open.gov.uk site. Since CCTA is a member of W3C it is entirely appropriate that we should take a lead in using standards set by that organisation. Using HTML4, CSS2 and XHML1 for example on a real life server gives us valuable information on usability issues such as browser compatibility. Much of the feedback we have received has been very positive. On occasion we have had to deviate slightly from the standards where their use causes our public difficulty because of some incompatibility with a particular client setup. That experience itself is very helpful, since it allows us to feed back into the standards making process.
CCTA has an advisory role on best practice in the use of IS/IT in the UK Public sector. We have no authority to mandate particular standards, nor would we seek to do so. If the use of standards is to be effective in any way, it is because the standards themselves make sense in the real world (witness the growth in the use of the TCP/IP protocol set at the expense of the OSI standards).
Simon Brooke adds... Oh, and, by the way, keep up the good work!
We intend to.
Thanks for your interest. It has been educational for me.
-- Mick Morgan
-- end --
Next week: John Vranesevich of AntiOnline.