Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Ask Slashdot: Best Linux Distro For Computational Cluster?

timothy posted more than 3 years ago | from the what-will-you-be-computationalizing? dept.

Red Hat Software 264

DrKnark writes "I am not an IT professional, even so I am one of the more knowledgeable in such matters at my department. We are now planning to build a new cluster (smallish, ~128 cores). The old cluster (built before my time) used Redhat Fedora, and this is also used in the larger centralized clusters around here. As such, most people here have some experience using that. My question is, are there better choices? Why are they better? What would be recommended if we need it to fairly user friendly? It has to have an X-windows server since we use that remotely from our Windows (yeah, yeah, I know) workstations."

cancel ×

264 comments

Sorry! There are no comments related to the filter you selected.

None of them (-1)

Anonymous Coward | more than 3 years ago | (#36254848)

None. Use Solaris or AIX. Fuck that Lunix bullcrap. Even Winders is better.

Re:None of them (0)

el_jake (22335) | more than 3 years ago | (#36254882)

Just imagine a Beowulf cluster of bullcrap!

Re:None of them (2)

blair1q (305137) | more than 3 years ago | (#36254992)

Imagine a Beowulf cluster of BeOS, beotch.

Re:None of them (0)

Anonymous Coward | more than 3 years ago | (#36254886)

Your in-depth analysis is invaluable to the discussion.

Re:None of them (2)

MikeDirnt69 (1105185) | more than 3 years ago | (#36255658)

Wrong too. Use the distro you work better with.

YDL (1)

metalmaster (1005171) | more than 3 years ago | (#36254864)

Yellow Dog Linux ftw!

RHEL (2)

morcego (260031) | more than 3 years ago | (#36254872)

Redhat Enterprise Linux.

If you need something cheaper (no licenses), you can always go CentOS. Or you can mix both, having some RHEL and some CentOS machines.

Re:RHEL (5, Insightful)

pavon (30274) | more than 3 years ago | (#36255046)

If you need something cheaper (no licenses), you can always go CentOS.

If you want something compatible with Red Hat but cheaper, you should go with Scientific Linux, which is the same sort of idea as CentOS, but has more timely releases, and is used by other major clusters, like the ones at Fermilab and CERN.

Re:RHEL (0)

Anonymous Coward | more than 3 years ago | (#36255054)

Why bother recommending a distro without stating why it's useful for research cluster? RH is for business, it's package handing is a joke even today. The main reason why people use RHES is because it has associated costs, and that gives the PHB's a fuzzy glow.

Re:RHEL (-1)

Dan667 (564390) | more than 3 years ago | (#36255098)

so true. I hate using rhel.

Re:RHEL (0)

Anonymous Coward | more than 3 years ago | (#36255110)

RHEL is fine, CentOS is just awful, and anytime someone offers up CentOS as a substitute for RHEL, I wonder of they've ever used CentOS. Watch for circular dependencies and lots of unavailable packages. You'll be installing most things from dag.wieers.com, and I feel bad for the guy paying for so much bandwidth for hosting rpms for this tripe of a Linux distribution. If you don't want to pay for support (and I mean RHEL, because YaST's finicky handling of configuration files make SLES un-sysadmin-friendly IMO) then go with Ubuntu server. Everything you could need is an apt-get away, rather then google-the-wget away with CentOS and dag. I know my situation isn't a cluster, but we're running 20 Ubuntu servers in 15 colos currently, and our experience has been by far the best with Ubuntu.

Re:RHEL (2)

morgan_greywolf (835522) | more than 3 years ago | (#36255546)

RHEL is fine, CentOS is just awful, and anytime someone offers up CentOS as a substitute for RHEL, I wonder of they've ever used CentOS. Watch for circular dependencies and lots of unavailable packages.

I've never seen that problem with CentOS.

Everything you could need is an apt-get away, rather then google-the-wget away with CentOS and dag. I know my situation isn't a cluster, but we're running 20 Ubuntu servers in 15 colos currently, and our experience has been by far the best with Ubuntu.

The problem with Ubuntu for scientific computing is that many commercial scientific computing packages have runtime dependencies on old, outdated libraries found in Red Hat-based distros, but aren't available on Ubuntu without compiling from source. I used to admin 2 large compute clusters for a Fortune 100 NASA contractor, so I actually know what I'm talking about.

Re:RHEL (1)

slmdmd (769525) | more than 3 years ago | (#36255174)

If performance is number one priority then I would say, you should compile your own kernel. The standard distros have a lot of fat in the kernel. I have not tried many distros, so let other slashdotters pick a distro. Take the base distro and then begin kernel compilation, cut out all the drivers for which you don't have the hardware. For example, cd writer driver, tape drive, network card drivers except for the one you use and many many other stuff.. It will take few iterations to get the desired kernel. Enterprise versions are outdated by years, for example compare RHEL kernel with Fedora 15 kernel version. From my experience "enterprise" means crippled distro yet 10+ times expensive.

Custom kernel doesn't gain you a thing (1)

Anonymous Coward | more than 3 years ago | (#36255452)

Doesn't gain you a thing. Drivers are loaded on demand as needed for local hardware. Unused drivers are not loaded at all, and do not impact performance or memory usage.

All custom kernels can do at most is to reduce the size of the initrd file used during boot.

The initrd is a compressed cpio file containing the contents of a memory resident root filesystem used during hardware initialization. Once hardware is identified, then required drivers (disk/video/keyboard/mouse) are loaded and the real root filesystem is used (using the driver from the initrd).

Once the real root is mounted additional drivers (if any) may be loaded as directed by configuration.

The only gain in a custom kernel is reducing the time to compile a kernel...

Re:RHEL (5, Informative)

b30w0lf (256235) | more than 3 years ago | (#36255670)

Agreed.

A primary component of my job is the design and maintenance of high performance compute clusters, previously in computational physics, presently in biomedical computing. Over the last few years I have had the privilege of working with multiple Top500 clusters. Almost every cluster I have ever touched has run some RHEL-like platform, and every cluster I deploy does as well (usually CentOS).

Why? Unfortunately, the real reasons are not terribly exciting. While it's entirely true that many distro's will give you a lot more up-to-date software with many more bells and whistles, at the end of the day what you really want is a stable system that works. Now, I'm not going to jump into a holy war by claiming RedHat is more stable than much of anything, but what it is is tried and true in the HPC sector. The vast majority of compute clusters in existence run some RHEL variant. Chances are, if any distro is going to have hit and resolved a bug that surfaces when you have thousands of compute cores talking to each other, or manipulating large amounts of data, or running CPU/RAM intensive jobs, or making zillions of NFS (or whatever you choose) network filesystem calls at once, or using that latest QDR InfiniBand fabric with OpenMPI version 1.5.whatever, it's going to be RHEL. That kind of exposure tends to pay off.

Additionally, you're probably going to be running some software on this cluster, and there's a good chance that software is going to be supplied by someone else. That kind of software tends to fall into one of two camps: 1) commercial (and commercially supported) software, and; 2) open source, small community research software. Both of these benefit from the prevalence of RHEL (though, #1 more than #2). If you're going to be running a lot of #1, you probably just don't have an option. There's a very good chance that the vendor is just not going to support anything other than RHEL, and when it comes down to it, if your analysis isn't getting run and you call the vendor for support the last thing you want to hear is "sorry, we don't support that platform ." If you run a lot of #2, you'll generally benefit from the fact that there's a very high probability that the systems that the open community software have primarily been tested on are RHEL-like systems.

Finally, since so many compute clusters have been deployed with RHEL-like distros, there is oodles of documentation out there on how to do it. This can be a pretty big help, especially if you're not used to the process. Chances are your deployment will be complicated enough without trying to reinvent the wheel.

PelicanHPC GNU Linux (formerly ParallelKnoppix) (0)

Anonymous Coward | more than 3 years ago | (#36254878)

http://pareto.uab.es/mcreel/PelicanHPC/

Slackware (-1)

Anonymous Coward | more than 3 years ago | (#36254880)

Slackware is trashy shit.

Use debian

Scientific Linux (5, Informative)

stox (131684) | more than 3 years ago | (#36254884)

Built for that very purpose.

Re:Scientific Linux (1)

Browzer (17971) | more than 3 years ago | (#36255964)

care to provide a link to that "informative" claim, and please don't say OpenAFS.

thanks

Re:Scientific Linux (5, Informative)

boristhespider (1678416) | more than 3 years ago | (#36256004)

Being in academia and spending time in a lot of departments I can at least confirm that a large number of departments are running Scientific. I've worked in Britain, the USA, Canada, Norway and Germany and while Germany (predictably enough) has a hankering for SuSE, the others have a tendency to run Scientific.

I did type in a long and boring anecdote about my experiences administering things running SGI Irix and Solaris back in the day, but wiped it when it began to look a bit incriminating and for all I know my ex-boss reads Slashdot. So I'll summarise as "don't administer SGI Irix or Solaris if you can avoid it". I'm no computer scientist, so maybe people who are better at it have no problems, but as a vaguely-competent scientist with an interest in computers but little more (like the original poster) I didn't get on with either of them. Red Hat was fine, and we hung Fedora machines off our central network and that was OK even though it was Fedora Core 1 with all its teething problems. And Scientific is very widely used in academia on big networks.

What do your sysadmins know? (0)

Anonymous Coward | more than 3 years ago | (#36254888)

If it was my choice, I'd go with Debian. I used RedHat for a few years in the '90s, then went to Debian after getting tired of dependency hell, and have not gone back since. Have worked in CentOS recently - not as bad as it used to be, but I still prefer Debian. Debian and its derivatives (like Ubuntu) were reported as being the "most important" Linux distro a couple months ago.

Re:What do your sysadmins know? (-1)

Anonymous Coward | more than 3 years ago | (#36254980)

And how does this information answer the question? How is Debian better than Fedora or CentOS for this purpose? It may be, I don't know, but how about explaining why it would be better for a computational cluster.

You are a loser.

Look at Rocks (1)

Anonymous Coward | more than 3 years ago | (#36254890)

It isn't about the OS, it is about the tools to manage it. Rocks is based on Centos, and helps you run the cluster.

http://www.rocksclusters.org/

NPACI Rocks (5, Informative)

rmassa (529444) | more than 3 years ago | (#36254894)

NPACI Rocks is probably your best bet. http://rocksclusters.org/ [rocksclusters.org]

Re:NPACI Rocks (1)

FromageTheDog (775349) | more than 3 years ago | (#36255334)

This. Rocks makes it so ridiculously easy to set up a cluster that administration literally can be a single-person job.

Re:NPACI Rocks (1)

esten (1024885) | more than 3 years ago | (#36255632)

Yep. Currently as a grad student I am managing a 128 proc cluster with 20 nodes which is shared between several groups. I knew a little bit of linux when I started and nothing about cluster and really have had very few problems with the cluster. Rocks makes it really simple

Re:NPACI Rocks (5, Interesting)

daemonc (145175) | more than 3 years ago | (#36255830)

Seconded. I used Rocks to build clusters for the university for which I worked, and it made my life much, much easier.

If you are already familiar with Redhat administration, you'll be happy to know Rocks can use either Redhat or CentOS as its base OS.

It uses meta-packages called "rolls", which completely automate the installation and configuration of your computing nodes. There are rolls that include most of the commonly used commercial and Open Source HPC software out there, or you can "roll" your own. Basically you just configure your head node, and then adding a compute node is as simple as setting the BIOS to boot over PXE, plug it in, and done.

Rocks, well, rocks.

Scientific Linux (5, Informative)

Skapare (16644) | more than 3 years ago | (#36254914)

How about Scientific Linux [wikipedia.org] ?

Ubuntu 10.04 LTS (0)

GoNINzo (32266) | more than 3 years ago | (#36254918)

Ubuntu 10.04 LTS, accept no substitute. Maybe use a Ubuntu 10.10.2 desktop to manage them, it's easy to use. (11.04 is still unstable, IMO.)

It actually all depends on what packages you plan on running. Then cross reference that against what your options are, I think you'll run out of options quickly, TBH.

And you just need putty on the windows side. But, if you have to, all of them can run x-windows generally these days. I just find Ubuntu to be the easiest, plus their packaging system is the best, being Debian under the covers.

Please note, these are opinions, and I'm entitled to my informed opinion.

Ubuntu is for NOOBS! (-1)

Anonymous Coward | more than 3 years ago | (#36255008)

Slashdot groupthink update:

Apple: Perfect
Ubuntu: Shit
Windows: Indifferent
Sony: Haha
Google: Privacy threat
Facebook: Rapists. More plz.
Hipster culture: In
Hacker culture: out

Re:Ubuntu 10.04 LTS (1)

tokul (682258) | more than 3 years ago | (#36255148)

Ubuntu 10.04 LTS, accept no substitute.

So what user should do? Use substitute of Debian or accept no substitute.

Re:Ubuntu 10.04 LTS (0)

Anonymous Coward | more than 3 years ago | (#36255290)

I don't think you know what the word "substitute" means.

Heh . . . Captcha: "Purity"

Re:Ubuntu 10.04 LTS - Why? (1)

Andy Dodd (701) | more than 3 years ago | (#36255920)

Simple question. The OP asked WHY you feel that is a solution for large-cluster HPC.

It looks like so far your only reason is "i liek it!" - I personally have no opinion or experience with HPC clusters, but so far nearly all of those who do are recommending something that is either RHEL or RHEL-based (Rocks or Scientific Linux), if only because it allows you to leverage commonality with the big cluster operators with installations in the Top500.

Disclaimer: I'm an Ubuntu user, and I greatly enjoy it, but I have not seen many examples of actual scientific clusters running it.

Re:Ubuntu 10.04 LTS (3, Insightful)

Hatta (162192) | more than 3 years ago | (#36255986)

Why? I know Ubuntu is the standard recommendation for grandma these days, but what makes you think it's particularly appropriate for a computational cluster? For instance, do you really need GNOME on a high performance cluster?

Fragmented Linux (-1)

Anonymous Coward | more than 3 years ago | (#36254920)

That's a good one.

CentOS+RocksClusters.org (0)

Anonymous Coward | more than 3 years ago | (#36254922)

I run a cluster built using RocksClusters.org distribution, which is based off of CentOS. In the past, the previous admin has us running Oscar, but I found its management style too clunky, but that was pre version 6, but then with Rocks we've never looked back.

I haven't use it, but Rocks also has a visualization roll, that should probably include XWindows stuff. Also check out vnc or nomachine as XForwarding is a chatty protocol and gets a lot of lag once you are off of the local network(ie from home or abroad).

what-will-you-be-computationalizing? (0)

Anonymous Coward | more than 3 years ago | (#36254926)

"From the 'what-will-you-be-computationalizing?' dept"

That's a good question.

Scientific Linux (2)

Ether (4235) | more than 3 years ago | (#36254940)

Scientific Linux. http://www.scientificlinux.org/ [scientificlinux.org] Has the benefit of RHEL: a stable OS environment without some of the headaches of CentOS. If you have money (you probably don't) RHEL is good.

Fedora (2)

tanawts (786512) | more than 3 years ago | (#36254944)

Fedora has components to help manage large deployments. https://fedorahosted.org/spacewalk/ [fedorahosted.org] It also has FreeIPA to help with a secure and scalable means of managing authentication/authorization/resources within the cluster. http://freeipa.org/page/Main_Page [freeipa.org]

Re:Fedora (1)

Fjandr (66656) | more than 3 years ago | (#36255472)

Fedora goes to they other extreme from CentOS. The update cycle is too short, which means you have increased worry about instability. Stuff just breaks sometimes, even though it's a good distro on the the whole for many purposes. I'd assume stability is a top priority for someone putting together a cluster.

Requirements (1, Insightful)

hawkbat05 (1952326) | more than 3 years ago | (#36254948)

I think an important question here is why was Red Hat chosen for the other clusters? Your requirements aren't very specific, there are hundreds of distro's that could meet your criteria.

Not an X Window Server (0)

Anonymous Coward | more than 3 years ago | (#36254954)

In the X Window model, the computer with the display is the server. Client programs connect to the server in order to display on its screen.

Mac! (1)

EonsWrath (1888134) | more than 3 years ago | (#36254966)

MAC! wait, what?

Which editor should he use? (1, Funny)

Albanach (527650) | more than 3 years ago | (#36254968)

Now we have such a clear winner on the choice of distro, perhaps we can discuss which would be the best editor on the cluster?

Re:Which editor should he use? (1)

Kamiza Ikioi (893310) | more than 3 years ago | (#36255096)

vi

Re:Which editor should he use? (0)

Anonymous Coward | more than 3 years ago | (#36255262)

Pfft!

If you can't do it in Emacs, it ain't worth doin'

Re:Which editor should he use? (0)

Anonymous Coward | more than 3 years ago | (#36255142)

nano. hands down.

Re:Which editor should he use? (0)

Anonymous Coward | more than 3 years ago | (#36255160)

emacs

Re:Which editor should he use? (1)

gilleain (1310105) | more than 3 years ago | (#36255166)

Now we have such a clear winner on the choice of distro, perhaps we can discuss which would be the best editor on the cluster?

Sounds good - and finish up with a reasoned, polite exchange of views on which programming languages to use on the new cluster?

Re:Which editor should he use? (1)

Ender Wiggin 77 (865636) | more than 3 years ago | (#36255296)

Editor should be SlickEdit. Programming language should be JavaScript execution of Java code converted with GWT. That's a science experiment right there.

Re:Which editor should he use? (0)

Anonymous Coward | more than 3 years ago | (#36255330)

C# and x86 assembly.

Re:Which editor should he use? (1)

Anonymous Coward | more than 3 years ago | (#36255186)

Nothing that belongs to the 80's.

Re:Which editor should he use? (1)

betterunixthanunix (980855) | more than 3 years ago | (#36255618)

The POSIX standard editor of course.

Rocks Cluster uses a modified Centos (2)

w3rdna (253598) | more than 3 years ago | (#36254970)

Centos is modified to be the base OS for the ROCKS Cluster.
http://www.rocksclusters.org/wordpress/

Re:Rocks Cluster uses a modified Centos (2)

erikscott (1360245) | more than 3 years ago | (#36255068)

The Rocks approach is nice for quickly regenerating a failed node. And it's Centos under the covers, as noted, so it's RHEL in disguise. If you're running 16 boxes with dual quad-cores, you'll lose the occasional disk drive. If you run 64 cheap desktops with single-socket dual-cores, you'll lose a disk drive every week or two.

Re:Rocks Cluster uses a modified Centos (-1)

Billly Gates (198444) | more than 3 years ago | (#36255128)

CentOS uses a circa 2006 kernel which has a fraction of the throughput of a modern Linux Kernel. 5.6 is very old and no longer maintained with security fixes.

If this cluster is utilized for five years then it will still be using a 2006 kernel in 2016! I lost faith in them when they refused to make CentOS 6 as RedHat 6.1 is now out.

Re:Rocks Cluster uses a modified Centos (2)

0racle (667029) | more than 3 years ago | (#36255436)

Ok, none of that is true. Even as a troll, that's pretty pathetic.

Choose wisely (0)

Anonymous Coward | more than 3 years ago | (#36254982)

1) You don't need X server. You only need ssh. And if you forward windows via ssh to the windows (or another linux) workstation, you don't need X server running on the server side.

For stability use Slackware (pain setting up as the package management is weak, but it stays incorruptible for a lifetime). If you seek good package management, no-bullshit straight forward configuration and custom layout (e.g. X server, lightweight or no window manager, network and monitoring daemons), use Arch linux. Just avoid consumer-oriented distros, they usually come with fancy and gigantic desktop environment and confusing graphical configuration wizards.

Distro isn't the biggie, it's the scheduler (5, Interesting)

javanree (962432) | more than 3 years ago | (#36254990)

I've worked with various clusters over the past year.
The distro doesn't really matter, mostly it's what you feel most comfortable with. I'd slightly favor RedHat Enterprise or a respin of it, since it's easiest in terms of drivers for commercial cluster hardware and commercial software support, but Debian would be just as fine. I would choose a 'stable' distro though, so no Fedora, no Ubuntu (even their LTS isn't exactly enterprise grade compared to RedHat / Suse or even Debian stable) You don't want to have to update every week since this usually requires quite some work (making new images and rebooting all nodes)

What I found out matters a lot more is the scheduler you will use; Sun Grid Engine, PBS, Torque or slurm to name a few. Every scheduler comes with it strong and weak points, be sure to look at what matters most to you.

If you are unfamiliar with all of these things, pick a complete bundle like Rocks (it's based on RedHat Enterprise Linux), which makes setting up a cluster quite easy and still allows you to choose which components you want. That'll greatly improve your chance of success. But be warned; it's still a steep learning curve building and specially configuring a cluster. The most time is spent tuning queuing parameters to maximize the performance of your cluster.

In Houston... (0)

Anonymous Coward | more than 3 years ago | (#36254994)

Most of the seismic clusters I work with use Centos or Fedora.

This Question (2)

SleazyRidr (1563649) | more than 3 years ago | (#36254996)

My comprehension of this question is roughly 'please have a flamewar about the different flavours of Linux.'

Re:This Question (1)

blair1q (305137) | more than 3 years ago | (#36255116)

..."and whoever is left standing and doesn't have too much shit on him in the end shall be king!"

Pretty much how all reviews go, minus the fun for the spectators.

X window (1)

turbidostato (878842) | more than 3 years ago | (#36255010)

"It has to have an X-windows server since we use that remotely from our Windows (yeah, yeah, I know) workstations."

So what? One one hand in order to run Linux graphic apps on Windows you need an X-Window server... on the Windows machine, not the Linux one. On the other hand, how is it that you *must* use GUI-based apps? There's *really* no operational alternatives? (I've been administrating Linux and Unix systems for almost two decades and I never needed -as in "must", GUI-based apps for that).

Re:X window (1)

grimsweep (578372) | more than 3 years ago | (#36255494)

Good chance that the GUI request deals primarily with user-friendly aspects of using the cluster. There are always alternatives to GUI-based apps, but there are plenty of times where using one will save you time and effort. Have you ever tried substituting Gimp with Image Magick? You can't beat the latter for batch image processing, but I wouldn't ask anyone to design a logo with it.

Re:X window (1)

bugi (8479) | more than 3 years ago | (#36255674)

He was asking for X libraries, not an X server. He's covered. No non-embedded distro will ship without such.

In X-land, the server is what talks to your display device on behalf of your other programs. The server manages the scarce resource. The clients bribe the server for access.

Familiarity matters (0)

Anonymous Coward | more than 3 years ago | (#36255018)

"The one you're (or your team/support staff is) most familiar with."

If none qualify under that criteria -- you probably want to go for popularity... because that leads to ease of finding support from people familiar with it. In that case, Ubuntu or possibly Fedora or CentOS.

SUSE Linux or SUSE Linux Enterprise (2)

hotfireball (948064) | more than 3 years ago | (#36255026)

If you are OK to go with RHEL, you also can look for SLES: SUSE Enterprise Linux Server. They also have SUSE Studio where you can make your own appliances. If you are large enterprise, they will even give you SUSE Studio appliance to be hosted in-house in your company for your own needs. They also have SUSE Manager — same as Spacewalk, but has more features in it (and is backward compatible with a Spacewalk).

That depends on what you are trying to do... (0)

Anonymous Coward | more than 3 years ago | (#36255080)

It's like asking "which screwdriver bit is the best to use for driving screws"... You probably have an idea what software you intend to run on the cluster. If any of those are commercial packages, they probably have their support and installation processes setup for particular distributions. It'd be nice if they'd support anything, but in practice, they don't. Consult the vendors of any commercial product you plan to use with the cluster and bias yourself towards whatever they say they will support. It's easier that way.

Second, if that's not an overriding concern, then pick something that's reasonably popular or at least sticks to mainstream layouts and package availability. The reason is mostly because whatever have a good mind-share is easier to find documentation and support for, not to mention contractors with experience.

You are lucky in that, at it's core, Linux provides a stable platform with a lot of features that are largely consistent between platforms. There's not going to be wild inconsistencies in performance or hardware support (there's some, but not much) and software is rarely distribution-dependent (though, some may require some configuration/modification if the developer makes too many assumptions about the environment).

Red Hat for support (3, Insightful)

guruevi (827432) | more than 3 years ago | (#36255082)

RH support is phenomenal and that's why a lot of businesses use it. If you want it on the cheap, go with what you're comfortable and have your specific calculation packages built in (Debian if you like apt and open source packages, RPM if you use a lot of commercial packages). If you're looking for performance and specific hardware enhancements, go Gentoo or one of it's brethren. Go with something that you can easily re-image if you're looking for lots of changes in software lineups or conflicts.

Scientific Linux 6.0 or RedHat Enterprise 6.1 (2)

Billly Gates (198444) | more than 3 years ago | (#36255084)

Scientific Linux 6.0 is built on Redhat Enterprise Edition 6 which is highly tested and tuned for server throughout put, power management, and stability compared to a stock vinalla kernel. The performance will be much better than a stock debian stable kernel or Ubuntu for example. Redhat has a bunch of hackers. Scientific Linux includes apps used for scientists which maybe your target market if you are a university too. If your old cluster has scripts and tools optimzied for Redhat and RPMs then makes sense to use a Redhat Distribution base.

If the scientific apps with Scientific Linux are not being utilized then just buy a license for RedHat Enterrpise Edition 6.1. The licensing fees are affordable if you have the budget for a large cluster and switches. With Redhat Enterprise edition you have support too if something goes down.

Remember to save a few bucks and go free is silly in an expensive project like this.

Re:Scientific Linux 6.0 or RedHat Enterprise 6.1 (1)

blair1q (305137) | more than 3 years ago | (#36255138)

Scientific Linux 6.0 is built on Redhat Enterprise Edition 6

So is Scientific Linux 6.0 free?

Re:Scientific Linux 6.0 or RedHat Enterprise 6.1 (1)

blair1q (305137) | more than 3 years ago | (#36255154)

eh, nemmind. my brain wandered while my eyes worked over your other two paragraphs.

NPACI Rocks (2)

jfp51 (64421) | more than 3 years ago | (#36255102)

NPACI Rocks without a doubt. Red Hat centric, you need to put in some work to understand how it ticks, once you so and set up your cluster properly, it is very solid and reliable.

RPM based distro (0)

Anonymous Coward | more than 3 years ago | (#36255146)

If a theme hasn't come up yet, go with an RPM or RedHat based distro. I would stay away form Fedora because of how fast development is on it, and go with either Scientific, or CentOS. The skill sets that people have around you for Fedora will translate over just fine since they are all based on RPM. If you want to pay money and get support, then go with RedHat, otherwise CentOS is just RedHat EL but without the RedHat name.

Check you ISV and HW vendor (0)

Anonymous Coward | more than 3 years ago | (#36255164)

I work in the industry, and know from personal experiences that Linux Distro selection is often almost religious.

But what you should be doing is check with your ISVs what they support, and do the same with your HW vendor. Try to make a matrix and you may find that the only option is one of the Enterprise distributions.

If your in the rare situation that you have source code for everything, I suggest you look at Scientific Linux or CentOS.

Heresy, I know, but... (0)

Anonymous Coward | more than 3 years ago | (#36255198)

Microsoft Windows Server has an HPC (High-Performance-Computing) Edition. Several of the Universities and research companies use this product, and it now can use Windows Azure as on-demand worker nodes. Essentially you could set up only one "head" node and then just pay for computation when needed.
http://blogs.msdn.com/b/ignitionshowcase/archive/2011/04/20/windows-high-performance-computing-bursting-into-windows-azure.aspx?wa=wsignin1.0

If you're interested specifically in Linux, I recommend a Beowulf Cluster, which is what we used when I worked at the Space Center in Florida several years ago.
http://beowulf.org/

Funny how 128 cores used to seem like a lot (1)

Quila (201335) | more than 3 years ago | (#36255210)

I was just pricing 2U database servers that had 32 cores each. A 128 core cluster is now just four small off-the-shelf servers in a rack for less than a hundred grand.

Red Hat Enterprise Server (1)

jschmitz (607083) | more than 3 years ago | (#36255222)

Period.

x-window server? (2)

tokul (682258) | more than 3 years ago | (#36255230)

It has to have an X-windows server since we use that remotely from our Windows (yeah, yeah, I know) workstations.

Are you sure that you know? You run local x window server on your windows machine when you use x window programs.

Go ask someone who knows what she's doing. (0)

Anonymous Coward | more than 3 years ago | (#36255236)

Seriously, the distribution is not the issue. You need a good concept (LDAP/SSO, automated installation, patch management, network/clustered filesystems, ...) to be able to manage that beast in the long term. Have someone capable do it for you initially.

Back to topic, i'm working with all the enterprise crap money can buy. Don't let anyone bullshit you into buying enterprise distributions because of certifications. You don't need that. Debian is a far better foundation both technologically and financially.

Redhat or fedora (0)

Anonymous Coward | more than 3 years ago | (#36255248)

Redhat or fedora
good forum support

baremetal (1)

Anonymous Coward | more than 3 years ago | (#36255250)

http://www.returninfinity.com/baremetal.html

CentOS, Scientific Linux, Ubuntu, Debian (4, Informative)

MetricT (128876) | more than 3 years ago | (#36255254)

I've got 10+ years experience managing a large (2000 core, 1+ PB storage) compute cluster. If you're using one of those annoying commercial apps that assume Linux = Red Hat Linux (Matlab, Oracle, GPFS,etc.), then CentOS or Scientific Linux are the way to go.

If you don't have that constraint, consider Ubuntu or Debian. apt-get is my single favorite feature in the history of Unix-dom. Plus, there are often pre-built packages for several common cluster programs (Torque, Globus, Atlas, Lapack, FFTW, etc.) which can get you up and running a lot faster than if you had to build them yourselves.

Re:CentOS, Scientific Linux, Ubuntu, Debian (2, Informative)

Anonymous Coward | more than 3 years ago | (#36255894)

I run matlab instances here on my debian vms - no problems. All in all, we have about 800 machines here over several clusters, and everything runs on debian.

Debian (1)

Alex Belits (437) | more than 3 years ago | (#36255256)

Debian -- easy to manage, easy to create new packages for, least amount of nonstandard, distribution-specific stuff (except configuration files management, but that is a result of having to keep individual packages' configuration tied to packages).

Response & questions (1)

multimediavt (965608) | more than 3 years ago | (#36255268)

1. What types of computation is the cluster going to be used for? MD, CFD, ???

2. What software will be used on the nodes? CHARMM, GAMESS, LAMMPS, NWChem, etc.

3. Do you have a preference for a Linux distro? If not, it really doesn't matter that much if you are rolling your own cluster and software stack. It will just determine what things are used for package management and what services in the distro you might want to turn off in order to get the most memory for apps and not the base OS.

4. You should be using SSH as the main interface for the actual compute nodes and maybe (big maybe) have an X server on the login/compile head nodes, but NOT the compute nodes. You want the compute nodes to be as bare as possible to conserve as much RAM and scratch disk space for apps as possible.

Having said all that, CentOS, Fedora, SuSE and RHEL are probably the most popular on distributed memory clusters today. You will also want to make sure that whatever compilers you are using are compatible with the Linux distro you want to use, unless you are relying completely on gcc or binary applications. I have built many clusters from scratch and can be a point of contact should you have additional questions.

debian squeeze (2)

dermond (33903) | more than 3 years ago | (#36255280)

we run our 320 core cluster on debian squeeze. infiniband support out of the box. the gridengine is a mater of apt.-get install. comes with tons of scientific sofware.

Swing the license for RHEL (2)

Zemplar (764598) | more than 3 years ago | (#36255284)

Scientific Linux is totally awesome, but a project of this size, especially with the IT knowledge on hand, needs the support and first-rate product which RedHat provides.

RHEL5 or Ubuntu10.04 (1)

digitaldebris (2038896) | more than 3 years ago | (#36255300)

If your cluster is going to be closed circuit (no internet access) I would recommend RHEL5 as finding and installing RPMs is generally easier when your not able to use the default Distro package utility. If you cluster will have access to the internet (you'll be able to use the Distro package app) I'd recommend Ubuntu10.04 as the Distro repository is up to date and constantly growing.

Rocks built on CentOS or Scientific Linux (1)

Anonymous Coward | more than 3 years ago | (#36255358)

For building and maintaining a small cluster, especially to anyone whose main job is not going to be maintaining the cluster, you should take a look at Rocks. It actually builds on top of a regular Linux distro, although only certain distros work. Redhat Enterprise, CentOS, and Scientific Linux are mentioned in the documentation as being compatible.

What Rocks does is add a bunch of cluster-specific tools to the underlying distro. It helps take care of networking and setting up the compute nodes for easy maintenance and configuration. You basically configure your front end, and then it is extremely simple to manage the computer nodes (including installation; installation by default is done over the network between the compute node and the head node). I have also found the Rocks mailing list to be extremely helpful even to folks who are new to building clusters.

RHEL or SLES will prevent ulcers (0)

Anonymous Coward | more than 3 years ago | (#36255380)

I work for an HPC vendor, thus the anonymous posting.

If you’re going to be running commercial codes/solvers/etc., you need to stick with one of the RPM-based distros. It’s all they test against. If you’re going to be running a fancy interconnect (Infiniband, etc.), that may further restrict your choices.

If you’re running homegrown code, go with your gut.

Building Clusters (5, Informative)

Nite_Hawk (1304) | more than 3 years ago | (#36255412)

Hi,

I work at a Supercomputing Institute. You can run many different OSes and be successful with any of them. We run SLES on most of our systems, but CentOS and Redhat are fine, and I'm using Ubuntu successfully for an Openstack cloud. Rocks is popular though ties you to certain ways of doing things which may or may not be your cup of tea. Certainly it offers you a lot of common cluster software prepackaged which may be what you are looking for.

More important than the OS are the things that surround it. What does your network look like? How you are going to install nodes, and how you are going to manage software? Personally, I'm a fan of using dhcp3 and tftpboot along with kickstart to network boot the nodes and launch installs, then network boot with a pass-through to the local disk when they run. Once the initial install is done I use Puppet to take over the rest of the configuration management for the node based on a pre-configured template for whatever job that node will serve (for clusters it's pretty easy since you are mostly dealing with compute nodes). It becomes extremely easy to replace nodes by just registering their mac address and booting them into an install. This is just one way of doing it though. You could use cobbler to tie everything together, or use FAI. XCAT is popular on big systems, or you could use system imager, or replace puppet with chef or cfengine... Next you have to decide how you want to schedule jobs. You could use Torque and Maui, or Sun Grid Engine, or SLURM...

Or if you are only talking about about like 8-16 nodes, you could just manually install ubuntu on the nodes, pdsh apt-get update, and make people schedule their jobs on google calendar. ;) For the size of cluster you are talking about and what I assume is probably a very limited administration budget, that might be the best way to go. Even with someting like Rocks you are going to need to know what's going on when things break and it can get really complicated really fast.

Re:Building Clusters (2)

clutch110 (528473) | more than 3 years ago | (#36255698)

This post is full of good information. I have been managing HPC for seismic companies for the past 8 years now. I regularly use xCAT as I find that after a few nodes automation is the way to go.

You will find that most clusters run RedHat or a variant of the OS. Most places run CentOS on the nodes and have a machine with RedHat stashed around somewhere in case a problem occurs and they need to reproduce it on a "supported" OS.

Why is there a requirement for a full blown X install? Are these machines desktop boxes or are they racked? Typically you have a thin client software installed at the cluster gateway. We use both NX and ThinAnywhere today.

Debian (1)

wirelesslayers (2014486) | more than 3 years ago | (#36255504)

I built one with Debian Lenny plus I developed a scheduling system in perl using kernel containers + CGroups. So, the research team would think they own a "real" linux and I can share resources in a better way. Also I used perl-cgi to make a containers design, this way no one touchs my real OS. The research leader just use my container design to create and deploy new containers, then the new container is booted in a testing machine so the person can install whatever he needs and cloning later to deploy on the cluster, 400 core + 1.8TB ram.

Ask HPC people (0)

Anonymous Coward | more than 3 years ago | (#36255528)

First, you need to ask this question in a place where folks are familiar with HPC, not a general forum where anyone can tote the flag for the Linux Flavor of the Week(tm) - since the term 'cluster' means failover to most. Second, you need to consider which distributions are supported by your solvers and schedulers. Third, why do you need to run X at all? You should probably look into an X server on your Windows desktops instead and save the computational cycles for, well, computation.

you have no idea what you are talking about!!! (1)

Anonymous Coward | more than 3 years ago | (#36255544)

but that is ok! rather than asking quickie questions and expecting quickie answers, you can start by learning the difference between a system administrator and IT professionals.

"user friendliness" is proportional to sysadmin's abilities or proportional to $$$ for commercial tech support

1. A (stale) link to get you going on hpc clusters: http://www.hpccommunity.org/section/kusu-45/ [hpccommunity.org]

2. http://www.platform.com/ [platform.com] - Dell's/Redhat official hpc cluster (at least a couple of years ago) which was based on kusu (see previous link). In other words, RH was(is?) using a third party for their RH HPC - correction needed if things have changed. - a great yo-yo system (DellRedHatPlatform) in case you have issues.

3. http://www.caoslinux.org/ [caoslinux.org]

Scientific Linux (2)

scheme (19778) | more than 3 years ago | (#36255628)

A lot of this depends on what you're doing with your cluster and what apps you're running. However, Scientific Linux is used by quite a few large clusters and all of the US ATLAS and CMS clusters run on. As others have mentioned, you probably want to be more interested in how the cluster is managed and nodes setup and kept up to date. I'd recommend something like cobbler and puppet or some other change management system so that you can setup profiles and automatically have that propagated to the various nodes automatically. This is preferable and easier than going through and making the same configuration changes on 5-10 machines.

Debian/Ubuntu (1)

dogmatixpsych (786818) | more than 3 years ago | (#36255726)

I'd have to agree with the Debian/Ubuntu route if you want user friendliness. I've always found Debianesque systems much more manageable than other distros. If I have to provide most of the IT myself, I prefer Debian/Ubuntu. There are some science Debian distros as well (and repositories).

Scientific Linux would likely be faster overall for computationally heavy tasks but it really depends on what you are planning on doing. Debian wouldn't be slow, just not quite as fast as Scientific Linux; but again, that might not matter very much in the big picture.

Debian/Gridengine (1)

jbazik (56295) | more than 3 years ago | (#36255762)

I run a 730-core cluster on debian/gridengine. We're a debian shop, and keeping the cluster platform the same as our desktops is an advantage. Configuring gridengine takes some effort, but so far we're pleased with the result.

Scientific Linux is your choice (0)

Anonymous Coward | more than 3 years ago | (#36255896)

I work in a national supercomputing centre and I can only recommend you to use Scientific Linux and Quattor to manage configuration and installation. There's nothing as powerful as Quattor and it's used by many computing centres (IN2P3, CERN, etc)

as scheduler and batch system I recommend u to use Slurm or the combination of Torque/Moab. Ran away of Maui!!!! And LFS is so damn expensive!

We run cray's SLES and scientific Linux with both Slurm and torque/Maui and we're very happy with our clusters.

In addition, I recommend you Lustre as shared FS. Relatively easy to use and install.

Good luck with your setup!!!

M

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>