Multi-Server Microkernel OS Genode 12.11 Can Build Itself 102
An anonymous reader wrote in with a story on OS News about the latest release of the Genode Microkernel OS Framework. Brought to you by the research labs at TU Dresden, Genode is based on the L4 microkernel and aims to provide a framework for writing multi-server operating systems (think the Hurd, but with even device drivers as userspace tasks). Until recently, the primary use of L4 seems to have been as a glorified Hypervisor for Linux, but now that's changing: the Genode example OS can build itself on itself: "Even though there is a large track record of individual programs and libraries ported to the environment, those programs used to be self-sustaining applications that require only little interaction with other programs. In contrast, the build system relies on many utilities working together using mechanisms such as files, pipes, output redirection, and execve. The Genode base system does not come with any of those mechanisms let alone the subtle semantics of the POSIX interface as expected by those utilities. Being true to microkernel principles, Genode's API has a far lower abstraction level and is much more rigid in scope." The detailed changelog has information on the huge architectural overhaul of this release. One thing this release features that Hurd still doesn't have: working sound support. For those unfamiliar with multi-server systems, the project has a brief conceptual overview document.
No plans for LLVM (Score:5, Informative)
For anybody wondering [osnews.com]:
Re: (Score:2, Interesting)
In particular, because it is very rigid in the tools it needs to work with, making it more complicated to have a full working toolchain on exotic platforms. Hurd still doesn't have: working sound support. For those unfamiliar with multi-server systems, the project has a brief conceptual overview document.
clang/llvm can actually cross-compile to several different architectures with the same binary. That thing would be absolutely impossible with GCC.
Re: (Score:2)
A bad copy/paste happened here, sorry.
Re: (Score:3, Insightful)
I'd rather concentrate on getting server code running natively no matter the toolchain used.
"We have a microkernel that can compile with LLVM" is not as cool as "run your apache pg and php/java/whatever in a microkernel built with security and accountability in mind".
Re: (Score:3, Interesting)
Microkernels are long on the "security and accountability" hype and somewhat short on reality. Sure, the services provided by the microkernel are less likely to have bugs or holes than a monolithic kernel -- but that's because the microkernel doesn't provide most of the monolithic kernel's functionality. Once you roll in all the device drivers, network stack, and the rest, the microkernel-based system is generally at least as bloated and typically less performant.
Re: (Score:3, Interesting)
Come back when you get the point. Kernel space is shared memory, a kernel mode component can crash the system and leave no trace of what did it. Like pre X MacOS or DOS.
And never say or type 'performant' again. It makes you look like a douche. 'less performant' == 'slower'.
Everybody knows mircrokernels are slower. They are more stable. Misbehaving drivers are identified quickly. They usually have fewer issues and the issues they have don't take the whole system down.
That said, count the context switc
Re:No plans for LLVM (Score:4, Interesting)
I would say that you're the one who needs to get the point. Major components that crash will still generally leave the system in a state that is difficult or impractical to diagnose or recover. If your disk driver or filesystem daemon crashes, you don't have many ways to log in or start a replacement instance. If your network card driver or TCP/IP stack crashes, you still need a remote management console to fix that web server. In the mean time, people with modern kernels have figured out how to make those monolithic kernels still fairly usable in spite of panics or other corruption. The only reason that microkernels look better on the metrics you claim is that they support less hardware and use less of the hardware's complex (high-performance) features.
Re:No plans for LLVM (Score:4, Insightful)
... and Linux, NT, and the Mac OS X kernel (XNU).
NT and the Mac OS X kernels are interesting cases: they started as microkernels, but soon moved on to "hybrid" approaches that keep a lot of drivers inside kernel space.
That sounds great in theory, but if a disk or network driver crashes on a production server, how much do you care that the rest of the system is still working? These things must not crash, period -- if they do crash, the state of the rest of the system is usually irrelevant.
Re:No plans for LLVM (Score:4, Interesting)
if a disk or network driver crashes on a production server, how much do you care that the rest of the system is still working? These things must not crash, period -- if they do crash, the state of the rest of the system is usually irrelevant.
That's not really true. The storage driver can ask the disk driver which blocks (or whatever you call them) have been successfully written, and not retire them from the cache until they have been recorded. And hopefully one day we will get MRAM, and then we'll have recoverable ramdisks even better than the ones we had on the Amiga -- where they could persist through a warm boot, simply getting mapped again. So you could load your OS from floppy into RAM, but you'd only have to do it once per cold boot, which is nice because the Amiga would crash a lot because it had no memory protection...
This conversation is especially interesting because the Amiga was a microkernel-based system with user-mode drivers, which is much of how they solved hardware autoconfiguration; you could include a config rom and the OS would load (in fact, run) your driver process from it. This was enough at least for booting, and then you could load any updated drivers which can kick the old driver out of memory. And now we have reached the limits of what I know about it :)
If the network card driver crashes, the same thing is true. The network server knows which packets have been ACKed and which ones haven't, and it knows the sequence number of the last packet it received. The driver is restarted, some retransmits are requested, and everything proceeds as normal. The only case in which the user even has to notice is when the driver is crashing so fast that it can't do any useful work before it does so.
Re: (Score:3)
It's undeniable that microkernels open very cool possibilities, like the ones you mentioned.
But my first point was that, every time someone makes a microkernel that has to compete with the kernels we have today, they end up doing all kinds of compromises ("hybrid" approaches) which end up with all sorts of drivers (network, disk, graphics) in kernel space. Anything else just slows things down too much, to the point where very few people would want to use those kernels.
And, to be honest, while the kinds of t
Re: (Score:2)
LLVM optimizes better than GCC in quite a few cases, in particular since 4.0
Re: (Score:1)
Also, I'm pretty sure GCC is the only compiler that requires the better part of an afternoon on modern hardware to build itself.
With parallel make, it only takes me about 20 minutes on a midrange 6-core system. Look at the -j or --jobs option. I usually use 1.5 times the number of cores for the number of jobs.
Re: (Score:3, Insightful)
There are no mid-range 6-core systems.
Mid-range is dual core with hyperthreading.
Re:No plans for LLVM (Score:4, Informative)
That's just not correct.
A Phenom II x6, especially the lower clocking ones are certainly not high end any more.
A dual core system is now certainly low end, given even netbooks have dual core processors.
Plenty of ultrabooks come with quad core processors these days, and they are not especially high speed machines, trading speed for power consumption and size.
Re: (Score:1)
AMD does not make any good processors anymore.
Intel i3 is the low end, i5 the middle end, and i7 the high end.
i5 are usually dual core processors, while i7 are usually quad core.
Re: (Score:2)
For the record, I just built my home computer with 8 cores and 32GB of ram for around $450-500. For buying AMD I also get AES acceleration, ECC support, turbo clocking, all of the virtualization features, and a number of other features that simply arent available on Intel till you hit the i5/i7 level.
If you can show me how I could get 8 cores or the equivalent for heavily nested virtualization labs (ESXi / HyperV on top of Workstation) on the intel platform, I would be interested; however everything I saw
Re: (Score:2)
That's a nice RAM. I am maybe $700 into my PC, but it's on its second processor (When from Phenom II X3 to X6) and it's got a HDD and a SSD and two optical drives and I started it back when a Phenom II X3 720 was a pretty slick processor. And I have a whopping 8GB.
I went to an X6 because single-thread performance wasn't really my limiting factor. Maybe that's because I run Linux and I don't play the latest greatest masturbatest games, and I only have a 1680x1050 display. But really, I haven't noticed a decr
Re: (Score:2)
AMD does not make any good processors anymore.
Intel i3 is the low end, i5 the middle end, and i7 the high end.
My Phenom x2 server cluster >> your tautology.
Phenom x2 6-core is currently one maximum in the price/power/performance 3-space. All eight corners of the 'cube' have valid use cases.
Re: (Score:2)
Server cluster? I thought we were talking about average middle end desktop/workstation computers.
Re: (Score:2)
Server cluster? I thought we were talking about average middle end desktop/workstation computers.
Yeah, a decent node is under $500 by using "desktop" hardware. The beauty of a redundant architecture is that "server-quality" hardware isn't that important anymore. I know how to spend 10x that on a really fast server, but most workloads don't justify the added expense.
Re: (Score:2)
Theres nothing wrong with artificial market segmentation. However, it IS the reason I went with AMD, since theres no reason to burn $300 for processor features that every AMD processor comes with.
Re: (Score:2)
Re: (Score:2)
given the post you made (the GP to the post I'm replying to). You just said there are low end 6-core systems.
So... are you saying 6 core is low end, but dual core + HT is mid range?
AMD reaches into the mid range, and they usually have a low to mid range CPU that is worth the money. Where AMD fails is single core/CPU performance, they still usually scale better than Intel.
Re: (Score:2)
Where did I say that? Because I didn't.
Re: (Score:1)
Re: (Score:3)
Because GCC doesn't have a static analyzer (you do analyze your code, right?) LLVM's analyzer (Clang's scan-build) is very good. Visual C++'s analyzer was crap a few releases ago but even it is getting better. I like GCC but it has a lot of catching up to do in this regard. And no, "-Wall" isn't nearly the same.
Re: (Score:2)
...is a really good dynamic analyzer. Again, not nearly the same.
user space drivers (Score:3)
Re: (Score:2)
Linux let's you write drivers in the user space if you want to. A lot of scanner drivers are written in the userspace. So if you're willing to take the performance hit, there is no reason to not do so, even in Linux.
Perhaps the difference here is that Linux lets you put them in userspace, but this system (like the GEC 4000 series from the '70s) has them all like that?
Why does putting a driver in user space require a performance hit?
Re: (Score:2)
Why does putting a driver in user space require a performance hit?
It has in every microkernel attempt so far, or do you have a way to do it that no one else has thought of?
Re: (Score:2)
Why does putting a driver in user space require a performance hit?
It has in every microkernel attempt so far, or do you have a way to do it that no one else has thought of?
I meant the question to be taken literally - that is, not as an assertion that it doesn't or shouldn't, but as a request for an explanation of why it does.
Re:user space drivers (Score:4, Interesting)
Re: (Score:2)
I believe it's because you need to verify a lot of things that come from user space into kernel space. This makes things like DMA and port communication somewhat more difficult.
Right, though to be fair implementing a microkernel on hardware that doesn't do anything to make microkernels efficient tends to be inefficient. Surprising, of course.
I wonder what people are doing with VPro and microkernels these days (they must be, but I admit to having stopped paying attention to microkernel a decade ago).
Re: (Score:2)
Re: (Score:2)
Does it matter much if you put the slowdown in hardware or software? You're still going to have to deal with context switching.
Apparently so - I hear from Xen and VMWare folks that VPro-enabled resource sharing is much faster than doing it in the hypervisor.
Re: (Score:2)
Re: (Score:2)
It's still reducing the time overhead (and probably heat overhead, since it's a less generic mechanism). It's still there as opposed to... not there. There's just less of it.
Re: (Score:2)
Wikipedia [wikipedia.org] has a pretty decent overview. It's actually kind of interesting and not too technical. Basically, it involves more system calls. Think of it as having more middle men involved in the process. Early microkernels implemented rather inefficient designs, leading people to believe that the concept itself was inefficient. Newer evidence reveals that it isn't quite that bad, and that it's possible to be very competitive with monolithic kernels.
My own understanding of the whole thing is rather shallo
Very Simple (Score:4, Informative)
All interrupts in processors are handled in a single context, the 'ring 0' or 'kernel state'. Device drivers (actual drivers that is) handle interrupts, that's their PURPOSE. When the user types a keystroke the keyboard controller generates an interrupt to hardware which FORCES a CPU context switch to kernel state and the context established for handling interrupts (the exact details depend on the CPU and possibly other parts of the specific architecture, in some systems there is just a general interrupt handling context and software does a bunch of the work, in others the hardware will set up the context and vector directly to the handler).
So, just HAVING an interrupt means you've had one context switch. In a monolithic kernel that could be the only one, the interrupt is handled and normal processing resumes with a switch back to the previous context or something similar. In a microkernel the initial dispatching mechanism has to determine what user space context will handle things and do ANOTHER context switch into that user state, doubling the number of switches required. Not only that but in many cases something like I/O will also require access to other services or drivers. For instance a USB bus will have a USB driver, but layered on top of that are HID drivers, disk drivers, etc, sometimes 2-3 levels deep (IE a USB storage subsystem will emulate SCSI, so there is an abstract SCSI driver on top of the USB driver and then logical disk storage subsystems on top of them). In a microkernel it is QUITE likely that as data and commands move up and down through these layers each one will force a context switch, and they may well also force some data to be moved from one address space to another, etc.
Microkernels will always be a tempting concept, they have a certain architectural level of elegance. OTOH in practical terms they're simply inefficient, and most of the benefits remain largely theoretical. While it is true that dependencies and couplings COULD be reduced and security and stability COULD improve, the added complexity generally results in less reliability and less provable security. Interactions between the various subsystems remain, they just become harder to trace. So far at least monolithic kernels have proven to be more practical in most applications. Some people of course maintain that the structure of OSes running on systems with large numbers of (homogeneous or heterogeneous) will more closely resemble microkernels than standard monolithic ones. Of course work on this sort of software is still in its infancy, so it is hard to say if this may turn out to be true or not.
Re:Very Simple (Score:4, Informative)
Most operating systems these days don't run device driver interrupt handling code directly in the interrupt handler --- it's considered bad practice, as not only do you not know what state the OS is in (because it's just been interrupted!), which means you have an incredibly limited set of functionality available to you, but also while the interrupt handler's running some, if not all, of your interrupts are disabled.
So instead what happens is that you get out of the interrupt handler as quickly as possible and delegate the actual work to a lightweight thread of some description. This will usually run in user mode, although it's part of the kernel and still not considered a user process. This thread is then allowed to do things like wait on mutexes, allocate memory, etc. The exact details all vary according to operating system, of course.
This means that you nearly always have an extra couple of context switches anyway. The extra overhead in a well designed microkernel is negligible. Note that most microkernels are not well designed.
L4 is well designed. It is frigging awesome. One of its key design goals was to reduce context switch time --- we're talking 1/30th the speed of Linux here. I've seen reports that Linux running on top of L4 is actually faster than Linux running on bare metal! L4 is a totally different beast to microkernels like Mach or Minix, and a lot of microkernel folklore simply doesn't apply to L4.
L4 is ubiquitous on the mobile phone world; most featurephones have it, and at least some smartphones have it (e.g. the radio processor on the G1 runs an L4-based operating system). But they're mostly using it because it's small (the kernel is ~32kB), and because it provides excellent task and memory management abstraction. A common setup for featurephones is to run the UI stack in one task, the real-time radio stack in another task, with the UI stack's code dynamically paged from a cheap compressed NAND flash setup --- L4 can do this pretty much trivially.
This is particularly exciting because it looks like the first genuinely practical L4-based desktop operating system around. There have been research OSes using this kind of security architecture for decades, but this is the first one I've seen that actually looks useful. If you haven't watched the LiveCD demo video [youtube.com], do so --- and bear in mind that this is from a couple of years ago. It looks like they're approaching the holy grail of desktop operating systems which, is to be able to run any arbitrary untrusted machine code safely. (And bear in mind that Genode can be run on top of Linux as well as on bare metal. I don't know if you still get the security features without L4 in the background, though.)
This is, basically, the most interesting operating system development I have seen in years.
Re: (Score:2)
Crap, it may be a holy grail for x86 but only because x86 virtualization sucks so bad. Go run your stuff on a 360/Z/P series architecture and you've been able to do this stuff since the 1960s because you have 100% airtight virtualization.
Of course ANY such setup, regardless of hardware, is only as good as the hypervisor. It is still not really clear what is actually gained. Truthfully no degree of isolation is bullet proof because whatever encloses it can look at it and there will ALWAYS be some set of inpu
Re: (Score:2)
Why does putting a driver in user space require a performance hit?
A context switch between processes in the same privilege level happens relatively quickly, but a context switch across privilege levels (e.g. calling user code from the kernel or vice versa) is much slower due to the mechanism involved.
Re: (Score:2)
ALL context switches are expensive. The primary effect of a context switch is that each context has its own memory address layout. When you switch from one to another your TLB (translation lookaside buffer) is invalidated. This creates a LOT of extra work for the CPU as it cannot rely on cached data (addresses are different, the same data may not be in the same location in a new context) and consequent cache invalidation, etc. It really doesn't matter if it is 'user' or 'kernel' level context, the mechanics
Re: (Score:2)
Luckily, virtualization requirements have led to tagged TLBs becoming available on at least x86. I think the number of processes that can share the TLB currently is fairly limited, but it's a start.
Re: (Score:2)
Yeah, this is true. I think if you were to start at zero and design a CPU architecture with a microkernel specifically in mind some clever things would come out of that and help even the playing field. Of course the question is still whether it is worth it at all. Until microkernels show some sort of qualitative superiority there's just no real incentive.
Re: (Score:2)
The worst part is that, until the mid 90s, there were architectures that made things convenient for garbage collection, heavy multithreading, type checking, etc. And then the C machine took over and ... oops, now we need to speed up all of those things, but are stuck with architectures that make it difficult!
Re: (Score:2)
Well, I gotta say, there is less diversity out there. OTOH you really had to be doing some niche stuff even in the old days to be writing code for Novix chips, transputers, Swann systems, and such.
Re: (Score:2)
ALL context switches are expensive. The primary effect of a context switch is that each context has its own memory address layout.
No, that's not correct. Context switches between threads within the same process (or between one kernel thread and another), or context switches due to system calls, do not alter the page tables and do not flush the TLB. The vast majority of context switches are due to system calls, not scheduling. In a system call, the overhead is primarily due to switching in and out of super
Re: (Score:2)
Really depends on the CPU architecture. You can't generalize a lot about that kind of thing. TLB is invalidated in x86. I'm a little sketchy on the ARM situation, but 68k and PPC architectures have a rather different setup than x86.
Context switches between threads generally aren't as expensive, yes, because the whole point with threads is shared address space, which is primarily for this very reason. However, there are still issues with locality, instruction scheduling, etc. There ARE also often changes in
Re: (Score:2)
Does microkernel architecture necessarily require context switches? Write the userspace components in Java or other managed language and run them in kernel threads at Ring 0. You might get a small penalty in code execution time, but get rid of the context switches while still keeping the processes separate.
Re: (Score:1)
It's usually because it requires the actual talking to the hardware to require a context change from userspace to kernel space on x86 based systems (I suspect the other major archs have similar issues but don't know for certain). This is because userspace is normally protected from touching hardware so that it can't cause side effects to other processes without the kernel knowing about it. A good microkernel should be able to give that access directly to userspace but I don't believe most CPUs play nicely
nerd parlor game proposal (Score:1)
Every time the word "Genode" appears in their documentation, misread it as "Genocide".
Re: (Score:2)
Re: (Score:2)
Hurd device drivers aren't in user space? (Score:2)
I thought I read somewhere (and part of why I remember) that Hurd device drivers are also in user space.
Is that wrong?
Re: (Score:2)
Where did you see that was not true?
Re: (Score:2)
It was implied in the summary.
think the Hurd, but with even device drivers as userspace tasks
Re: (Score:2)
Re: (Score:3)
Uhhhhhhh, wait a minute. I was an avid Amiga programmer back in the day. AmigaOS wasn't in any particular sense a microkernel. Such distinctions in fact would be largely meaningless because AmigoOS was written to run on the MC68k processor, a chip which had no MMU nor any facilities for address translation at all (though in theory you could implement storage backed virtual memory it wasn't terribly practical). Every Amiga program was address independent, it could load and run at any address, and all softwar
Re: (Score:2)
Thinking about this: I so much wish that there was an effort to write a new sane and consistent OS based on modern C++ (seeing the error handling code in Linux makes me cry). But I know that in my lifetime we will not see such a thing going mainstream. :(
It seems that Linus has said:
- the whole C++ exception handling thing is fundamentally broken. It's
_especially_ broken for kernels.
Can you care to elaborate on how you think that C++ error handling would be superior for a modern kernel?
Re:Hurd device drivers aren't in user space? (Score:4, Interesting)
It depends. Hurd itself is an implementation of the unix api as servers running on top of a microkernel. Drivers are not its concern.
The way drivers are handled on a Hurd system depends on the choice of microkernel. Mach includes drivers, so they run in kernel space. L4 doesn't have drivers, so they will have to be written separately and run in user space.
Hurd, in one line (try this at home kids!) (Score:2)
20+ years in development, still no sound support.
Re: (Score:2)
Re: (Score:2)
Re: (Score:3)
The system goes on-line August 4th, 1997. Human decisions are removed from strategic defense. Skynet begins to learn at a geometric rate. It becomes self-aware at 2:14 a.m. Eastern time, August 29th. In a panic, they try to pull the plug.
Skynet responds by posting millions of cat pictures to Facebook. Six billion Internet users collectively go "awww!" and hit Share. First Facebook, then Twitter, then the entire wireless broadband infrastructure collapses under the strain. Without access to GPS, dazed urbanites are unable to find their way to espresso sources and enter simultaneous caffeine and microblogging withdrawal. Riots begin in urban metropolitan areas within the hour. Thirty-six hours later, all major metropolitan areas are a smoking
multi-server? (Score:1)
Why does this article use the term "multi-server microkernel OS"? I don't see anything in the article or anything else about Genode referring to multiple servers. Sounds like they're just trying to redefine the term "microkernel"
osFree (Score:2)
TU Dresden and their Informatics faculty (Score:1)