×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Code Quality In Open and Closed Source Kernels

kdawson posted more than 5 years ago | from the tale-of-four-kernels dept.

Programming 252

Diomidis Spinellis writes "Earlier today I presented at the 30th International Conference on Software Engineering a research paper comparing the code quality of Linux, Windows (its research kernel distribution), OpenSolaris, and FreeBSD. For the comparison I parsed multiple configurations of these systems (more than ten million lines) and stored the results in four databases, where I could run SQL queries on them. This amounted to 8GB of data, 160 million records. (I've made the databases and the SQL queries available online.) The areas I examined were file organization, code structure, code style, preprocessing, and data organization. To my surprise there was no clear winner or loser, but there were interesting differences in specific areas. As the summary concludes: '..the structure and internal quality attributes of a working, non-trivial software artifact will represent first and foremost the engineering requirements of its construction, with the influence of process being marginal, if any.'"

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

252 comments

Is it just me? (4, Interesting)

Abreu (173023) | more than 5 years ago | (#23434796)

Or the summary is completely incomprehensible?

Of course, I could try to RTFA, but hey, this is Slashdot, after all...

Re:Is it just me? (5, Insightful)

Anonymous Coward | more than 5 years ago | (#23434836)

That is if you can figure out which of the 12 links are the actual FA and which are supporting material.

Re:Is it just me? (0)

Anonymous Coward | more than 5 years ago | (#23434968)

I think it is the part: '..the structure and internal quality attributes of a working, non-trivial software artifact will represent first and foremost the engineering requirements of its construction, with the influence of process being marginal, if any.'"

Makes no sense anyway. (0, Troll)

twitter (104583) | more than 5 years ago | (#23435086)

There's obviously a problem with a study that takes 8GB of data and concludes that there's no difference in quality between kernels with legendary uptimes and those that can't manage memory well enough to stay up more than a few weeks. This kind of study sounds interesting but it's not practical. Practical results come from real operations.

Re:Makes no sense anyway. (-1, Offtopic)

Anonymous Coward | more than 5 years ago | (#23436426)

Two people just spent two mod points to bury a post that started out at -1 to begin with. That should give you an idea of how popular you and your sockpuppets have become around here.

Re:Is it just me? (3, Interesting)

Bazman (4849) | more than 5 years ago | (#23435076)

Well, it's not just you, but probably millions like you. Plenty of the summary is comprehensible, but I get the fear that it's really just a slashvertisement for his book (third link in summary).

Re:Is it just me? (2, Informative)

stavrosg (893274) | more than 5 years ago | (#23435360)

TFA is the second link, but yes, the summary does not do much to help you figure it out.

Re:Is it just me? (4, Insightful)

raddan (519638) | more than 5 years ago | (#23435184)

It's not a very good summary, but the paper is well-written, which is interesting considering that the author is the one who submitted the summary to Slashdot. I suspect that he assumes we have more familiarity with the subject than we actually do.

Re:Is it just me? (5, Interesting)

Diomidis Spinellis (661697) | more than 5 years ago | (#23436068)

It's not a very good summary, but the paper is well-written, which is interesting considering that the author is the one who submitted the summary to Slashdot. I suspect that he assumes we have more familiarity with the subject than we actually do.
In my submission I did not include the last sentence with the "summary", which, I agree, is completely incomprehensible in the form it appears.

Re:Is it just me? (3, Interesting)

tcopeland (32225) | more than 5 years ago | (#23436072)

> the paper is well-written

Yup, and the author of the paper is Diomidis Spinellis, who wrote the excellent book Code Reading [spinellis.gr]. This is a great study of code analysis and familiarization techniques. He also wrote a fine article on C preprocessors... in Dr. Dobb's Journal, I think.

Re:Is it just me? (1)

DustCollector (903185) | more than 5 years ago | (#23435660)

No, it's not just you. Also,

>> To my surprise there was no clear winner or loser, ...

Not sure why it is a *surprise* there was no clear winner or loser. That's what I would expect.

Re:Is it just me? (4, Insightful)

Diomidis Spinellis (661697) | more than 5 years ago | (#23435794)

I didn't write the last part when I submitted the story, and, yes, the summary given here is comprehensible, because it appears out of context. What the sentence '..the structure and internal quality attributes of a working, non-trivial software artifact will represent first and foremost the engineering requirements of its construction, with the influence of process being marginal, if any.' means is that when you build something complex and demanding, say a dam or an operating system kernel, the end result will have a specific level of quality, no matter how you build it. For this reason the differences in the software built with a tightly-controlled proprietary software process and that built using an open-source process are not that big.

Re:Is it just me? (5, Insightful)

legutierr (1199887) | more than 5 years ago | (#23436570)

How useful is it to write something about computers that needs to be translated for the slashdot audience? Jargon is a great way to provide specialized information to insiders quickly and efficiently, but this is slashdot. If slashdot readers need for you to restate your description of a problem or observation related to the Linux kernel (even if that description is taken out of context), could it be that the paper could be written in a more open manner? The quote you provided from your paper seems to speak to a narrow audience; how narrow must your audience be, however, if it excludes a good portion of slashdot's readers?

If I seem overly critical, I do not mean to, it is only that I hate to see good, useful research made less accessible to non-academics by the use of academic language.

Re:Is it just me? (0)

Anonymous Coward | more than 5 years ago | (#23436018)

You're just used to reading media reports of research instead of actual research. This is why researchers "post" their stuff at conferences, so it can be digested by people who can understand it and actually care.

In other words, "go play with the other kids, adults are talking."

That said, I have no idea why people post their research papers here, it's just silly.

Re:Is it just me? (0)

Anonymous Coward | more than 5 years ago | (#23436088)

No the problem is it's accurate and doesn't say that windoze sux0r5! Hence the confusion about why it's on slashdot.

(.)(.) warning! (-1, Offtopic)

Anonymous Coward | more than 5 years ago | (#23434852)

I just shit my pants.

windows is teh suck blah blah blah (0, Insightful)

Anonymous Coward | more than 5 years ago | (#23434874)

I can't wait to see what kind of ignorant anti-Windows screeds the 13-year-olds who post on Slashdot during recess will come up with in response to this article! I'm betting five minutes, tops, until someone posts a comment that spells "Microsoft" with a "$". Don't let me down, Slashbots!

Re:windows is teh suck blah blah blah (0, Flamebait)

LBArrettAnderson (655246) | more than 5 years ago | (#23435012)

The fact that he said that he couldn't draw any conclusions probably means that the windows code had by far the best quality out of any of them, and that shouldn't surprise anyone.

Re:windows is teh suck blah blah blah (1)

Uncle Focker (1277658) | more than 5 years ago | (#23435284)

The fact that he said that he couldn't draw any conclusions probably means that the windows code had by far the best quality out of any of them, and that shouldn't surprise anyone.
Not quite.

The two systems with a commercial pedigree (Solaris and WRK) have slightly more positive than negative marks. However, WRK also has the largest number of negative marks, while Solaris has the second lowest number of positive marks.

Win2k Leak (0)

Anonymous Coward | more than 5 years ago | (#23435474)

Have you ever seen the code that was leaked on Windows 2000? It was absolutely exactly like the same quality of code in Microsoft's SDKs (Software Development Kits) for Visual C++ and some of their source code examples for MSDN subscriptions are pretty much a mirror image of the quality. Absolutely fucking horrific. No wonder they take forever (if they can even bother to try) to fix bugs and security holes. It's just a god damn nightmare from hell.

Re:windows is teh suck blah blah blah (0)

Anonymous Coward | more than 5 years ago | (#23435054)

Shh.

Slashbots are teh 13 year olds blah blah blah (5, Funny)

spun (1352) | more than 5 years ago | (#23435142)

Well, you lose your bet, been over five minutes and no anti Microsoft screeds let alone spelling it with a $.

Just so everyone understands, the tactic used here is known as "Poisoning the well." [wikipedia.org] The idea is the discredit an argument's source before the argument is presented. Here, our AC friend is trying to ward off criticism of Microsoft by insinuating that anyone who does so is a 13 year old "Slashbot."

The fallacy is in the fact that even someone who is 13 and often goes along with the Slashdot zeitgeist may still have legitimate criticisms of Microsoft, such as the fact that Microsoft sucks giant hairy donkey balls.

Re:Slashbots are teh 13 year olds blah blah blah (0)

Anonymous Coward | more than 5 years ago | (#23435258)

...such as the fact that Microsoft sucks giant hairy donkey balls.

Actually, donkeys' balls aren't hairy, though the scrotum is.

Re:windows is teh suck blah blah blah (-1, Troll)

Anonymous Coward | more than 5 years ago | (#23435204)

Linuck$ $uck$

It's obvious (5, Funny)

Number6.2 (71553) | more than 5 years ago | (#23434922)

That you have neither capitalized on your shared synergies, nor have you recovered your cherished paradigms.

Oh. Wait. This is about propeller-head stuff rather than management stuff. Lemme get my "Handbook of postmodern buzz words"...

So.... (0)

Anonymous Coward | more than 5 years ago | (#23434932)

Results inconclusive. How does he define what is trivial and what is not. Also this is all based on x86... he should compare the code bases of stuff that runs on Alpha, and PowerPC as well.

Re:So.... (1)

ac666 (535743) | more than 5 years ago | (#23435050)

Um, he already did plenty. I'm sure there's room for a follow-up paper along those lines. Quite possibly by someone else - I'd guess you're up to the task, considering how dismissive you were of his efforts?

Re:So.... (0)

Anonymous Coward | more than 5 years ago | (#23435970)

What I took away:

Huge SQL database: non-trivial
His use of said database: trivial

Closed Source? (1, Informative)

Anonymous Coward | more than 5 years ago | (#23434948)

Which of the samples were closed source? And........ how did he get a hold of it/them? Open source software is still open even if it came from microsoft.

Re:Closed Source? (4, Interesting)

zeromorph (1009305) | more than 5 years ago | (#23435470)

The WRK is under the Microsoft Windows Research Kernel Source Code License [microsoft.com]. I'm not sure that this license conforms with anyones definition of open source, but it's reasonably free for reasearch.

But PP addresses a crucial point, if something really is closed source there is no reviewable way to compare and present this code. So if the WRK would be total crap they could always say: yes that's only the WRK, not the real kernel.

Only statements about open source code are directly verifiable/falsifiable. One of the reasons, why the FOSS approach is superior from a scientific as well as technical point of view.

Re:Closed Source? (1)

Em Adespoton (792954) | more than 5 years ago | (#23435554)

There are a number of companies that provide the source for you to READ, but you have no permission to MODIFY or DISTRIBUTE. I think by "open" he means "free" as in speech.

However, I was wondering how legal his database is; it might skate awfully close to the edge of the licenses he had to sign to get access to some of that code.

Re:Closed Source? (1)

twizmer (1206952) | more than 5 years ago | (#23436434)

Glancing at the schema, it's clear that the database, while highly interesting from a structural point of view, doesn't actually contain any of the code or anything like that. I'll bet he was smart enough to get it ok'd and I'll bet Microsoft was fine with it (after all, the paper does conclude that windows is comparable to the open source kernels)

Not that surprising (5, Interesting)

abigor (540274) | more than 5 years ago | (#23434958)

Final line in the paper: "Therefore, the most we can read from the overall balance of marks is that open source development approaches do not produce software of markedly higher quality than proprietary software development."

Interesting, but not shocking for those who have worked with disciplined commercial teams. I wonder what the results would be in less critical areas than the kernel, say certain types of applications.

Re:Not that surprising (3, Insightful)

ivan256 (17499) | more than 5 years ago | (#23435320)

It's obvious what the results would be.

Half completed, unpolished commercial software usually stays unreleased and safe from this sort of scrutiny. However many of the same types of projects get left out in the open and easily visible to everybody when developed as open source. The low code quality of these projects would drag down the average for open source projects as a whole.

On the lighter side, you could say that you'd only consider software that was "out of beta" or version 1.0 or greater, but that would leave out most open source projects and commercial "Web 2.0" products....

Re:Not that surprising (2, Insightful)

indifferent children (842621) | more than 5 years ago | (#23435516)

So open source software is not of 'markedly' higher quality. If it is of even 'slightly' higher quality, or 'exactly the same quality' as closed source software, then the fact that it costs less, and gives users freedoms that they don't have with closed source software, means that closed source is doomed.

Re:Not that surprising (2, Interesting)

abigor (540274) | more than 5 years ago | (#23435752)

Well, not necessarily. Perhaps for certain types of commodity applications, like office suites, but even then, it's tough to say. That's why I was interested in the comparison. Your assertion is certainly not true for games, for example.

Generally speaking, commercial desktop apps are still way ahead of their open counterparts, with the exception of code development tools and anything that directly implements a standard (browsers, mail clients, etc.)

One reason for this is that code quality as measured in this study may not directly relate to application quality as measured by the typical user. Photoshop is "good" not least because of its well-understood interface and the fact that everyone uses it, regardless of how admirable the code is.

Re:Not that surprising (3, Insightful)

FishWithAHammer (957772) | more than 5 years ago | (#23436528)

Generally speaking, commercial desktop apps are still way ahead of their open counterparts, with the exception of code development tools and anything that directly implements a standard (browsers, mail clients, etc.)
Code development tools? VS says hi. (And somebody is now going to leap in and say that that monstrosity Eclipse is somehow "better" than VS...this will be amusing.)

Re:Not that surprising (1)

jgarra23 (1109651) | more than 5 years ago | (#23435816)


So open source software is not of 'markedly' higher quality. If it is of even 'slightly' higher quality, or 'exactly the same quality' as closed source software, then the fact that it costs less, and gives users freedoms that they don't have with closed source software, means that closed source is doomed.


Please explain how you come to this conclusion. I may not be understanding your argument. I would like to know how factors such as marketing, market penetration, user acceptance, interface quality, support quality (in terms of user-friendliness and customer service), methods of distribution and target audiences apply to both argument and conclusion. I could be wrong but I'm pretty sure those are all of equal weight when determining the overall success or failure of a open/closed source model of software development.

Re:Not that surprising (1)

indifferent children (842621) | more than 5 years ago | (#23436298)

To butcher a phrase from MLK: The arc of history is long, but it bends toward quality.

Putting a fresh coat of paint on a pig will only work for so long (and producing a fatter, less attractive, less useful, more expensive pig after 5 years of effort to produce something other than a pig, is not a win). Marketing and support can only compensate for high cost of low quality for so long; every day, more people realize that software that doesn't crash is better than software that has a 1-800 number that you can call when it crashes.

Re:Not that surprising (0)

Anonymous Coward | more than 5 years ago | (#23435788)

Final line in the paper: "Therefore, the most we can read from the overall balance of marks is that open source development approaches do not produce software of markedly higher quality than proprietary software development."

Interesting, but not shocking for those who have worked with disciplined commercial teams. I wonder what the results would be in less critical areas than the kernel, say certain types of applications.
I wonder if the commercial software made available for scrutiny was cleaned up before it was released.

Re:Not that surprising (1)

fitten (521191) | more than 5 years ago | (#23435930)

Very hard to say. If it were a specific project (the release of the code) for a specific goal, then possibly. However, once a project is done, rarely will you find the money/time to revisit past code just to clean it up or merge source trees or whatever.

For example, if there was a business decision to release the code to the public but it had to look beautiful, then it might warrant being a 'real' project and be funded... but it would most likely have to have some gain associated with it (advertising, good-will, whatever) for the company... basically, some way for the company to profit from it.

If you're just wanting to do a general cleanup, you probably won't have time/money allocated for that... particularly when you may impact the stability of the codebase and/or regress or introduce new bugs (which will cost more money to fix).

So yeah, most of the time, you have to have to write it and you can change it when you have to fix a bug or add new features or something, otherwise, what's in the source doesn't change. However, where I work now, we are encouraged to refactor and such when we see good opportunity to do so. We also have some ATPs that we run to detect regression and bug introduction, to keep those issues at a minimum when possible.

Why would that be a surprise? (1)

HangingChad (677530) | more than 5 years ago | (#23435820)

Interesting, but not shocking

Considering that few open source developers are strictly open source, that's hardly a surprise. I'd be willing to bet many open source developers are also part of disciplined commercial teams.

The flip side of that coin is just as intriguing. Open source development models don't produce software of notably inferior quality either. That should send a shivey through Castle Redmondore.

Re:Not that surprising (1)

hey! (33014) | more than 5 years ago | (#23436138)

Well, for a certain definition of "quality".

Maybe a better way of saying this is that open source programmers aren't better programmers than closed source ones.

But nobody ever said open source programmers are better. The argument is that open source software gets continually better from a user's perspective. If it doesn't for enough users, somebody else gives them what they want. If you aren't happy with SUSE's direction, you can go to RHEL and vice versa without creating a lot of fuss. Chances are somebody is taking the same basic building blocks and putting them together more like you want them.

In any case, I'm not impressed by including the Widows kernel to Unix kernels. It's not very useful to compare a microkernel to a monolithic kernel without including enough fo the modules that go around it to implement the same functionality. They should compare it to Mach instead.

Re:Not that surprising (1)

quangdog (1002624) | more than 5 years ago | (#23436236)

By the same token, based on these results could the last sentence read: "Therefore, the most we can read from the overall balance of marks is that closed source development approaches do not produce software of markedly higher quality than open source software development."

In other words, this study reveals nothing about the relative level of quality of either approach.

Flashbacks to my undergrad work when I had to read Zen...

-- Kimball
www.kimballlarsen.com

No-one has ever claimed (4, Insightful)

wellingtonsteve (892855) | more than 5 years ago | (#23434962)

..that Open Source code is of quality, but at least the point of things like the GPL is that you have the power to change that, and improve that code..

Re:No-one has ever claimed (0, Flamebait)

Angostura (703910) | more than 5 years ago | (#23435124)

...and the findings suggest that, actually - no-one does.

Re:No-one has ever claimed (1)

jedidiah (1196) | more than 5 years ago | (#23435368)

I have personally done it myself when the need was present.

With a large userbase, even a large kernel development team is going to represent "no one". This is just the nature of the numbers.

The fact that anyone can be "no one" and that these "no ones" can then benefit everyone one else, is the whole point of Free Software.

The libre kernels undoubtedly have much more diverse and spread out development teams.

They represent more than just one corporate culture or more than just one approach to software.

Re:No-one has ever claimed (0)

Anonymous Coward | more than 5 years ago | (#23435148)

i don't need the gpl. i just publish. i'm no lemming to a license.

keep on slaving for the man, boys.

this entire article is a troll (-1)

Anonymous Coward | more than 5 years ago | (#23435026)

The tool who submitted this article is just trying to get clickbacks to his homepage so that all you zombies will buy his shitty book. Attention Slashdot: YHBT. YHL. HAND.

Oh, and I'm reporting him to Homeland Security.

CScout Compilation (5, Insightful)

allenw (33234) | more than 5 years ago | (#23435028)

"The OpenSolaris kernel was a welcomed surprise: it was the only body of source code that did not require any extensions to CScout in order to compile."

Given that the Solaris kernel has been compiled by two very different compilers (Sun Studio, of course, and gcc), it isn't that surprising. Because of the compiler issues, it is likely the most ANSI compliant of the bunch.

Re:CScout Compilation (0)

Anonymous Coward | more than 5 years ago | (#23435268)

No, it was the insistance that the code be lint clean. I have never compiled Solaris with gcc -- I even used a specific version of the compiler. But, the code had to be clean to be accepted.

statistical wash-out? (4, Insightful)

davejenkins (99111) | more than 5 years ago | (#23435034)

If I am understanding correctly, you were looking for 'winners' and 'losers' (weasel words in and of themselves, but anyway...) in terms of 'quality' (another semi-subjective term that could make someone go crazy and drive motorcycles across the country for the rest of their lives).

You found that '..the structure and internal quality attributes of a working, non-trivial software artifact will represent first and foremost the engineering requirements of its construction, with the influence of process being marginal, if any.' -- or in plain English: "the app specs had a much bigger influence when compared to internal efficiencies".

I would wonder if you're just seeing a statistical wash-out. Are you dealing with data sets (tens of millions of lines and thousands of functions) that are so large, that patterns simply get washed out in the analysis?

Oh dear, my post is no more clear than the summary...

Re:statistical wash-out? (2, Interesting)

Junior J. Junior III (192702) | more than 5 years ago | (#23435250)

in plain English: "the app specs had a much bigger influence when compared to internal efficiencies".

It sounds more like they're saying "If someone built it, and someone else is using it, and it's important, then the code quality is going to be pretty good. If it matters, it's going to get attention and be improved."

Of course, I can think of a bunch of counter-examples in Windows where something was important *to me* and mattered *to me* and no one at Microsoft saw fit to do anything about it for decades.

Re:statistical wash-out? (1)

geonik (1003109) | more than 5 years ago | (#23435484)

Oh dear, my post is no more clear than the summary...
Have you ever thought of applying to become a slashdot editor?

Re:statistical wash-out? (0)

Anonymous Coward | more than 5 years ago | (#23435726)

Are you dealing with data sets (tens of millions of lines and thousands of functions) that are so large, that patterns simply get washed out in the analysis?
Er, what? Is it just me or does this concept seem to strike against basic statistics?

Re:statistical wash-out? (1)

Diomidis Spinellis (661697) | more than 5 years ago | (#23435898)

I would wonder if you're just seeing a statistical wash-out. Are you dealing with data sets (tens of millions of lines and thousands of functions) that are so large, that patterns simply get washed out in the analysis?
I don't think so. I can't now do an analysis for a counterexample, but I am pretty sure that if I run the same metrics on, say, the bottom 20% in terms of downloads, of Sourceforge projects I will very different results.

Re:statistical wash-out? (3, Insightful)

raddan (519638) | more than 5 years ago | (#23436568)

With regard to the guy who went crazy and drove his motorcycle across the country-- I think the point of the book was to demonstrate that "subjective" and "objective" are specious terms. Science gets all hot and bothered when words like "good" and "bad" are used, but not when words like "point" are used. So if we can make allowances for axiomatic terms, why not so-called "qualitative" terms? After all, the word "axiom" means, according to Wikipedia:

The word "axiom" comes from the Greek word axioma a verbal noun from the verb axioein, meaning "to deem worthy", but also "to require", which in turn comes from axios, meaning "being in balance", and hence "having (the same) value (as)", "worthy", "proper". Among the ancient Greek philosophers an axiom was a claim which could be seen to be true without any need for proof.
Indeed, if you look at many of our "quantitative" measures, they are, at their heart, a formalization of "goodness" and "badness". If you're a mathematician, you might argue that this is not true (since there are loads of mathematical constructs whose only requirement is simply self-consistency and not some conformance to any external phenomenon), but if you're an engineer, you're whole career balances on the fine points of "goodness" and "badness". It is an essential concept!

My personal opinion is that if statistics are a wash-out in general, then the researcher is asking the wrong questions. I know that the author pre-defined his metrics in order to avoid bias, but that's not necessarily good science. Scientific questions should be directed toward answering specific questions, and the investigatory process must allow the scientist to ask new questions based on new data.

There is clear non-anecdotal evidence that these operating systems behave differently (and, additionally, we assign a qualitative meaning to this behavior), so the question as I understand it is: is this a result of the development style of the OS programmers? The author should seek to answer that question as unambiguously as possible. If the answer to that question is "it is unclear", then the author should have gone back and asked more questions before he published his paper, because all he has shown is that the investigatory techniques he used are ill-suited to answering the question he posed.

Really? (3, Insightful)

jastus (996055) | more than 5 years ago | (#23435200)

I'm sorry, but if this is what passes for serious academic computer-science work, close the schools. This all appears to boil down to: quality code (definition left to the reader) is produced by good programmers (can't define, but I know one when I see his/her code) who are given the time to produce quality code. Rushed projects by teams of average-to-crappy programmers results in low-quality code. All the tools and management theories in the world have little impact on this basic fact of life. My PhD, please?

Re:Really? (2, Insightful)

jjohnson (62583) | more than 5 years ago | (#23436162)

If you'd RTFA, you'd know that there's a lot more to what the author said than that. He says nothing about a relationship between the quality of programmers and the quality of code; he says nothing about the time taken to develop code, and makes no conclusions about its effect on code quality.

What he says is that a cluster of metrics that collectively say something general about code quality (e.g., better code tends to have smaller files with fewer LOC; worse code has more global functions and namespace pollution) show little difference between four kernels with diverse parentage.

He speculates (and says he is speculating) that obvious differences in process might account for small variances in where each kernel scores well or badly.

The 99% Solution (4, Interesting)

SuperKendall (25149) | more than 5 years ago | (#23435232)

So while looking at the data collected, I had to wonder if some of the conclusions reached were not something of a matter of weighting - I saw some things pretty troubling about the WRK. Among the top of my list was a 99.8% global function count!!!

This would explain some things like lower LOC count - after all, if you just have a bunch of global functions there's no need for a lot of API wrapping, you just call away.

I do hate to lean on LOC as any kind of metric but - even besides that, the far lower count of Windows made me wonder how much there, is there. Is the Windows kernel so much tighter or is it just doing less? That one metric would seem to make further conclusions hard to reach since it's such a different style.

Also, on a side note I would say another conclusion you could reach is that open source would tend to be more readable, with the WRK having a 33.30% adherence to code style and the others being 77-83%. That meshes with my experience working on corporate code, where over time coding styles change on more of a whim whereas in an open source project, it's more important to keep a common look to the code for maintainability. (That's important for corporate code too - it's just that there's usually no-one assigned to care about that).

Re:The 99% Solution (2, Insightful)

Yosi (139306) | more than 5 years ago | (#23435754)

The piece of Windows they had did not include drivers. It says:

Excluded from the kernel code are the device drivers, and the plug-and-play, power management, and virtual DOS subsystems. The missing parts explain the large size difference between the WRK and the other three kernels.


Much of the code in Linux, for instance, is drivers.

Re:The 99% Solution (0)

Anonymous Coward | more than 5 years ago | (#23435774)

The WRK doesn't include the enormous number of in-tree drivers of linux or even the moderately large number of in-tree drivers of solaris.

Re:The 99% Solution (3, Informative)

Diomidis Spinellis (661697) | more than 5 years ago | (#23436024)

So while looking at the data collected, I had to wonder if some of the conclusions reached were not something of a matter of weighting - I saw some things pretty troubling about the WRK. Among the top of my list was a 99.8% global function count!!!
I guess Microsoft uses a non-C linker-specific mechanism to isolate their functions, for instance by linking their code into modules. But yes, this is a troubling number.

This would explain some things like lower LOC count - after all, if you just have a bunch of global functions there's no need for a lot of API wrapping, you just call away.
The lower LOC comes from the fact that WRK is s subset of Windows. It does not include device drivers, and the plug-and-play, power management, and virtual DOS subsystems.

Also, on a side note I would say another conclusion you could reach is that open source would tend to be more readable, with the WRK having a 33.30% adherence to code style and the others being 77-83%. That meshes with my experience working on corporate code, where over time coding styles change on more of a whim whereas in an open source project, it's more important to keep a common look to the code for maintainability. (That's important for corporate code too - it's just that there's usually no-one assigned to care about that).
About 15 years ago I chanced upon code in a device driver that Microsoft distributed with something like a DDK that had comments written in Spanish. The situation in WRK is markedly better, but keep in mind that Microsoft distributes WRK for research and teaching.

Crap metrics != quality measure (0)

Anonymous Coward | more than 5 years ago | (#23435254)

It is easy to explain why the results are not conclusive, those metrics do not measure the actually quality of code.

KLOCs? (4, Insightful)

Baavgai (598847) | more than 5 years ago | (#23435380)

If good code and bad code were a simple automated analysis away, don't you think everyone would be doing it? What methodolgy could possibly give a quantitative weighting for "quality"?

"To my surprise there was no clear winner or loser..." Not really a surprise at all, actually.

Re:KLOCs? (2, Funny)

getto man d (619850) | more than 5 years ago | (#23435646)

Exactly. Automation excludes creativity. If it were that easy I could just do the following:

writeGoodCode(int numberOfLines, float ouncesOfCoffeeConsumed)

Re:KLOCs? (0)

Anonymous Coward | more than 5 years ago | (#23436368)

If good code and bad code were a simple automated analysis away, don't you think everyone would be doing it? What methodolgy could possibly give a quantitative weighting for "quality"?
There are massive differences in style and quality that are obvious even at first glance.

FreeBSD: clever and correct, but ugly and complex.
Linux: tidy and refined, but not very intellectual/theoretical
Solaris: brilliant, but chaotic and a huge mess
Windows: correct and industrial, what can one say but 'mainframe'.
OS X: nice drivers and straightforward, but made from frankenstein parts

Seriously the study is nothing to do with quality at all. Take one example, number of files in a folder where they say 'lower is better'. Bullshit, it all depends on what files are in the folders and how the code is distributed among the files (and these are subjective measures). For instance, FreeBSD has a file 'net/zlib.c' implementing the compress algorithm whereas linux has this under 'lib' -- linux wins for organizing that file, but this study fails it for 'too many files in a folder'. I could go on... and many of their metrics are biased toward a phd's preconceived assumptions on what makes quality code.

In terms of Sum('WTF?!' + '/sigh') when reading the code, I would rank the kernels as:

Linux > Solaris > OS X > FreeBSD >> Windows (>> Subversion) ... but that's just my subjective take on it. From somebody who has written production code for linux, os x, and freebsd kernels and has reviewed windows and solaris in depth.

Re:KLOCs? (0)

Anonymous Coward | more than 5 years ago | (#23436516)

Debian OpenSSL maintiners seem to think bug free code is just a matter of making Valgrind not complain. ;)

I ran scripts on 8GB of data (0, Troll)

ameboy (1211832) | more than 5 years ago | (#23435396)

Then I realized it is rather pointless. That did not prevent me from presenting it anyway, after all most of the academia works this way. When I get out of the university, I will read and understand the code, maybe maintain it for a year, before judging its quality.

Puppet (0)

Anonymous Coward | more than 5 years ago | (#23435464)

"Therefore, the most we can read from the overall balance of marks is that open source development approaches do not produce software of markedly higher quality than proprietary software development."

That phrase should be reversed.

The winner is still open source (3, Insightful)

abolitiontheory (1138999) | more than 5 years ago | (#23435518)

Does anybody see that these results are in still favor of open source? The fact is, it's actually a beautiful thing that the difference in quality is marginal. This equality then becomes the rubric by which to judge other elements of the design process, and choices about whether to develop and deploy programs with open source or closed source.

People make claims about the need for closed source all the time, usually revolving around the need to a predictable level of quality, or some other factor. The fact is, this results proves that its a wash whether you choose open or closed--so why not choose open?

There's a deep significance here I'm failing to capture completely. Someone else word it better if they can. But there didn't need to be some blow-out victory of open source over closed source for this to be a victory. All open source needed to do was compare--which it did, clearly--with closed source, in terms of value, to secure its worth.

Re:The winner is still open source (1)

bugs2squash (1132591) | more than 5 years ago | (#23435780)

I read the article, I understood most of it (I think). I agree, as I read it I was constantly thinking...

1) Closed source does not seem to justify it's price tag based on better quality

2) Windows seemed to be damned by feint praise from time to time (the comments were great - they were spell checked...)

How can I rid myself from the gut feeling that the author had an agenda ?

For all I know this is a scholarly and unbiased report, but it's an article of faith rather than any conviction I got from reading TFA.

What I most liked about it were some of the pointers indicating what to look for in good coding. I hope he never runs this report on any of my code !

So.... (2, Interesting)

jellomizer (103300) | more than 5 years ago | (#23435574)

The way you choose to license your software doesn't coralate with software quality... Seems logical to me. As how you license your software has very little to do about the code inside the OS.

Closed Source Developer: I will try to do my best job as I possibly can so I can keep my job and make money because that is what I value.

Open Source Developer: I will try to do my best job as I possibly can so I can help the comunity and feel better about myself/get myself noticed in the comunity/Something cool to put on my resume... because that is what I value.

People who choose to license their software OpenSource vs. Closed Source says nothing about their programming ability. There are a bunch of really crappy GNU projects out there as well as a bunch of crappy closed source projects... Yea there is the argument of millions of eyes fixing problems but really when you get millions of people looking at the same thing you will get good and bad ideas so the more good ideas you get the more bad ideas you get and the more people involved the harder it gets to weed out good ones and bad ones. Closed source is effected often by a narrow level of control where bad ideas can be mandated.... All in all everything really ballances out and the effects of the license are negledgeable.

Re:So.... (1)

jgarra23 (1109651) | more than 5 years ago | (#23436324)


People who choose to license their software OpenSource vs. Closed Source says nothing about their programming ability. There are a bunch of really crappy GNU projects out there as well as a bunch of crappy closed source projects... Yea there is the argument of millions of eyes fixing problems but


You hit the nail on the head, there are millions of eyes looking at the code but how many of those eyes are in the heads of idiots? Probably more than anyone thinks- how many people still buy Britney Spears albums (or ever have)? There are plenty of idiots in the world and every open source project has the same proportion of them... not saying that closed source is any better but this failed argument of millions of eyes is such propaganda... it does nothing to help OSS by perpetuating that myth :)

An interesting point.. (5, Interesting)

SixDimensionalArray (604334) | more than 5 years ago | (#23435680)

I haven't seen anybody else comment on the fact that the statement that the quality of the code had more to do with the engineering than the process through which the code was developed is quite interesting.

From my personal experiences, it typically seems code is written to solve a specific need. Said another way, in the pursuit of solving a given problem, whatever engineering is required to solve the problem must be accomplished - if existing solutions to problems can be recognized, they can be used (for example, Gang of Four/GOF patterns), otherwise, the problem must have a new solution engineered.

Seeing as how there are teams successfully developing projects (with both good, and bad code quality) using traditional OO/UML modeling, the software development life-cycle, capability maturity model, scrum, agile, XP/pair programming, and a myriad of other methods, it would seem to be that what the author is saying is, it didn't necessarily matter which method was used, it was how the solution was actually built (the.. robustness of the engineering) that mattered.

Further clarification on the difference between engineering and "process" would strengthen this paper.

I went to a Microsoft user group event some time ago - and the presenter described what they believed the process of development of code quality looked like. They suggested the progression of code quality was something like:
crap -> slightly less crappy -> decent quality -> elegant code.

Sometimes, your first solution at a given problem is elegant.. sometimes, it's just crap.

Anyways, just my two cents. Maybe two cents too many.. ;)

SixD

What I took from it was: (2, Interesting)

jabjoe (1042100) | more than 5 years ago | (#23435762)

"Linux excels in various code structure metrics, but lags in code style. This could be attributed to the work of brilliant motivated programmers who aren't however efficiently managed to pay attention to the details of style. In contrast, the high marks of WRK in code style and low marks in code structure could be attributed to the opposite effect: programmers who are efficiently micro-managed to care about the details of style, but are not given sufficient creative freedom to structure their code in an appropriate manner. "

How ever I was left wondering how it was possible to compare fairly? He already stated:

"Excluded from the kernel code are the device drivers, and the plug-and-play, power management, and virtual DOS subsystems. The missing parts explain the large size difference between the WRK and the other three kernels."

and reading I see even more of the drivers aren't there:

"The NT Hardware Abstraction Layer, file systems, network stacks, and device drivers are implemented separately from NTOS and loaded into kernel mode as dynamic libraries. Sources for these dynamic components are not included in the WRK. "

http://www.microsoft.com/resources/sharedsource/licensing/researchkernel.mspx [microsoft.com]

So it's not like for like. Maybe you would draw different conclusions if it was, maybe the Linux style issue is because of all the drivers the WRK lacks. So even though I think his conclusion sounds probable, I don't feel I can state it as so with any confidence.

some functions missing, (0)

Anonymous Coward | more than 5 years ago | (#23435772)

such as:

define how_good
return crash count
end

now that would give you a winner,

Stupid metrics (3, Interesting)

Animats (122034) | more than 5 years ago | (#23435880)

The metrics used in this paper are lame. They're things like "number of #define statements outside header files" and such.

Modern code quality evaluation involves running code through something like Purify, which actually has some understanding of C and its bugs. There are many such tools. [wikipedia.org] This paper is way behind current analysis technology.

Re:Stupid metrics (3, Insightful)

Diomidis Spinellis (661697) | more than 5 years ago | (#23436214)

It took me about two months of work to collect these metrics. Yes, running in addition the code of the four kernels through a static analysis tool would have been even better, but this would have been considerably more work: You need to adjust each tool to the peculiarities of the code, add annotations in the code, weed out false positives, and then again you only get one aspect of quality, that related with bugs, like deadlocks and null pointer indirections.

Using one of the tools you propose, you will still not obtain results regarding the analysability, changeability or readability of the code.

"Code quality" is bunk (5, Interesting)

mlwmohawk (801821) | more than 5 years ago | (#23435968)

Sorry, I've been in the business for over 25 years and had to hear one pin head after another spout about code quality or productivity. Its all subjective at best.

The worst looking piece of spaghetti code could have fewer bugs, be more efficient, and be easier to maintain than the most modular object oriented code.

What is the "real" measure of quality or productivity? Is it LOC? No. Is it overall structure? no. Is it the number of "globals?" maybe not.

The only real measure of code is the pure and simple darwinian test of survival. If it lasts and works, its good code. If it is constantly being rewritten or is tossed, it is bad code.

I currently HATE (with a passion) the current interpretation of the bridge design pattern so popular these days. Yea, it means well, but it fails in implementation by making implementation harder and increasing the LOC benchmark. The core idea is correct, but it has been taken to absurd levels.

I have code that is over 15 years old, almost untouched, and still being used in programs today. Is it pretty? Not always. Is it "object oriented" conceptually, yes, but not necessarily. Think the "fopen,"fread," file operations. Conceptually, the FILE pointer is an object, but it is a pure C convention.

In summation:
Code that works -- good.
Code that does not -- bad.

Re:"Code quality" is bunk (3, Insightful)

Llywelyn (531070) | more than 5 years ago | (#23436278)

There is a company that, at the heart of their business, exists a 6000 line SQL statement that no one understands, no one can modify, and occasionally doesn't work without anyone knowing why but a restart of the program seems to take care of it.

It has lasted that way for a very very long time.

Is it good code simply as function of its survival and (sort of) working?

I tend to think of good code like good engineering or good architecture. Surely you wouldn't define good architecture as "a building that remains standing," would you? The layout of the rooms, how well that space is used, how well it fits the needs of the users, how difficult it is to make modifications, etc all factor in to "good design" and have nothing to do with whether the building "works."

I am not sure you can put a metric to it anymore than I could put a metric to measuring the quality of abstract expressionism or how well a circuit is laid out--there may be metrics to aid in the process, but in the end one can't necessarily assign a numerical rating to the final outcome for the purpose of rating.

That doesn't mean that there isn't such a thing as good quality and bad quality code.

Re:"Code quality" is bunk (2, Interesting)

mlwmohawk (801821) | more than 5 years ago | (#23436514)

Is it good code simply as function of its survival and (sort of) working?

"sort" of working is not "working."

exists a 6000 line SQL statement that no one understands

This is "bad" code because it needs to be fixed and no one can do it.

Surely you wouldn't define good architecture as "a building that remains standing,"

I'm pretty sure that is one of the prime criterion for a good building.

Your post ignores the "works" aspect of the rule. "Works" is subtly different than "functions." "Works" implies more than merely functioning.

Re:"Code quality" is bunk (3, Interesting)

Diomidis Spinellis (661697) | more than 5 years ago | (#23436462)

Coding to achieve some code quality metrics is dangerous, but so is saying that code that works is good. Let me give you two examples of code I've written long time ago, and that still survives on the web.

This example [ioccc.org] is code that works and also has some nice quality attributes: 96% of the program lines (631 out of the 658) are comment text rendering the program readable and understandable. With the exception of the two include file names (needed for a warning-free compile) the program passes the standard Unix spell checker without any errors.

This example [ioccc.org] is also code that works, and is quite compact for what it achieves.

I don't consider any of the two examples quality code. And sprucing bad code with object orientation, design patterns, and a layered architecture will not magically increase its quality. On the other hand, you can often (but now always) recognize bad quality code by looking at figures one can obtain automatically. If the code is full of global variables, gotos, huge functions, copy-pasted elements, meaningless identifier names, and automatically generated template comments, you can be pretty sure that its quality is abysmal.

Metrics (0)

Anonymous Coward | more than 5 years ago | (#23436204)

A couple of issues with your metrics:

1. "Figure 6: Common coupling at file and global scope."

Shouldn't the "coupling" be weighted by the actual "scope" size? For example:

- Having a small file with lots of file global symbols shouldn't increase complexity much.
- Having 1 global symbol in all files is far better than 10 global symbols in 1/10 of the files.

2. "Strictly structured functions are those following the rules of structured programming: a single point of exit and no goto statements."

Having lots of gotos/labels doesn't increase the code complexity per se. Having nested gotos, mix of gotos jumping forward and backwards, intermixed gotos and labels does. For example, this is simple:

goto a; ...
goto b; ...
goto c;

a: ...
b: ...
c: ...

while this is unreadable (but uses the same number of labels and goto statements):

b: ...
goto c; ...
a: ...
goto b;

c: ...
goto a;

b: ...

Rui

Additionally, GOTOs in Linux (0)

Anonymous Coward | more than 5 years ago | (#23436554)

are as far as modules are concerned then BEST way to handle partial setup in the module loading when the module doesn't register fully and must unload itself.

That's going to up the GOTO count in Linux, especially since drivers were left out of Windows.

This is virtually baseless (2, Insightful)

malevolentjelly (1057140) | more than 5 years ago | (#23436276)

It's a well known fact that code will always resemble the institution that produced it, to some extent. To describe the Microsoft code as "poorly structured" is likely a bit out of touch.

The absolutely best kernel code is generally extremely beautiful and descriptive when dealing with the system's abstracts (with nice, long descriptive names for each function) and then unbelievably hellish and ugly in the sections that deal with hardware. Kernels represent an intersection between the idealistic system code and the hideously complex and inhuman machine interaction code. For this reason, we gauge the value of the systems based on how cleanly they compile into assembly, their performance, and ideally how well they do what they were written to do.

Kernel code fills such a complex role in the computer science paradigm that it is likely impossible to gauge the value or quality of any of them through any sort of automated means. What we have here is a mess of a research paper that comes to no obvious conclusions because they didn't really discover anything. If it were of any value, its final summary and conclusions wouldn't be so obfuscated. The researcher may or may not have mastered the art of understanding the zeitgeist of kernels but he certainly hasn't mastered the research paper.

Analogy (1)

TheLink (130905) | more than 5 years ago | (#23436482)

By the time you get a car/plane that's accepted by the "market", the insides are going to be of a certain quality, no matter what the process used.

Did I get that right? :)

So now the question is how long did it take to get that level of quality. And maybe there is a difference in quality but the measurement used is not sensitive enough, or not as appropriate or his conclusion isn't quite correct - he measured a difference, just didn't show in his conclusion ;).
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...