Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Serious Bug In 2.4.15/2.5.0

timothy posted more than 12 years ago | from the aim-the-bugspray dept.

Linux 498

John Ineson writes: "There is a bug in the latest kernel releases, that causes fs corruption on umount. A lot of people have already been hit by this, so for now I suggest you hold fire on booting those new kernels. More dead-duck than greased-turkey. Two possible fixes are being discussed on linux-kernel." Colin Bayer adds links to a story at the Register and Al Viro's fix. Update: 11/25 00:39 GMT by T : Tarkie writes "Linux 2.4.16-pre1 is out, as detailed at NewsForge. If you've been having the filesystem corruptions, might be worth a try so that 2.4.16 can be out ASAP!"

Sorry! There are no comments related to the filter you selected.

Filesystems (2)

fishebulb (257214) | more than 12 years ago | (#2607035)

From the looks of the post this bug occurs regardless of filesystem. Is that accurate? or would certain fs's be unaffected, im guessing that it doesnt matter, anyone care to clarify that

Re:Filesystems (5, Informative)

MShook (526815) | more than 12 years ago | (#2607047)

You're correct, it is regardless of filesystem. If you happen to be running 2.4.15 or 2.5.0, just remember to force a fsck for the next reboot (shutdown -F) that's the only way to clear the fs because it will be marked clean even if it's not). Right now, the developpers don't know how reseirfs would deal with this bug...

Re:Filesystems (5, Interesting)

Colin Bayer (313849) | more than 12 years ago | (#2607054)

It afflicts every filesystem. However, rebooting with the file /forcefsck extant forces it to run an fsck (and fix the corruption) on boot.

Also of help might be the Alt+SysRq keys; if you sync the drives and unmount them in single user mode before reboot, you should reduce or eliminate the corruption.

Alan Cox (1, Funny)

linux_warp (187395) | more than 12 years ago | (#2607036)

Good thing we have alan cox who tries to keep his tree somewhat more stable. Anyone know if his kernels were affected?

Re:Alan Cox (4, Interesting)

linux_warp (187395) | more than 12 years ago | (#2607049)

Also, straight out of alans diary:

September 29th - Much kernel patching going on. The -ac kernel tree seems to be turning into the stable tree as Linus merges odder, weirder and more alarming things. I just hope he knows what he is doing.
---

Sounds like confidence to me :)

Re:Alan Cox (1)

Rohan427 (521859) | more than 12 years ago | (#2607190)

Note: This is not supposed to be a stable release.

- Rohan

Stick with 2.4.15-pre8 (2, Informative)

ShawnX (260531) | more than 12 years ago | (#2607037)

No problems with this kernel pre release :)

quality assurance (1, Insightful)

xah (448501) | more than 12 years ago | (#2607038)

When are we going to start giving kernels to a QA team before releasing them?

"QA" (2, Insightful)

Anonymous Coward | more than 12 years ago | (#2607051)

The users are the QA (why do you think Linus moved to 2.4 so early? To get more testers). If you don't like being a guinea pig, then wait about a week before moving to the newest kernel. Seriously, 7 days isn't that long, and all show-stoppers will have shown up long before then.

Re:"QA" (1, Informative)

Anonymous Coward | more than 12 years ago | (#2607161)

You cannot compare Linux and FreeBSD that way. FreeBSD is a complete OS, not just the kernel.

I've NEVER seen filesystem corruption caused by my distribution (Red Hat) kernel.

Compare FreeBSD with the whole Red Hat Linux (probably the same for Debian, don't know about the others), and you'll see neither have this sort of problems.

Re:quality assurance (0, Interesting)

Anonymous Coward | more than 12 years ago | (#2607065)

As soon as you get a professional group of developers who are compensated for their work. You can't rely on the garbage thats being churned out be the long-haired communist hippies that are in charge now.

Re:quality assurance (1)

Sj0 (472011) | more than 12 years ago | (#2607219)

As soon as you get a professional group of developers who are compensated for their work. You can't rely on the garbage thats being churned out be the long-haired communist hippies that are in charge now.

Let he who has never had windows corrupt a part of a FAT partition bitch about how it's because they don't get paid. The rest of us will deal with it as a part of trying new things (innovating, if you will) is making mistakes and fixing them, and anyone who runs out and installs the latest patch the second it's posted is asking for trouble. If it ain't broke, don't fix it.

this is not flamebait (0)

xah (448501) | more than 12 years ago | (#2607092)

My post is not flamebait. It's an honest, legitimate question.

Re:this is not flamebait (0, Offtopic)

pjbass (144318) | more than 12 years ago | (#2607112)

They shall be meta-moderated accordingly.

Re:this is not flamebait (-1, Offtopic)

Anonymous Coward | more than 12 years ago | (#2607173)

Sometimes, you have to wonder about the moderation system, the moderators themselves and how anyone gets labelled (sp?) a "troll" and whatnot.

So many times I have seen very valid posts moderated at 0 (zero) while I've seen idiotic ones "promoted" to 1 or even 2, it kind of leaves me with a bad after-taste...

There is a need for moderation (to avoid what some little creeps started doing a few years back, when they discovered /. and started posting downright obscene (sp?) and vulgar msgs), but the end-results can be sometimes quite puzzling. And no, this is no flame-bait. I'm just pointing a fact that makes me wonder, like I said earlier.

i agree (0)

Anonymous Coward | more than 12 years ago | (#2607188)

moderators arent needed anymore. (no, im not trolling)

thats the problem (-1, Troll)

ArchieBunker (132337) | more than 12 years ago | (#2607203)

You're the QA team when running linux. Say you upgrade the kernel on a production server because of a security hole, turns out this new kernel also corrupts your data too. The truth is you're fucked. How many times has this occured with any other OS? None. Thats why I stick with commercial unix and the *BSD's.

Re:quality assurance (0)

Anonymous Coward | more than 12 years ago | (#2607239)

The Linux kernel is Free and it's open source. The users are the QA testers. If you're a user, but you don't want to be a QA tester then do what the man says and stay at least a week or so behind the latest releases. If you're running important services on your Linux box you would be wise in many cases to still be on 2.2.

The freedom of open source which we value so much necessarily includes enough freedom to shoot yourself in the foot with bleeding edge untested software.

Does anyone know... (4, Insightful)

Griim (8798) | more than 12 years ago | (#2607039)

...how something like this could have creeped in, and be missed? Was it a last-minute change that just didn't have time for testing, or was it (bad)luck-of-the-draw that no one noticed it?

Re:Does anyone know... (5, Informative)

Colin Bayer (313849) | more than 12 years ago | (#2607069)

This bug was introduced when the kernel coders were trying to fix a bug that existed earlier (but, AFAIK, didn't cause fs corruption). It was introduced in pre9, but the final kernel was released within a few hours, so I guess nobody caught it in time.

Re:Does anyone know... (0)

Anonymous Coward | more than 12 years ago | (#2607075)

It was a last minute change. In fact, it was a patch to fix a very similar problem, but the fix didn't quite solve it. So the problem was known, but the fix was untested...

Re:Does anyone know... (0)

Anonymous Coward | more than 12 years ago | (#2607187)

a last-minute change that just didn't
have time for testing,


I'm disturbed by the acceptance of such a situation, implicit in this comment. Note:
  1. This is a kernel, not a piece of user-land software.
  2. This kernel is in the stable release branch, supposedly good for enterprise computing, mission critial work--stuff where human lives are on the line.

I'm absolutely shocked that changes to the file system were not thoroughly checked for basic performance bugs. While I don't expect such testing would reveal all bugs, it should have uncovered this.

Remember back to the 2.1.x kernel days, when a new
release of the 2.1.x was out, we all took our sacrifice box and ran it? We understood and accepted that the odd-numbered branch was unstable, and used it at our own risk.

I wonder if we can't return to those days.

Linux is not mission critical software. (0)

glrotate (300695) | more than 12 years ago | (#2607200)

stuff where human lives are on the line.


You're joking right?

If you are already running it... (3, Funny)

krogoth (134320) | more than 12 years ago | (#2607041)

I recomment turning your computer off with the power switch or by unplugging it, after you've made sure you can boot an older kernel. Since umounting is done when you shut down cleanly, you don't want to do that.

Re:If you are already running it... (1)

xah (448501) | more than 12 years ago | (#2607068)

But first make sure you can boot into a different kernel next time, either from hard drive or from removable media such as floppy.

to clarify (0)

Anonymous Coward | more than 12 years ago | (#2607071)

Turn it off with the power switch *when you're going to turn it off*. There's no need to panic and start shutting it down for no reason :). I think I'll keep my box up until 2.4.16 is released and just run 2.4.15 until then. FWIW, I've already rebooted a couple times with 2.4.15, so it obviously doesn't show up for everyone (or at least not every time).

Re:to clarify (2)

krogoth (134320) | more than 12 years ago | (#2607113)

You would still have to be careful until then - people who regularly mount and unmout read/write might want to be careful. I wonder if mounting read-only would help, or if the bug is below that level (from the discussion, it doesn't sound like it)?

this post is not a troll (1)

xah (448501) | more than 12 years ago | (#2607110)

The author has very valid advice. Yet, some moderator marked this post a "troll." That's like saying Click and Clack on NPR are opponents of the automobile industry.

Make sure you have removed ext3 option, too (2, Informative)

willamowius (193393) | more than 12 years ago | (#2607198)

For those who have tried ext3 in 2.4.15:
Make sure you have reset the journaling flag on your filesystems, because your older kernel will not mount an unclean ext3 volume.

Do a "tune2fs -O ^has_journal /dev/whatever".

what? (-1)

Fucky the troll (528068) | more than 12 years ago | (#2607042)

Will somebody please buy me a rock organ for my birthday. I wanna play me some cheese.

Bad start (1)

C0vardeAn0nim0 (232451) | more than 12 years ago | (#2607046)

for the brasilian guy, hum ?

Well, this is not the first, and probably it won't be the last too, dangerous boog in the 2.4 series. IIRC (too lazy today to check) 2.4.11 is marked as "do not use" in the kernel mirrors.

When (-1, Flamebait)

Genghis Troll (158585) | more than 12 years ago | (#2607052)

has microsoft ever released a piece of software as bug-ridden as the 2.4.x linux kernel series?

Patch available from AA (0)

Anonymous Coward | more than 12 years ago | (#2607055)

There is a patch available from Andrea Arcangeli that fixes the problem. The link is http://marc.theaimsgroup.com/?l=linux-kernel&m=100 658821112994&w=2 -kurt

FS corruption? (2, Interesting)

be-fan (61476) | more than 12 years ago | (#2607057)

Dude. I hate to say this, but Windows 2000, while it may crash more, doesn't hose you're filesystem nearly as often as Linux seems to these days. At what point do we get to start making the LinSux jokes?

PS> Don't flame me please. I just wiped Win2K off my harddrive this morning. Luckily, I downloaded the 2.4.15 tree but have been too lazy to compile it yet.

Re:FS corruption? (1)

fishebulb (257214) | more than 12 years ago | (#2607077)

not a flame, but ive had more corruption on win2k. our server at work had a corrupt file that hosed backups. MS's response to getting rid of this file, format the drive. that may not count as corrupt filesystem, but it was quite a serious problem.

Personally ive never had a corrupt filesystem (that wasnt fsck fixable) in linux using ext2, reiser, or xfs.

Re:FS corruption? (2)

be-fan (61476) | more than 12 years ago | (#2607084)

Neither have I (and I track the -pre kernels), I was just being facetious. It gives MS more ammo anyway, though.

Re:FS corruption? (1)

vrmlknight (309019) | more than 12 years ago | (#2607160)

this release isnt exactly something i'd want to put on a production server im sure some of the alpha and/or internal releases of win2k or nt had problems too

beats me (0)

Anonymous Coward | more than 12 years ago | (#2607083)

I've been running Linux since the 2.0 days (and I even went through part of 2.1 and 2.3), and I'm running 2.4.15 right now (and umounted a few times with it). And I have *never* had *any* FS corruption with Linux. Although to be fair, since I've never run Win2k, it hasn't hosed my FS yet either :). So I guess the score is Linux 0 - Windows 0 at this point (unless you count the Win98 CD Install as "Windows", in which case it's like Linux 0 - Windows 3, heh). Maybe I'm just a statistical anomoly *shrug*

Re:FS corruption? (2, Insightful)

A_Non_Moose (413034) | more than 12 years ago | (#2607096)

Not going to flame you, just trying to amuse.

I thought the reason for installing *nix's was so you'd never have to shut down? Therefore this should not be a problem.

Now does this occur during *any* unmounting operation? Manually vs Shutdown?

Oh, and be-fan, don't install XP and use Ext3 (hey, that rhymes) because if XP uses your Ext3 as swap space and 2.4.15 corrupts itself...woah, double whammy.

Hey, any chance of getting iTunes 2.0 on Linux and Windows? Wanna play Russian Roulette...with an Uzi?

Whip me, beat me, make me write bad checks (or install windows...same same)

Re:FS corruption? (5, Insightful)

Jagasian (129329) | more than 12 years ago | (#2607097)

Thats funny. I have been running Debian (stable) for a long time now, and I haven't had any filesystem corruption. In fact, I haven't had the OS crash either.

Its better to compare Windows 2000 to another complete operating system, NOT a bleeding edge kernel. Compare Windows 2000 to Debian (stable), and Windows 2000 will look like a house of cards.

2.2.20 anyone? (0)

Anonymous Coward | more than 12 years ago | (#2607064)

just another reason why 2.2.20 is still the kernel of choice

Really... (2, Insightful)

J.C.B. (141141) | more than 12 years ago | (#2607067)

Isn't the 2.4 branch supposed to be stable? You know, the one that doesn't eat your disk. I think that this kernel should have gotten a little more testing for bugs of the catastrophic nature before it was deemed fit for general consumption.

NO! (2, Informative)

Anonymous Coward | more than 12 years ago | (#2607098)

This is a common misconception! 2.4 is *not* "stable"! It is "testing"! Well, now that it's split in two I suppose it can officially be called "stable" (what a bad start!), but I don't consider it stable (though I'm just a lowly AC). AFAIC, 2.2 = "stable" and 2.4 = "testing". In a month or so, things we'll change and we'll have 2.4 = "stable" and 2.5 = "experimental". Until 2.5 turns into 2.6/3.0, at which point it will be "testing", and the cycle continues :)

Re:Really... (2, Interesting)

Colin Bayer (313849) | more than 12 years ago | (#2607115)

Just like so many things with Linux, you'd think that this would be true, but it isn't. ;) There are two trees that are in development at all times:

The "stable" tree (which has an even minor version number), and the "development" tree (which has an odd minor).

When kernel 2.2n.0 (n being a non-negative integer, in this case, 2) is released, development stops on 2.2n-1.x, and all newly-submitted-and-approved-by-Linus patches are applied to the 2.2n.x tree (because 2.2n-1.x is out of date). While 2.2n.x is still called the stable tree, it becomes the development tree (because some of the newly-patched stuff is untested, and there's no "development" tree to put it in). The "stable" role falls back to 2.2(n-1).0, in this case, the 2.2.x tree.

As far as this goes, it was a stroke of bad luck and hurried timing that this bug wasn't ironed out in 2.4.15-pre9 before it went final (and somewhat of a stroke of stupidity on the parts of the early adopters, myself included).

When 2.2n+1.0 is released, 2.2n continues development, making it the stable tree. Any fixes to bugs found in the 2.2n+1.x tree are back-merged to the current stable tree so that end-users can enjoy a stable, debugged kernel without riding the bleeding edge.

The problem with the Linux kernel numbering system is that the "stable" tree is only stable when there's a "development" tree to test the uncharted waters for it... if there isn't one, it's best to stay back a few revisions unless you like running fsck. ;)

Re:Really... (5, Insightful)

Anonymous Coward | more than 12 years ago | (#2607155)

(Inven: r-r, * to see, ESC) Wear/Wield which item? r
You are wielding a Rant Stick (1d2) (+0,+0) (*slay* kernel developer)(a).

It's not so much that it wasn't stable enough when it was released, but rather that they keep messing with 2.4 instead of making a 2.5. I think maybe Linus had this idea (at the end of 2.3) that the developers could focus on fixing bugs and make 2.4 really great. Unfortunately, they're volunteer developers, so they're working on things that excite them, which means insane stuff like VM rewrites and other "hey, let's try this" changes.

This is why I still use 2.2 and will until there has been a 2.5 for a while (so the developers have a place to try their unstable new ideas) and 2.4 has gone into "bug-fix" mode (like 2.2 is now). It's really annoying, because I want some of the new features of 2.4 (the ones introduced back in 2.3), but can't afford to have the thing crashing on me, and don't want to spend a long time looking for a stable 2.4.X.

Maybe next time, Linus won't wait so long to introduce a development version, or will at least refuse anything but bugfixes in so-called "stable" branches. Still, despite my complaining, I am happy that people have gone through all the trouble to write the Linux kernel, and will try to remember that. :)

I know I'm going to get modded down (-1, Flamebait)

Anonymous Coward | more than 12 years ago | (#2607076)

but it has to be said: this would never happen with windows since MS have proper testing procedures.
Until Linux goes commercial, it'll never succeed.

Re:I know I'm going to get modded down (1)

tollieman (243634) | more than 12 years ago | (#2607085)

Well a least it is more secure...

Re:I know I'm going to get modded down (0)

Anonymous Coward | more than 12 years ago | (#2607090)

No distro has used this buggy kernel yet. If Red Hat had already released its latest edition using 2.4.15 I could see your point. But at this point the only people who have upgraded to 2.4.15 are the power users who like living on the edge; there is no comparison to the closed-source world except possibly to Windows developers who constantly install the latest betas.

Don't throw stones in Glass Houses (1, Insightful)

Anonymous Coward | more than 12 years ago | (#2607078)

Hmm..wow this is a serious bug alright.

Yet there's no snide commentary from the editors whenever something like this happens with Microsoft (M$ to all the haters) software.

Maybe you zealots will realize that nobody is perfect, and open-source is not necessarily better than closed-source.

This also makes a case for not announcing new kernels not slashdot (aka not freshmeat). Most people here are linux newbie wannabees so they're not the most qualified people to be running the latest and greatest kernels.

Re:Don't throw stones in Glass Houses (0)

Anonymous Coward | more than 12 years ago | (#2607108)

The newbies won't know how to recompile their kernels anyway. Anyone who upgrades to a kernel within a week of its release knows that this kind of thing sometimes happens.

Re:Don't throw stones in Glass Houses (2)

toast0 (63707) | more than 12 years ago | (#2607151)

how hard is it really to compile a kernel?

download the source, read teh kernel-howto, go through the menu (or x config), make bzImage, etc

then repeat as necessary to get it to boot properly (ide root drive, load ide driver as module is always a good combo :)

Re:Don't throw stones in Glass Houses (3, Interesting)

fishebulb (257214) | more than 12 years ago | (#2607117)

yes this is quite a serious bug, but 2 things set this apart from MS. It will be fixed within 24-48 hours. The frequency of these bugs are a bit smaller than MS's bug of the day (which very often are large holes).

actually... (0)

Anonymous Coward | more than 12 years ago | (#2607137)

...24-48 *minutes* is more like it ;). There are already a few patches out for it.

A Workaround (4, Informative)

kanelephant (142254) | more than 12 years ago | (#2607081)

Al Viro gave this comment and workaround on lkml.
Breakage happens when you umount filesystem (_any_ local filesystem, be it ext2, reiserfs, whatever) that still has dirty inodes.

As a workaround - sync before umount (and don't boot unpatched 2.4.15/2.4.15-pre9 again, obviously).

IOW, if you are running 2.4.15 - build a patched kernel, install it and do the following:
* switch to single-user
* sync
* umount everything non-busy
* remount the rest read-only
* turn the thing off
* boot with patched kernel or with anything before 2.4.15-pre9

The filesystem corruption can be fixed by a forced fsck. (The fsck must be forced since the filesystem is marked clean.)

Re:A Workaround (1)

Peyna (14792) | more than 12 years ago | (#2607123)

fsck on a 30 gig drive for each unmount isn't my idea of fun.

Re:A Workaround (1)

toast0 (63707) | more than 12 years ago | (#2607163)

yes, but if you compile a patched kernel, you'll only have to do it once.

when was the last time you fscked your drive? maybe its prime time for you to do so (all sorts of things can cause fs corruption without marking the drive unclean... happens more often with poor/overstressed components, but it can happen anyhow)

Re:A Workaround (3, Informative)

kanelephant (142254) | more than 12 years ago | (#2607128)

sorry I didnt make that clear. If you follow the above advice you should not get any filesystem corruption. The last line is what to do if you have already got a corrupt filesystem!

Re:A Workaround (3, Interesting)

Anonymous Coward | more than 12 years ago | (#2607159)

The strange thing is, out of habit (years ago, you always had to remember to sync on Unix, and due to a bug, you always had to sync more than once), I always sync, sync, sync, umount...

Re:A Workaround (0)

Anonymous Coward | more than 12 years ago | (#2607180)

I thought I was the only one.

the patch from the kernel list (4, Informative)

MentlFlos (7345) | more than 12 years ago | (#2607088)

I hope /. dosent mangle this up too bad, but if it does:
http://marc.theaimsgroup.com/?l=linux-kernel&m=100 658174003122&w=2

List: linux-kernel
Subject: Re: 2.4.15-pre9 breakage (inode.c)
From: Linus Torvalds
Date: 2001-11-24 5:55:42
[Download message RAW]

On Sat, 24 Nov 2001, Andrea Arcangeli wrote:
>
> --- 2.4.15pre9aa1/fs/inode.c.~1~ Thu Nov 22 20:48:23 2001
> +++ 2.4.15pre9aa1/fs/inode.c Sat Nov 24 06:30:20 2001
> @@ -1071,7 +1071,7 @@
> if (inode->i_state != I_CLEAR)
> BUG();
> } else {
> - if (!list_empty(&inode->i_hash) && sb && sb->s_root) {
> + if (!list_empty(&inode->i_hash)) {
> if (!(inode->i_state & (I_DIRTY|I_LOCK))) {
> list_del(&inode->i_list);
> list_add(&inode->i_list, &inode_unused);

I have to say that I like this patch better myself - the added tests are
not sensible, and just removing them seems to be the right thing.

Linus

Re:the patch from the kernel list (2)

MentlFlos (7345) | more than 12 years ago | (#2607167)

of course something would go wrong.... that link is whacked out.

I'm not sure where its going (vs where I was).

I'm about to try and apply that patch and see what happens.

The discussion isn't over (4, Informative)

Carnage4Life (106069) | more than 12 years ago | (#2607191)

The last post in that thread is this one by Andrea Arcangeli [theaimsgroup.com] sometime this morning and from the looks of things (if you read the entire thread) there is conflict between Alexander Viro and Andrea on which is the better solution.

Linus saying he prefers a patch on an initial viewing isn't the end of the situation for now. I'd suggesting waiting a week and revisiting the thread to find out what the final word was.

Re:The discussion isn't over (1)

Lord Omlette (124579) | more than 12 years ago | (#2607227)

No room for credit in your sig?

HA HA! (-1)

Guns n' Roses Troll (207208) | more than 12 years ago | (#2607091)

I'd never trust my systems to use Linux. Windows 2000 for me, thanks.

Possible Fixes? (-1, Flamebait)

Anonymous Coward | more than 12 years ago | (#2607100)

How long has this bug been known about? More than 24 hours? If so, all the /.-ers need to apologize for the racking over the coals given to Apple for its MINOR install bug with iTunes. Sounds like if boot Linux loose all your Natilie Portman pictures and hot grits recipes.

Lets see if the open source community can solve a MAJOR file system bug in 24 hours. The same amount of time Apple typically corrects its problems. Boy, I feel sorry for anyone that lost its data to an open source OS. Makes the proprietary Mac OS X look real nice right now. Its never has had a file system bug of this magnitude. Now that I think about it neither has Microsoft. Man it must suck to be a 2.4-2.5 kernel user right now. So much for the superiority of Open Source. Just think, Red Hat's offer to install Linux in every classroom could have changed the standard excuse, "The dog ate my homework" to "The Linux corrupted my homework files".

Oops! I said something negative about GNU/Linux, guess this will be modded down. Got to love those file trashing open source operating systems and the holier than thou users (Is Stallman Pope of the movement yet?).

It's already been fixed. (0)

Anonymous Coward | more than 12 years ago | (#2607126)

There's already a fix for this problem.
You have either not read the message correctly, or you are an anti-open source zealot.

Re:It's already been fixed. (0)

Anonymous Coward | more than 12 years ago | (#2607147)

Are these official fixes incorporated into the kernel tree or just user hacks and are you foolish enough to trust anyone that posts kernel code to a web page without peer review and QA testing? Maybe that makes you a blind and stupid open source zealot

Re:Possible Fixes? (0)

bonch (38532) | more than 12 years ago | (#2607202)

I'll bite. I'm bored.

>Lets see if the open source community can solve >a MAJOR file system bug in 24 hours. The same >amount of time Apple typically corrects its >problems.

There's already a patch. Oops!

>Boy, I feel sorry for anyone that lost its data >to an open source OS.

Nobody did. Hell, a forced fsck fixes a corrupted fs right up, and the patch is already out. Problem solved.

>Makes the proprietary Mac OS X look real nice >right now. Its never has had a file system bug >of this magnitude. Now that I think about it >neither has Microsoft.

The FAT filesystem design itself is a design flaw. Micros~1?

>Man it must suck to be a 2.4-2.5 kernel user >right now.

No, it doesn't, but thanks for inquiring.

>So much for the superiority of Open Source.

When has Microsoft announced a major flaw and fixed it in so little time? Have you forgotten how they've kept secret MAJOR bugs until other companies finally revealed them? Oops!

>Just think, Red Hat's offer to install Linux in >every classroom could have changed the standard >excuse, "The dog ate my homework" to "The Linux >corrupted my homework files".

Heh. That was actually kind of funny.

>Oops! I said something negative about GNU/Linux, >guess this will be modded down.

Nobody has a problem with constructive criticism. But you're obviously just trying to push some buttons because you need the attention for whatever reasons.

- bonch
stay animated

Strange (5, Insightful)

imrdkl (302224) | more than 12 years ago | (#2607101)

that a successful reboot of the system running the kernel is not in the regression suite. Does this error occur on every architecture?

Re:Strange (2, Informative)

Colin Bayer (313849) | more than 12 years ago | (#2607124)

Does this error occur on every architecture?

Yep... since the affected files are in fs/, not arch/*, it's an architecture-independent problem. Good thing I have the Magic SysRq enabled. ;)

This is why I use FreeBSD (4, Insightful)

cperciva (102828) | more than 12 years ago | (#2607118)

Come on guys, nobody is going to take linux seriously as long as problems like this -- or the VM saga -- keep popping up in supposedly stable kernels. FreeBSD has no trouble keeping separate -CURRENT and -STABLE trees; why can't linux do the same?

Re:This is why I use FreeBSD (1)

dmelomed (148666) | more than 12 years ago | (#2607139)

Yes, Linux developement needs better QA akin to BSDs.

Re:This is why I use FreeBSD (2)

Colin Bayer (313849) | more than 12 years ago | (#2607157)

Come on guys, nobody is going to take linux seriously as long as problems like this -- or the VM saga -- keep popping up in supposedly stable kernels.

Read my earlier post on the subject. This is not a stable kernel, as there is no development tree to iron out all the bugs. If you ask me, anyone who upgraded to something as bleeding-edge and untested as this (myself included) deserves to get burned a little bit as a warning that you don't really need the newest kernel. ;)

Re:This is why I use FreeBSD (1)

Colin Bayer (313849) | more than 12 years ago | (#2607165)

Eep... didn't mean to give myself +1. D'oh!

Re:This is why I use FreeBSD (1)

drsoran (979) | more than 12 years ago | (#2607208)

Because this was a merge between a development and a stable tree. For some reason someone had the bright idea to make them equivalent. 2.4.14 however has been fine for me. 2.4.15 should have just been renamed 2.5.0 and left it at that. Then if it caused corruption, so be it.. it is development.

Real fix (-1, Flamebait)

Anonymous Coward | more than 12 years ago | (#2607119)

real fix is to install windows. Linux is a piece of shit.

Ok so Apple isn't the only one to screw up (1, Insightful)

geek (5680) | more than 12 years ago | (#2607131)

In your face! I sat here and read all the flames to apple about the iTunes screw up, and here we are with one just as big and glaring from the kernel developers themselves.

Hypocrites!!!!!!!

Mod me down all you like (0, Offtopic)

geek (5680) | more than 12 years ago | (#2607143)

You assholes just keep showing your true colors

Re:Ok so Apple isn't the only one to screw up (1)

Colin Bayer (313849) | more than 12 years ago | (#2607144)

The iTunes bug was a broken shell script; this is a bug in some obscure, not-very-often-modified C code in the core of the Linux kernel... there's a big difference.

actually no there isn't (1)

geek (5680) | more than 12 years ago | (#2607153)

FS corruption is FS corruption, you can't justify it at all, especially since this is supposed to be a stable kernel. It can't be all that obscure if it's changed enough since the last revision to do damage now could it?

IMPOSSIBLE!!! (-1, Flamebait)

Anonymous Coward | more than 12 years ago | (#2607135)

Linux *never* has bugs, only Microsoft products have those! Right?

fucking Linux jackasses!

patch barfs on pristine 2.4.15 source (0)

Anonymous Coward | more than 12 years ago | (#2607138)

patching file fs/inode.c
patch: **** malformed patch at line 11: {

I'll just wait for 2.4.16 and 2.5.1

Patch download here (4, Informative)

DeeKayWon (155842) | more than 12 years ago | (#2607243)

The mailing list converted tabs into spaces, causing patch to choke. Get the patch here [usask.ca] .

reponsibility (0)

Anonymous Coward | more than 12 years ago | (#2607145)

I may be wrong about this, but isn't that Brazillian chap Marcelo in charge of this tree now? Not a good start if it is him, but everybody is entitled to a cock up once in a while...

stable vs. unstable (0)

Anonymous Coward | more than 12 years ago | (#2607146)

2.4.15 (stable): corrupted
2.5.0 (dev): corrupted
2.4.15-pre9: corrupted
2.4.13-ac8: corrupted
...
Where's Alan cox when we need him?

Yeah (1)

Evro (18923) | more than 12 years ago | (#2607177)

Welcome to "why not to grab every new piece of software as soon as it's released". Other examples include Apple's iTunes 2 and MS Windows 98 (first edition). Also works for hardware, and heck, even cars (first year of a new model is usually riddled with problems).

Censorship (-1, Troll)

Anonymous Coward | more than 12 years ago | (#2607179)

Slashdot censors posts! Why? Hmm...it will be nice to see all you slashdot fags lose your jobs. You know its coming.

And if this were Windows... (0)

Anonymous Coward | more than 12 years ago | (#2607182)

people would be pointing and screaming and wringing their hands and making all kinds of horror-noises about the Gates Borg Collective.

Hypocritical.

How can this be avoided in the future? (3, Insightful)

imrdkl (302224) | more than 12 years ago | (#2607186)

Can someone give a joe-user guide to helping test new kernels?

Sad (1, Troll)

Platinum Dragon (34829) | more than 12 years ago | (#2607193)

I have to wonder, with all the bizarre bugs that have been creeping into "stable" kernels... are they even being tested before release, or is Linus just slapping on some patches and putting out a new kernel as 2.4. instead of 2.4.-prewhatever?

The latter would seem to indicate frustration and burnout on his part.

No they are not tested. (0)

glrotate (300695) | more than 12 years ago | (#2607214)

If you read the posts it clearly states that they didn't have time to test.

I'm sorry but I don't remember a single fs corruption bug as serious as this from MS. Linus needs to get his priorities refocused.

What I don't get ... (3, Insightful)

pauljlucas (529435) | more than 12 years ago | (#2607195)

... is why there seems to exist this rampant tendency among Linux-folk to upgrade one's kernel constantly. Unless a new kernel solves a problem you have, there is no reason to upgrade.

I told you so... (-1, Flamebait)

Anonymous Coward | more than 12 years ago | (#2607206)

Let's have a close look at the costs involved when running a Linux system.

An important factor in Linux' cost is its maintenance. Linux requires a *lot* of maintenance, work doable only by the relatively few high-paid Linux administrators that put themselves - of course willingly - at a great place in the market. Linux seems to be needing maintenance continuously, to keep it from breaking down.

Add to this the cost of loss of data. Linux' native file system, EXT2FS, is known to lose data like a firehose spouts water when the file system isn't unmounted properly. Other unix file systems are much more tolerant towards unexpected crashes. An example is the FreeBSD file system, which with soft updates enabled, performance-wise blows EXT2FS out of the water, and doesn't have the negative drawback of extreme data loss in case of a system breakdown.

According to Linux advocates, an alternative to EXT2FS would be ReiserFS. Unfortunately, ReiserFS is still in beta stage. This means it is not intended for production use (although according to many Linux advocates this shouldn't be a problem, which makes me wonder how (little) valuable they find your data).

The other proposed 'solution', EXT3FS, is nothing more than an ugly hack to put journaling into the file system. All the drawbacks of the ancient EXT2FS file system remain in EXT3FS, for the sake of 'forward- and backward compatibility'. This is interesting, considering that the DOS heritage in the Windows 9x/ME series was considered a very bad thing by the Linux community, even though it provided what could be called one of the best examples of compatibility, ever. When it's about Linux, compatibility constraints don't seem to be that much of a problem for Linux advocates.

Back to Linux' cost. Factor in also the fact that crashes happen much more often on Linux than on other unices. On other unices, crashes usually are caused by external sources like power outages. Crashes in Linux are a regular thing, and nobody seems to know what causes them, internally. Linux advocates try to hide this fact by denying crashes ever happen. Instead, they have frequent "hardware problems".

The steep learning curve compared to about any other operating system out there is a major factor in Linux' cost. The system is a mix of features from all kinds of unices, but not one of them is implemented right. A Linux user has to live with badly coded tools which have low performance, mangle data seemingly at random and are not in line with their specification. On top of that a lot of them spit out the most childish and unprofessional messages, indicating that they were created by 14-year olds with too much time, no talent and a bad attitude.

I could go on and on and on, but the conclusion is clear. Linux is not an option for any one who seeks a professional OS with high performance, scalability, stability, adherence to standards, etc.

How Extensive Is This? (2, Interesting)

grahamkg (5290) | more than 12 years ago | (#2607216)

I had an fs corruption with RH 7.2, using the kernel that came with the distro. It trashed the geometry of an entire drive. I was using a combo of ext2 and ext3 on the drive. I didn't lose anything, as I backup my system regularly.

I've since migrated to Mandrake 8.1, which is much more solid than RH 7.2. Yet, it too runs a 2.4 kernel variant. This distro on one boot failed to recognize the ext3 partitions. I migrated all of the ext3 partitions back to ext2.

I'd be very interested in learning if this is a problem that extends far back into the kernel tree.

What do I do now? (1)

bluelarva (185170) | more than 12 years ago | (#2607220)

I compiled and ran 2.4.15 for few hours and now I'm back to 2.4.14. As for me it appears my file system is intact. At least I don't think I did. How do I know for sure my files won't disappear on me? What kind of error messages would I see if my file system is corrupt? How do I correct it?

Quality testing (1)

Cyclone66 (217347) | more than 12 years ago | (#2607233)

Ok, no if this is still in "testing" forgive me.

Should this darn thing be tested before it's released as "final"! I mean just a few weeks ago you guys were bashing Apple for their iTunes install that wrecked the hard drive, and now you're just coming up with solutions. How bout complaints? How about "This code should never have been released with such a serious bug". Again if this is test code then fine, it comes with the territory. Even if it's "Implied" test code, that's not good enough.

"We" are the QA team! (0)

Anonymous Coward | more than 12 years ago | (#2607242)

I say kernel development in stable/testing/unstable branch.

Agree?

Things are working right not wrong: (5, Insightful)

amccall (24406) | more than 12 years ago | (#2607246)

I've already seen 2 posts refering to "QA" and keeping the kernel stable, etc... If you are going to try the latest version of each package that comes out, you are going to get burned.

This is one reason why distributions are so important. They do the QA, they make sure packages are stable, they apply the patches. If you want to download and run the latest edition of every package out, including the kernel, then you should expect some bumps in the road, because you are beta testing - even on a "stable" kernel series. Remember: release early, release often. You will have to do the QA, you will have to apply the patches, you will be burned. Some people like doing this to stay on the bleeding edge, others are a bit more cautious.

If you want stable, solid kernels, that are heavily QA'd wait for packages to come out. Otherwise, post a bug report, and quit whining.

Another kind of trojan horse (0)

Anonymous Coward | more than 12 years ago | (#2607248)

Do you want to create a trojan horse? Just "contribute" malicious code to the Linux kernel. It's a sure bet!
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?