×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Is ext4 Stable For Production Systems?

Soulskill posted more than 4 years ago | from the work-in-progress dept.

Operating Systems 289

dr_dracula writes "Earlier this year, the ext4 filesystem was accepted into the Linux kernel. Shortly thereafter, it was discovered that some applications, such as KDE, were at risk of losing files when used on top of ext4. This was diagnosed as a rift between the design of the ext4 filesystem and the design of applications running on top of ext4. The crux of the problem was that applications were relying on ext3-specific behavior for flushing data to disk, which ext4 was not following. Recent kernel releases include patches to address these issues. My questions to the early adopters of ext4 are about whether the patches have performed as expected. What is your overall feeling about ext4? Do you think is solid enough for most users to trust it with their data? Did you find any significant performance improvements compared to ext3? Is there any incentive to move to ext4, other than sheer curiosity?"

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

289 comments

Risk Vs Benefits Analysis (5, Insightful)

eldavojohn (898314) | more than 4 years ago | (#28150149)

Is ext4 Stable For Production Systems?

Probably.

Is there any incentive to move to ext4, other than sheer curiosity?

Ok so I'm gussing production = income = your ass? Let me turn your question back to you by asking, "What is driving this need to move to ext4?" Because so far, all you've told me is that you are considering risking your ass for sheer curiosity.

I may be grossly misinformed but that is how the question sounds to me. And by "your ass" I don't mean oh-no-we-had-a-service-outage-for-five-minutes ... no, we could have a customer on the phone saying, "You mean to tell me that the modifications being made to my site for the past 24 hours are gone?!"

If it ain't broke, don't fix it!

I don't know about you but I'm too busy dealing with shit like this [youtube.com] than to ponder new potential problems I can put into play.

Look through this page [wikipedia.org] for a rough comparison of ext4 with other file systems. There's a better list of features for ext4 here [wikipedia.org] that will tell you why you might need to switch to it. It is backward compatible with ext3 and ext2 so moving to it may be trivial. If you're dealing with more than 32000 subdirectories or need to partition some major petabytes/exobytes then you might not have a choice. Some of these benefits are probably not risking your ass for but if there's a business need that cannot be overcome any easier way then back your shit up and do rigorous testing before you go live with it. If you're using Slashdot to feel out if the majority of users scream OMGNOES so you don't waste your time doing that, then that's fine. Just don't do this if you don't have to.

I tell you what, there's a $288 desktop computer at Dell today [hot-deals.org] that you can buy, put ext4 on and your OS of choice and your application(s) and whipping boy it into next century without risking anything. Where I work we have two servers in addition to our production servers. I don't think this is an uncommon scheme so if you have a development server, throw it on there and poke it with a stick. Then move it to the testing server and let your testers grape it [youtube.com] for two weeks. Then you'll know.

Re:Risk Vs Benefits Analysis (4, Insightful)

Joce640k (829181) | more than 4 years ago | (#28150233)

> If it ain't broke, don't fix it!

This.

Re:Risk Vs Benefits Analysis (0)

Anonymous Coward | more than 4 years ago | (#28150783)

If it ain't broke, don't fix it!

This.

Fixed that for you.

Re:Risk Vs Benefits Analysis (3, Insightful)

BrokenHalo (565198) | more than 4 years ago | (#28150649)

A shorter approach to the question:

What do I gain by running with ext4?
Is that gain worth the time spent changing what I've got?

If the answer to the first question is that ext4 is cool and shiny, and the answer to the second is unknown, the OP has his answer.

Filesystems are one thing we need to be VERY conservative about. We need to be certain that it works reliably, because we do not need to find our work disappearing out the end of our backup cycle after having discovered problems too late. (Yes, I know, what is this "backup" of which I speak?)

I still have drives running ReiserFS, and I still use ext2 for boot partitions mounted readonly. I pretty much trust those systems, but even so, I still take backups and test them when I can.

Re:Risk Vs Benefits Analysis (1)

Jurily (900488) | more than 4 years ago | (#28150819)

What do I gain by running with ext4?

And also, "What do I lose?". Ext4 is nowhere near trustworthy in my eyes. I'll probably switch about the same time I abandon KDE 3.5.

EXT4 is not broken? (2, Insightful)

DJRumpy (1345787) | more than 4 years ago | (#28151099)

Why does everyone keep speaking about EXT4 as if it's broken? It's working exactly as designed. It's the applications that need fixing, no?

Re:Risk Vs Benefits Analysis (1)

stinerman (812158) | more than 4 years ago | (#28150651)

It is backward compatible with ext3

Not if you decide to use extents, which is a major reason why you'd want to use ext4. Per your link:

The ext3 file system is partially forward compatible with ext4, that is, an ext4 filesystem can be mounted as an ext3 partition (using "ext3" as the filesystem type when mounting). However, if the ext4 partition uses extents (a major new feature of ext4), then the ability to mount the file system as ext3 is lost.

But then again, if you're looking at ext4 just for extents, there have been other file systems [wikipedia.org] that have used extents for awhile.

Re:Risk Vs Benefits Analysis (1)

identity0 (77976) | more than 4 years ago | (#28150661)

>I may be grossly misinformed but that is how the question sounds to me.

You are. The question is clearly asking about normal users, which is NOT uber-leet production $$$$ systems.

> My questions to the early adopters of ext4 are about whether the patches have performed as expected. What is your overall feeling about ext4? Do you think is solid enough for most users to trust it with their data? Did you find any significant performance improvements compared to ext3? Is there any incentive to move to ext4, other than sheer curiosity?

I see no problem with migrating a desktop to a different FS out of sheer curiosity, as long as one backs up one's personal data beforehand.

Don't let the title fool you, the body of the text makes no reference to 'production systems' and it is likely something inserted by the editors.

But yeah, whoo smartass. All it takes is a smartass attitude to get +5 Insightful these days.

Re:Risk Vs Benefits Analysis (0)

Anonymous Coward | more than 4 years ago | (#28150943)

Two words: ad hominem.
Another two: fuck off (see I also can do this!)

BTW: I'm not an original poster.

That you tube video kicks ass! (0)

Anonymous Coward | more than 4 years ago | (#28150919)

Funniest thing I've seen in weeks!

Ye (5, Funny)

identity0 (77976) | more than 4 years ago | (#28150165)

I've been running ext4 on my system and everything's fi

Re:Ye (4, Interesting)

dov_0 (1438253) | more than 4 years ago | (#28150409)

I've been running ext4 for / , but left ext3 for /home where any KDE apps I run could fudge writes. No problems at all.

Re:Ye (4, Insightful)

TCM (130219) | more than 4 years ago | (#28150817)

So you used the "riskier" fs for / where you don't actually need the features it provides and used the "more stable" fs where features could actually be useful because app/fs developers couldn't agree on semantics?

Only on Linux...

"*^%£*(!^&*T"49! (1)

RiotingPacifist (1228016) | more than 4 years ago | (#28150663)

Im running ext4 too but as you can the content of my posts is fine!

Re:"*^%£*(!^&*T"49! (1)

dword (735428) | more than 4 years ago | (#28150823)

Warning: ext4 may break your joke detector.

Re:"*^%£*(!^&*T"49! (1)

RiotingPacifist (1228016) | more than 4 years ago | (#28150891)

Warning: ext4 may break your joke detector.

Ok, so it wasn't particularly funny joke!
But i was definitely going for a joke about how the files are fine but the metadata (in this case post title) is what gets messed up.
yeah obviously that kind of joke bombs when I do stand-up tho!

Wrong question (5, Insightful)

AmiMoJo (196126) | more than 4 years ago | (#28150167)

You are asking the wrong question. Ext4 does not need fixing, the apps do.

Are your apps patched yet?

Re:Wrong question (5, Interesting)

QuoteMstr (55051) | more than 4 years ago | (#28150261)

Face it: your side lost. "fsync everywhere" is an infeasible, untenable, and useless position to take.

fsync-on-rename creates a much better environment for application developers and users alike. The Right Thing happens by default, and I maintain that nobody actually wants the unsafe rename behavior. Allowing an application "choice" in this respect is a red herring.

The only improvement I'd make it to flush the file involves on every rename, not just renames that happen to overwrite an existing file. Under the current scheme, an application doing the write-close-rename to replace a file will still be put in a bind if the file to write doesn't exist yet. (i.e., you can still end up with a zero-length file where no such file ever existed on a running system)

Re:Wrong question (5, Insightful)

k8to (9046) | more than 4 years ago | (#28150279)

There was no single loser here.

Ext4 should handle the case gracefully, but the apps will fail on other filesystems, and they *will* be run on those filesystems, so they should fix the bugs.

Re:Wrong question (5, Funny)

Jane Q. Public (1010737) | more than 4 years ago | (#28150589)

Huh? Buddy, this is Slashdot. There are lots of single losers here.

Re:Wrong question (-1, Flamebait)

Anonymous Coward | more than 4 years ago | (#28150775)

Huh? Buddy, this is Slashdot. There are lots of single losers here.

You're a fat chick, aren't you? Because that sure does look like fat-girl attitude. Even if they lose the weight, they can't hide the fat-girl attitude!

Re:Wrong question (3, Informative)

RiotingPacifist (1228016) | more than 4 years ago | (#28150603)

how should the apps behave? write,rename is the best way to do what they want, if you cant trust the filesystem to rename a file (and not just not rename it but leave its metadata wrong so neither the new or original are in the correct place) then what sort of program are you going to be able to run?

Re:Wrong question (4, Interesting)

icebike (68054) | more than 4 years ago | (#28150531)

Face it: your side lost. "fsync everywhere" is an infeasible, untenable, and useless position to take.

And had it been enforced, as soon as all developers went thru and added the fsync calls everywhere it would have become necessary for file system maintainers to no-op fsync calls in order to regain any approximation of prior performance.

Flushing "one file" is not always sufficient. Calling fsync() does not necessarily ensure that the entry in the directory containing the file has also reached disk. For that an explicit fsync() on a file descriptor for the directory is also needed. And perhaps the higher level directory as well.

Re:Wrong question (1)

TheSunborn (68004) | more than 4 years ago | (#28150533)

But even then you might end up with a zero byte file, if your system crashes between the close and rename call. (Or between write and close, or doing write, or well anytime after open).

But I don't really think there might be a zero size file left if the system crashes is such a problem.

But what we really need is a flag to close(Or open) called FLUSH_ON_CLOSE that flushes a file when it's closed. There are so few situations where you would not want to do that, so maybe it should be default, and we could add a DO_NOT_FLUSH_ON_CLOSE.

Bonus points for anyone who can give a realistic use case for DO_NOT_FLUSH_ON_CLOSE

I don't think it's more effective to delay the flush, because you are not going to write anything in/behind the last flushed data block.

Re:Wrong question (5, Insightful)

QuoteMstr (55051) | more than 4 years ago | (#28150717)

But even then you might end up with a zero byte file, if your system crashes between the close and rename call. (Or between write and close, or doing write, or well anytime after open).

This statement is incorrect. Suppose you want to atomically replace the contents of file "foo". Your application will write a file "foo.tmp", then call rename("foo.tmp", "foo"). At no time on a running system does any process observe a file called "foo" that does not have either the new or the old contents, and this invariant holds true whether or not "foo", "foo.tmp", or any other file has been flushed to the disk.

On the filesystem level, the kernel can actually write the contents of foo.tmp to disk whenever is convenient. The only constraint is that the on-disk name record for "foo" must be updated to point to the new data blocks from foo.tmp only after these data blocks have themselves been written to disk. That's the issue here: without that ordering guarantee, the kernel can write a file's name record before its data blocks. If the system crashes after the name record is written but before the data blocks are, what's observed on the recovered system is a zero-length file.

That's the problem here: the kernel is conjuring out of thin air a zero-length file that never actually existed on a running system.

Forcing applications to call fsync is not only an onerous burden on application developers, but it also reduces performance because it gives the filesystem less freedom than the much looser constraint on rename above.

Bonus points for anyone who can give a realistic use case for DO_NOT_FLUSH_ON_CLOSE

  1. Application configuration files. You don't care that they hit the disk immediately, but only that when they do hit the disk, they're not corrupt
  2. /etc/mtab

Flushing on close is the wrong thing: it far exceeds the minimum requirements that most applications actually need, which will substantially reduce performance.

Re:Wrong question (3, Insightful)

eldavojohn (898314) | more than 4 years ago | (#28150329)

You are asking the wrong question. Ext4 does not need fixing, the apps do.

Are your apps patched yet?

At the risk of revealing just how incredibly inept I am about file systems ... shouldn't your "apps" (and by apps I am guessing you mean applications) be calling the operating system to do anything to the file system? I mean, isn't the point of operating systems to create or contain APIs and the like that allow you to interface with any file system type that the OS supports?

I guess what I'm asking is just the technicality that only his operating system need be patched and tested for it?

Again, I don't really do this type of coding and in all the C programming I've done, I've never seen a need or way even to get down and dirty with the file system. I can dream up cases (like Google's bigtable) where that may be desirable with benefits if well planned but I would imagine most of the time it would be unwise and unsafe and put you dependent on a type of file system.

Re:Wrong question (5, Informative)

blueg3 (192743) | more than 4 years ago | (#28150487)

The problem is that some applications assume a behavior that is not supported by the POSIX definitions (the guarantees provided by the OS functions they're calling). However, it happens to be the behavior on existing filesystems and happens to be convenient. Now a new filesystem comes along and sticks to the POSIX definitions but does not follow this behavior. Application breaks, people complain.

As a simplified example, imagine you create file B, then delete file A. Existing filesystems happen to do this in order, so you always have at least one of A or B. (If the system crashed partway through, you might have both A and B.) Your application fails if neither A nor B is present. POSIX doesn't require that the operations be performed in order. New filesystem comes along and sometimes does them in the reverse order, so if the system crashes at the wrong time, neither A nor B is left on the filesystem.

Re:Wrong question (1)

Hognoxious (631665) | more than 4 years ago | (#28150827)

POSIX doesn't require that the operations be performed in order.

[eldavojohn mode]I guess[/] it doesn't forbid it either. So what's the reason, other than pure pedantry, to do them in random order?

Re:Wrong question (0)

Anonymous Coward | more than 4 years ago | (#28150885)

POSIX doesn't require that the operations be performed in order.

[eldavojohn mode]I guess[/] it doesn't forbid it either. So what's the reason, other than pure pedantry, to do them in random order?

Ohhhh, I think I've got a fan! :-)

Steve Stephenson [slashdot.org] is that you?

Re:Wrong question (2, Insightful)

GryMor (88799) | more than 4 years ago | (#28150945)

Performance optimization. You can get much write rates if you can reorder the writes to be sequential on disk, starting with whichever one the disk head can get to first.

Re:Wrong question (4, Insightful)

QuoteMstr (55051) | more than 4 years ago | (#28150847)

The problem is that some applications assume a behavior that is not supported by the POSIX definitions

POSIX is a red herring here. It covers the behavior of a running system, and makes no guarantees about atomicity or durability following a crash. After a crash and as far as POSIX goes, it's perfectly legitimate to overwrite the entire disk with hentai. Every crash recovery technique goes beyond POSIX because POSIX says nothing about crashes.

POSIX doesn't require that the operations be performed in order

It most certainly does! On a running system, if you rename B over A, at no point does any process on the system observe a file called "A" that does not have either the contents of the old A or the contents of B. THIS ATOMICITY IS A FUNDAMENTAL POSIX GUARANTEE.

Filesystems should do their best to honor this guarantee (which always applies on a running system, remember) even when the system crashes. Filesystems don't have to do that according to POSIX. Instead, they should do it because it's a sane thing to do, and doesn't violate anything POSIX guarantees. POSIX is not the arbiter of what a good system should be. It's perfectly reasonable to make guarantees that go beyond POSIX, and every real-world operating system does precisely that. POSIX guarantees are necessary but insufficient for a reasonable system in 2009.

Re:Wrong question (4, Interesting)

nwanua (70972) | more than 4 years ago | (#28150457)

Wha....? Are you seriously suggesting that applications/utilities need to be patched to deal with faulty (yes, faulty) filesystem semantics? For _every_ single filesystem they might encounter? The whole point behind a filesystem layer is to present a unified view of files to the user layer regardless of physical media or driver quirks.

The point is really that ext4 is/was broken, and IMO, any filesystem requiring patches to applications in order not to lose data is no filesystem at all. It's unbelievable (despite the technical benefits of ext4) that this would even be up for consideration.

Re:Wrong question (0)

Anonymous Coward | more than 4 years ago | (#28150521)

If my understanding of this problem is correct, it occurs only if the system crashes shortly after the KDE apps have updated their configuration files. If so many people think this is a big problem, I'm more worried about the great number of constantly crashing linux machines.

Missing the point, IMO (1)

Weaselmancer (533834) | more than 4 years ago | (#28150751)

If my understanding of this problem is correct, it occurs only if the system crashes shortly after the KDE apps have updated their configuration files. If so many people think this is a big problem, I'm more worried about the great number of constantly crashing linux machines.

It's not that a ton of Linux boxes are crashing. It's just that it's a computer, and sometimes they crash. ANY computer. Any machine, any OS. They're made by people, and nothing we make is ever perfect.

In the light of that though - what we're striving for is to provide the best performance and the best results under any circumstances, even the rare ones. Worrying about these <1% corner cases is what makes a superior product. A problem handled gracefully is a problem the user hopefully never sees. And that goes a long way towards creating a satisfactory experience. People say things like, "I don't know - I just plugged the scanner in and it died." They never say "I've plugged this scanner in over 1000 times and it's never died!" People remember the negatives, so it always pays to minimize those, however rare they may already be.

Re:Missing the point, IMO (2, Informative)

Repossessed (1117929) | more than 4 years ago | (#28150933)

They never say "I've plugged this scanner in over 1000 times and it's never died!"

Speaking as a help desk tech, they say that alot. In fact, its always worked before is probably the single most common form of whining the caller's do.

Its particularly amusing when someone is complaining they've never had te replace a battery/toner cartridge before.

Re:Wrong question (1, Insightful)

Anonymous Coward | more than 4 years ago | (#28150685)

By your logic, web standards should be changed to match the behavior of Microsoft IE. Since IE is the most popular browser, it should not be forced to conform to the incompatible ("faulty") web standards.

This is exactly why we need precise interface specifications, along with powerful tools for checking against those specs. Otherwise, application developers will find some idiom that appears to work without regard to whether they are assuming more than the spec guarantees. As a result, their code is broken. The current OS code might not expose the error, but a future one will. The OS code should not have to include hacks for every possible interface error that could be present in application code.

Re:Wrong question (5, Insightful)

Anonymous Coward | more than 4 years ago | (#28150551)

Only on Linux is it the user's fault that apps have data loss because the Linux kernel people changed filesystem semantics. At least Microsoft takes some responsibility for their mistakes :-/

I did follow the ext4 debate. Here's my quick synopsis.

  • Linux kernel hacker discovers he can make a certain microbenchmark run 50% faster if he allows reordering of filesystem metadata writes ahead of filesystem data writes. Said hacker checks in code with a "now 50% faster!!!" message.
  • A few months later, users start discovering data corruption of KDE files. Specifically, a copy of A to A', ftruncate(A'), write(A'), rename(A' to A), host crash, causes the resulting file to contain A data and not A' data despite the well-known atomic "rename" that serves as a barrier.
  • Linux kernel hacker ignored problem as not-a-bug, since the apps didn't make use of fdatasync() / fsync() correctly, which (using Posix semantics) would have prevented data corruption. The detail to note here is that Posix doesn't actually say that rename is a write barrier for data and metadata, even though everyone would assume that it is a write barrier and ALL other filesystems have treated it as a write barrier. (And in my opinion as a professional systems programmer, this is an oversight in the Posix standard and not a desired behavior). So the linux kernel hacker is technically correct but has introduced a behavior that goes against all previous implementations.
  • Linux kernel hacker (and some Slashdot posters) attack KDE developers for being incompetent because they didn't read a sub-sub-sub clause of the Posix spec that (1) isn't mentioned in the man pages, (2) only gets read by kernel programmers anyway, and (3) is about two orders of magnitude more arcane than the average desktop app developer will ever read documentation.
  • 90% of users and 80% of programmers wonder what the hell fdatasync() and fsync() and the difference between data and metadata write barriers are, and why the default behavior is to corrupt data.
  • Linux kernel hacker promises to commit a few patches to fix the problem, so as not to break software that has worked perfectly fine for the past 10 years.
  • Those of us with experience realize that since said kernel hacker didn't believe this was a problem in the first place, the patches are as likely to be half-hearted band-aids as to actually increase data integrity guarantees. Programming has a long and proud history of making a quick fix to satisfy "management" (in this case, the Linux community) that makes one symptom go away and doesn't actually fix the underlying problem.
  • We get an Ask Slashdot asking if the problem actually got fixed, because 99% of us do not have the technical expertise to understand patches to the Linux filesystem to figure out if this actually got fixed.

I do have a moral to this story. Filesystems have one cardinal, inviolable rule. DO NOT CORRUPT THE USER'S DATA. The guarantee is that if a user makes a read, the user will get back either good data OR an error (or explicit indication of no data). Google likes filesystems that lose data - but they don't ever give back corrupt search results. Ext3 can reorder writes - but defaults to a safe 5-second flush rate to keep the window of unexpected corruptions small. Ext4 ignored this rule and allows silent data corruption so that this filesystem can be the best at certain microbenchmarks, and instead of accepting responsibility, the kernel hacker in question blames everybody else.

The greatest danger to Linux's success is not Microsoft. It's the hubris of many Linux developers, users, and advocates, who are too busy disavowing responsibility and blaming everybody else to fix real user's problems. (And yes, I'm a follower of the Raymond Chen philosophy)

Re:Wrong question (0)

Anonymous Coward | more than 4 years ago | (#28150843)

I love you more than a person ought to love an anonymous poster on a web site.

The whole point of computers is to make them do things that the user wants them to do. Ext4 as delivered has been a clear violation of that standard.

Re:Wrong question (1)

Requiem18th (742389) | more than 4 years ago | (#28150905)

The greatest danger to Linux's success is not Microsoft. It's the hubris of many Linux developers, users, and advocates, who are too busy disavowing responsibility and blaming everybody else to fix real user's problems

Unlike Microsoft who takes all responsibility from any malfunction in its softw--Oh that's right the EULA crowd never does.

Come on, no Ubuntu LTS uses ext4 by default, nor Debian stable, nor OpenBSD AFAIK.

When you are dealing with the bleeding edge its normal for things to break. This is not disavowing responsibility, its fixing the problem where the problem is.

Re:Wrong question (1)

TCM (130219) | more than 4 years ago | (#28150963)

Come on, no Ubuntu LTS uses ext4 by default, nor Debian stable, nor OpenBSD AFAIK.

What has OpenBSD got to do with anything in this discussion?

Re:Wrong question (1)

Kjella (173770) | more than 4 years ago | (#28151043)

A few months later, users start discovering data corruption of KDE files. Specifically, a copy of A to A', ftruncate(A'), write(A'), rename(A' to A), host crash, causes the resulting file to contain A data and not A' data despite the well-known atomic "rename" that serves as a barrier.

No, it's more fucked than that. The rename has pointed A to A', but the data for A' has not been written so you have NO data, only a zero byte file. From a "high-level" perspective, and by high level I mean I want to atomicly replace file A with A' then this is clearly a major WTF but apparently not for the ext4 developers. That means there's bigger chances of ice skating contests in hell than me installing ext4 on a production server.

Re:Wrong question (4, Interesting)

RiotingPacifist (1228016) | more than 4 years ago | (#28150627)

hmm i think most of them are but im still having problems with mv, seriosuly can we stop this bullshit, ext4 was clearly not working!
If you cant rename a fucking file without risking total corruption of the file, at no point in renaming "settings-new" to "settings" should the file "settings" become unusable, What the fuck CAN kde4 do?

Re:Wrong question (1, Insightful)

Anonymous Coward | more than 4 years ago | (#28151013)

Unfortunately, "fixing" apps to work around ext4's brokenness means you have to fsync the new version of a file before renaming it over the old one. So instead of having KDE's 500 config files being lazily flushed to disk in a single 10-millisecond disk write, each one gets written synchronously, hanging your system for 5 whole seconds. Brilliant.

Or, I could just use ext3, which gives sane behavior (preserving either the old or new version of a file, don't care which) and doesn't require apps to be written in a way that makes you feel like you're running DOS on floppy disks.

Probably not (yet) (2)

jamesmorlock (1565937) | more than 4 years ago | (#28150181)

I would just wait until it becomes main stream and all the issues are worked out, until then I'll stick with ext3

I think it's "safe enough" (3, Interesting)

buttfscking (1515709) | more than 4 years ago | (#28150193)

I moved to ext4 as soon as it became available. I haven't had any problems thusfar (no data loss, etc), and the increased speed is noticable. So - in the opinion of a very casual Linux user - I would say that yes, it's "okay." I'm not sure I'd trust it with anything super serious, though. I could be the only one without any problems, after all. As always, you should tip-toe around anything bleeding-edge.

Re:I think it's "safe enough" (5, Funny)

eldavojohn (898314) | more than 4 years ago | (#28150221)

I moved to ext4 as soon as it became available. I haven't had any problems thusfar (no data loss, etc), and the increased speed is noticable. So - in the opinion of a very casual Linux user - I would say that yes, it's "okay." I'm not sure I'd trust it with anything super serious, though. I could be the only one without any problems, after all. As always, you should tip-toe around anything bleeding-edge.

Yeah, man, it's ok go ahead and flip your entire corporation's servers to ext4 over this weekend. A Slashdot user named buttfscking just said it is "safe enough."

Re:I think it's "safe enough" (1, Funny)

Anonymous Coward | more than 4 years ago | (#28150313)

Buttfscking is my real name you insensitive clod!

Sincerely,

Ray J. Buttfscking

Re:I think it's "safe enough" (-1, Offtopic)

Anonymous Coward | more than 4 years ago | (#28150337)

At least he's on the giving end of his buttfsking. You're a shameless fellator of Slashdot editors.

Re:I think it's "safe enough" (1)

drinkypoo (153816) | more than 4 years ago | (#28150683)

Speaking of users with funny names, I converted to ext4 (the hard way — create a bootable backup, then repartition) as soon as Jaunty went final. So far system stability seems to be about the same as ext3. I've hung it with a couple of effective fork bombs (shell scripts accidentally spawning themselves because I am too stupid to enter a complete path) and had to force-power-cycle with no data loss or indeed problems of any kind.

I wouldn't have done this, however, if I didn't have a full system backup. So I'd say if you have to ask, the answer is no.

Re:I think it's "safe enough" (2, Insightful)

Hognoxious (631665) | more than 4 years ago | (#28150787)

Well he said not to, but don't let the facts interfere with a choleric rant.

Re:I think it's "safe enough" (0)

Anonymous Coward | more than 4 years ago | (#28150839)

Well he said not to, but don't let the facts interfere with a choleric rant.

Uh what? None of your post makes any sense, especially since Buttfscking said it was "okay" and "safe enough."

The last part of your post is downright trollish.

Choleric [wiktionary.org] :

1. Easily becoming angry.

2. Showing or expressing anger.

Rant [wiktionary.org] :

1. A criticism done by ranting (To speak or shout at length in an uncontrollable anger).

2. A wild, incoherent, emotional articulation.

Re:I think it's "safe enough" (4, Informative)

BrokenHalo (565198) | more than 4 years ago | (#28150699)

I haven't had any problems thusfar (no data loss, etc)

How do you know? Do you do md5sums on every file? Most admins I've come across don't seem to, and it could be months or years before you find out, in which case any loss might easily end up outside your backup cycle.

Yes (0)

Anonymous Coward | more than 4 years ago | (#28150209)

If you're producing file undelete software.

Speed improvements? (1)

wild_berry (448019) | more than 4 years ago | (#28150215)

Did you find any significant performance improvements compared to ext3?

The extents mean that a large contiguous read is faster and files are more likely to be written in contiguous chunks, giving a bit of a boost to the filesystem. That's the explanation I have for my system and its 5400-rpm laptop disks seeming quicker (note that the appearance of greater performance isn't greater performance).

maybe (0)

wizardforce (1005805) | more than 4 years ago | (#28150247)

It depends on why you are switching from an older filesystem to ext4. It's a relatively new filesystem so you should probably expect it to be a bit more buggy when combined with software not designed for it. From my limited experience with a combination of KDE and ext4 recently I'd wait on upgrading for a while. Ext4 looks like it could be very interesting as software matures around it however, as it is currently KDE seemed to me at least a bit less stable on ext4 than it was on ext3. however, I didn't stay with the filesystem as long as I should have so take this with a bit of salt...

Re:maybe (1)

arth1 (260657) | more than 4 years ago | (#28150659)

Not only compared to older file systems. XFS is such an older file system, and it still outperforms ext4 for quite a few operations. For deletes, ext4 is far faster, but otherwise, XFS tends to win.

Then again, it all depends on what you're going to use the system for. Some file systems are very good for running databases on top of, and others are good for small and fast create/destroy operations, and yet others favour appends.

In an informal test I just did with large Subversion repositories, ext4 didn't score too well, being rather slower than both JFS and XFS, and slightly slower than ext3. An rsync repository gave somewhat similar results, although here it outperformed ext3. On the other hand, for a build environment ext4 fared much better, and outperformed the others.

But anyhow, this is all irrelevant if you don't consider ext4 mature enough for production use yet. I'm not sure that I do -- there may still be critical bugs that haven't been found yet, and I'd give it a little more time before I shift it from "bleeding edge" to "cutting edge".
It'll get there, I'm sure, and possibly even without any major bugs fixed. But I don't want my production systems to be the test case for verifying that. If ext3 and xfs works well for me, I don't see a need to change just yet.

Um, yes, it's called fsck. (5, Informative)

dandaman32 (1056054) | more than 4 years ago | (#28150257)

I'm using ext4 on an encrypted partition on my tiny X41 tablet. The hard disk is 5400RPM IIRC, so when Ubuntu decides to run fsck due to a scheduled run or an unclean shutdown after a certain bug manifests itself, I don't have to sit there for 10 minutes or more waiting for fsck to run. That for me and many other casual users is probably the biggest advantage of ext4.

Does a laptop count as production? In the eyes of an everyday user, yes. My laptop is very much "production" IMHO, and I trust ext4 enough to not magically make all my school assignments disappear.

Digressing a bit, I haven't seen any of the data loss either, though I use GNOME and not KDE. I do think that if an application relies on specific undocumented behavior, that the application should change, not the filesystem driver. It's acceptable that the kernel developers are doing their best to get temporary workarounds into place, but the permanent solution is to fix the applications so they don't depend on undocumented behavior.

Re:Um, yes, it's called fsck. (1)

RiotingPacifist (1228016) | more than 4 years ago | (#28150553)

reiserfs, ive been using it for years for fast fsck and it can handle a file rename gracefully too :O
Its not undocumented, the problem is kde was using write then rename to make sure there was an atomic operation an gaurantee the integrity of the file, nobody expects a rename to fail (and then ext4 came along and zeros metadata at bad times to improve the performance)!

Re:Um, yes, it's called fsck. (2, Informative)

hackstraw (262471) | more than 4 years ago | (#28150653)

Maybe I'm clueless, and I'll be corrected shortly, but a) didn't ext3 bring this functionality back in in 2000 or so? b) don't most distributions format their partitions with the options to not do fsck's periodically based on mount count or time?

<insert paragraph break about here>

I know that every system I ever have to create a filesystem manually I remove the counts to prevent that quick reboot from being a slow reboot and a trip to the data center to babysit the thing through a fsck.

Production? (1)

msimm (580077) | more than 4 years ago | (#28150777)

A laptop? No, that doesn't count unless you run your production system on another laptop of the same build and make. At least where I work production is business critical systems on real kit, then we have our development environment, testing environment, and after that (in terms of importance) we have the business/office network and individual workstations (and your laptop would be somewhere after that).

Nothing a developer did on a home system would be considered production ready without, you know, doing lots of actual testing.

Re:Production? (1)

colinrichardday (768814) | more than 4 years ago | (#28150975)

So someone could actually use a laptop for hosting a crucial web site, but you wouldn't "consider" it to be production ready without actual testing. Hmm . . .

Maybe.... (1)

jonnycando (1551609) | more than 4 years ago | (#28150281)

...a moot point for me....I have been using xfs for several years, and so haven't tried, nor do I think I need the latest iteration of ext. But like was opined already, it's not ext4 but the apps that need fixing. So it seems at least.

ext4 is buggy (4, Interesting)

hamanu (23005) | more than 4 years ago | (#28150291)

Well, the fsck times are really fast compared to ext3, and thank god, because EVERY time I reboot it requires an fsck, complaining about group descriptor checksums. Even if I unmount my ext4 filesystem and remount it without rebooting it gets all fscked up. I have a 3TB ext4 fs on LVM on RAID, that was NOT converted from ext3, but built on brand new drives. My similar ext3 filesystem has had so such problems.

ext4 takes about 7 minutes to fsck, ext3 took hours. I hope they fix this soon.

Re:ext4 is buggy (4, Informative)

msuarezalvarez (667058) | more than 4 years ago | (#28150439)

Maybe you should do something about whatever the cause for the constance fsck'ing is. You do realize it is quite abnormal to have a system have errors at each remount, don't you?

Re:ext4 is buggy (5, Insightful)

TCM (130219) | more than 4 years ago | (#28150921)

But he uses R-A-I-D! R-A-I-D magically makes data bulletproof and immune to disaster as we all know.

Seriously, running a 3TB RAID with a buggy fs and applauding faster fsck times instead of wondering why the fs gets fucked up constantly must be the peak of idiocy.

Re:ext4 is buggy (1)

StarHeart (27290) | more than 4 years ago | (#28151009)

This sounds like a problem I have had. It isn't ever time I reboot, and has gotten better with newer kernel versions. Mine is a 4tb ext4 filesystem on linux software raid5.

No (2, Insightful)

ducomputergeek (595742) | more than 4 years ago | (#28150295)

We avoid anything that has less than 24 months of wide deployment unless there is some absolute pressing need to move to an unstable/untested product.

We have test and development systems where we run latest and greatest, but generally they are used in sync with the existing system. We don't switch over until we're damn sure there aren't any unforeseen consequences. That typically means 12 months without any major hiccups and 3 months without minor ones.

Re:No (3, Funny)

icebike (68054) | more than 4 years ago | (#28150655)

We avoid anything that has less than 24 months of wide deployment unless there is some absolute pressing need to

Good Idea. Let's all follow this sage advice.

It's a good file system. (4, Interesting)

3vi1 (544505) | more than 4 years ago | (#28150325)

I was one of the people that spoke loudly when Ext4 caused 0-byte file corruption.

While I don't entirely agree that it's just "an application issue", because apps that work fine on every other filesystem should not need to be re-written specifically for Ext4, I am pleased at the work the devs have done to work around the problems. The kernel patches have eradicated the issues I had with corruption, and the performance is still great.

I never did official benchmarking to determine the extent, but my perception is that there's a noticeable performance increase when using Ext4 instead of Ext3.

If I were building a production server, I may think twice and just go with Ext3... unless the app would *greatly* benefit from Ext4. However, for a desktop system, I think Ext4 is a very good choice and ready for primetime.

Re:It's a good file system. (3, Interesting)

Flammon (4726) | more than 4 years ago | (#28150897)

... because apps that work fine on every other filesystem should not need to be re-written specifically for Ext4

Not quite. I believe XFS and JFS behave the same way as Ext4. Here's a good article and thread on the subject. http://lwn.net/Articles/322823/ [lwn.net]

Re:It's a good file system. (3, Interesting)

QuoteMstr (55051) | more than 4 years ago | (#28150929)

Not quite. I believe XFS and JFS behave the same way as Ext4.

When XFS was first released, there was quite a buzz surrounding it before people realized they'd lose data. XFS, not ext3, would have been the the de-facto Linux standard had the developers not stubbornly refused to fix its dataloss bugs. By the time they finally got around to it (for some cases), there'd already been irreparable damage to XFS's reputation.

Anonymous Coward (0)

Anonymous Coward | more than 4 years ago | (#28150343)

After the famous filesystem corruption due delayed allocation I lost confidence in ext4. I've been using xfs on some partitions and it works great.

Re:Anonymous Coward (2, Informative)

grumbel (592662) | more than 4 years ago | (#28150421)

If you worry about file corruption, I wouldn't touch XFS, that thing shredded files for me on every single unclean shutdown.

Re:Anonymous Coward (1)

larry bagina (561269) | more than 4 years ago | (#28151125)

Seconded. Earlier this year, I set up a NFS raid box for storing videos, music, and other large files, so I went with XFS. Within 10 minutes (only copying 4-5 videos over), XFS had corrupted itself to the point it couldn't be recovered. EXT3 may be a tad slower, but it can manage to read and write files.

Connection Interrupted errors loading slashdot? (0)

Anonymous Coward | more than 4 years ago | (#28150387)

Is anyone else getting a lot of these "connection intrupted" errors when clicking on stories?
It's been going on for a week now and is making slashdot almost unreadable and annoying.

Re:Connection Interrupted errors loading slashdot? (1)

icebike (68054) | more than 4 years ago | (#28150669)

No. But this seems the wrong place to hid this question.

Not for me... (1)

petrus4 (213815) | more than 4 years ago | (#28150443)

I've never used anything other than Reiser3 with Linux. Might not be the most reliable or fast, but it has other advantages.

- Undeletion.
- Partition resizing.
- Readable from within Windows via YaReG [akucom.de] .

Re:Not for me... (0)

Anonymous Coward | more than 4 years ago | (#28150595)

ext2/3 can be resized offline, ext4 may have online resize too. i can also read ext2 partitions from windows (see http://www.fs-driver.org/ [fs-driver.org] )

and undeletion should never be needed :)

Re:Not for me... (1)

The MAZZTer (911996) | more than 4 years ago | (#28150697)

As my sibling post said, http://www.fs-driver.org/ [fs-driver.org] is a Windows File System driver drive ext2, and thanks to forward compatibility (as I understand it), ext3 works too. http://sourceforge.net/projects/ext2fsd [sourceforge.net] is another alternative.

You should be warned that whenever I've used the first tool to write to the partition, I've ended up with Ubuntu fscking it on boot. But I've never noticed any problems like data corruption from using it. The second one also seems OK, although when browsing the disk from the Command Prompt it shows entries for . and .. in the root, which confuses dir.

Theodore Ts'o: Donâ(TM)t fear the fsync! (5, Informative)

sirdude (578412) | more than 4 years ago | (#28150515)

After reading the comments on my earlier post, Delayed allocation and the zero-length file problem as well as some of the comments on the Slashdot story as well as the Ubuntu bug, itâ(TM)s become very clear to me that there are a lot of myths and misplaced concerns about fsync() and how best to use it. I thought it would be appropriate to correct as many of these misunderstandings about fsync() in one comprehensive blog posting.

http://thunk.org/tytso/blog/2009/03/15/dont-fear-the-fsync/ [thunk.org]

FYI, Ts'o is the ext4 maintainer.

Not reassuring (3, Insightful)

Junta (36770) | more than 4 years ago | (#28150769)

He presents three common cases for 'quickie' file modifications:
-Modify-in-place. Yes, this logically cannot be expected to leave the content intact in an unexpected interruption. You ask the OS to blow away data, then send it new data, there is a logical indeterminate state in the middle where doing things in the order you specified leaves you exposed.
-Write new file, use rename, using fsync to ensure a low exposure of data. This forces data to disk so it's coherent.
-Write new file and then use rename without fsync:
*This* he claims should easily be expected to corrupt the contents. I take issue with this. The fact that this occurs is because ext4 commits the rename out-of-order ahead of the data commit. I don't understand why the rename operation cannot also be delayed until after the data has been written out. I've seen several people ask 'I don't care that the change happens *now*, but I want the changes to occur in the order I specified', and thus far have seen Ts'o miss that point (intentionally or unintentionally). I have not read any explanation of why changing hardlinks should logically be an operation to jump ahead of pending data writeout. I could be missing something, but I'm not the only one with these questions.

fsync gives a relatively expensive guarantee above and beyond what people require to behave sanely. He says its inexpensive 'now' relative to the past. However, 'now' in this context only applies to ext4 users and thus the operation degrades other filesystem performance and fsync remains an expensive operation relative to not doing at all.

In terms of the general attitude of filesystems shrugging off data consistency so long as their indexes are intact, I find myself agreeing with Torvalds' comments on the debacle:
http://thread.gmane.org/gmane.linux.kernel/811167/focus=811700 [gmane.org]

You're Asking Slashdot? (2, Insightful)

welshbyte (839992) | more than 4 years ago | (#28150541)

You should be asking this question in a more authoritative forum. The majority of Slashdot readers are likely to just regurgitate their perceived status of ext4 from the last time ext4 was mentioned on Slashdot and I know for certain that ext4 has had more testing and development since then. Try asking the ext4 development team; they're very nice, helpful people in my experience. I refer you to the #ext4 channel on irc.oftc.net and the linux-ext4 mailing list.

wait until at least 2.6.30+ (3, Insightful)

xenoterracide (880092) | more than 4 years ago | (#28150561)

last I checked some patches for the dealloc empty file problem was being merged in 2.6.30. if you want to avoid it but want some other advantages like faster fscks you could go with data=journal on your filesystems which is a bit slower but also disables dealloc, while still having extents, barriers, and other ext4 benefits. I've been using data=journal on my /home partition without a single problem.

it also depends a lot on what you have in 'production'. a web server that's mostly doing reads it should be fine for. a heavy email server... well.. can you afford to lose email on a crash? I think it might be alright for a server that just does mta but not the fs for the actual mailbox's (with dealloc anyways). database server should be fine, because the database's job is to make sure data hits the disk, among other things. dns servers are a very read heavy so again I would think it'd be fine. so basically you need to watch anything that's heavy write and not to a database, and even then only with dealloc.

still as I'm sure others have said, it's a good idea to wait on new tech like this. some tools don't yet recognize that ext4 is not ext3.

moD down (-1, Flamebait)

Anonymous Coward | more than 4 years ago | (#28150617)

Will r3call 7hat it

Regretting using it.. (1)

Junta (36770) | more than 4 years ago | (#28150639)

I have installed a system and have been getting resize inode invalid and group descriptors corrupted issues on clean reboots. fsck has yet to fail me, and IO stress tests have demonstrated no general io corruption other than ext4 errors.

On the flipside, for my applications I haven't really gained much.

Its good for general use (1)

revjtanton (1179893) | more than 4 years ago | (#28150681)

Ive used it for the past few months on my netbook, however I've only recently tried the Fedora 11 build on ext4 on my desktop. I was impressed w/ the boot speed on my netbook, and for the netbook thats all that really matters. I had absolutely no problems w/ the netbook using Jaunty UNR and ext4.

I did have some problems w/Fedora on my desktop though. It booted nice, and did all the easy stuff well (web browsing, office stuff, etc.) but it got all screwy w/Wine and Eclipse. It might be the GPU I have, or the lack of driver support for Fedora w/it, but Wine ran extremely badly with Steam and Left 4 Dead as compared to a Fedora 10 build w/ ext3. Also I had the typically reported problems w/ext4 and data loss when I was doing some Android dev in Eclipse.

Like anything new in the open source community there are bugs to be hashed out. If nobody uses it and nobody reports the bugs then it won't get better. The boot speed gives it value. Windows 7 RC is booting only slightly slower than ext4 (at least on my system) so if Linux is going to make its stand its got to do certain things to make it distinct. I believe simple things like boot-time and broad dev support are the areas that Linux can shine and to that end it appeals to the right clientele to take it in the right direction as a community. :)

No (1)

_LORAX_ (4790) | more than 4 years ago | (#28150691)

It still needs more time. I have played under both ubuntu and rhel 5.3 and run into strange behavior that makes me uncomfortable.

1) Bonnie++ throws errors even on server class hardware that something is wrong when creating and deleting a large number of random files. This is with no errors in the filesystem and everything operating normally. https://www.redhat.com/archives/ext4-beta-list/2009-February/msg00000.html [redhat.com]

2) A crash of ubuntu ended up removing *ALL* group and other permission on a laptop drive. Not just those altered within 2 minutes of the crash, but of every single file in the system leading to a system that non-root users could not log into.

Neither of those are acceptable. For now it's still ext3 only until ext4 has had some more time to mature.

Data loss (1)

WillKemp (1338605) | more than 4 years ago | (#28150779)

A couple of months ago i installed Ubuntu 9.01, which used ext4 by default. Running it, i experienced data loss for the first time since i moved from ext2 to ext3 quite a few years ago now. I've just changed back to ext3 - which has been rock solid for me since it first appeared in Redhat or whatever distro it was i was using back then.

Yes...I have experienced problems with ext4 (1)

Kamphor (609888) | more than 4 years ago | (#28150869)

I nearly lost my whole filesystem. It's a good thing I had a backup core system on reiserfs to boot from and run fsck. from what I understand, it's a problem with the ext4 journaling system and metadata. this link has info on the journal problem...which may have already been patched in the current kernels. http://lwn.net/Articles/284037/ [lwn.net] wiki page for ext4 - bottom has a fix for the problem: http://wiki.archlinux.org/index.php/Ext4 [archlinux.org] essentially, mounting and ext4 filesystem with option "data=ordered" helped my system out. since I have enabled this mount option, my filesystem is now stable even after hard reboots or power failures. Hope this helps out people as it did me! -Kamphor

No (1)

427_ci_505 (1009677) | more than 4 years ago | (#28150887)

As someone who recently had the latest ubuntu trash every inode on my ext3 partitions, I'd have to say no. Not because my case is related to ext4 in any way, but because if a kernel (2.6.28) can get ext3 wrong, I shudder to think what happens with ext4.

Not touching it for at least 12 months (1)

wdef (1050680) | more than 4 years ago | (#28151115)

Filesystems are mission critical for everything. Stabilility is the thing here. Personally, I see no reason to risk this until they iron out all the wrinkles.
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...