Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

The State of Linux IO Scheduling For the Desktop?

timothy posted more than 3 years ago | from the in-and-out-and-in-and-out dept.

Linux 472

pinkeen writes "I've used Linux as my work & play OS for 5+ years. The one thing that constantly drives me mad is its IO scheduling. When I'm copying a large amount of data in the background, everything else slows down to a crawl while the CPU utilization stays at 1-2%. The process which does the actual copying is highly prioritized in terms of I/O. This is completely unacceptable for a desktop OS. I've heard about the efforts of Con Kolivas and his Brainfuck Scheduler, but it's unsupported now and probably incompatible with latest kernels. Is there any way to fix this? How do you deal with this? I have a feeling that if this issue was to be fixed, the whole desktop would become way more snappier, even if you're not doing any heavy IO in the background." Update: 10/23 22:06 GMT by T : As reader ehntoo points out in the discussion below, contrary to the submitter's impression, "Con Kolivas is still actively working on BFS, it's not unsupported. He's even got a patch for 2.6.36, which was only released on the 20th. He's also got a patchset out that I use on all my desktops which includes a bunch of tweaks for desktop use." Thanks to ehntoo, and hat tip to Bill Huey.

Sorry! There are no comments related to the filter you selected.

easy solution: (-1, Troll)

Anonymous Coward | more than 3 years ago | (#33997900)

use a real OS you cock-smoking faggot!

Re:easy solution: (-1)

Anonymous Coward | more than 3 years ago | (#33997970)

You must be a stupid Windows or Mac fanboy. Otherwise you would understand how awesome Linux and FreeBSD is. Back to on topic. I have never had a issue with this when copying large files. I have used Fedora, Gentoo, and Ubuntu and none of these have ever had an issue like that. Is it when you are copying to a different spot on the same drive? I copy from internal to external large amounts of data (50+ GB) and never have any issues with my Ubuntu box. Even same drive different directory copies are snappy for me. Maybe its your Distro...

Re:easy solution: (1, Troll)

0123456 (636235) | more than 3 years ago | (#33998136)

You mean an OS like Windows which will swap out the web browser you're using when you copy a big file from one disk to another even though it's far too large for the entire file to fit in the disk cache?

Re:easy solution: (1, Interesting)

TheTrueScotsman (1191887) | more than 3 years ago | (#33998286)

You can disable swapping in Windows if you have sufficient RAM. The poster raises a very good point, but it's actually more important in servers than clients (isn't Linux anyway dead on the desktop...?).

This is actually one of the very reasons (the other being multithreaded performance) why many of us use Windows Server 2003/2008 sometimes in preference to Linux.

Re:easy solution: (1)

0123456 (636235) | more than 3 years ago | (#33998316)

You can disable swapping in Windows if you have sufficient RAM.

I tried that once on XP and several programs barfed. For example I seem to remember that Premiere simply wouldn't run if you didn't have a swap file, because it did some wacky things with virtual memory allocations; perhaps the newer versions aren't so braindead.

Re:easy solution: (1)

TheTrueScotsman (1191887) | more than 3 years ago | (#33998378)

Sorry, I can only comment on server solutions that we wrote. I'm sure that some flaky desktop programs have problems.

It sucks I agree (4, Interesting)

Anonymous Coward | more than 3 years ago | (#33997908)

This issue got so bad for me I switched to FreeBSD.

Re:It sucks I agree (4, Insightful)

Anonymous Coward | more than 3 years ago | (#33998128)

We switched our dedicated web servers from Linux to FreeBSD and OpenSolaris. When we upload videos (usually 10GB or larger) over our 100Mbps internet connection to the server, or a client was downloading the videos, those who were accessing the server using the web server complained it took seconds serve each web page. The videos were on a magnetic hard drives, the OS and web server was on SSDs (which was mirrored in RAM). Server logs were fine, CPU utilisation was low, the servers have 1Gbps connection. We put it down to I/O scheduling. Switching the OS solved the problems.

Re:It sucks I agree (1)

dotgain (630123) | more than 3 years ago | (#33998184)

I'm just glad we're finally talking about this. For years I've wondered if it was just me; everyone I'd asked naturally denied any problems, when all I had to do was delete a 1GB file and I could kiss goodbye to my system for 20 seconds or so.

Back then I ran Gentoo so fine tuning your kernel was the norm, and I genuinely believed the terrible performance was due to bad choices I made selecting kernel options. About six months ago I finally abandoned Gentoo and the entire practice of too-tooing with the internals, and switched to Ubuntu, accepting defaults all the way. No I/O improvements noticed. WinXP on the same machine kicks ass.

Again, I'm glad this is no longer the 'elephant in the room' and we can finally discuss and hopefully fix this. Whodathunkit - Linux had its own Reality Distortion Field.

Re:It sucks I agree (0)

Anonymous Coward | more than 3 years ago | (#33998244)

Deleting one file, however large it may be, takes a minimal amount of disk I/O, which will probably sit around in RAM for a few seconds before being flushed to the disk.

Re:It sucks I agree (4, Interesting)

ObsessiveMathsFreak (773371) | more than 3 years ago | (#33998198)

This is the number one problem with all Linux installations I have ever used. The problem is most noticeable in Ubuntu where, any time one of the frequent update/tracker programs runs, the entire system will become all but unusable for several minutes.

I don't know if it's all that related, but swap slowdown is an appalling issue as well. If a single program spikes in RAM usage, I often have to reboot the whole system as it hangs indefinitely. As I work with Octave a lot, often a script will gobble up a few hundred megs of memory and push the system into swap. Once that happens, it's often too late to do anything about it as programs simply will not respond.

Re:It sucks I agree (3, Interesting)

Lord Byron II (671689) | more than 3 years ago | (#33998266)

That's exactly why I stopped using swap a couple of years ago. On my main machine I have 3 GB and I feel that if I reach the limit on that, then whatever program is running is probably a lost cause anyway. The next malloc/new causes the program to crash, saving the system.

Re:It sucks I agree (3, Informative)

lsllll (830002) | more than 3 years ago | (#33998362)

Such drastic change! I have seen this happen on numerous systems and I just change the elevator to "deadline" and poof! The problem is gone. See this discussion [launchpad.net] for some details. The CFQ scheduler is great for a Linux server running a database, but it completely sucks for desktop or any server used to write large files to.

what about servers? (3, Insightful)

StripedCow (776465) | more than 3 years ago | (#33997922)

Isn't this also relevant when using Linux on a server? I mean, if one process or thread is copying a large file, you don't want your server to come to a crawl.

It doesn't sound like just a "desktop" issue to me.

Re:what about servers? (1)

e065c8515d206cb0e190 (1785896) | more than 3 years ago | (#33997948)

I use linux for desktop and servers. Never saw noticed the issue OP is referring to... maybe OP could be a little more specific about what he's doing and on which hardware?

Re:what about servers? (2, Interesting)

Lord Byron II (671689) | more than 3 years ago | (#33998274)

It's been a big issue for me. Go to a directory with a couple of large files (say a dvd rip) and do a "cat * > newfile". Watch your system come to a crawl.

Re:what about servers? (5, Informative)

joaosantos (1519241) | more than 3 years ago | (#33998350)

I just did it and didn't notice any slowdown.

Re:what about servers? (1)

hedwards (940851) | more than 3 years ago | (#33998390)

It's not Linux specific, although he's complaining about it on Linux. I've seen similar behavior on Windows and FreeBSD as well as Linux. If I'm not mistaken, MS added IO scheduling of some sort to Vista and 7 likely has it as well. And FreeBSD has its own effort for that as well.

It's not something which one notices all the time, but I have noticed it and it is annoying. Not personally sure what exactly causes it except in the case where multiple programs want to access different portions of the hard disk at the same time.

Re:what about servers? (2, Interesting)

Anonymous Coward | more than 3 years ago | (#33997974)

On IO intensive server: this is also a real issue. 20-30% of processors and cores stuck with a 99% iowait for hours, while the rest tries to cope. Total CPU load does not go above 20%. No solution yet after months of study and experimenting. Linux is indeed really bad at IO scheduling in general, it seems.

Notw think of that situation and a heavy database system. A no-no solution.

What am I doing wrong? (1)

poptones (653660) | more than 3 years ago | (#33998400)

Sometimes I see my system get bogged down doing copies and even lock up for a few seconds. And when I see this I always become very nervous because usually it means I have a hard disk failure of some sort (sometimes it can be just a bad connector, but still a hard failure).

I just copied about 5GB from my 5 disk raid5 to my main partition WHILE I copied about the same amount of data back TO that same raid WHILE watching a video from that raid and didn't see much of an issue. I saw one little "stutter" for about 100mS and that was it.

I am using LVM. And the braindead way ubuntu configures LVM to put swap in the same LVM partition as your home and system directories DOES cause all sorts of nastiness. This used to drive me nuts before I fixed it simply by disabling swap. I have 8GB of RAM, why do I need swap?

Edit: this "new and improved" page formatting SUCKS. Now the buttons that were right there are hiddden behind DUMBASS popup menus. Fucking "engineers" thinking they know how to improve shit. How do I turn this bullshit off?

Re:what about servers? (0)

Anonymous Coward | more than 3 years ago | (#33998002)

What if copying that large file quickly is very important and your server is only used for non-latency-sensitive stuff? In that case you might prefer your server to come to a crawl.

Re:what about servers? (5, Informative)

Anonymous Coward | more than 3 years ago | (#33998030)

There are some interactive-response fixes queued up for 2.6.37 that may help (a lot!) with this stuff.
Start reading here: http://www.phoronix.com/scan.php?page=news_item&px=ODU0OQ

Re:what about servers? (2, Interesting)

man_of_mr_e (217855) | more than 3 years ago | (#33998122)

How does this happen? Every year it seems I read about how this problem has been fixed in the latest kernel, and then it's like those fixes mysterious vanish?

Re:what about servers? (2, Interesting)

fishbowl (7759) | more than 3 years ago | (#33998132)

This problem is highly visible in VMs. When you have one VM doing write-heavy disk IO, the other VMs suffer.

I don't think it's a Linux problem as much as a general problem of the compromises that must be made by any scheduling algorithm.

What about you Linux mainframe guys? You have unbeatable IO subsystems. Do you see the same problems?

Re:what about servers? (1)

inode_buddha (576844) | more than 3 years ago | (#33998406)

I'm not the kinda guy you're looking for, with litle or zero mainframe experience. But with that said, I honestly think that you'd be better off to ask that on the kernel mail-list. Chances are that questions like that would actually do everyone some good.

have you tried ionice? (5, Informative)

larry bagina (561269) | more than 3 years ago | (#33997928)

have you tried ionice?

Re:have you tried ionice? (0)

Anonymous Coward | more than 3 years ago | (#33997978)

They should put that in the file transfer box gui maybe in a pull down with three priority levels.

Re:have you tried ionice? (5, Informative)

atrimtab (247656) | more than 3 years ago | (#33998080)

ionice works great in a terminal window, but isn't integrated into any of the Desktop GUIs.

I suppose you could prefix the various file transfer commands used by the GUI with an added "ionice -c 3", but I haven't bothered to look.

Using ionice to lower the i/o priority of various portions of MythTV like mythcommflag, mythtranscode, etc. can make it quite snappy.

Re:have you tried ionice? (-1, Troll)

Anonymous Coward | more than 3 years ago | (#33998236)

Yeah, I need to use ionice on my Unix-based OSX machine all the time to make I/O performance not suck.

Oh wait, I don't, and nor do I need to clutter up the GUI with options that'd confuse any non-hacker types.

Re:have you tried ionice? (1)

jedidiah (1196) | more than 3 years ago | (#33998290)

Perhaps the appalling lack of storage space on most Macs is why you don't ever see this sort of problem.

Perhaps you should.. (3, Insightful)

Anonymous Coward | more than 3 years ago | (#33997934)

..download and compile the 2.6.36 kernel. A feature of the changes can be found at http://www.h-online.com/open/features/What-s-new-in-Linux-2-6-36-1103009.html

A very very easy to follow guide can be found at http://kernel.net/articles/how-to-compile-linux-kernel.html

Sidenote - What is up with not being able to paste links? That's annoying.

Re:Perhaps you should.. (0, Troll)

biryokumaru (822262) | more than 3 years ago | (#33998014)

Actually, I much prefer Slashdot's HTML style comment system. Maybe you should just not be lazy.

Re:Perhaps you should.. (3, Interesting)

Qzukk (229616) | more than 3 years ago | (#33998060)

Theres a bug in chrome that causes it to usually be unable to paste into slashdot's comment box once you've placed an < character in the box. (Slashdot, specfically. It does fine on all sorts of other sites with even fancier ajaxy textareas like the stackoverflow sites)

Re:Perhaps you should.. (1)

biryokumaru (822262) | more than 3 years ago | (#33998284)

That's odd. I use Chrome and I've never had that issue. I'm on Windows 7, what's your OS?

Re:Perhaps you should.. (1)

No. 24601 (657888) | more than 3 years ago | (#33998314)

That's odd. I use Chrome and I've never had that issue. I'm on Windows 7, what's your OS?

I have observed this problem on Chrome for Mac and Chrome for Linux (Ubuntu). Chrome 6 & 7 (the latest).

Re:Perhaps you should.. (1)

SirThe (1927532) | more than 3 years ago | (#33998382)

I use Chromium on Ubuntu 10.04, but I didn't use any <, so not sure what happened. Also, HTML comment style? Why would that prevent me from pasting a link?

BFS Isn't Unsupported (5, Informative)

ehntoo (1692256) | more than 3 years ago | (#33997938)

Con Kolivas is still actively working on BFS, it's not unsupported. He's even got a patch for 2.6.36, which was only released on the 20th. http://ck.kolivas.org/patches/bfs/ [kolivas.org] He's also got a patchset out that I use on all my desktops which includes a bunch of tweaks for desktop use. http://www.kernel.org/pub/linux/kernel/people/ck/patches/2.6/ [kernel.org]

Perhaps if Con Kolivas named his scheduler .. (1, Insightful)

Anonymous Coward | more than 3 years ago | (#33997942)

Perhaps if Con Kolivas named his scheduler ...named his scheduler something else, it might gain more traction ...

x

Re:Perhaps if Con Kolivas named his scheduler .. (1)

ScrewMaster (602015) | more than 3 years ago | (#33998000)

Perhaps if Con Kolivas named his scheduler ...named his scheduler something else, it might gain more traction ...

x

Maybe it's like the BFG9000 in the original DOOM.

Re:Perhaps if Con Kolivas named his scheduler .. (3, Informative)

m50d (797211) | more than 3 years ago | (#33998376)

He tried that before. I think he's given up on getting his scheduler (though perhaps not a suspiciously similar one written by Inigo) in the kernel after what happened with CFQ.

I fixed it! (-1, Troll)

Anonymous Coward | more than 3 years ago | (#33997956)

After a long time of being frustrated with this on my Ubuntu laptop, I figured out a great solution: Installing Windows.

Re:I fixed it! (0, Flamebait)

fishbowl (7759) | more than 3 years ago | (#33998154)

Yes, because everybody knows that kernel scheduling algorithms are far more tunable on Windows than on Linux.
 

But that's a strawman (0)

Anonymous Coward | more than 3 years ago | (#33998276)

OP didn't state that he wants to have a fine tunable kernel schedulin algorithm. He stated a problem and is looking for a solution. So, if the problem has already been fixed in another system (So you don't need to tune the kernel scheduling algorithm there... It just works), it is irrelevant whether you actually could tune the kernel scheduling algorithm if you wanted to.

Not saying, that GP wasn't a troll or a flamebait - he obviously was. But just noting that your answer didn't really refute his post in any way.

Re:But that's a strawman (1)

0123456 (636235) | more than 3 years ago | (#33998344)

So, if the problem has already been fixed in another system (So you don't need to tune the kernel scheduling algorithm there... It just works), it is irrelevant whether you actually could tune the kernel scheduling algorithm if you wanted to.

And which OS has magically managed to predict which I/O operations should be given a higher priority over other I/O operations without any user intervention? I don't know about Windows 7, but XP had a choice between 'server mode' and 'desktop mode' and both were equally useless if you copied a large file while using the system.

Re:I fixed it! (0)

Anonymous Coward | more than 3 years ago | (#33998360)

If it works there's no real need to tune it.

Linux I/O scheduling (5, Insightful)

Animats (122034) | more than 3 years ago | (#33997980)

If the CPU utilization is that low, it's an I/O scheduling problem. See Linux I/O scheduling [linuxjournal.com] .

The CFQ scheduler is supposed to be a fair queuing system across processes, so you shouldn't have a starvation problem. Are you thrashing the virtual memory system? How much I/O is going into swapping. (Really, today you shouldn't have any swapping; RAM is too cheap and disk is too slow.)

Re:Linux I/O scheduling (1)

Khyber (864651) | more than 3 years ago | (#33998206)

"RAM is too cheap and disk is too slow"

Your disks might be too slow, but OUM-based MLC flash drives are so fast most current SSDs would look like 80's tech.

Re:Linux I/O scheduling (1)

pinkeen (1804300) | more than 3 years ago | (#33998280)

I doesn't swap, in fact I don't even have a swap partition. The issue was present on all of the boxes I've had in the past few years so it isn't hardware specific I think.

Re:Linux I/O scheduling (1)

Compaqt (1758360) | more than 3 years ago | (#33998348)

Just for curiosity's sake, you can't hibernate without a swap partition, can you?

Re:Linux I/O scheduling (0)

Anonymous Coward | more than 3 years ago | (#33998336)

UNIX going back to 1983 at least in my experience ALWAYS hugely jacks up the priority of any process doing a LOT of I/O. When I had a computing lab assignment and I, along with 20 others, was logged into a VAX 750 running BSD, and I was up against a deadline, I used to issue a shell command to duplicate a 1MB file, I would see my priority jump way up. Then I could run my program quickly. After a few moments, my priority would be back to normal and my program would take much longer to run.

When I have asked about the unix and linux schedulers acting this way, most people would tell me it was a feature, not a bug. Try it! You'll see that a huge copy gets its priority increased hugely, along with its parent process and that of further children.

Re:Linux I/O scheduling (0)

Anonymous Coward | more than 3 years ago | (#33998352)

Your process can get an utterly fair allocation of block reads and still perform horribly if other IO is thrashing the buffer cache. Generally when I get complaints about IO concurrency, its not because an individual process is "hogging" the block device, but rather because concurrency is up, cache hits are down, and hence the total number of request hitting the block device are shooting up.

On an idle system with 1G of free memory, doing a random walk through a couple of 100M files is going to be memory mapped in no time and you won't be limited by the disk at all.
Do the same thing on a busy system with other processes pushing your files out of buffer cache though, and suddenly you *need* the disk.

The fundamental issue isn't a scheduling algorythm problem. The fundamental issue is that modern application run *horribly* without massive memory buffers between the app and the disk.

Not had the slightest problems with this (1)

Giant Electronic Bra (1229876) | more than 3 years ago | (#33997984)

Using Mandriva 2010.0 (or on any earlier builds for that matter). Not sure if their stock kernel is using scheduling patches or not but the only time I've ever seen slowdowns on my wimpy P4 machine is with really serious oversubscribing to memory, which obvious will turn it into a dog. IO seems to have little to no effect however.

So maybe you just need a better desktop distribution? A newer one perhaps? Don't expect that if you slap just any old distro on a machine and call it a workstation that you get something beyond garbage. I'd expect Suse and/or Fedora to work equally well. Ubuntu is probably doing OK but I wouldn't know. Most of the smaller/less mainstream distros however are quite random, and running something like CentOS on a desktop is just asking for a crappy desktop.

Re:Not had the slightest problems with this (2, Informative)

frisket (149522) | more than 3 years ago | (#33998200)

I'm using Ubuntu 10.4 on an old Dell and big copies don't seem to slow it down any more than I'd expect on an old machine, either when copying to an external USB backup (with rsync) or over the net to my office systems (via scp). Serious slowdown would seem to indicate something deeper is wrong.

Background? (-1, Troll)

Anonymous Coward | more than 3 years ago | (#33997986)

How is Linux supposed to know that this copying should be done in the background? Maybe I need the copying done before I can do anything else.

There is an ionice command you know. But telling the system what you want would be stupid, everything must be done by magical guesswork instead.

Re:Background? (1)

amn108 (1231606) | more than 3 years ago | (#33998194)

It seems to me you don't really know what you're talking about. As much as one wants one's file copying to be finished sooner, it should never EVER impair responsiveness of a workstation. That's what multitasking operating systems with GUIs are all about. It has nothing to do with Linux or its kernel knowing where it's priorities should be.

Have you ever heard the BitTorrent client Transmission? Whenever the thing has anything to do (downloading / uploading) I have an almost useless desktop. Now, I don't really care how fast it writes to disk as long as it keeps a decent pace, but I do care about being able to be productive while it's doing its job, otherwise I could resort to MS-DOS or somthing.

You are obviously confusing concurrency with latency. Latency has been proved time and again to be THE decisive factor for desktop users. It's all on Internet, just do some googling and you'll find case studies which show what is more important to users - whether their 700Mb DivX file copy finishes in 3 minutes instead of 5 or whether their computer keeps being as responsive as they are used to, during those minutes. Users always prefer latency, which is why Con Kolivas' work is appreciated, regardless of what you may have heard (in particular from Linus.) His whole argument with the opponents of a plugin-scheduler (and scheduler plugin system) revolved around the fact that what works for servers doesn't always work for desktops.

Perhaps you should... (1)

SirThe (1927532) | more than 3 years ago | (#33997990)

..download and compile the 2.6.36 kernel. A feature of the changes can be found at http://www.h-online.com/open/features/What-s-new-in-Linux-2-6-36-1103009.html [h-online.com] A very very easy to follow guide can be found at http://kernel.net/articles/how-to-compile-linux-kernel.html [kernel.net] Sidenote - What is up with this comment not showing up when I wasn't registered. That's stupid and annoying.

Re:Perhaps you should... (2, Funny)

biryokumaru (822262) | more than 3 years ago | (#33998046)

Sidenote - What is up with this comment not showing up when I wasn't registered. That's stupid and annoying.

It did. [slashdot.org] Now who's stupid and annoying? I mean, besides me.

Re:Perhaps you should... (1)

SirThe (1927532) | more than 3 years ago | (#33998084)

Heh, it wasn't there when I first refreshed, nor after I registered and came back. Not sure why it was hidden, my bad.

May be Fixed (1)

IQgryn (1081397) | more than 3 years ago | (#33998028)

The 2.6.36 kernel supposedly has a fix for this issue. I haven't been able to test it yet myself, but it sounds like they finally tracked it down. See here [kernelnewbies.org] for more information.

Is it really only a matter of scheduling? (4, Interesting)

mrjb (547783) | more than 3 years ago | (#33998042)

I've wondered on occasion if this problem is really only due to scheduling. After all, most of us still write our file access code more or less as follows: x=fopen('somefilename'); while ( !eof(x)) { print readln(x,1024); /* ---- */ } fclose(x); Point being, there's nothing that tells the marked line that the process should gracefully go to sleep while the drive is doing its thing, and there's no callback vector defined either- nothing that indicates we're dealing with non-blocking I/O. I'd like to think that our compilers have silently been improved to hide those implementation details from us, but I have no proof that this is the case. Unless the system functions use some dirty stack manipulation voodoo to extract the return address of the function and use that as callback vector?

Re:Is it really only a matter of scheduling? (5, Informative)

Anonymous Coward | more than 3 years ago | (#33998120)

The kernel will preempt the process calling "readln", in other words putting it to sleep.
The kernel will make sure the I/O happens, allowing other processes to work at the same time.
You only need non-blocking code if your own process needs to other things at the same time.

Re:Is it really only a matter of scheduling? (4, Informative)

Anonymous Coward | more than 3 years ago | (#33998140)

The process will go to sleep inside the read() system call (inside readln() somewhere presumably). Other processes will be able to run in the meantime. It works by interrupting into kernel code, and the kernel changes the stack pointer (and program counter, and lots of other registers) to that of another process. When the data comes back from the disk, the kernel will consult its tables and see that your process is runnable again, and when the scheduler decides it's its turn, in a timer interrupt, the stack pointer will be switched back to your stack. (So yes, dirty stack manipulation voodoo.) Every modern OS works this way.

Re:Is it really only a matter of scheduling? (1)

SuricouRaven (1897204) | more than 3 years ago | (#33998212)

Depending on the drive, sometimes application behavior can be an issue. A program that calls fwrite 1024 times with a kilobyte buffer will act quite differently to one that calls it just once with a megabyte buffer. The former will promote fragmentation, and often cause the drive to thrash around (unless it's an SSD) if there is other IO going on simutainously, which really kills performance. A single large write completes faster than a thousand small ones.

Re:Is it really only a matter of scheduling? (0)

Anonymous Coward | more than 3 years ago | (#33998230)

That really isn't anything that you're supposed to write code to do.

The kernel itself is what is doing the scheduling, you see. Programs that aren't I/O intensive should not have to come to a stand-still when programs that are I/O intensive run.

Re:Is it really only a matter of scheduling? (1)

onefriedrice (1171917) | more than 3 years ago | (#33998250)

... nothing that indicates we're dealing with non-blocking I/O ...

Because it typically isn't non-blocking IO. The process can obviously request non-blocking, but the default (and most used) is blocking. So the process actually will sleep as it is supposed to if it's written correctly. It would not be the kernel's fault if a process requests a non-blocking fd and enters into a tight loop which spends more time looping than writing.

Re:Is it really only a matter of scheduling? (1)

dotgain (630123) | more than 3 years ago | (#33998260)

Blocking I/O is hardly a new phenomenon. The first thing that happens to any process that needs a read from disk is to get knocked out to sleep. In terms of CPU clock cycles, a hard drive seek is like an entire season.

What's more - processes that are put to sleep in this manner usually wake with a higher priority than usual, on the assumption they're more I/O bound than CPU bound.

Re:Is it really only a matter of scheduling? (0, Offtopic)

seebs (15766) | more than 3 years ago | (#33998272)

I sincerely hope that most of us don't write our code like that, because EOF is a past-tense check, not a future-tense check. The usual idiom is:

while (fgets(stream)) /* do something */

because feof(f) tells you whether the LAST read ALREADY FAILED due to EOF.

Re:Is it really only a matter of scheduling? (1)

WetCat (558132) | more than 3 years ago | (#33998374)

Hm. I afraid you are not correct.
When in this part of code "drive is doing its thing" ?I
It happens only in readln () and in readln process yields control to the kernel, which is doing scheduling and io.
  So it's up to kernel what to do in this case.

CK patches for the kernel are always updated... (1, Informative)

Anonymous Coward | more than 3 years ago | (#33998050)

"I've heard about the efforts of Con Kolivas and his Brainfuck Scheduler, but it's unsupported now and probably incompatible with latest kernels."

I don't know what you're talking about: http://users.on.net/~ckolivas/kernel/
It's updated for the latest kernel which came out just yesterday.

It is been worked on (0)

Anonymous Coward | more than 3 years ago | (#33998056)

It is been worked on: http://kernelnewbies.org/Linux_2_6_36#head-738bffb3415051b478ecdfd2eabb0294e35146a9 and http://lkml.org/lkml/2010/10/19/123

Con and his trolls (0)

Anonymous Coward | more than 3 years ago | (#33998082)

Did Con unleash some of his trolls on Slashdot?

Yeah, I think he just did ...

This is fixed or being worked on (1)

Yfrwlf (998822) | more than 3 years ago | (#33998086)

Supposedly the 2.6.36 kernel addresses this issue. I don't know if the problem has been completely fixed, or mostly fixed, or what, since I haven't tried that kernel yet (too bad there isn't an easy way to install kernels in a cross-distro fashion!).

Read the bullet points here, particularly the ones in the middle, as there has been multiple things done to this kernel to improve performance:
http://www.h-online.com/open/features/What-s-new-in-Linux-2-6-36-1103009.html?page=6 [h-online.com]

Is Desktop Linux [still] relevant? (0, Troll)

bogaboga (793279) | more than 3 years ago | (#33998102)

I ask this question with utmost sincerity. Folks Over here [pcworld.com] believe it is indeed dead. I am afraid I agree with them. I hear so little about desktop Linux these days. It's all about iOS, Android and RIM. The future does not appear to be on track to change anytime soon. Now tell me I am wrong and why.

Re:Is Desktop Linux [still] relevant? (4, Informative)

bieber (998013) | more than 3 years ago | (#33998162)

That was a joke, right? You don't really think that all the millions of desktop Linux users just up and vanished because some idiot at PCWorld wanted a catchy headline?

Re:Is Desktop Linux [still] relevant? (1)

32771 (906153) | more than 3 years ago | (#33998186)

You are working for the wrong newspaper. Oh, oops that was supposed to meant 'reading' instead of 'working' but hey, its close enough.

Re:Is Desktop Linux [still] relevant? (1)

m6ack (922653) | more than 3 years ago | (#33998234)

As for me and my laptop... $ uname -a Linux mmm 2.6.35-22-generic #35-Ubuntu SMP Sat Oct 16 20:45:36 UTC 2010 x86_64 GNU/Linux

Re:Is Desktop Linux [still] relevant? (0)

Anonymous Coward | more than 3 years ago | (#33998262)

You're premise is wrong. It doesn't matter if an operating system has 99% market share or 1% market share so long as the OS is good and the community is good.

The Linux desktop OS has never been better than it is today, when compared with Windows.

The community? I don't know, but I've always been able to get help one way or another.

if i have many gigs of data to copy over somewhere (1)

FudRucker (866063) | more than 3 years ago | (#33998114)

i just run it and let it own the computer for whatever time it takes = anywhere from 10 to 30 minutes, and just walk off, maybe go get a fresh cup of coffee or cold beer depending on where i am and what time of day it is. one thing i dont want is a borked copy because i was too impatient to let it do its job.

Re:if i have many gigs of data to copy over somewh (1)

lsllll (830002) | more than 3 years ago | (#33998246)

Hey, I'm all for grabbing a beer any time of the day, but surely you don't think watching a YouTube video, sending emails, playing chess, or shopping online on your machine as it is copying a file in the background will "bork" the copy. I would toss any O/S that would do such thing.

It has always been like that (2, Informative)

guacamole (24270) | more than 3 years ago | (#33998142)

I can remember that even as far back as 1999 I saw this issue with Linux. This is not bad only for the desktop, but also for the server. I have also experience with Solaris workstations and servers, and it usually doesn't behave this way.

Switch to Deadline (1, Interesting)

Anonymous Coward | more than 3 years ago | (#33998146)

I ran into the same problems and ended up switching to the "deadline" scheduler. Haven't had a single problem since. I changed it via the "elevator=deadline" on the kernel boot prompt, but you can change it on the fly for individual devices. See Configuring and Optimizing Your I/O Scheduler [devshed.com] to see how.

Not a Thread Scheduling Issue (1)

SwashbucklingCowboy (727629) | more than 3 years ago | (#33998148)

This is not a thread scheduling issue, it's a disk scheduling issue. If CPU utilization is only 1-2% and things aren't snappy then the issue is because the foreground process's I/Os aren't given higher (high enough?) priority. Easy enough to believe too, a whole lot of writes get cached and then queued up. With an elevator algorithm they'll likely all get performed before any reads required by the foreground process.

Re:Not a Thread Scheduling Issue (-1, Troll)

Anonymous Coward | more than 3 years ago | (#33998278)

This issue is also semi-obsolete. When I last used Windows, it used to almost physically lock up when under IO strain. That is to the point that mouse stopped working, and entering text in a text editor was bloody slow. I'm not even talking about IO of those processes, just other interrupt processing. I think that was Windows 2000 or XP. Linux is light-years ahead of that mess.

If Linux wants to be even more desktop friendly than Windows 2000 kernel, it will need some sort of Traffic Shaping, except for IO. Something where the desktop manager can poke certain processes to be higher IO priority over default.

Now, the issue is semi-obsolete is because SSD don't suffer from seek times overhead, and that is where IO contention causes most lag.

Swith to "noop". (0)

Anonymous Coward | more than 3 years ago | (#33998152)

"noop scheduler: just service next request in the queue without any algorithm to prefer this or that request."

OS/2 (2, Interesting)

picross (196866) | more than 3 years ago | (#33998158)

I remember using OS/2 (IBM's desktop OS) and i was always amazed that you could format a floppy and do other tasks like nothing else was going on. I never did understand why that never seemed to make it into the mainstream.

I'll tell you how to fix it (-1, Flamebait)

Anonymous Coward | more than 3 years ago | (#33998176)

Use a desktop OS, such as Windows 7, not a server OS, such as Linux.

Wrong Question (2, Interesting)

donscarletti (569232) | more than 3 years ago | (#33998188)

This is not a case of Linux IO schedulers being unsuitable for the desktop, but more a case of desktop applications being written in a horrendous way in terms of data access. The general pattern being to open up a file object, load in a few hundred kilobytes, processing this then asking the operating system for more. This is a small inefficiency when the resource is doing nothing, but if the disk is actually busy, then it will probably be doing something else by the time you ask for it to read a little bit more. Not to mention the habit of reading through a few hundred resource files one at a time in seemingly random order, and blocking every time it reads, because the application programmer is too lazy to think about what resources the app is using.

Linux has such a nice implementation of mmap, which works by letting Linux actually know ahead of time what files you are interested in and managing them itself, without the application programmer worrying his pretty little head over it. Other options are running multiple non-blocking reads at the same time and loading the right amount of data and the right files to begin with.

The best thing about a simple CSCAN algorithm is that it gives applications what they asked for and if the application doesn't know what it wants, well, that's hardly a system issue.

Probably not the IO scheduler (5, Informative)

crlf (131465) | more than 3 years ago | (#33998192)

This is almost certainly not the IO scheduler's problem. IO scheduling priorities are orthogonal to CPU scheduling priorities.

What you are likely running into is the dirty_ratio limits. In Linux, there is a memory threshold for "dirty memory" (memory that is destined to be written out to disk), that once crossed, will cause symptoms like you've described. The dirty_ratio values can be tuned via /proc, but beware that the kernel will internally add its own heuristics to the values you've plugged in.

When the threshold is crossed, in an attempt to "slow down the dirtiers", the Linux kernel will penalized (in rate-limited fashion) any and every task on the system that tries to allocate a page. This allocation may be in response to userland needing a new page, but it can also occur if the kernel is allocating memory for internal data structures in response to a system call the process did. When this happens, the kernel will force that allocating thread (again, rate-limited) to take part in the flushing process, under the (misguided) assumption that whoever is allocating a lot of memory is the same thread that is dirtying a lot of memory.

There are a couple ways to work around this problem (which is very typical when copying large amounts of data). For one, the copying process can be fixed to rate limit itself, and to synchronously flush data at some reasonable interval. Another way that a system administrator can manage this sort of task (if automated of course) is to use Linux's support for memory controllers which essentially isolates the memory subsystem performance between tasks. Unfortunately, it's support is still incomplete and I don't know of any popular distributions that automate this cgroup subsystem's use.

Either way, it is very unlikely to be the IO scheduler.

Re:Probably not the IO scheduler (2, Insightful)

0123456 (636235) | more than 3 years ago | (#33998302)

Then there are programs like Firefox, which continually write to sqlite databases, which causes multiple fsync() calls, which will flush the disk cache each time if you're running on an ext3 filesystem. All because NTFS used to eat your bookmarks file if Windows crashed.

Learn to use "nice" (1, Redundant)

NReitzel (77941) | more than 3 years ago | (#33998202)

Gee, most of us *nix people - what did that guy call us, something about smoking roosters over small pieces of wood - know that when you need to copy a few gigabytes in background, you use "nice" and crank the priority way down. This has been around since something like 1975 or so.

"nice" is for CPU bound tasks (1)

LukeCrawford (918758) | more than 3 years ago | (#33998408)

it doesn't do much for I/O. ionice, a much newer program does something similar to what 'nice' does for CPU for I/O intensive tasks. It's pretty good, not as good as nice is for cpu-bound tasks, but eh.

Even worse on SSDs (1)

blacklint (985235) | more than 3 years ago | (#33998254)

There's another massive problem with I/O scheduling on Linux: all of the schedulers are designed for physical disks. With solid state drives as opposed to physical spinning platters, a ladder algorithm is useless and only serves to reduce performance. With solid state drives, the best scheduler is currently noop, which doesn't implement priorities. I prototyped a lottery based scheduler for a class that would allow ionice to be used in a sensible way on solid state drives, but never got it into a state where it didn't crash the kernel. The whole system does seem a little massively out of date.

How much does the filesystem matter? (0)

Anonymous Coward | more than 3 years ago | (#33998304)

On my current openSUSE 11.3 install I've only observed severe slowdown whenever I read/write large amounts of data from/to NTFS partitions. Similar operations that only involve ext4 largely remain unnoticed. My best guess would be, that the NTFS-3G driver was written around a spec that was for one thing closed and, perhaps more importantly, not designed with the Linux kernel in mind.

ionice helps some (1)

LukeCrawford (918758) | more than 3 years ago | (#33998312)

If you are doing something non-interactive that uses a lot of I/O, use IOnice. experiment, but I find

  ionice -p [pid] -c 2 -n 7

to produce reasonable results.

Maybe it's the filesystem? (1)

flerchin (179012) | more than 3 years ago | (#33998332)

My absolutely puny hardware (all 5+ years old, or netbooks) does not experience this problem at all running different releases of Ubuntu. I did notice that Transmission sometimes chewed up too much processor when I had 10+ torrents going, but my bulk drive was NTFS. After I formatted it to ext4, even that went away. I routinely copy multiple GB files intra-drive, inter-drive, and intranetwork while browsing, youtubing, etc.

Maybe you're using an NTFS filesystem that isn't as efficient?

Again, my hardware is majorly obsolete. My only "multicore" setup is on a hyperthreading Atom.

IO scheduler != CPU scheduler (5, Insightful)

Ingo Molnar (206899) | more than 3 years ago | (#33998342)

FYI, the IO scheduler and the CPU scheduler are two completely different beasts.

The IO scheduler lives in block/cfq-iosched.c and is maintained by Jens Axboe, while the CPU scheduler lives in kernel/sched*.c and is maintained by Peter Zijlstra and myself.

The CPU scheduler decides the order of how application code is executed on CPUs (and because a CPU can run only one app at a time the scheduler switches between apps back and forth quickly, giving the grand illusion of all apps running at once) - while the IO scheduler decides how IO requests (issued by apps) reading from (or writing to) disks are ordered.

The two schedulers are very different in nature, but both can indeed cause similar looking bad symptoms on the desktop though - which is one of the reasons why people keep mixing them up.

If you see problems while copying big files then there's a fair chance that it's an IO scheduler problem (ionice might help you there, or block cgroups).

I'd like to note for the sake of completeness that the two kinds of symptoms are not always totally separate: sometimes problems during IO workloads were caused by the CPU scheduler. It's relatively rare though.

Analysing (and fixing ;-) such problems is generally a difficult task. You should mail your bug description to linux-kernel@vger.kernel.org and you will probably be asked there to perform a trace so that we can see where the delays are coming from.

On a related note i think one could make a fairly strong argument that there should be more coupling between the IO scheduler and the CPU scheduler, to help common desktop usecases.

Incidentally there is a fairly recent feature submission by Mike Galbraith that extends the (CPU) scheduler with a new feature which adds the ability to group tasks more intelligently: see Mike's auto-group scheduler patch [kernel.org]

This feature uses cgroups for block IO requests as well.

You might want to give it a try, it might improve your large-copy workload latencies significantly. Please mail bug (or success) reports to Mike, Peter or me.

You need to apply the above patch on top of Linus's very latest tree, or on top of the scheduler development tree (which includes Linus's latest), which can be found in the -tip tree [redhat.com]

(Continuing this discussion over email is probably more efficient.)

Thanks,

Ingo

Actually, you're wrong (1)

inode_buddha (576844) | more than 3 years ago | (#33998370)

Con Kolivas released a patch set against the 2.6.36 kernel just a few days ago. Check lkml.org.

umm.. not a general issue (0)

Anonymous Coward | more than 3 years ago | (#33998394)

i have never seen such a problem... i play music off of a mounted drive, read/write to another mounted partition all the time... move files from here to there... absolutely no noticeable slowdowns coz of that...

but hands down, windows 7 beats the pants off any linux distro (or even mac for that matter)... love it completely! very professional work...

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?