×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Faster Chips Are Leaving Programmers in Their Dust

CmdrTaco posted more than 6 years ago | from the or-maybe-they've-already-wrapped-around-to-zero dept.

Programming 573

mlimber writes "The New York Times is running a story about multicore computing and the efforts of Microsoft et al. to try to switch to the new paradigm: "The challenges [of parallel programming] have not dented the enthusiasm for the potential of the new parallel chips at Microsoft, where executives are betting that the arrival of manycore chips — processors with more than eight cores, possible as soon as 2010 — will transform the world of personal computing.... Engineers and computer scientists acknowledge that despite advances in recent decades, the computer industry is still lagging in its ability to write parallel programs." It mirrors what C++ guru and now Microsoft architect Herb Sutter has been saying in articles such as his "The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software." Sutter is part of the C++ standards committee that is working hard to make multithreading standard in C++."

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

573 comments

2005 Called (5, Funny)

brunes69 (86786) | more than 6 years ago | (#21727168)

....it wants it's article back.

Seriously - any developer writing modern desktop or server applications that doesn't know how to do multi-threaded programming effectively deserves to be on EI anyway. It is not that difficult.

Re:2005 Called (5, Insightful)

CastrTroy (595695) | more than 6 years ago | (#21727302)

It's not just making your app multithreaded, it's completely changing your algorithms so they they take advantage of multiple processors. I took a parallel programming course in University, so I'm by no means an expert, but I'll give what insight I have. You can't just take a standard sort algorithm and run in multithreaded. You have to change the entire algorithm. In the end, you end up with something that sorts faster than n log (n). However, doing this type of programming where you break up the dataset, sort each set, and then gather the results can be very difficult. Many debuggers don't deal well with multiple threads, so that adds an extra layer of difficulty to the whole problem. Granted, I don't think that we really need this level of multithreadedness, but I think that's what the article is referring to. I think that 10+ core CPUs will only really help for those of us who like to do multiple things at the same time. I think it would even be beneficial to keep most apps tied to a single CPU so that a run-away app wouldn't take over the entire computer.

Re:2005 Called (4, Interesting)

gazbo (517111) | more than 6 years ago | (#21727406)

In the end, you end up with something that sorts faster than n log (n).

Not without an infinite number of processors you don't.

Re:2005 Called (1)

ByOhTek (1181381) | more than 6 years ago | (#21727418)

sometimes it is as simple as adding multiple threading without changing the logic.

It depends on where you are splitting your logic.

Lets take a binary search example:
Your bank accidentally left a back door in their database, and now the hackers/crackers want to grab their enemies credit and accoutn information, which will allow them to get it faster?

The database is sorted:
1) Perform a binary search on the data with each thread doing 1/Nth the data, where N is the number of threads per search
2) Perform a binary search on the data, with each thread searching over all of the data, and one thread per search.

The second works best if there are more searches than CPUs, otherwise the first works best. The second also doesn't really require changing any major algorithms.

Re:2005 Called (0)

Anonymous Coward | more than 6 years ago | (#21727910)

Neither. The bottleneck is the disk system.

Re:2005 Called (5, Funny)

ZeroFactorial (1025676) | more than 6 years ago | (#21727724)

This sounds to me like a great example of passing the buck.

EE Guy #1: We can't seem to build faster chips.
EE Guy #2: No problem. We'll just put tons of processor cores in instead.
EE Guy #1: But people have spent the past 30 years creating algorithms for single core machines. Almost none of the programmers have any experience writing multi-core algorithms!
EE Guy #2: Exactly! We'll be able to blame the programmers for being lazy and not wanting to learn new complicated algorithms that require an additional 4 years of university.
EE Guy #1: Brilliant! We should come up with a catchy headline like "The Free Lunch is Over" or something like that.
EE Guy #2: Yeah, and we could get Slashdot to post a link to the article. Slashdot users are sure to sympathize with our devious plans...

Threads Are Not the Answer (0, Troll)

MOBE2001 (263700) | more than 6 years ago | (#21727758)

Threads are the second worse thing to have happened to computing, in my opinion. They make the problem worse. Ask Intel and Microsoft. They've been trying to make threads works for years and they've spent a lot of money on it. They have nothing interesting to show for their effort. What's amazing to me is that we've had the answer to parallel programming with us all along. We are just blind to it, for whatever psycho-social reason. We've been using it to parallelize processes in such applications as cellular automata, simulations, neural networks for decades. And without using threads, mind you. We just need to apply the same principle at the instruction level and design development tools and special multicore CPUs to support the model. Read Half a Century of Crappy Computing [blogspot.com] to find out more.

Re:Threads Are Not the Answer (5, Interesting)

caerwyn (38056) | more than 6 years ago | (#21727880)

This is very, very wrong. Data-set partitioning is certainly one way of achieving parallelism in programming, but it is hardly the only way- nor is it applicable to all domains, as many problems have solutions with too many inter-cell data dependencies. In addition, threads provide a wealth of benefits to application developers by allowing multiple unrelated tasks to be performed simultaneously.

There is, and will always be, overhead associated with parallelization. It may sound great to say "oh, we can farm out parts of this data set to other cores!", but that requires a lot of start-up and tear-down synchronization. It's not at all uncommon for overall performance to be improved by doing something *unrelated* at the same time, requiring less synchronization overhead.

Are threads perfect for everything? No. But calling them the second worse thing to happen to computing is, as best, disingenuous.

Re:2005 Called (0)

TheLazySci-FiAuthor (1089561) | more than 6 years ago | (#21727340)

It's not just about mutiple-threads, it's about coordination between those threads in a real-time manner.

As you know, multiple threads in a program do not actually execute concurrently - processing is still serial, it's just so fast that threads can appear to execute simultaneously - and it's not just about queuing execution either.

With multiple cores we have the ability to multiply the processing potential, but there is still a coordination issue - what good is it to process x amount of data if you must still wait for the Y data to finish processing before you can use it?

Here's an interesting article [techreport.com] about Valve software's difficulties as they begin to experiment with the multi-core architecture paradigm. Some of their solutions are most interesting, including this little tidbit which has stuck with me: "In additional to failing to scale well, coarse threading also introduced an element of latency. Valve had to enable the networking component of the engine to keep the client and server systems synchronized, even with the single-player game."

It's interesting that they already had a 'solution' in place, but that this solution was created for a different reason yet with a useful outcome.

Re:2005 Called (4, Informative)

caerwyn (38056) | more than 6 years ago | (#21727762)

As you know, multiple threads in a program do not actually execute concurrently - processing is still serial, it's just so fast that threads can appear to execute simultaneously - and it's not just about queuing execution either.

That holds only for multithreaded programming on a single core. As soon as there are multiple cores available, processing does, in fact, happen simultaneously.

Re:2005 Called (1)

mycroft822 (822167) | more than 6 years ago | (#21727496)

....it wants it's article back.
Oh yea? Well the jerk store called, and they want you back!

//just to be clear it's not an insult, just a joke about a Seinfeld episode...

Re:2005 Called (0)

Anonymous Coward | more than 6 years ago | (#21727768)

You mean: "The jerk store called and their running out of you".

Re:2005 Called (1)

chaboud (231590) | more than 6 years ago | (#21728000)

I think he was trying to combine the two.

It would also be "they're" rather than "their."

I'm just sayin'...

Missing the point (0)

Anonymous Coward | more than 6 years ago | (#21727550)

Sure, adding a few threads to a program is typically easy. But making a multi-threaded program which not only maximizes the concurrency available to its design, AND does so without adding any new serious bugs, AND is scalable across a large number of cores is not at all easy. Which is why it's very uncommon in production software today.

What Microsoft, Intel, AMD, and others are trying to do is to make parallelization not merely a language feature as it is today but to make scalable many-core programming something more akin to a compiler flag. In other words, to make parallelization automatic. This is actually a pretty difficult problem and a pretty revolutionary approach, but if it works out it could have enormous benefits.

Re:2005 Called (-1, Offtopic)

Anonymous Coward | more than 6 years ago | (#21727560)

its*

His, hers, its. No apostrophes.

Re:2005 Called (2, Insightful)

MrSteveSD (801820) | more than 6 years ago | (#21727572)

A lot of multi-threading up until now has been about keeping applications responsive, rather than breaking up tasks. That makes sense since muti-core chips haven't been around that long in most peoples homes. Another issue is that once you have more than one processor, two threads really can run at the same time which can show up all kinds of bugs you would never notice on a single core system. The main problem I can see is with testing for errors. With multiple threads it's up to the OS on how it juggles them around and that juggling may be different for every test run. So you could run the same test a hundred times, then suddenly, you could get a failure. So multi-threading throws in a certain random aspect into the software which never used to be there.

Re:2005 Called (0)

Anonymous Coward | more than 6 years ago | (#21727618)

It's also not about threads in single apps either, it's about an OS that can swap threads around itself. I was skeptical about the usefulness of multiple cores having heard the "no apps can take advantage of it anyway" mantra repeated time and time again, until I picked up a Mac Pro from a graphic house that was going out of business. I paid $1k thinking I'd get a good dual-dual core, but it was a 3GHz 8 core. Let me tell you when all 8 cores are enabled, there is NOTHING that touches this machine, it can do hundreds of things at once, and the load across the cores gets spread for the most part evenly. Nothing I've done has made it feel unresponsive, or even slower in use, and that includes half a dozen virtual machines, DVD ripping, video re-encoding and 3D rendering all at the same time.

Even running by itself, 12 minute DVD rips are worth it.

The next machine I buy, even if it's full price, will have as many CPUs/Cores as I can afford.

Re:2005 Called (3, Informative)

chaboud (231590) | more than 6 years ago | (#21727936)

Well, 2005 called...

it wants its reply back.

The parent is exactly how I would have replied a couple of years ago. I was doing lots of threading work, and I found it easy to the point of being frustrated with other programmers who weren't thinking about threading all of the time.

I was wrong in two ways:

1. It's not that easy to do threading in the most efficient way possible. There's almost always room for improvement in real-world software.

2. There are plenty of programmers who don't write thread-safe/parallel code well (or at all) that are still quite useful in a product development context. Some haven't bothered to learn and some just don't have the head for it. Both types are still useful for getting your work finished, and, if you're responsible for the architecture, you need to think about presenting threading to them in a way that makes it obvious while protecting the ability to reach in and mess with the internals.

The first point is probably the most important. There are several things that programmers will go through on their way to being decent at parallelization. This is in no strict order and this is definitely not a complete list:

- OpenMP: "Okay, I've put a loop in OpenMP, and it's faster. I'm using multiple processors!!! Oh.. wait, there's more?"
Now, to be fair, OpenMP is enough to catch the low-hanging fruit in a lot of software. It's also really easy to try out on your code (and can be controlled at run-time).

- OpenMP 2: "Wait... why isn't it any faster? Wait.. is it slower?"
Are you locking on some object? Did you kill an in-loop stateful optimization to break out into multiple threads? Are you memory bound? Blowing cache? It's time to crack out VTune/CodeAnalyst.

- Traditional threading constructs (mutices, semaphores): "Hey, sweet. I just lock around this important data and we're threadsafe."
This is also often enough in current software. A critical section (or mutex) protecting some critical data solves the crashing problem, but it injects the lock-contention problem. It can also add the cost of round-tripping to the kernel, thus making some code slower.

- Transactional data structures: "Awesome. I've cracked the concurrency problem completely."
Transactional mechanisms are great, and they solve the larger data problem with the skill and cleanliness of an interlocked pointer exchange. Still, there are some issues. Does the naive approach cleanly handle overlapping threads stomping on each-others' write-changes? If so, does it do it without making life hell for the code changing the data? Does the copy/allocation/write strategy save you enough time through parallelism to make back its overhead?

Should you just go back to a critical section for this code? Should you just go back to OpenMP? Should you just go back to single-threading for this section of code? (not a joke)

Perhaps as processors get faster by core-scaling instead of clock-scaling this will become less of a dilemma, but to say that "[to do multi-threaded programming effectively] is not that difficult" is akin to writing your first ray-tracer and saying that 3D is "not that difficult." Somtimes it is. At least at this point there are places where threading effectively is a delicate dance that not every developer need think about for a team to produce solid multi-threaded software.

That doesn't mean that I object to threading being a more tightly-integrated part of the language, of course.

M$ programmers should be already capable (5, Funny)

scafuz (985517) | more than 6 years ago | (#21727176)

just start a multithread process: 1 core for the program itself, the remaining 7 for the bugs...

Microsoft & 8+ x cores (1)

DrYak (748999) | more than 6 years ago | (#21727964)

For the very first time, 8 and more cores CPU will enable exclusively windows users to run an extensive whole botnet spitting out spam... ..all this running on 1 single multicore CPU.

thank you, Microsoft !

hhooppee tthheeyy ffiixx tthhiiss ssoooonn (5, Funny)

Chordonblue (585047) | more than 6 years ago | (#21727190)

II hhaavvee aann XX22 pprrocceessssoor? Ii ccaann ggooeess TTWWIICCEE aass ffaasstt nnooww?

Re:hhooppee tthheeyy ffiixx tthhiiss ssoooonn (3, Interesting)

Nova1313 (630547) | more than 6 years ago | (#21727690)

When the first AMD x2 chips came out the linux kernel had issues with the clock on those chips. The clock would be several times (presumably 2 times?) faster then it should be, the cores clocks were not synchronized for some reason or the kernel would lose track... When you typed a letter it would repeat multiple times as you described. :)

Re:hhooppee tthheeyy ffiixx tthhiiss ssoooonn (1)

CastrTroy (595695) | more than 6 years ago | (#21727930)

I've had the same problem on single processor machines running Linux for years. I don't know if it's a problem of the keyboard repeat rate being set too low, or something else to that effect, but I notice that a lot of the time on my Linux machines it seems to double/triple type a lot of letters.

Concurrency (0)

Anonymous Coward | more than 6 years ago | (#21727196)

this is my attempt to get a frost post on two forums simultaneously

I'll take parallel computing for $500, Alex

Wow, this is a great idea! (0)

Anonymous Coward | more than 6 years ago | (#21727208)

Now we can build operating systems and applications with 50 million lines of code and not feel bad about it since we a parallel processing! Great!

What is the law that states that application speed will slow down to the inverse of Moore's Law?

Re:Wow, this is a great idea! (1)

mjorkerina (1158683) | more than 6 years ago | (#21727286)

Wirth's law, even though he's not the one who came up with it.

Re:Wow, this is a great idea! (1)

somersault (912633) | more than 6 years ago | (#21727870)

I've always thought that, didn't realise there was a law for it. People used to optimise everything way back when, but now I suspect that most people just let the faster processor take care of things rather than trying to squeeze every nanosecond of performance out of their apps :( At least graphics are still getting faster just because they're adding more parallel processors to the chips..

OS/2? (4, Interesting)

SCHecklerX (229973) | more than 6 years ago | (#21727218)

I remember learning to write software for OS/2 back in the early 90's. Multi-threaded programming was *the* model there, and had it been more popular, it would be pretty much standard practice today, making scaling to multiple cores pretty effortless, I'd think. It's a shame that the single-threaded model became so ingrained in everything, including linux. For an example that comes to mind, why do I need to wait for my mail program to download all headers from the IMAP server before I can compose a new message on initial startup? Same with a lot of things in firefox.

Does anybody remember DeScribe?

BeOS (0)

Anonymous Coward | more than 6 years ago | (#21727350)

What about pervasive multithreading in the 90s with BeOS? I *actually* learned to write multithreaded applications on a SMP box with BeOS, the real thing :P Too bad nowdays I rarely do multithreading on Windows or Java. Sad times. :-/

Re:OS/2? (0)

Anonymous Coward | more than 6 years ago | (#21727420)

For an example that comes to mind, why do I need to wait for my mail program to download all headers from the IMAP server before I can compose a new message on initial startup?
Apple's Mail doesn't have such problems.

Re:OS/2? (2, Interesting)

shoor (33382) | more than 6 years ago | (#21727686)

I was working at a very small software shop when OS/2 came out. We would get a customer, who wanted something to work on an apollo workstation, another one wanted it for xenix, a third for Unix BSD 4.2 (my favorite), or Unix System V (ugh!), or Dos. So, we got a project to port something to OS/2 version 1.0, and I got it to work, and it used multi-threading which I thought was pretty cute and I was proud of myself for figuring it all out just from the manuals. Then the new revision of OS/2 came out and everything I had done was broken. My boss was so mad he swore off OS/2 forever after that.

Thank god (4, Funny)

Fizzl (209397) | more than 6 years ago | (#21727228)

Thank god that Java, C# and other piles of shit I hate do this quite intuitively and easily.
Guess I had it coming.
/me closes his eyes and embraces C++ for the last time before the inevitable doom

Re:Thank god (0)

Anonymous Coward | more than 6 years ago | (#21727334)

You bloaty C++ programmers wasting all that memory and such, you should be writing machine code you wasteful bastard!

Re:Thank god (5, Informative)

zifn4b (1040588) | more than 6 years ago | (#21727570)

The only significant thing that managed languages make easier with regard to multithreading other than a more intuitive API is garbage collection so that you don't have to worry about using reference counting when passing pointers between multiple threads.

All of the same challenges that exist in C/C++ such as deadly embrace and dining philosophers still exist in managed languages and require the developer to be trained in multi-threaded programming.

Some things can be more difficult to implement like semaphores. You also have to be careful about what asynchronous methods and events you invoke because those get queued up on the thread pool and it has a max count.

I would say managed languages are "easier" to use but to be used effectively you still have to understand the fundamental concepts of multithreaded programming and what's going on underneath the hood of your runtime environment.

How many languages have multithread support? (1)

cyfer2000 (548592) | more than 6 years ago | (#21727268)

How many languages have multithread support already?

Java, C#(?), Fortran(?)...

I haven't been programming in those languages for some time, so just curious, and my current major language (Igor pro) will use all the cores automatically, and how many languages do multithread this way? Matlab(?), Octave(?).

Re:How many languages have multithread support? (2, Interesting)

ILongForDarkness (1134931) | more than 6 years ago | (#21727616)

Matlab isn't that smart, you still have to tell it that the for loop is parallizable for example. I might be wrong but I don't think Java or C# do either. Their frameworks/VM's supply API's to do multi-threading you simply call into them for the support that you need. C has had pthreads for a long time (since it was standardized?), for some reason the C++ committee's never agreed on an implementation.

There is a great talk by Bjarne Stroustrup (http://csclub.uwaterloo.ca/media/C++0x%20-%20An%20Overview.html [uwaterloo.ca]) about the new version of C++ coming out and some of the difficulties getting things added. Essentially, if a new feature will only help 100,000 developers, it isn't important enough to be implemented. With such a huge developer community all the "little" things get left for non-standard API implementations, only big, almost everyone will find useful features get added. That is probably why this version or the next of C++ probably will get a standard tread library, because almost everyone has access to a multicore system. Oh yeah, also, and it sucks, anyone with a few thousand dollars to waste can get added to the committee, but most people don't care enough to go get their feature implemented for that much money (you also have the travel/time off to attend the meetings) except big business, so guess who runs the show (I don't expect anyone to be suprised).

Re:How many languages have multithread support? (0)

Anonymous Coward | more than 6 years ago | (#21727864)

Java, C#(?), Fortran(?)...

And C/C++

The dork that wrote this is a sensationalist. He aught to pick up a basic book on pthreads, RPC or just do a man on mutex. I was writing code like that 10 years ago.

C/C++ have had these for years. He just needs to discover them. And if written in C/C++, you know it will run faster than Java/C#. O'Reilly has a good book on these two.

The basic problem (5, Insightful)

ucblockhead (63650) | more than 6 years ago | (#21727292)

Some algorithms are inherently not amenable to parallelization. If you have eight cores instead of one, then the performance boost you can get can be anywhere from eight times faster to none at all.

So far, multiple cores have boosted performance mostly because the typical user has multiple applications running at a time. But as the number of cores increases, the beneficial effects diminish dramatically.

In addition, most applications these days are not CPU bound. Having eight cores doesn't help you much when three are waiting on socket calls, four are waiting on disk access calls and the last is waiting for the graphics card.

Re:The basic problem (1)

Arakageeta (671142) | more than 6 years ago | (#21727486)

You can actually get more than a 8x speed up by smartly exploiting a shared cache between cores.

Re:The basic problem (1)

ahabswhale (1189519) | more than 6 years ago | (#21727684)

You can actually get more than a 8x speed up by smartly exploiting a shared cache between cores.
Your post is so vague, that it is pointless. You might want to be more specific because otherwise I could easily just say "no you can't" and I'd be just as right.

Re:The basic problem (0)

Anonymous Coward | more than 6 years ago | (#21727794)

if you knew anything about parallel programming, you'd know what he's talking about. more cores doesn't just mean more CPU cycles per time, it means a larger amount of total cache memory in the system. computation is a lot faster when you don't have to wait on memory accesses. so if you're working with a 2 MB dataset on a processor with a 1 MB cache, then you move to 2 processors with 1 MB caches, you can see more than a 2x speedup because now all of the work can be done in cache.

Re:The basic problem (1)

phasm42 (588479) | more than 6 years ago | (#21727838)

Your post is so vague, that it is pointless. You might want to be more specific because otherwise I could easily just say "no you can't" and I'd be just as right.
Grandparent post was specific: shared cache

You don't have to be an ass if you want specifics, you could just ask. In this case, I think GPP is referring to the upfront cost of memory access. Once one core pays that price (think of it as a fixed cost), the other cores will be able to access the memory from the shared cache without having to pay that cost. Thus, theoretically resulting in a speedup greater than the number of cores.

Re:The basic problem (1)

$RANDOMLUSER (804576) | more than 6 years ago | (#21727490)

Some algorithms are inherently not amenable to parallelization.
Are you sure about that? If you put 9 women on the task of making a baby it only takes a month...

Re:The basic problem (0)

Anonymous Coward | more than 6 years ago | (#21727658)

Yeah but task 9 women with getting pregnant and you'll probably have a result faster than if you tasked 1. What's more, if someone takes care of the maintenance payments, I volunteer to err... 'seed' such an experiment.

Re:The basic problem (1)

Armozel (1203632) | more than 6 years ago | (#21727524)

Yes, that's what I've learned in brief with regard to multi-threading in classes I've taken this year (yes I'm still an under-grad). One way around this maybe to look toward OSes handling the CPU resources as one logical processor, but that may not work out so well in the end (in my opinion). In the end, it's just best to figure out which kinds of applications best use multi-processors, and figure out the algorithms that can execute in parallel, and which that cannot. But all this thinking that every algorithm can be made to magically execute in parallel doesn't seem to fit reality, at least for me and my programs.

Re:The basic problem (2, Interesting)

Anonymous Coward | more than 6 years ago | (#21727566)

In addition, most applications these days are not CPU bound. Having eight cores doesn't help you much when three are waiting on socket calls, four are waiting on disk access calls and the last is waiting for the graphics card.
Processors don't "wait" on blocked IO calls. Your program waits while the processor switches to another task. When the processor switches back to your program, it checks to see if the blocked IO call has completed. If it has, it continues executing your program again. If not, your program continues to wait while the processor again switches to other tasks.
So it is you (as the programmer) that determines if your program just sits and waits for blocked IO to complete. Or you could spawn a thread for blocked IO calls so your main program thread continues executing (if it is viable to your situation).

With more processors, your program and its blocked IO calls will be checked more frequently. So even blocked IO calls will see a performance increase.

Re:The basic problem (1)

$RANDOMLUSER (804576) | more than 6 years ago | (#21727700)

With more processors, your program and its blocked IO calls will be checked more frequently. So even blocked IO calls will see a performance increase.
You fail [wikipedia.org] it.

Re:The basic problem (1)

Chris Mattern (191822) | more than 6 years ago | (#21727968)

Processors don't "wait" on blocked IO calls. Your program waits while the processor switches to another task.


That's his *point*, nimrod. Processors don't wait on blocked I/O calls, but processes do. Therefore, having umpteen processors doesn't do you much good if there are no processes ready to run because they're all waiting on something.

Chris Mattern

Re:The basic problem (0)

Anonymous Coward | more than 6 years ago | (#21727596)

"In addition, most applications these days are not CPU bound."

So that's why things bog down at only 55% utilization. Damn, I thought my CPU was too slow.

While that's true (1)

Sycraft-fu (314770) | more than 6 years ago | (#21727884)

The bigger problem is that which the article mentioned: That programmers don't know how to take advantage of the parallel cores. There are two major parts to this:

1) Just because a given algorithm can't be implemented multi-threaded, doesn't mean there isn't another algorithm that does the same thing that can. So part of it is learning new ways of doing old things, or inventing new ways of doing things (we haven't discovered every possible algorithm).

2) Rethinking program design so that even though a given algorithm may be a single thread, many of them can run in parallel. As a simple example say you have a program that processes audio. Rather than having it process one track completely, then move on to the next, then mix them all when it is done you have it process each track in a different thread at the same time, then hand off the mixing to yet another thread (most DAWs work this way).

Nobody is saying it is easy (or at least nobody who understands it) but that also doesn't mean it is impossible. I fully agree, there are things where each step is dependant on the previous step and there is simply no way to do two steps in parallel. However I bet those are much less than you might first think, especially in the scheme of a whole program and not just a single algorithm in a program.

Thus programmers face the task of learning how to deal with this, both in terms of program design, new algorithms, and hopefully better compilers to help. It seems as though multi-core is the way of the future at least for a while, so just saying "Well we can't make this parallel," may not be an option.

concurrency - the developer's responsibility? (-1, Troll)

decuser (451726) | more than 6 years ago | (#21727322)

What a ridiculous idea. The application developer's free lunch is over, now she needs to think concurrently? Ha, she probably has difficulty with a single thread of thought...

But seriously, isn't the OS responsible for the heavy lifting with regards to task scheduling and concurrency? Oh, wait, this is Microsoft, right? Perhaps this is similar to their take on Security being somebody else's problem.

Re:concurrency - the developer's responsibility? (2, Insightful)

LWATCDR (28044) | more than 6 years ago | (#21727520)

"But seriously, isn't the OS responsible for the heavy lifting with regards to task scheduling and concurrency? Oh, wait, this is Microsoft, right? Perhaps this is similar to their take on Security being somebody else's problem."
Huhhh?
My guess is that you never wrote any code.
Linux doesn't do any more heavy lifting for you than Windows does. I doubt that OS/X does.
So what are you talking about.
An OS will never figure out what part of your program is going to need to be in which thread. A compiler MAY at some time do it but they are just now doing a good job with vectors.

Re:concurrency - the developer's responsibility? (0)

Anonymous Coward | more than 6 years ago | (#21727542)

Are you trolling or are you really that fucking retarded? Do you think there is a magic concurrency fairy that makes code parallel?

It's up to the programmers (and to a lesser extent, vectorizing compilers) to make their programs take advantage of multiple threads and SIMD architectures. There are plenty of frameworks such as MPI and OpenMP, but at the end of the day parallel code is still and order of magnitude harder to write and debug than its single-threaded cousin. As for C++, there are plenty of cross-platform threading libraries, but the fundamental problems are still there.

Re:concurrency - the developer's responsibility? (1)

meatpan (931043) | more than 6 years ago | (#21727600)

But seriously, isn't the OS responsible for the heavy lifting with regards to task scheduling and concurrency?

Only a surprisingly small group of programmers will be impacted by expanded multi-core architectures. In particular, the impacted devs are authors of the relatively 'low-level' code within compilers, kernels, and interpreters.

Re:concurrency - the developer's responsibility? (1)

s20451 (410424) | more than 6 years ago | (#21727736)

What a ridiculous idea. The application developer's free lunch is over, now she needs to think concurrently? Ha, she probably has difficulty with a single thread of thought...

I think this sentence makes most sense if you imagine it being read in Comic Book Guy voice.

Companies do not want good tools (1)

JackMeyhoff (1070484) | more than 6 years ago | (#21727324)

MSFT had C Omega, what did they do, crapify it into a .Net library to use via C# which is not the best library to use in a parallel world. C Omega was nice and abstracted a lot of the parallelism but no they canned that project. Look at LINQ, they crapified C# with VARIANT types and EXTENSIONS. Totally Crapping on OOP.

I assume this is about client or user software? (0, Troll)

FatSean (18753) | more than 6 years ago | (#21727326)

Because Java application servers have been multi-threaded for a long, long time.

Obviously, I haven't read the article.

Oh, wow (-1, Troll)

nagora (177841) | more than 6 years ago | (#21727354)

A guy who's on the C++ standards committee AND works for Microsoft. He's really going to know what he's talking about, then.

TWW

Re:Oh, wow (5, Insightful)

bladesjester (774793) | more than 6 years ago | (#21727786)

A guy who's on the C++ standards committee AND works for Microsoft.

Actually, according to the latest Dr Dobbs, Herb is the *chair* of the ISO C++ Standards committee. (He had an article on lock hierarchies being used to avoid deadlock)

He's really going to know what he's talking about, then.

As chair of the committee, I'd say there's a pretty fair chance that he *does*.

I really love people who bash things just because Microsoft is involved. Contrary to what seems to be a popular belief here, they have some incredibly intelligent people who are very good at what they do there.

melt in your mouth not in your mobo (0)

Anonymous Coward | more than 6 years ago | (#21727368)

the best quote in the article is the one about how the thermal issues with making single cores faster was the risk of melting. I don't seem to recall that being the case. Anyone care to comment or make fun of the journalist for throwing in that comment?

Re:melt in your mouth not in your mobo (2, Informative)

$RANDOMLUSER (804576) | more than 6 years ago | (#21727576)

Well then you're not remembering very well. There was some crazy statistic floating around that a Prescott at ~25Ghz would put out as much heat per cm^2 as the surface of the sun.

Multiple Applications. (1)

headkase (533448) | more than 6 years ago | (#21727380)

For now the biggest advantage of multiple cores is the ability to run multiple applications with each running at full speed. Within each application the problems get a lot more complex, using current algorithms many tasks are not easily subdivided. With data that is inherently paralizable it's pretty easy - each pixel on your display is relatively independent of the others and drawing on a common dataset. However the majority of other areas are not so easy. Generally, how do you take an algorithm and divide it in such a way that step A is separate from step B? Especially if the input of step B depends on the output of step A. Now that multicores are becoming common more research will be done in coming up with a fundementally new approach to algorithms themselves but two cores absolutely does not mean two times speed improvements - some algorithms simply cannot be divided with our current level of understanding.

Clue by 4? (1)

Nomen Publicus (1150725) | more than 6 years ago | (#21727398)

If your operating system can multi-program, you can have 8 programs running at the same time. No threads needed.

If you think your laptop doesn't need to run 8 programs at the same time you really should look under the hood more frequently :-)

Re:Clue by 4? (1, Informative)

Anonymous Coward | more than 6 years ago | (#21727692)

The problem is, most of those programs running "under the hood" on your laptop are likely small enough that they would not cause moderate strain on even a single CPU.

The user wants the application that he is currently running to perform as well as possible. If that application is single-threaded, it may not be able to perform as well as it needs to appear fast and responsive, and the net result will be that the user perceives the system as slow.

There's only so much an OS can do to "hide" the need for concurrent design and programming.

Is speed still the issue? (1)

heroine (1220) | more than 6 years ago | (#21727412)

Most of the jobs being created are not for achieving maximum speed but standards compliance. Companies want software which is easy to maintain & portable, but not necessarily the fastest. If it still was 1997 there would probably be ubiquitous implementations for SMP & vectored assembly language, but that's not the focus anymore.

It's the Curse of the Algorithm (1, Troll)

MOBE2001 (263700) | more than 6 years ago | (#21727430)

The reason that parallel programming is so hard is that we're still using the same computing model that English mathematician Charles Babbage pioneered 150 years ago. It's time to change. To understand the problem, read, Parallel Programming, Math, and the Curse of the Algorithm [blogspot.com].

Re:It's the Curse of the Algorithm (1)

tchuladdiass (174342) | more than 6 years ago | (#21727708)

I've read that, and the ideas that it links to. What you are proposing is that everything be converted to a form of multi-branch pipeline programming, correct? That is, think of a standard Unix pipe. Then imagine a process having multiple inputs / outputs, and each of those outputs can be connected to a input on a different process. So once the base modules are done, programming would be a matter of connecting various inputs and outputs, like designing an electronic circuit.

I can see how this would help with multiple cores, but can every construct in computing be represented by these "circuit" modules? Also, is this similar to Hartmann pipelines?

C++? (1)

K. S. Kyosuke (729550) | more than 6 years ago | (#21727434)

...while all the clever folks have already started writing their scalable applications in something reasonable, like Erlang? No offense to anybody using C++. but I think that C++ would first profit from some serious weight reduction dieting, before they start trying to develop better concurrency concepts.

(Not that this concerns me too much, I'm going to stay with Common Lisp...yeah, I know, it might suffer from the same issues (sometimes even vague semantics) in some places, but I probably just have that strange incurable parenthetical personality disorder that suddenly broke out on the beginning of the 80's. ;-))

Re:C++? (1, Informative)

Anonymous Coward | more than 6 years ago | (#21727514)

Almost all the desktop software that matters are written in C++ so obviously the clever minds are not where you think they are. Almost anything that has to do with graphics like the software from Adobe is written in C++, almost anything that has to do with video is written in C++, almost anything that has to do with audio and music is written in C++... the creatives mind are using C++ while the folks banging their heads at yet another web 2.0 or financial app are using Java, Lisp or scripting languages.

Re:C++? (1)

K. S. Kyosuke (729550) | more than 6 years ago | (#21727986)

Yes, and almost all these applications run only on Windows. Therefore, Windows are beyond any doubt the best and most creative OS in the world. :-)

Programmers are at fault (0)

Anonymous Coward | more than 6 years ago | (#21727438)

Programmers can barely tap the performance of "today's" chips, let alone the next generation of dual/quad core processors. There is too much focus on the trivial side of programming (easter eggs, useless 'functions', etc), and not enough on the engineering behind the program itself.

We can hardly blame the programmer though, there isn't much teaching of the engineering discipline going on in schools these days, it's just about metrics, spewing out code, and whatever toy programming language (java) that happens to be every marketers favorite buzzword.

All programmers do is "leave the optimizations to the compiler", or "let java take care of it". There is no 'algorithm analysis' or 'parallelism' thought given to the program.

Perhaps a better 'batch' of programmers or software engineers will address this problem.

Re:Programmers are at fault (1)

jedidiah (1196) | more than 6 years ago | (#21727680)

Why should programmers be knocked for only using the tools they have?

It's a lot easier to develop for something if you can actually get your hands on it. When this "nifty but underutilized" sort of hardware gets out there where everyone can use it, perhaps the problem will sort itself out. Not everyone has the resources to test their ideas in this area.

If you're going to knock anyone, knock the professors for ignoring this area of research and not capturing the attention of their students who are now "substandard practitioners".

I'm sure no one will ever read this, but (0)

Anonymous Coward | more than 6 years ago | (#21727476)

can't MS actually incorporate support for parallel programming in their OS? (I don't mean an application framework as we have now)I mean, isn't it the OS's job to interface with the hardware leaving the application developer free from most of these concerns? isn't it possible to incorporate something into the OS that will automatically make the best use of multi cpu's/cores etc, while allowing applications to remain unchanged or almost unchanged?

Wait for the new C++ standard before you switch... (1)

DamnStupidElf (649844) | more than 6 years ago | (#21727480)

There is currently no working concurrency model for standard C++. You want to make an atomic access to an object? Hope and pray that you have bug free system libraries and a compiler that doesn't optimize away your locking wrappers and do inappropriate speculative stores. Apparently the next C++ standard will address it, but it seems rather foolish to start a transition to massively multithreaded code without an actual standard.

Personal computing? (5, Interesting)

Dan East (318230) | more than 6 years ago | (#21727536)

"processors with more than eight cores, possible as soon as 2010 -- will transform the world of personal computing"

Exactly what areas of "personal computing" are requiring this horsepower? The only two that come to mind are games and encoding video. The video encoding part is already covered - that scales nicely to multiple threads, and even free encoders will use the extra cores to their full potential. That leaves gaming, which is basically proprietary. The game engine must be designed so that AI, physics, and other CPU-bound algorithms can be executed in parallel. This has already been addressed.

So this begs the question, exactly how will average consumer benefit from an OS and software that can make optimum use of multiple cores, when the performance issues users complain about are not even CPU-bound in the first place?

Dan East

speech recognition (1)

schwaang (667808) | more than 6 years ago | (#21727760)

Not that it takes massive (by today's PC standards) compute power to do decent speech recognition, but it's definitely worth dedicating a core or two.

And then with Vista, you might need one or two cores dedicated to handling UAC events ("The user tried to breath again: Cancel or Allow?").

Re:Personal computing? (2, Insightful)

bogie (31020) | more than 6 years ago | (#21727848)

"So this begs the question, exactly how will average consumer benefit from an OS and software that can make optimum use of multiple cores"

AOL 10.0 will say "You got mail!" .25ms faster.

Re:Personal computing? (1)

Thelasko (1196535) | more than 6 years ago | (#21727924)

Exactly what areas of "personal computing" are requiring this horsepower?
Windows Vista! Sorry, if I didn't say it someone else would have.

Re:Personal computing? (1, Informative)

Anonymous Coward | more than 6 years ago | (#21727994)

So this begs the question, exactly how will average consumer benefit from an OS and software that can make optimum use of multiple cores, when the performance issues users complain about are not even CPU-bound in the first place?

Unfortunately it doesn't beg anything. Begging the question [wikipedia.org] is something completely different.

Diaspora (1)

dino213b (949816) | more than 6 years ago | (#21727648)

I agree with some of the previous posters that have faulted programmers for "the state of today." My feeling is that the divide between knowledge of hardware and knowledge of software is far too wide. In my experience, I have witnessed many programmers who spent more time organizing the readability of their code than analyzing the actual effectiveness of it: i.e. whitespace use vs algorithm optimization (be it processor method + instruction or i/o improvement). The end result: bloaty-pooh.

I feel that by making threading a C++ standard, or at least making the threading model a predominant one, the overall "state of today" will improve in the near future simply because more programmers will be aware of it. Parallel processing really does require some training -- it cannot be adapted to every task.

Take for example a simple thumbnail-generating program (a form of is used by everyday users). If the program is written in its traditional linear model, it would not take advantage of multiple processors or cores (unless you ran multiple instances of it or otherwise manipulated it in an unplanned fashion). However, if the program utilizes threading, it could become scalable without requiring any intervention. Knowledge of hardware -- and not necessarily relying on the compiler to optimize your code just might help.

HPC (2, Interesting)

ShakaUVM (157947) | more than 6 years ago | (#21727650)

As someone who got a master's in computer science with a focus in high performance computing / parallel processing, and have taught on the subject, *yes*, it does take a bit of work to wrap one's mind around the concept of parallel processing, and to correctly write code with concurrency. But *no*, it's not really that hard. Once you get used to the idea of having computation and communication cycles over a processor geometry, it becomes little more difficult to write parallel code than serial.

It's like of like when people see recursive functions for the first time. If they don't understand the base condition and inductive step, then they can easily fall into infinite loops or write bugs. Parallel code is the same way... just a bit more tricky.

Sameless Plug: Qt 4.4 (5, Informative)

scorp1us (235526) | more than 6 years ago | (#21727696)

Full disclosure: I am a Qt Developer (user) I do not work for TrollTech

The new Qt4.4 (due 1Q2008) has QtConcurrent [trolltech.com], a set of classes that make multi-core processing trivial.

From the docs:

The QtConcurrent namespace provides high-level APIs that make it possible to write multi-threaded programs without using low-level threading primitives such as mutexes, read-write locks, wait conditions, or semaphores. Programs written with QtConcurrent automaticallly adjust the number of threads used according to the number of processor cores available. This means that applications written today will continue to scale when deployed on multi-core systems in the future.

QtConcurrent includes functional programming style APIs for parallel list prosessing, including a MapReduce and FilterReduce implementation for shared-memory (non-distributed) systems, and classes for managing asynchronous computations in GUI applications:

        * QtConcurrent::map() applies a function to every item in a container, modifying the items in-place.
        * QtConcurrent::mapped() is like map(), except that it returns a new container with the modifications.
        * QtConcurrent::mappedReduced() is like mapped(), except that the modified results are reduced or folded into a single result.
        * QtConcurrent::filter() removes all items from a container based on the result of a filter function.
        * QtConcurrent::filtered() is like filter(), except that it returns a new container with the filtered results.
        * QtConcurrent::filteredReduced() is like filtered(), except that the filtered results are reduced or folded into a single result.
        * QtConcurrent::run() runs a function in another thread.
        * QFuture represents the result of an asynchronous computation.
        * QFutureIterator allows iterating through results available via QFuture.
        * QFutureWatcher allows monitoring a QFuture using signals-and-slots.
        * QFutureSynchronizer is a convenience class that automatically synchronizes several QFutures.
        * QRunnable is an abstract class representing a runnable object.
        * QThreadPool manages a pool of threads that run QRunnable objects.

This makes multi-core programming almost a no-brainer.

Just Maybe... (0)

Anonymous Coward | more than 6 years ago | (#21727720)

Windows will now be able to list the contents of a folder in less than 30 seconds.

Evolution that halted at 4 ghz.... (2, Interesting)

MindPrison (864299) | more than 6 years ago | (#21727754)

It's not easy... especially since things sort of halted at 4 ghz, what on earth am I typing about? Well...picture this...limitations...yes they do exist..and sometimes it's important to think beyond what lies just straight ahead (such as the next cycle speed)...and think into a second...maybe even a 3rd dimmension to expand your communication speed. I have for over 6 years been thinking..of a 3d-dimmension processor that cross communicates over a diagonal matrix instead of the traditional serial and parallel communication model. Imagine this folks...if your code could "walk" across a matrix of 10 x 10 x 10 instead of just 8 x 8 or 64 x 64 if you want...get the picture, no? Imagine that your data could communicate on a 3 dimmensional axis - imagine that you had 10 stacks of cores on top of each other - and instead of just connecting they communication bus to a parallel or a serial model...they could in fact communicate on a diagonel basis... this would make it possible to send commands...data..etc....in a 3d-space rather than just a "queue". This of course...would demand a different "mindset" of coding... everything would have to be written from scratch....though...but the benefits would be tremendeous .....you could 10 fold existing computational speed by increasing the communication across processor-cores...maybe even more! Even by todays technology standards. Ok..ok...sounds far fetched for you doesnt it? Well..get this...this was my invention 6 years ago (maybe even 9 years ago...I am getting older so I dont really care...I do care for freedom of information and sharing...Not so much wealth so listen on)...The theory of what I just wrote here on Slashdot (which has more implication on your life in the future than you will ever be capable of comprehending...yes...I am full of myself aint i....Who cares? You dont know me) .. point is... There was once a missing brick to the idea of diagonal cross matrix computing....with yesteryears technology it just would not be feasible to do it... but ...if you have ANY understanding of what I write here (yes...I am not kidding...this may change history as we know it...and I am drunk right now...and I dont want to keep a lid on it anymore)...here we go... Please think about what I just wrote - and - look up frances hellman's lecture upon magnetic materials in semiconductors...and you WILL have your 4-th link in the 3-B-E-C (base, Emitter, Collector) construction...to make the Cross Matrix Processor possible....just understand this....JoOngle invented this...Frances made it possible - YOU read it from a drunk nobody of Slashdot.org....) now...go make it real!

Re:Evolution that halted at 4 ghz.... (4, Informative)

Animats (122034) | more than 6 years ago | (#21727932)

I have for over 6 years been thinking..of a 3d-dimmension processor that cross communicates over a diagonal matrix instead of the traditional serial and parallel communication model.

Six years, and you haven't discovered all the machines built to try that? This was a hot idea in the 1980s. Hypercubes, connection machines, and even perfect shuffle machines work something like that. There's a long history of multidimensional interconnect schemes. Some of them even work.

And so it goes...... (5, Insightful)

Nonillion (266505) | more than 6 years ago | (#21727770)

processors with more than eight cores, possible as soon as 2010 -- will transform the world of personal computing....

Translation:

Code will get even more inefficient / bloated and require faster hardware to do the same thing you are doing now. While I'm all for better / faster computer hardware, most if not all Jane and Joe Sixpack users never need Super Computer power to surf the net, read e-mail and watch videos.

Erlang (4, Informative)

Niten (201835) | more than 6 years ago | (#21727778)

Oddly enough, I just watched a presentation about this very topic, with an emphasis on Erlang [erlang.org]'s model for concurrency. The slides are available here:

http://www.algorithm.com.au/downloads/talks/Concurrency-and-Erlang-LCA2007-andrep.pdf [algorithm.com.au]

The presentation itself (OGG Theora video available here [linux.org.au]) included an interesting quote from Tim Sweeney, creator of the Unreal Engine: "Shared state concurrency is hopelessly intractable."

The point expounded upon in the presentation is that when you have thousands of mutable objects, say in a video game, that are updated many times per second, and each of which touches 5-10 other objects, manual synchronization is hopelessly useless. And if Tim Sweeney thinks it's an intractable problem, what hope is there for us mere mortals?

The rest of this presentation served as an introduction to the Erlang model of concurrency, wherein lightweight threads have no shared state between them. Rather, thread communication is performed by an asynchronous, nothing-shared message passing system. Erlang was created by Ericsson and has been used to create a variety of highly scalable industrial applications, as well as more familiar programs such as the ejabberd Jabber daemon.

This type of concurrency really looks to be the way forward to efficient utilization of multi-core systems, and I encourage everyone to at least play with Erlang a little to gain some perspective on this style of programming.

For a stylish introduction to the language from our Swedish friends, be sure to check out Erlang: The Movie [google.com].

Threads considered harmful (4, Interesting)

richieb (3277) | more than 6 years ago | (#21727798)

Check out this article [oreilly.com] on O'Reilly's site. Threads are actually very low level construts (like pointers and manual memory management). Accordingly the future belongs to languages that eliminate threads as a basis for concurrency. See Erlang and Haskell.

Re:Threads considered harmful (0)

Anonymous Coward | more than 6 years ago | (#21727976)

You don't need yet another language to do that. You can WRAP around the threads to build a lib of higher level constructs.
Just because C++ has manual memory management and pointers doesn't mean you HAVE to use them yourself.
There's plenty of libs providing Reference Counting or classic garbage collectors.
Just build a collection of templates to abstract the threads.

Erlang is slow to do anything in a small scale btw. With three or four CPU running an Erlang program you are barely getting what C++ can do with only one, seriously. Erlang is good at what it was designed for (software who can work around failures better) but not at getting speed from multiple core CPUs on our desktop towers.

Haskell is fast but is a goddamned pain in the ass to do anything that has to do with GUI and IO because of its functional purity.

I may not like the implications for my code... (1)

AndyCR (1091663) | more than 6 years ago | (#21727834)

But I sure like my newfound ability to compile multiple source files at once and finish a 5-files-changed compile in a few seconds.

This has been coming for a while. (4, Interesting)

jskline (301574) | more than 6 years ago | (#21727940)

The fact is that programming by and large has gotten lazy, shiftless and sloppy over time and not any better or faster. They really did rely on processing and memory architectures getting faster to overcome their coding bottlenecks. The words; "optimized code" have little or no significance in todays programming shops because of budgets. Because of the push to get stuff out the door as quickly as possible, corners are cut all over the place on many things.

There once was time when debugging was part of your job. Now; someone else does that and at most, the better coders do some unit testing to ensure their code snippet does what it is supposed to. There generally isn't any "standard" with regard to processes except in some houses that follow *recommended coding guidelines* but these are few and far between. Old school coders had a process in mind to fit a project as a whole and could see the end running program. Many times now, you are to code an algorithm without any regard or concept as to how it might be used. A lot of strange stuff going on out there in the business world with this!

If there is a fundamental change in the base for C++, et al., this is going to possibly have a detrimental effect on the employment market as there will be many who cannot conceptualize multi-threading methodologies much less modeling some existing processing in this paradigm; and leave the markets.

I left the programming markets because of the clash of bean counters vs quality, and maybe this will have a telling change in that curve. I always did enjoy some coding over the years and maybe this would make an interesting re-introduction. I have personally not coded in a multi-threading project but have the concepts down. Might be fun!
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...