Smarter Thread Scheduling Improves AMD Bulldozer Performance

Follow Slashdot stories on Twitter

Smarter Thread Scheduling Improves AMD Bulldozer Performance 196

Posted by Soulskill on Friday October 28, 2011 @01:12PM from the almost-up-to-par dept.

crookedvulture writes "The initial reviews of the first Bulldozer-based FX processors have revealed the chips to be notably slower than their Intel counterparts. Part of the reason is the module-based nature of AMD's new architecture, which requires more intelligent thread scheduling to extract optimum performance. This article takes a closer look at how tweaking Windows 7's thread scheduling can improve Bulldozer's performance by 10-20%. As with Intel's Hyper-Threading tech, Bulldozer performs better when resource sharing is kept to a minimum and workloads are spread across multiple modules rather than the multiple cores within them."

This discussion has been archived. No new comments can be posted.

Smarter Thread Scheduling Improves AMD Bulldozer Performance

Load All Comments

Search 196 Comments Log In/Create an Account

Comments Filter:

no one got fired buying intel (Score:2)

by alen ( 225700 ) writes:

that's the truth. unless i can buy an AMD server for a lot cheaper i'm not going to try and take on the risk of performance issues
- Re: (Score:2)
  
  by h4rr4r ( 612664 ) writes:
  
  Depends on what you mean by a lot cheaper. If you need lots of cores but don't need them fast, like for a VM host then AMD servers can be quite a bit cheaper once we are talking about getting 128GB+ of RAM.
  Risk of performance issues makes no sense if you don't know what app you want to run.
- Re:no one got fired buying intel (Score:4, Informative)
  
  by Antisyzygy ( 1495469 ) writes: on Friday October 28, 2011 @01:37PM (#37871088)
  
  AMD servers are way cheaper, and there are no performance issues most admins can't handle. What do you mean by performance? If you mean slower, then yes, but if you mean reliability than they are about the same. Why else do Universities almost exclusively use AMD processors in their clusters for cutting edge research? I can see your point if you are only buying 1-3 servers but you start saving shitloads of money when its a server farm.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by KhazadDum ( 790345 ) writes:
    
    Agreed. To further expound upon parent's point, unless you really know your performance needs and requirements, where the initial extra cost of Intel chips is lower than the revenue that is gained with that extra couple percent of performance, then go Intel. Otherwise, it's usually a cost versus preference piss fest. And last I checked in a down economy, cost is king.
  - Re:no one got fired buying intel (Score:5, Interesting)
    
    by Kjella ( 173770 ) writes: on Friday October 28, 2011 @02:38PM (#37871960) Homepage
    
    Well, it doesn't seem to apply when you get up to supercomputing levels at least. I checked the TOP500 list [top500.org] and it's 76% Intel, 13% AMD. As for Bulldozer, it has serious performance/watt issues even though the performance/price ratio isn't all that bad for a server. On the desktop, Intel hasn't even bothered to make a response except to quietly add a 2700K to their pricing table, with the 2600K left untouched. On the business side (where after all margins fund future R&D) then Sandy Bridge's 216mm2 is much smaller than Bulldozer's 315mm2. Intel can produce almost 50% more in the same die area, in practice the yields probably favor Intel more because the risk of critical defects go up with size. Honestly, I don't think Intel has felt less challenged since the AMD K5 days...
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by Antisyzygy ( 1495469 ) writes:
      
      When you can save 8000 per server then invest it in something else it becomes a different issue. I am not trying to say AMD processors are superior, I am just saying factoring in all costs including power and and the lifespan of the unit, AMD wins a lot of the time. Every computer cluster at every University I have ever had access to used AMD processors (with the exception of some NVidia units), and this was for their CS departments. I suspect part of the issue is its easier to justify power budgets and not
      - Re: (Score:2)
        
        by Antisyzygy ( 1495469 ) writes:
        
        by cost-effective computer I meant cluster! Also, I would like to add I had high hopes for the bulldozer, so it was disappointing it was all marketing hype.
      - Re: (Score:2)
        
        by bill_mcgonigle ( 4333 ) * writes:
        
        Fast memory bus, nothing special needed to use ECC RAM, good work/watt, and low prices all help win AMD for most clusters.
        If you're aiming for a Top-500 slot and you have server money but not real estate money, then Intel is the logical choice.
  - Re: (Score:2)
    
    by blair1q ( 305137 ) writes:
    
    >Why else do Universities almost exclusively use AMD processors in their clusters
    Because when your budget is fixed and N is the number of nodes you can afford and M is the performance per node, and N1*M1 > N2*M2, you buy P1 over P2 even if M1 > N2 in this case because proprietor 1 has a lot of trouble selling its units to individuals and turns to massively discounting its products when sold in bulk to HPC OEMs.
    - Re: (Score:2)
      
      by Antisyzygy ( 1495469 ) writes:
      
      Yes, as I said, its cheaper. I did not mention performance per node but I did in a later post.
  - Re: (Score:2)
    
    by Idbar ( 1034346 ) writes:
    
    Why else do Universities almost exclusively use AMD processors in their clusters for cutting edge research
    [citation needed]
    
    Not that I question your argument, but I want to see you backing up your claims. Last time I checked, that was not the case.
    - Re: (Score:2)
      
      by Antisyzygy ( 1495469 ) writes:
      
      Thats a tough one to find a citation for. Essentially, at the three universities I have worked and two of them I have collaborated with they had between 6/10 and 8/10 clusters running AMD processors. Some had NVidia clusters as well. Some had ones running Intel but they were older systems or were in use by departments other than Math/CS. This evidence may be anecdotal, but two of the universities are larger ones with large research budgets. Exclusive was a bit of a exaggeration in hind sight.
      - Re: (Score:2)
        
        by Idbar ( 1034346 ) writes:
        
        As you said, it's probably a market strategy. Every vendor focuses on certain companies/universities to sell their products. The one I worked for (what I guess is a 2nd-tier ranked university) received a lot of discounts depending on the vendors, and used to get lots of Intel based processor CPUs. Perhaps AMD is targeting more aggressively Tier 1 universities, while others take a wider range. That's why I asked.
        
        Of course many are interested in seeing their products advertised at top universities, while ot
  - Re: (Score:3)
    
    by yuhong ( 1378501 ) writes:
    
    And slower will be I think solved with Interlagos, and Intel will have only Westmere-EX (Xeon E7) to compete since Sandy Bridge-EP is not even released yet. Now compare the already-released pricing [softpedia.com] of Opteron 6200 CPUs with Intel's current Xeon 7500/E7 pricing, and guess what will happen.
- Re:no one got fired buying intel (Score:4, Informative)
  
  by QuantumRiff ( 120817 ) writes: on Friday October 28, 2011 @01:55PM (#37871308)
  
  A dell R815 with 2 twelve-core AMD processors (although they were not bulldozer ones) 256GB of ram, and a pair of hard drives was $8k cheaper than a similarly configured Dell R810 with 2 10-core Intel Processors when we ordered a few weeks ago. That difference in price is enough to buy a nice Fusion-IO drive, which will make much, much more of a performance impact than a small percentage higher CPU speed
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by 0123456 ( 636235 ) writes:
    
    Clearly AMD should be charging $4k more for their CPUs if they're leaving that big a gap between their price and Intel's.
    - Re: (Score:2)
      
      by Surt ( 22457 ) writes:
      
      They're fighting reputation. If it was $4k more, they would probably lose too many sales to make up the price difference.
      - Re: (Score:2)
        
        by Zorpheus ( 857617 ) writes:
        
        Maybe their reputation would be better if their processors would cost the same.
        Some people just think that something must be worse when it is cheaper.
        
        Re: (Score:2)
        
        by billcopc ( 196330 ) writes:
        
        Those kinds of people are very vulnerable to an optimistic young techie destroying their rep as a purchaser, or so my last two years of sales would suggest. I displaced someone who would only buy "the best", which in his view meant something 5x more expensive, and where every tech dispatch was accompanied by a sales guy, to work the purchaser while techie was busy installing the goods.
        If AMD can deliver better performance per $ and per watt in the server room, I'll consider them, and so will my clients if
    - Re: (Score:2)
      
      by yuhong ( 1378501 ) writes:
      
      And AMD has also be trotting the death of the 4P tax on their blogs in the Opteron 6100 era, and there is no indication they will going to change that with Opteron 6200 anyway [softpedia.com].
  - Re: (Score:2)
    
    by nabsltd ( 1313397 ) writes:
    
    A dell R815 with 2 twelve-core AMD processors (although they were not bulldozer ones) 256GB of ram, and a pair of hard drives was $8k cheaper than a similarly configured Dell R810 with 2 10-core Intel Processors when we ordered a few weeks ago.
    The Westmere-EX CPUs on the Dell R810 are recently released, and as such are very pricey. They are also much, much faster than any other Intel or AMD chip on a per-clock basis. Because the E7-88xx Xeons have nearly twice the cache (30MB "smart" vs. 24MB total L2 plus L3), are hyper-threaded, and run faster clock-for-clock, a heavily parallel task will likely finish faster on a single CPU Westmere-EX than on a dual CPU Magny-Cours.
    Because of this, the R810 is a much, much more powerful system than the R815
    - Re: (Score:2)
      
      by yuhong ( 1378501 ) writes:
      
      Yes, but I have wondered for a while what will happen to the quad-socket market if AMD sticks to the same pricing policy with Interlagos. Remember that Intel is one generation behind with Westmere-EX, and Sandy Bridge-EP is not even released yet right now.
    - Re: (Score:2)
      
      by yuhong ( 1378501 ) writes:
      
      And remember that Interlagos will be drop-in replacement for Magny-Cours.
  - Re: (Score:2)
    
    by afidel ( 530433 ) writes:
    
    Apples to apples they cost difference between an R810 and R815 should be on the order of $200, not $8,000.
  - Re: (Score:2)
    
    by the linux geek ( 799780 ) writes:
    
    A 10-core Westmere-EX vs a 12-core Magny-Cours is much more than a "small percentage higher" - probably 30 or 40%, potentially higher depending on workload.
  - Re: (Score:2)
    
    by blair1q ( 305137 ) writes:
    
    I highly doubt the price difference was because of the processors. More likely it was because Dell is having trouble moving those boxes because they're slower.
Weird (Score:2)

by 0123456 ( 636235 ) writes:

Perhaps I'm remembering incorrectly, but I thought part of the Bulldozer hype was that it had two 'real' cores and not hyperthreading, with only a few resources shared? Yet now it turns out that you have to treat it like a hyperthreading CPU or performance sucks.
I still don't understand why AMD didn't just set the hyperthreading bit in the CPU flags, so Windows would presumably just treat it like a hyperthreading CPU in the first place.
- Re: (Score:3)
  
  by Sloppy ( 14984 ) writes:
  
  I thought part of the Bulldozer hype was that it had two 'real' cores and not hyperthreading,
  
  No, the hype is that it blurs the distinction between cores and hyperthreading. It's both and neither.
- Re: (Score:3)
  
  by laffer1 ( 701823 ) writes:
  
  It's not like hyper threading. For integer operations, the AMD chips are much better. What AMD doesn't have is two floating point units so that's what gets bogged down. There are two instruction decoders and two units to handle integer math, but one floating point unit per component.
  AMD's approach is faster for some workloads. The problem is that they didn't design it around how most people currently write software.
  I would have preferred AMD to implement hyper threading as it would have greatly simplifi
  - Re: (Score:3)
    
    by 0123456 ( 636235 ) writes:
    
    It's not like hyper threading. For integer operations, the AMD chips are much better. What AMD doesn't have is two floating point units so that's what gets bogged down. There are two instruction decoders and two units to handle integer math, but one floating point unit per component.
    Ah, so this benchmark is floating point and that's why it's faster across multiple cores?
    I can't really see AMD convincing Microsoft to invest a lot of effort into dynamically tracking which threads use floating point and which don't and reassigning them appropriately. Maybe a flag on the thread to say whether it's using floating point or not at creation time would be viable, but then app developers won't bother to set it.
  - Re: (Score:2)
    
    by fuzzyfuzzyfungus ( 1223518 ) writes:
    
    To me, Bulldozer's shared-FPU design looks rather like they wanted some of the specialized-workload advantage of the UltraSPARC T-series CPUs; but with somewhat less extreme trade-offs(The T1 had a single FPU shared between 8 physical cores, which proved to be a little too extreme and was beefed up in the T2). There are a fair number of server tasks that are FPU light; but have lots of threads, often do well with a lot of RAM, and are fairly cost sensitive.
    
    Not at all a good recipe for a workstation or sc
    - Re: (Score:3)
      
      by DamonHD ( 794830 ) writes:
      
      A T1 is still working well for me: at most about 1 thread on my entire Web server system is doing any FP at all, and in places I switched to some light-weight integer fixed-point calcs instead. That now serves me well with the came code running on a soft-float (ie no FP h/w) ARMv5.
      So, for applications where integer performance and threading is far more important than FP, maybe AMD (and Sun) made the right decision...
      Rgds
      Damon
  - Re: (Score:2)
    
    by washu_k ( 1628007 ) writes:
    
    It's not like hyper threading. For integer operations, the AMD chips are much better. What AMD doesn't have is two floating point units so that's what gets bogged down. There are two instruction decoders and two units to handle integer math, but one floating point unit per component.
    It's a lot closer to hyper-threading than you think. The BD chips do *NOT* have two instruction decoders per module, just one. The only duplicated parts are the integer execution units and the L1 Data caches. The Instruction fetch+decode, L1 Instruction Cache, Branch prediction, FPU, L2 Cache and Bus interface are all shared.
    
    It's pretty obvious how limited each BD "core" really is given these benchmarks. AMD should have presented the CPU as having hyper-threading to the OS.
    - Re: (Score:2)
      
      by laffer1 ( 701823 ) writes:
      
      The earlier article i read must have been way off then.
      Here's a set of graphics displaying the actual architecture.
      http://www.anandtech.com/Gallery/Album/754#7 [anandtech.com]
  - Re: (Score:2)
    
    by Chris Burke ( 6130 ) writes:
    
    It's not like hyper threading. For integer operations, the AMD chips are much better. What AMD doesn't have is two floating point units so that's what gets bogged down. There are two instruction decoders and two units to handle integer math, but one floating point unit per component.
    The decoders are a shared resource in the Bulldozer core. That can be a significant bottleneck that affects integer code. Also, those integer sub-cores are still sharing a single interface to the L2 and higher up the memory hierarchy. So it's not all roses for integer apps.
    Speaking of memory hierarchy, the FX parts are, like FX parts of the past, just server chips slapped into a consumer package. So the cores being studied here all have pretty substantial L3s. One of the claimed benefits of putting rel
  - stave me (Score:2)
    
    by epine ( 68316 ) writes:
    
    I would have preferred AMD to implement hyper threading as it would have greatly simplified things for OS developers. It's getting to a point where kernels have to know about CPU families in order to get the performance they need. They also have to know the workload.
    This an architecture designed for a ten year run, much like the original P6, which underwhelmed everyone with (at most) half a brain.
    Just how long do you think the OS can remain task agnostic as we head down the road to eight and sixteen core pr
    - Re: (Score:2)
      
      by laffer1 ( 701823 ) writes:
      
      And how would the OS know the workload ahead of time? It's not like there are hints in the binary that it's going to be doing floating point work or that it's going to be CPU bound.
      Remember that the more complex we make scheduling, the slower it is. Schedulers have to be fast. There's only so much the OS can do to help out. As a programmer, we're taught that the hardware is a black box. We're supposed to assume it works correctly most of the time. There's a big difference between seeing a hyper thread
- - Re: (Score:2)
    
    by 0123456 ( 636235 ) writes:
    
    Lying to the OS for short term gain means long term pain.
    Shipping hardware whose performance sucks on real workloads and expecting the OS developers to fix your problem causes short-term pain that leads to long-term pain as your sales drop through the floor.
"10% to 20%" boost is just overclocking processor (Score:2)

by Skarecrow77 ( 1714214 ) writes:

The article basically says "if your schedule threads to use less modules, dynamic turbo will clock those modules up, giving you a performance boost.
so... anybody who is already clocking their entire cpu at top stable clock speed isn't going to get a boost out of thread scheduler modifications.
- Re: (Score:2)
  
  by Skarecrow77 ( 1714214 ) writes:
  
  I take it back. apparently that's what page 1 says. There is a page 2 and it says something else entirely.
But does it actually make a difference? (Score:3)

by robot256 ( 1635039 ) writes: on Friday October 28, 2011 @01:34PM (#37871058)

Sure, the scheduling change improves performance by 10-20% for certain tasks, but that still makes it 30-50% slower than an i7, and with more power consumption.
I can't fault AMD for not having full third-party support for their custom features, since Intel had a head-start with hyperthreading, but if it will still be an inferior product even after support is added then I'm not going to buy it.

Share
twitter facebook
- Re: (Score:2)
  
  by h4rr4r ( 612664 ) writes:
  
  30% slower at what percentage of the cost?
  If it costs 50% as much as an i7 that might then be fine.
  - Re: (Score:2)
    
    by AdamJS ( 2466928 ) writes:
    
    They generally cost between 8% less and 20% MORE than their closest performance equivalents (hard to use that word since the gap is still pretty noticeable). That's sort of part of the problem.
    - Re: (Score:2)
      
      by HarrySquatter ( 1698416 ) writes:
      
      That and the fact that they are power hogs compared to even the higher end sandy bridge and phenom ii processors
  - Re: (Score:3)
    
    by HarrySquatter ( 1698416 ) writes:
    
    An i2600k is only 15% more expensive has a 25% lower tdp and blows away the fx-8150 in most of the benchmarks. Even with this tweak it'll still barely compete and the 2600k has half as many real cores and a lower clock speed.
It's a Windows limitation (Score:4, Informative)

by Animats ( 122034 ) writes: on Friday October 28, 2011 @02:00PM (#37871408) Homepage

This is really more of an OS-level problem. CPU scheduling on multiprocessors needs some awareness of the costs of an interprocessor context switch. In general, it's faster to restart a thread on the same processor it previously ran on, because the caches will have the data that thread needs. If the thread has lost control for a while, though, it doesn't matter. This is a standard topic in operating system courses. An informal discussion of how Windows 7 does it [blogspot.com] is useful.
Windows 7 generally prefers to run a thread on the same CPU it previously ran on. But if you have a lot of threads that are frequently blocking, you may get excessive inter-CPU switching.
On top of this, the Bulldozer CPU adjusts the CPU clock rate to control power consumption and heat dissipation. If some cores can be stopped, the others can go slightly faster. This improves performance for sequential programs, but complicates scheduling.
Manually setting processor affinity is a workaround, not a fix.

Share
twitter facebook
Windows? (Score:2)

by turgid ( 580780 ) writes:

Windows is not exactly known for its multi-processor (multi-core) scalability.
Repeat the test with a real OS (Linux, Solaris...) and I'll be interested, especially Solaris x86 since it is known to be the best at scaling on parallel hardware.
It was already beating all intel in highly threade (Score:5, Interesting)

by unity100 ( 970058 ) writes: on Friday October 28, 2011 @02:10PM (#37871612) Homepage Journal

applications, like photosop cs5 or truecrypt, including some more :

http://www.overclock.net/amd-cpus/1141562-practical-bulldozer-apps.html [overclock.net]

also, if you set your cpuid to genuineintel in some of the benchmark programs, you will get suprising results :
try changing cpuid=genuineintel for +47% INCREASE IN SCORES.

changing cpuid to GenuineIntel nets 47.4% increase in performance:
[url]http://www.osnews.com/story/22683/Intel_Forced_to_Remove_quot_Cripple_AMD_quot_Function_from_Compiler_[/url]

PCMark/Futuremark rigged bentmark to favor intel:
[url]http://www.amdzone.com/phpbb3/viewtopic.php?f=52&t=135382#p139712[/url] [url]http://arstechnica.com/hardware/reviews/2008/07/atom-nano-review.ars/6[/url]

intel cheating at 3DMark vantage via driver: [url]http://techreport.com/articles.x/17732/2[/url]

relying on bentmarks to "measure performance" is a fool's errand. dont go there.

Share
twitter facebook
- Re: (Score:3)
  
  by yuhong ( 1378501 ) writes:
  
  It is time for some reverse engineering of the benchmark programs I think to see what exactly is happening.
  - No need, everyone knows... (Score:2, Informative)
    
    by Anonymous Coward writes:
    
    Here's Agner Fog's page about this issue. [agner.org]
    The Intel compiler (for many years and many versions) has generated multiple code paths for different instruction sets. Using the lame excuse that they don't trust other vendors to implement the instruction set correctly, the generated executables detect the "GenuineIntel" CPU vendor string and deliberately cripple your program's performance by not running the fastest codepaths unless your CPU was made by Intel. So e.g. if you have an SSE4-capable AMD CPU, it will
    - Re: (Score:2)
      
      by yuhong ( 1378501 ) writes:
      
      As a note, I remember seeing one of Intel's libraries used in MSHTML.DLL in MS's own IE9 when I was disassembling it with IDA.
- Re: (Score:2)
  
  by blair1q ( 305137 ) writes:
  
  >relying on bentmarks to "measure performance" is a fool's errand. dont go there.
  And yet, that's what you're doing.
  The correct phrase is: Relying on benchmarks that are not relevant to your application is a fool's errand.
- Re: (Score:2)
  
  by makomk ( 752139 ) writes:
  
  One fun side note: notice how that link says "it will fail to recognize future Intel processors with a family number different from 6". Intel have conspicuously kept the family number reported by CPUID at 6 on their new processors in order not to trigger a fallback to the non-Intel pathway that AMD processors get to use, presumably because they know how much that'll harm them in benchmarks and how bad the reviews will look.
- Re: (Score:2)
  
  by ackthpt ( 218170 ) writes:
  
  So basically they suck. I shouldn't need to tweak my os thread scheduler just so a cpu can suck less. AMD needs to fix their shit instead of lame excuses.
  It's good for a low end multi core, but after a lot of research I've decided to go with the proven Phenom II processor.
- Re: (Score:3)
  
  by h4rr4r ( 612664 ) writes:
  
  So then SSDs suck because you have to tweak the IO scheduler(elevator)?
  - Re: (Score:3)
    
    by fuzzyfuzzyfungus ( 1223518 ) writes:
    
    So then SSDs suck because you have to tweak the IO scheduler(elevator)?
    How can you even Dream of trusting any drive that isn't good enough for solid, proven, CHS addressing?
  - Re: (Score:2)
    
    by billcopc ( 196330 ) writes:
    
    There is a difference between a CPU upgrade and an SSD, which is not a hard drive at all and thus exhibits completely different performance characteristics. SSDs are a radical departure from the norm. A multi-core CPU is not.
    I don't claim to know how CPU design works, but surely they must have ways to study or simulate real-world performance before the product is finalized and placed on shipping pallets. Windows' scheduler "sucks" ? Funny, it works fine with all the other Intel and AMD systems, even chu
  - - Re: (Score:2)
      
      by h4rr4r ( 612664 ) writes:
      
      Tuning is a normal part of setting up a machine. If you don't want to do any tuning Dell will be happy to do it for you.
      The Phenom 2 is probably what you should then buy.
      - Re: (Score:2)
        
        by HarrySquatter ( 1698416 ) writes:
        
        Tuning the thread scheduler is not normal for 99% of users. This is a lame excuse by amd for a cpu core that will be megafail. Ivy bridge will make it look even more pathetic.
        
        Re: (Score:2)
        
        by h4rr4r ( 612664 ) writes:
        
        Users don't buy CPUs, the system builder will do this for you.
        This is a pretty bad release out of AMD, lets hope they survive it.
    - Re:So basically... (Score:4, Funny)
      
      by Runaway1956 ( 1322357 ) writes: on Friday October 28, 2011 @02:19PM (#37871738) Homepage Journal
      
      "User". That summarizes half of the nonsense being posted here. This is a techie forum, isn't it? Techies tweak when no tweaking is needed. If you're a "user", then you're not even authorized to be in a server room. GTFO a STAY OUT!
      (listens for door slamming as the dweeb runs out)
      I just hate it when children blurt out their juvenile bullshit, interrupting the adults. Happens all the time . . .
      
      Parent Share
      twitter facebook
    - Re: (Score:3)
      
      by DeadCatX2 ( 950953 ) writes:
      
      Uh...what? Users don't have to do anything to the scheduler. That's the responsibility of the operating system. A Service Pack will be released and you won't have to do shit, so your argument is moot.
      Besides, if your argument is "We shouldn't have to optimize schedulers", then you're a little late, because schedulers are most definitely optimized for their associated hardware
    - - Re: (Score:3, Insightful)
        
        by turgid ( 580780 ) writes:
        
        Unfortunately, the Wintel world has thrived on this philosophy for 20 years.
- Re: (Score:2)
  
  by X0563511 ( 793323 ) writes:
  
  Because "Yea! Fuck progress!" - is that what I'm hearing?
  - Re: (Score:2)
    
    by HarrySquatter ( 1698416 ) writes:
    
    Slower performance and higher tdp equals progress?
    - - Re: (Score:3)
        
        by 0123456 ( 636235 ) writes:
        
        After 6 months to 1 year, the process will be significantly more mature and the Bulldozer chips will be serious contenders to Intel offerings.
        AMD just have to survive six months to a year of selling poorly-performing CPUs that have twice as many transistors as the competition.
      - Re: (Score:2)
        
        by billcopc ( 196330 ) writes:
        
        Funny, I don't see it that way at all.
        I think AMD enjoyed runaway success because of the P4, which was a very vulnerable platform for countless reasons. Poor IPC, awful thermals, and absurdly high prices. This gave AMD a giant gaping opportunity to dominate with their not-so-shitty AMD64. Then they released the dual-core, another great hit. They enjoyed nearly 4 years without any serious competition from Intel, but the moment Core 2 landed, it trounced AMD64 across the board, and came at a very reasonab
        
        Re: (Score:2)
        
        by makomk ( 752139 ) writes:
        
        This gave AMD a giant gaping opportunity to dominate with their not-so-shitty AMD64. Then they released the dual-core, another great hit. They enjoyed nearly 4 years without any serious competition from Intel
        Despite how much better AMD's desktop chips were than Intel's, sometimes they literally couldn't give them away. Intel were threatening to cut off OEM's supplies of laptop chips if they sold AMD processors on the desktop (AMD weren't so competitive for laptops), set up deals where buying less Intel chips would mean paying more money for them (AMD didn't have the capacity to provide all the big OEM's entire supply of processors - they were and are a lot smaller than Intel, and new fabs just took too long to
- Re: (Score:2)
  
  by Sloppy ( 14984 ) writes:
  
  So basically they suck. I shouldn't need to tweak my os thread scheduler just so a cpu can suck less.
  You must think the i3 and i7 suck too, then, since they have hyperthreading in addition to their multiple cores, and definitely benefit schedulers being HT-aware. Actually, you probably think all multicore CPUs and SMP motherboards suck, since before those were widely available, the kernels in use at the time didn't know how to use more than one CPU.
  AMD needs to fix their shit instead of lame excuses.
  Can't
  - Re: (Score:2)
    
    by HarrySquatter ( 1698416 ) writes:
    
    But does the end user have to do esoteric tweaks themselves for an Intel processor with hyperthreading? Nope.
    - Re: (Score:2)
      
      by h4rr4r ( 612664 ) writes:
      
      The system builder did when they first came out.
      The user, buys his machines off the shelf at Bestbuy.
      - Re: (Score:2)
        
        by afidel ( 530433 ) writes:
        
        Actually for first generation HT if you cared about performance you turned it off in the BIOS, it wasn't until Nehalem that HT actually added to performance in the majority of situations and that was mostly from a combination of better HT aware schedulers and actually better chip design.
      - Re: (Score:2)
        
        by h4rr4r ( 612664 ) writes:
        
        Those CPUs existed before Vista.
        
        Re: (Score:2)
        
        by 0123456 ( 636235 ) writes:
        
        The oriignal hyperthreading P4s were pretty much irrelevant because they were single core; the OS either scheduled one thread or two based on whether hyperhreading was enabled in the BIOS, and there was nothing more complex required than that.
        
        Re: (Score:2)
        
        by billcopc ( 196330 ) writes:
        
        You're right, and yet HT processors still offered repeatable performance gains in real-world usage, even under Windows XP. HT-aware scheduling improved the margin somewhat, and narrowed the worst-case losses, but by and large Prescott showed a measurable improvement from day one. HT takes existing code and finds idle "holes" to sneak in another thread's instructions, improving performance with existing software.
        Bulldozer just adds a bunch more physical cores, each one of them running slower than before, a
    - Re: (Score:2)
      
      by Surt ( 22457 ) writes:
      
      Because intel has the leverage to get those tweaks into windows.
      - Re:So basically... (Score:4, Informative)
        
        by washu_k ( 1628007 ) writes: on Friday October 28, 2011 @02:24PM (#37871790)
        
        No, It's because AMD is lying to the OS. The "8 core" BD is not really 8, core. It only has 4 cores with some duplicated integer resources. Basically a better version of hyper-threading, but not a proper 8 core design.
        
        The problem is that the BD says to Windows "I have 8 cores" and thus Windows schedules assuming that is true. If BD said "I have 4 cores with 8 threads" then Windows would schedule it just like it does with Intel CPUs and performance would improve just like in the FA.
        
        There shouldn't need to be any OS level tweaks because Windows already knows how to schedule for hyper-threading optimally. If BD reported it's true core count properly then no OS level changes would be needed.
        
        Parent Share
        twitter facebook
        
        Re:So basically... (Score:4, Interesting)
        
        by Kjella ( 173770 ) writes: on Friday October 28, 2011 @02:54PM (#37872148) Homepage
        
        There shouldn't need to be any OS level tweaks because Windows already knows how to schedule for hyper-threading optimally. If BD reported it's true core count properly then no OS level changes would be needed.
        Except that hyperthreading quite obviously has one fast thread and one slow thread filling the gaps. In AMDs solution both cores in a module are equal, but they share some resources. To use a car analogy the Intel solution is a one-lane road with pullouts where the hyperthread sneaks from one pullout to the other while there's no traffic while the AMD solution is a two-lane road with one lane chokepoints. Both sorta allow cars to travel simultaneously, but I don't think the optimization would be the same.
        
        Parent Share
        twitter facebook
        
        Re: (Score:3)
        
        by washu_k ( 1628007 ) writes:
        
        No, that is not correct. Hyper-threading gives each thread the same amount of resources, assuming they can use them equally. The only difference between hyper-threading and a BD module is that the BD module has a dedicated integer execution unit and L1 D cache for each thread. Everything else is shared just like in Intel cores. It is simply a better hyper-threading, not real cores.
        
        Re: (Score:3)
        
        by beelsebob ( 529313 ) writes:
        
        This is actually exactly what you wouldn't want in a design –when you're designing a threading model, whether at the application level, the OS level or the CPU level, you absolutely do not want thread starvation. Designing it in is just dumb, hence why intel didn't.
        
        Re: (Score:3)
        
        by makomk ( 752139 ) writes:
        
        Except that's not quite right either, because classic hyperthreading only gets about 10-20% improvements at most from using two threads rather than one, whereas Bulldozer appears to be closer to 80-90% even for stuff that makes heavy use of the shared resources.
    - Re: (Score:3, Interesting)
      
      by Anonymous Coward writes:
      
      You did when it was initially launched. Windows 2000's scheduler does not cope well with hyperthreading /at all/ by default. You saw similar things when dual core CPUs were launched. Now hyper threading and multicore are standard and OSs are aware of these cases.
      It's already been pointed out that windows 8's scheduler is bulldozer aware and performs much better than windows 7. I would not be surprised to see a patch from Microsoft that specifically addresses scheduler performance improvements for bulldozer
    - Re: (Score:2)
      
      by DeadCatX2 ( 950953 ) writes:
      
      Will the end user have to do esoteric tweaks after the next Service Pack for Windows? Nope.
      - Re: (Score:2)
        
        by Chris Burke ( 6130 ) writes:
        
        Will the end user have to do esoteric tweaks after the next Service Pack for Windows? Nope.
        Maybe they're saying that running Windows Update is an esoteric tweak?
        I guess they should pay the teenager next door to do it for them, and then clear off all the spyware they have from running an unpatched OS.
- Re: (Score:2)
  
  by dpilot ( 134227 ) writes:
  
  No, what it means is that the software hasn't caught up to the hardware, yet. Until compilers and kernels/schedulers have time to react to Booledozer, we won't see what it's truly capable of. Since you're not interested in tracking such stuff, buy something more mainstream.
  The interesting thing here is the lame excuses. Not that long ago, Intel managed to (nearly) simultaneously introduce both NetBurst and Itanium. AMD never would have survived such a debacle - there's serious question about whether the
- Re:So basically... (Score:4, Interesting)
  
  by fuzzyfuzzyfungus ( 1223518 ) writes: on Friday October 28, 2011 @02:05PM (#37871478) Journal
  
  So basically they suck. I shouldn't need to tweak my os thread scheduler just so a cpu can suck less. AMD needs to fix their shit instead of lame excuses.
  I've got some very bad news for you: While I have no particular knowledge of, or interest in, today's architecture pissing match, the days when the OS was allowed to ignore architectural details and expect things to just work optimally are good and over(if they ever existed in the first place).
  
  Dynamic processor clocks? Why should I have to deal with some performance governor shit when Intel can just make a CPU that either uses almost no power at 3GHz or runs like a bat out of hell at 800MHz? Oh, because they actually can't. Sorry. Multiple cores? WTF? Why do they expect me to program in parallel for 2 3GHz cores instead of just giving me a 6GHz core? Oh, because they actually can't. Sorry. NUMA? Memory access times already blow! Now you want to make them unpredictable? Well, we can either repeal the speed of light and restrict every system to a single memory controller or deal with nonuniform access times and cry into our 128GB of RAM... The list just goes on. Hyperthreading can provide anything from less than zero improvement, if it increases contention for resources that were already being fully used, to fairly substantial improvement, if the CPU was being starved at times under a single thread. Now the Bulldozer cores have implemented something between full multi-core(with 100% duplication of resources per core) and hyperthreading(with virtually zero additional resources for the HT 'core'). Shockingly, performance depends on whether the two semi-independent cores are stepping on one another's shared toes or not...
  
  Even if, in this specific instance, AMD happens to have fucked up and made the wrong architectural choice, that doesn't change the fact that you can't escape architectural oddities unless you are willing to stay quite far from the forefront of performance, or deal with some sort of hardware/firmware abstraction layer that ends up being at least as complex as the OS-level hackery would have been, but more likely to be vendor specific and have its cost spread across far fewer units. It certainly isn't the case that all architectural deviations are good, some are ghastly hacks best forgotten, some are perfectly OK ideas dragged down by products that overall aren't much good; but the path of progress has been liberally sprinkled with oddities that have to be accounted for somewhere in the overall stack.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by billcopc ( 196330 ) writes:
    
    I agree with about 75% of your post. AMD fucked up, because current operating systems already know how to handle hyper-threading, of which Bulldozer is basically a reimagining. If their CPU reported as 4 cores with 8 threads, chances are most schedulers would treat it like an Intel HT and not overburden it with uncooperative threads.
    AMD fucked up, because they either didn't know this would happen (unlikely), or pretended we wouldn't notice. Now that the reviews are out, their PR team is spinning the blam
- Re: (Score:2)
  
  by DarkOx ( 621550 ) writes:
  
  Yea, its not like its the operating systems job to abstract the hardware, and coordinate resource sharing.
- Re: (Score:2)
  
  by beelsebob ( 529313 ) writes:
  
  It's not only that – they tweaked the scheduler to make one very specific benchmark perform well. Now run a different benchmark, I bet this will degrade performance.
  Not only that, but I bet we could play the same trick on $intel_chip with enough fiddling with settings.
- - Re: (Score:2)
    
    by 0123456 ( 636235 ) writes:
    
    And I have to wonder if most of the performance gains will be made by essentially doing the -same things- (such as not putting two high loads on the same core when other cores are idle).
    From the article it would appear that in other cases you'll reduce performance because that will disable 'turbo' overclocking. But the whole thing just seems too complex to optimise for because of all the special cases (e.g. don't put two integer threads on different cores, don't put two floating point threads on the same core), so that may be the best compromise.
- Re: (Score:2)
  
  by laffer1 ( 701823 ) writes:
  
  You mean integer based instructions. Floating point is still not as good with the AMD chips (unless using the new instructions)
  - Re: (Score:2)
    
    by EdZ ( 755139 ) writes:
    
    Worse, what this shows is that AMD's idea that you only need one FPU for every two integer units (how Bulldozer is laid out) results in a 20% performance drop.
    - Re: (Score:3)
      
      by makomk ( 752139 ) writes:
      
      In theory it actually has the equivalent of an 128-bit wide FPU for every integer unit. Though I hear rumours that they may have not put as much effort into making the classic x87 FPU instructions run fast and that harmed them in some of the non-SSE-supporting benchmarks that a lot of the reviews used.
- Re: (Score:2)
  
  by beelsebob ( 529313 ) writes:
  
  The idiocy here is that they've not succeeded in making bulldozer faster, they've succeeded in making one very specific benchmark run faster with very specific scheduler settings for that exact one benchmark. Give it some different code to run and this'll degrate performance.
- Re: (Score:2)
  
  by HarrySquatter ( 1698416 ) writes:
  
  Oh goody! Now the tdp can be even worse than it already is!
  - Re: (Score:2)
    
    by h4rr4r ( 612664 ) writes:
    
    Why do you care about a few measly watts?
    Is another 50watts really going to break your budget?
    If that is the case you probably should not be buying a new computer.
    - Re: (Score:2)
      
      by 0123456 ( 636235 ) writes:
      
      Why do you care about a few measly watts?
      Oddly, the AMD fanboys were making the opposite argument back in the days when you could cook your breakfast on your Pentium-4 while checking your email.
      - Re: (Score:2)
        
        by h4rr4r ( 612664 ) writes:
        
        I own a Phenom 2 X4 and a Core 2 Quad. I buy what meets my needs at the price point I want when I want to buy it.
        I am not a fanboy of either, I just want to see AMD survive so I don't have to pay far out the ass for CPUs. I owned one of those P4s at the time, I bought an Athlon that put it too shame not much later.
      - Re: (Score:2)
        
        by Lithdren ( 605362 ) writes:
        
        Oddly, the AMD fanboys were making the opposite argument back in the days when you could cook your breakfast on your Pentium-4 while checking your email.
        This is why I was always sad nobody invented a teflon topped computer case.
        
        I'd like to read my morning email while cooking bacon, eggs, and pancakes.
        
        Bonus points if you can get a waffle iron in there somehow.
        ,
      - Re: (Score:2)
        
        by makomk ( 752139 ) writes:
        
        Pentium-4 managed to be hot, slow and expensive all at once... though by comparison to modern chips I don't think it's actually that power-hungry, scarily enough. Intel and AMD are heading well above the 100W level again.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

no one got fired buying intel (Score:2)

Re: (Score:2)

Re:no one got fired buying intel (Score:4, Informative)

Re: (Score:2)

Re:no one got fired buying intel (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re:no one got fired buying intel (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Weird (Score:2)

Re: (Score:3)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

stave me (Score:2)

Re: (Score:2)

Re: (Score:2)

"10% to 20%" boost is just overclocking processor (Score:2)

Re: (Score:2)

But does it actually make a difference? (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

It's a Windows limitation (Score:4, Informative)

Windows? (Score:2)

It was already beating all intel in highly threade (Score:5, Interesting)

Re: (Score:3)

No need, everyone knows... (Score:2, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:So basically... (Score:4, Funny)

Re: (Score:3)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:So basically... (Score:4, Informative)

Re:So basically... (Score:4, Interesting)

Re: (Score:3)

Re: (Score:3)