Benchmarking the Benchmarks

Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

Benchmarking the Benchmarks 126

Posted by CmdrTaco on Monday February 11, 2008 @11:45AM from the blogging-the-bloggers dept.

apoppin writes "HardOCP put video card benchmarking on trial and comes back with some pretty incredible verdicts. They show one video returning benchmark scores much better than another compared to what you get when you actually play the game. Lies, damn lies, and benchmarks."

This discussion has been archived. No new comments can be posted.

Benchmarking the Benchmarks

Load All Comments

Search 126 Comments Log In/Create an Account

Comments Filter:

Erase Futuremark = instant win (Score:1, Insightful)

by majorme ( 515104 ) writes:

damn i hate benchmarks
OSS (Score:1, Funny)

by Anonymous Coward writes:

Its no wonder that most modern benchmarks are innacurrate, given that they tend to benchmark propietary, closed source software, running on propietary, closed source operating systems. Where they to run benchmarking software on Open Source operating systems, such as Ubuntu, then their results would not only be more accurate, but fairer. The fact that Open Source software would also have much higher scores then propietary, closed source software goes without saying.
- Re: (Score:2, Insightful)
  
  by joaommp ( 685612 ) writes:
  
  aren't you being just a little bit... oh, I dunno... offtopic?
  
  Either I misunderstood you, or I don't see how the license can be a metric of performance or accuracy.
  - Re: (Score:3, Funny)
    
    by edwdig ( 47888 ) writes:
    
    Either I misunderstood you, or I don't see how the license can be a metric of performance or accuracy.
    
    Clearly you haven't been drinking enough of your Kool Aid. Please contact the FSF and request more immediately.
  - - Re:OSS (Score:5, Funny)
      
      by snoyberg ( 787126 ) writes: <snoyberg@users.s ... t minus caffeine> on Monday February 11, 2008 @01:59PM (#22381088) Homepage
      
      PS yes...release your rage and mod me down.... just makes my post more Insightful.
      
      Translation: if you mod me down, I will become more insightful than you can possibly imagine.
      
      Parent Share
      twitter facebook
  - - Re: (Score:1)
      
      by joaommp ( 685612 ) writes:
      
      oh it seems like a joke to you? then who wrote it really must have a very refined sense of irony...
  - - Re: (Score:1)
      
      by joaommp ( 685612 ) writes:
      
      The fact that Open Source software would also have much higher scores then propietary, closed source software goes without saying.
      
      Did you read this?
      
      And even if the operating system or platform is opensource, that doesn't mean that the benchmark will be. He didn't mention any benchmark but actually referred to a platform. So how could you have any idea of how biased the benchmark software is/isn't?
back in my day... (Score:5, Funny)

by Aranykai ( 1053846 ) writes: <slgonser.gmail@com> on Monday February 11, 2008 @11:51AM (#22379664)

We used to benchmark a computer by *gasp* actually running things on it. If you wanted to find out how well it would perform running a game, you played the damn game and found out. Course, thats not good enough for these ubernoobs who think they are cool with their benchmark scores on their forum signatures...

Share
twitter facebook
- Re: (Score:3, Interesting)
  
  by Anonymous Coward writes:
  
  It's not the benchmark-scores that count. Sure, you need a specific minimum to enjoy the game, but it's the actual gameplay that makes the game fun, no matter the hardware.
  
  I'm pretty sure these benchmarks are invented by men.
  - Re: (Score:3, Insightful)
    
    by MobileTatsu-NJG ( 946591 ) writes:
    
    It's not the benchmark-scores that count. Sure, you need a specific minimum to enjoy the game, but it's the actual gameplay that makes the game fun, no matter the hardware.
    
    I'm pretty sure these benchmarks are invented by men.
    These benchmark scores are important when trying to determine a balance of cost vs. performance. So yes, these benchmarks were invented by men. This is because the old standard of picking the one whose color matches their shoes also resulted with the invention of the credit card.
  - Re:back in my day... (Score:5, Insightful)
    
    by donscarletti ( 569232 ) writes: on Monday February 11, 2008 @01:11PM (#22380564)
    
    There is indeed a bare minimum hardware performance required to play but sadly many new games, especially Crysis, that bare minimum is scarily close to the market's maximum. Benchmarks are supposed to be a way to isolate this and objectively measure it so that a good purchasing decision can be made by the consumer and when the game is played hopefully the subjective experience of enjoyment will follow. A framerate above human perception is needed for fun (as jerky frames lead to nausia and frustration), high detail is needed for the beauty of a game which is probably just as important (it's been the basis for visual art, music and poetry for millennia).
    
    The reason we've got so far and now can have computers, electricity, aeroplanes, cars, etc. is because of the willingness of scientifically inclined individuals to isolate, experiment and measure. Technology is one of the things in life that can be measured and I think it is a good idea to continue to do it, provided we can do it right. Experimentation and science is what got us out of caves no?
    
    As for Hardocp, what have they proven? Apparently traditional time demos run a fairly linear amount faster than realtime demos, even though it has been acknowledged that realtime demos render more including weapons, characters and effects that the canned demo does not. This would be interesting if the question was "how fast can Crysis run on different cards" but that's not what people want to know. What I'd want to know is which card should I buy to allow me to continue to play cutting edge games for as long as possible while enjoying their whole beauty but not getting a framerate low enough to make me uncomfortable. It just so happens that the card with the best timedemo benchmark has the best actual playthrough benchmark and by roughly the same factor. The only difference is that the traditional timedemo depends on only the graphics hardware whereas the playthrough benchmark depends on efficiency elsewhere in the engine (AI physics), where the player spent most time and if reviewing subjectively, the reviewers current mindset and biases.
    
    Somebody please think of the science!
    
    Parent Share
    twitter facebook
    - Re:back in my day... (Score:4, Insightful)
      
      by cHiphead ( 17854 ) writes: on Monday February 11, 2008 @02:11PM (#22381244)
      
      Some of us make purchasing decisions based on the piece of shit game we are thinking of buying. Crysis is a joke with such high requirements for a playable experience. I base my game purchases on what will run on my old pos single core p4 2.8ghz box. Any game that can't impress with such insanely fast hardware as we have these days even on the 'budget' boxes is not a game worth investing in.
      
      I must be getting old, I haven't upgraded my box in almost 2 years.
      
      Cheers.
      
      Parent Share
      twitter facebook
    - Re:back in my day... (Score:4, Interesting)
      
      by billcopc ( 196330 ) writes: <vrillco@yahoo.com> on Monday February 11, 2008 @02:23PM (#22381374) Homepage
      
      It's funny that you mention Crysis... people are freaking out over Crysis the same way they freaked out over Aero Glass a year ago. The reality is, Crysis runs fine on midrange gaming systems. It won't run in 1920x1200 with DX10 eyecandy on that crusty old Geforce 6200, but it certainly does not require a $2500 powerhouse to be enjoyable.
      
      In the end, benchmarks can be useful as long as you don't accept their results as the gospel truth. Some benchmarks favor ATI, some favor NVidia, and I'm sure there's gotta be one benchmark that favors Intel Extreme Graphics :P... the important thing is to find parallels that relate to your own needs and wants so you can put those numbers into perspective.
      
      Parent Share
      twitter facebook
      - Re: (Score:1)
        
        by i.of.the.storm ( 907783 ) writes:
        
        Yeah, I have to agree the whole thing with Crysis is overblown, the minimum requirements are actually really low and any (intended for gaming) card made in the last two years could probably run it.
      - Re:back in my day... we didnt make bad analogies (Score:2)
        
        by mjwx ( 966435 ) writes:
        
        people are freaking out over Crysis the same way they freaked out over Aero Glass a year ago
        The difference is that Aero Glass required far more system resources than its equivalent under Linux or available for XP (Stardock have something, I cant remember its name though). I have yet to see a game that can match the graphics on Crysis. Crysis runs at a reasonable frame rate on my Geforce 8800 GTS, I average about 20 FPS which halves around effects like waterfalls. This is on an Athlon 6000, 2 GB RAM, runni
      - Re: (Score:1)
        
        by NorQue ( 1000887 ) writes:
        
        It's funny that you mention Crysis... people are freaking out over Crysis the same way they freaked out over Aero Glass a year ago.
        And it hasn't not started with Aero Glass either. The first game I remember people complaining about hardware requirements was Wing Commander, and there must be earlier examples. People are stupid.
        
        Re: (Score:2)
        
        by Glonoinha ( 587375 ) writes:
        
        Wing Commander? I had friends complaining about hardware requirements as far back as Choplifter and JumpMan.
        
        What do you mean I need to buy a 1541 single sided floppy drive for my C=64 - I just bought this tape drive six months ago, paid $100 for it and at the time it was the fastest secondary storage known to man - I could type LOAD "*",1,1 and by the time I was done eating lunch my game was ready to run, and now you tell me I have to buy a new piece of hardware just to play a game?
- Re:back in my day... (Score:5, Funny)
  
  by SQLGuru ( 980662 ) writes: on Monday February 11, 2008 @12:10PM (#22379842) Homepage Journal
  
  And, on top of that, they are on your lawn....
  
  Layne
  
  Parent Share
  twitter facebook
- Re:back in my day... (Score:5, Informative)
  
  by Sancho ( 17056 ) writes: on Monday February 11, 2008 @12:40PM (#22380146) Homepage
  
  The problem is that it's hard to objectively score performance by "running things on it." Benchmarks are nice because they run the exact same tests every time. You can't just turn on FPS display and walk around in the game to measure performance--your actions may not be the same each time, and slight variations could cause drastically different results.
  
  Benchmarking provides potential customers with a metric to compare potential purchases.
  
  Parent Share
  twitter facebook
  - Re: (Score:1)
    
    by billcopc ( 196330 ) writes:
    
    Actually, I think the FPS display is a great measure of actual performance. The benchmarks will give you abstract numbers, but the FPS display is what you're actually getting out of the game.
    
    It doesn't matter if you don't follow the same path each time, what counts is the actual feel... some games can get away with lower framerates in the flashy areas (e.g. Crysis), while others would be totally unacceptable.
    
    I believe it's HardOCP that plots graphs of the minimum, maximum and average FPS. That's a step i
    - Re: (Score:3, Insightful)
      
      by Sancho ( 17056 ) writes:
      
      You're conflating benchmarking games vs. benchmarking graphics cards. If you're looking for raw power for an arbitrary amount of money, you'd want to get the graphics card which has the maximum frame rate at that price. If you're looking to play a specific game, you'd look for a graphics card which most people (quite subjectively, obviously) say plays the game well.
      
      The point is that you can't use a standard game (plus FPS meter) played by a human player to judge a graphics card's raw capabilities. To red
      - Re: (Score:2)
        
        by immcintosh ( 1089551 ) writes:
        
        The point is that you can't use a standard game (plus FPS meter) played by a human player to judge a graphics card's raw capabilities. To reduce subjectivity and error, you need a consistency in what is being rendered.
        What you're saying makes sense when you write it down, but after having read the article the OP is talking about, as well as some of the related articles, I think it's fair to say that they are reliably doing just that. Decide on a specific run to do through a specific section, practice it
  - Re: (Score:2)
    
    by Loopy ( 41728 ) writes:
    
    Which, it has been repeatedly shown, can be and are "faked" by competent video card manufacturers. Having a preset benchmark means you can tweak the solution to perform in the static environment. You can't fake what you see when actually playing the game.
    
    I'll give you a great example. In Crysis, my 8800GT system at home can be set up to 1280x1024 HIGH settings and still get 25FPS+ through the whole timedemo. Take those same graphics settings and try to play the last series on the aircraft carrier and it i
  - Re: (Score:2)
    
    by mwvdlee ( 775178 ) writes:
    
    Benchmarks would be nice if the hardware manufacturers didn't optimize specifically for those benchmarks. The problem isn't automated benchmarks, it's drivers cutting corners in the benchmarks at the expense of (or atleast not benefitial to) performance in normal use.
- Re: (Score:2)
  
  by MWoody ( 222806 ) writes:
  
  Admit it, you "benchmarked" with Windows Solitaire.
  - Re:back in my day... (Score:4, Informative)
    
    by PReDiToR ( 687141 ) writes: on Monday February 11, 2008 @01:00PM (#22380416) Homepage Journal
    
    Wolfenstein3D actually.
    That DX chip kicked the arse out of the SX models.
    
    Solitaire on "You just won. Watch the cards leap" was good for checking out the Windows performance, but Wolf told you how fast the PC was.
    
    Parent Share
    twitter facebook
    - Re: (Score:3, Funny)
      
      by IndustrialComplex ( 975015 ) writes:
      
      I do remember marveling at my friend's 486 and how fast those cards bounced off the screen.
    - Re: (Score:2)
      
      by YeeHaW_Jelte ( 451855 ) writes:
      
      Pfff, newfangled stuff.
      
      I use the dir command in dos to benchmark my new computers. and have been doing so since the 8088.
- Re: (Score:1)
  
  by KPexEA ( 1030982 ) writes:
  
  Since every game / program uses the hardware differently the ONLY way to compare hardware is to run the game/program or a subset of the game on the actual hardware. What would be really nice would be to have a slimmed down version of the game you want ( supplied by the game company, and preferable as small as possible so it can easily be put on a small USB drive ) that you can run on the machine in question and have it display the "score". That way, when my kid is looking for a new machine to run WoW on, I
- Those were rigged, too. (Score:2)
  
  by SanityInAnarchy ( 655584 ) writes:
  
  Video drivers from both ATI and nVidia would look for specific binaries known to be games used for benchmarking. Example: Quake3. You could rename your quake3 binary to quack3 and it'd perform somewhat worse.
  
  Apparently, it had something to do with trading correctness for speed.
FRAPS Overhead? (Score:1)

by roadkill_cr ( 1155149 ) writes:

Correct me if I'm wrong, but doesn't FRAPS have some sort of overhead while running? I certainly don't disagree with their findings, but it seems to be a factor they didn't account for between the traditional timedemo benchmarks and their FRAPS-ified benchmarks.
- Re: (Score:3, Informative)
  
  by compro01 ( 777531 ) writes:
  
  without using the screen-recording functionality, the overhead should be statistically irrelevant.
- - Re: (Score:1)
    
    by joeytmann ( 664434 ) writes:
    
    That is true if you start recording in FRAPS, and actually probably even less than half your framerate if your proc/mem/disk speeds suck. FRAPS will give you a decent FPS display with out too much overhead. Usually though, most games have the ability to display their frame rates in game with even less overhead. And with most game publishers giving out demo's...download the demo and try it out....see what your fps is. If it sucks, decide if you really want to see the game in all its FX glory and spend the $$
whatevermark (Score:3, Funny)

by Yath ( 6378 ) writes: on Monday February 11, 2008 @12:00PM (#22379760) Journal

Crysis, UT3, and COD4 are the three primary games we are using currently, with Crysis performance certainly being the new watermark in the industry.

I have no idea what this means, but it certainly sounds like Crysis has left its mark somewhere or other.

Share
twitter facebook
- Re: (Score:2)
  
  by peragrin ( 659227 ) writes:
  
  read it again Crysis left a watermark.
  
  don't ask why the water smells funny and is yellow in color.
- Re: (Score:2)
  
  by immcintosh ( 1089551 ) writes:
  
  My best guess is he meant "high water mark."
hmm (Score:2, Funny)

by nomadic ( 141991 ) writes:

Is your benchmark of the benchmarks accurate? We might have to benchmark it.
My old benchmark (Score:3, Funny)

by Anonymous Coward writes: on Monday February 11, 2008 @12:05PM (#22379798)

I used to do this benchmark:
10 PRINT TIME$
20 FOR I=1 TO 9999
30 NEXT I
40 PRINT TIME$

I then improved it to be:
10 A$=TIME$
20 IF A$=TIME$ THEN GOTO 20 !breaks out when the seconds change
30 I=1:A$=TIME$
40 I=I+1:IF A$=TIME$ THEN GOTO 40
50 PRINT I

Ahhh...the good old days... (1970s, early 1980s)

Share
twitter facebook
- Re: (Score:2)
  
  by CarpetShark ( 865376 ) writes:
  
  I used to do this benchmark:
  10 PRINT TIME$
  20 FOR I=1 TO 9999
  30 NEXT I
  
  I think I've spotted a bug. You'll need a much bigger upper limit on that loop, if you're busy-waiting for basic to be capable of something useful ;)
- Re:My old benchmark (Score:4, Funny)
  
  by sempernoctis ( 1229258 ) writes: on Monday February 11, 2008 @02:04PM (#22381128)
  
  My favorite benchmark for finding the size of the memory heap:
  
  void doit(int i) { printf("%i\n", i); doit(i + 1); }
  
  worked really well until I tried it in an environment where the call stack could get paged...then it turned into a hard drive benchmark
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by bored ( 40072 ) writes:
    
    finding the size of the memory heap:
    
    void doit(int i) { printf("%i\n", i); doit(i + 1); }
    
    Oh god! What has become of this site? Poor spelling and grammar I can understand. Confusing the stack and the heap is a sign of the times!
Synthetics not entirely useless (Score:4, Informative)

by Anonymous Coward writes: on Monday February 11, 2008 @12:07PM (#22379810)

Benchmarking using actual games is, of course, important. But part of the reason a lot of us buy video cards and such isn't JUST about the performance on today's games, but for how they'll play the games coming out in the next few months. Synthetic benchmarks often implement advanced features not currently seen in today's games, but which will be implemented in just-over-the-horizon games. So while clearly one ought not judge a card purely on 3DMark or similar benchmarking suites, they do have their uses.

Share
twitter facebook
- Re: (Score:1)
  
  by dmsuperman ( 1033704 ) writes:
  
  Trust me, as long as there are games like Crysis to do more than 200% of what my system can handle we'll be alright.
- Re: (Score:1)
  
  by snarfies ( 115214 ) writes:
  
  I prefer the term "Artificial Hardware Test" myself.
  
  More cornbread?
- Re: (Score:2)
  
  by immcintosh ( 1089551 ) writes:
  
  Crysis sorta breaks your argument though. One thing everybody will agree on is that there currently is NO consumer hardware that will play smoothly it at its highest settings. I've heard that a two (three?) card SLI setup of nVidia's top of the line overclocked monsters can get it to pump out 30fps or so with its settings maxed out, but that's about it. The game of tomorrow--today!
We need international benchmarking standards! (Score:4, Funny)

by Thanshin ( 1188877 ) writes: on Monday February 11, 2008 @12:09PM (#22379826)

...And an international benchmarking committee.

To avoid concentrating all the data management in a single entity, we need a national benchmarking committee for each country and then international elections to get a chief of benchmarking interrelationships or CBI.

To avoid the possible corruption of the CBI, we would need an independent international supervision committee for the review of benchmarking standards.

The IISCRBS would review the actions of the CBI yearly and produce a thorough report.

That report (which would be called the IISCRBS-CBI report) would be the main reference to start any kind of productive debate about who has the leetest rack and who's a lame n00b.

Share
twitter facebook
Would like to see a real world comparison for EQ (Score:2)

by Maxo-Texas ( 864189 ) writes:

I have what was a "hot" card only eighteen months ago (7800) ago and now it is stuttering on some of the newer content when I'm raiding. The rest of the game is glass smooth. Suppose it could be the PC but it is a pretty good PC too.

Would love a site that showed "here is the game on the highest settings on these CPU/GFX combos".
- Re:Would like to see a real world comparison for E (Score:5, Funny)
  
  by Digital Vomit ( 891734 ) writes: on Monday February 11, 2008 @12:21PM (#22379964) Homepage Journal
  
  I have what was a "hot" card only eighteen months ago (7800) ago and now it is stuttering on some of the newer content when I'm raiding.
  
  Are you one of those software pirates?
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by Maxo-Texas ( 864189 ) writes:
    
    hehe.
    
    Well you probably know what I meant and were making a funny but in case you didn't.
    
    In EQ, on a raid, you get 54 people close to you (so they can't be clipped based on distance), and 40-70 server side creatures (player pets, monsters, the big "bad") and your machine is trying to keep up and report on and render all that in real time. My frame rate is >60 (>100?) in some content but in the new content on a raid, it can go to 10 to 20 fps unless I turn off a lot of features. Kinda sucks.
  - Re: (Score:2)
    
    by andi75 ( 84413 ) writes:
    
    The true reason Blizzard switched from 40-man to 25-man raids in the Burning Crusade.
- Re:Would like to see a real world comparison for E (Score:2)
  
  by irc.goatse.cx troll ( 593289 ) writes:
  
  That would be nice, especially retouching on older ones and also cheaper combos you'd find in generic desktops.
  
  I'd also like to see a benchmark app you canr un from usb or dvd/cdrom booting. Something that gives you a clean slate to compare against running it in your existing install so you can see how much all the various apps and drivers are bogging your performance down.
- Re:Would like to see a real world comparison for E (Score:1)
  
  by Jeng ( 926980 ) writes:
  
  EQ is in many ways a very very bad example, or in some ways I guess a good example.
  
  Problem with EQ is that performance can vary greatly depending on the card, the drivers, and of course the settings.
  
  There are non-graphical settings within EQ that can slow down your computer in a raid environment that won't mess with it much in a non-raiding environment. Basically anything that logs information to your hard drive will really mess you up in a raid.
  
  But EQ has so many damn bugs in it that benchmarking would be
  - Re: (Score:2)
    
    by Maxo-Texas ( 864189 ) writes:
    
    A fix for the west bug has been found.
    It is posted somewhere on "therunes.net" boards. I linked it to my guild boards a couple months ago.
- Re: (Score:1)
  
  by __aaqvdr516 ( 975138 ) writes:
  
  I think your problem is SoE. I've also done raids on both EQ2 and SWG (back in the day). EQ's servers handle the load better than SWG's did back then. In SWG the lag got so bad around half of the people lost connection. So, in short, your end is not the problem.
Benchmarks (Score:5, Insightful)

by Anonymous Coward writes: on Monday February 11, 2008 @12:13PM (#22379878)

Duh, a benchmark is a controlled test performed "on a bench" - meaning, in a controlled environment with specific, well-described procedures.

You must perform the same exact test on all video cards, disclose any variables, and you must not "pick a subset of completed tests to publish". You must not compare tests performed using different procedures, no matter how slight the deviation of the procedures are.

One cannot draw conclusions about "real world" performance from a benchmark. The benchmark is merely an indicator. A "real world" test that uses the strong, formalized procedures of a benchmark IS a benchmark - and suddenly, the benchmark is not "real world" - because the "real world" doesn't have formal procedures for gameplay.

Haphazard "non-blind" gameplay on a random machine is NOT a benchmark, and it can not provide useful, comparable numbers.

A good benchmark is one where (1) most experts agree that it has validity, and (2) one where the tester cannot change the rules of the game.

The numbers of a benchmark are meaningless, except in terms of being compared to one another using the same exact procedure.

Share
twitter facebook
- Re: (Score:2)
  
  by Xzzy ( 111297 ) writes:
  
  The accusation that HardOCP is making is that it is not possible to perform the exact same tests for all video cards, because software vendors sneak in shortcuts and cheats (sorry, optimizations) that screw with the numbers.
  
  So they threw benchmarking out, for the most part, and instead tried to make a system for measuring how well a given video card delivers a positive experience. It's not ideal.. but at least it's immune to interference from the video card makers. Now you just have to worry about bias from
  - Re: (Score:2)
    
    by immcintosh ( 1089551 ) writes:
    
    Actually, I think it would be more accurate to say that their accusation is that, while you can perform those tests, your results will be totally useless and not even remotely indicative of real-world performance, as seems to be demonstrated by their Crysis benchmark. And by useless, they mean the benchmarks will lead you to believe one card is faster (ATI here), when in actuality the opposite is the case while actually playing the game.
Benchmarks != Reality (Score:2)

by Smidge204 ( 605297 ) writes:

Okay, so benchmarks don't adequately reflect real applications. Not much of a surprise there...

But does this impact their usefullness in comparing hardware at all?
=Smidge=
- Re: (Score:2)
  
  by jonnythan ( 79727 ) writes:
  
  Yes.
  
  RTFA. It clearly shows how the canned timedemo benchmarks most sites use can be horribly misleading and give totally wrong impressions.
  - Re: (Score:2)
    
    by Firehed ( 942385 ) writes:
    
    We've known this for years, which is why a lot of the better review sites moved away from timedemos a long while ago.
    
    However, they can still (sort of) be used to compare cards against each other. They don't do much to reflect playability of a game at given settings accurately, but in theory all of the numbers you get from a timedemo should be inflated by about the same percent.
    - Re: (Score:2)
      
      by jonnythan ( 79727 ) writes:
      
      The article attempts to show that the numbers you get from a timedemo *don't* correlate well to what you get in the real world. Some cards or drivers do better in the "timedemo -> real life" conversion than others.
      
      This difference is the entire point of the article.
Obligatory Portal Reference (Score:1)

by psychicsword ( 1036852 ) * writes:

Lies, damn lies
Just like the cake.
HardOCP benchmarks suck ass (Score:2)

by Clockwurk ( 577966 ) * writes:

They never use the same game configuration, so trying to figure out how much faster one thing is than another is impossible. Rather than have 1 variable (the hardware being benchmarked), they use 2 variables (the hardware, and the settings of the benchmarked software).
- Re:HardOCP benchmarks suck ass (Score:4, Insightful)
  
  by jonnythan ( 79727 ) writes: on Monday February 11, 2008 @12:33PM (#22380080)
  
  Um, they come up with what is probably the most useful data of all:
  
  The highest playable settings for given hardware.
  
  They then change the video card and find the highest playable settings for that hardware.
  
  I'd much rather compare the highest playable settings for two different cards than the timedemo benchmark numbers for two different cards.
  
  Parent Share
  twitter facebook
  - Re: (Score:3, Insightful)
    
    by Dracolytch ( 714699 ) writes:
    
    You know that's totally intractable, right?
    
    For example: 1620x1050 with no AA may be considered unplayable (jaggies) for some, but others it's perfectly fine...
    
    Or, maybe you can turn on the AA, but deactivate shadows, changing your whole "playable" demographic again.
    
    It's like asking someone to benchmark coffee at different resturants to grade whether it is palletable or not.
    
    ~D
    - Re: (Score:2)
      
      by sholden ( 12227 ) writes:
      
      You [blogspot.com] mean [seth.id.au] precisely [mclo.net] like [wordpress.com] people [coffeeshopcritic.com] do [typepad.com]?
      
      I've heard rumors that similar things are done for movies, books, games, tv shows, and even food.
      
      I believe the idea is to work out how closely you agree with the reviewer in question in order to determine if what they say is useful (and of course when you completely disagree they can be useful - if they love it you'll hate it sort of thing)...
      
      But, yes, if the point was meant to be that there is no one comparison function and hence each persons ordering will may be different
[H] raises more questions than it answers (Score:3, Informative)

by tayhimself ( 791184 ) writes: on Monday February 11, 2008 @12:22PM (#22379966)

Here are a few that I had :
- is triple-buffering on or vsync off? This will make a huge difference to real time versus sped up timedemos
- is sound on when playing back both types of timedemos?
- how does FRAPS affect your benchmark scores?

Finally, in relation to the Crysis real world gameplay versus the AT benchmark score, I thought it was common knowledge that the game would be slower when actually playing it because you likely have physics,AI,logic,sound calculations to do that you don't in timedemo mode. What is the big deal here?

Share
twitter facebook
- Re:[H] raises more questions than it answers (Score:4, Informative)
  
  by DeadChobi ( 740395 ) writes: <DeadChobi@gmIIIail.com minus threevowels> on Monday February 11, 2008 @12:38PM (#22380122)
  
  It's misleading because video card manufacturers tweak their drivers to perform better in timedemos versus real world gameplay so that hardware review sites will do reviews touting the game as playable on such-and-such a card at maximum settings even though real world gameplay never comes close to what the time demo is doing to the game. Wow, that was one sentence. Oh, and how can you say that card A outperforms card B without ever comparing them in gameplay? That would be like me going into a hardware store and swinging two different hammers to compare them, then buying one based on that test only to find out that its total crap at actually hammering.
  
  The root of the issue is that timedemos give the video card manufacturers something to tweak their drivers around besides gameplay. And there are also some arguments over how representative of your actual experience a timedemo will be. At least HardOCP gives a crap about their methodology, as opposed to other hardware sites which don't use any sort of statistical analysis.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by Hoi Polloi ( 522990 ) writes:
    
    Reminds me of how the EPA is changing how fuel efficiancy is determined for cars. The old standard was not realistic compared to how most people actually drive. Now they are putting a lot more stop & go driving in their testing and getting lower, but more realistic, numbers.
  - Re: (Score:2)
    
    by Mike Rubits ( 818811 ) writes:
    
    One of the under-appreciated things about the Q3 and D3 engines is that demos are essentially a recording of the network stream. So running imedemo on a demo will be extremely accurate for real world performance.
  - Re: (Score:2)
    
    by Sebastopol ( 189276 ) writes:
    
    any sort of statistical analysis.
    
    HardOCP didn't really do any sort of statistical analysis. They gave min/avg/max on a few cards. Anandtech and Toms Hardware have a sample population and a methodology that blows the doors of HardOCP statistically.
    
    HardOCP is just regurgitating age-old arguments that have been around since the dawn of benchmarks. I helped code 3DMark in 1996, we went through the same arguments then. Nothing has changed. Synthetic benchmarks serve a purpose: because playing the game and r
- Re: (Score:2)
  
  by jonnythan ( 79727 ) writes:
  
  It's misleading because sometimes one card will come out way in front of another during a canned benchmark due to tweaking, shortcuts, whatever.... but that same card will come out way behind the other card during actual, real-life gameplay.
  
  See the difference?
  
  HardOCP's testing is only concerned with real-life gameplay. Most of the time, their conclusions are pretty similar to other sites... card A is faster than card B, for instance. However, sometimes, their conclusions are opposite what other sites come u
- Re: (Score:1)
  
  by Warll ( 1211492 ) writes:
  
  No kidding! In game I'm sure physics could really slow you down. Your computer now needs to keep track of the guy you just shot (Rag doll) those few stray bullets which just hit that Jeep's gas tank (Explosion, motion blur, the five other jeeps parked right next to it...) and all this while its keeping track of your gunboat rolling in the waves.
- Re: (Score:2)
  
  by blueg3 ( 192743 ) writes:
  
  Ideally, the graphics card and all on-CPU calculations are running in parallel, so the influence of this extra work on graphics performance should be minimal. This is what they mean in TFA when they refer to situations that are not CPU-limited.
- Re: (Score:2)
  
  by Nebu ( 566313 ) writes:
  
  Finally, in relation to the Crysis real world gameplay versus the AT benchmark score, I thought it was common knowledge that the game would be slower when actually playing it because you likely have physics,AI,logic,sound calculations to do that you don't in timedemo mode. What is the big deal here?
  There's no reason you couldn't write a benchmark/demo which actually performs the physics/AI/logic/sound calculations, as opposed to pre-calculating that ahead of time. Even if your AI or physics code contains
Benchmarks are a marketing tool only (Score:2)

by Bullfish ( 858648 ) writes:

Give you an idea relative to other cards tested using the same benchmark. However, I have always found them misleading and somewhat gratuitous. Declaring a card superior over another just because it gives five more frames a second than another card is dumb. Especially when it is the difference between 110 and 115 frames per second.

As long as you don't run two 30 inch monitors, any name brand video card for about 200 bucks will give you great playable rates at 1680 x 1050.

A lot of benchmarks imply you need t
- Re: (Score:3, Insightful)
  
  by jonnythan ( 79727 ) writes:
  
  "As long as you don't run two 30 inch monitors, any name brand video card for about 200 bucks will give you great playable rates at 1680 x 1050."
  
  Not in Crysis, Call of Duty 4, UT3, etc.
  
  When I go to plunk down $200 - $300 on a video card, and one of them performs comfortably at my LCD's native resolution and the other one doesn't, that matters. Saying all cards in a given price range are roughly equivalent is saying that you are completely, 100% blind to the reality of video cards today.
  - Re: (Score:1)
    
    by i.of.the.storm ( 907783 ) writes:
    
    Radeon HD 3870 should have you covered for about $200, at least at 1680x1050.
    - Re: (Score:2)
      
      by jonnythan ( 79727 ) writes:
      
      It can't do Crysis at that resolution, and it is 5-10 fps (a significant number) slower in the likes of COD4 and similar at the same settings than a similarly-priced 8800GT.
      
      I can *just barely* enable AA and AF with the 8800GT. I would not be able to do this with a 30% slower card like the 3870.
      
      This is why reviews matter.
      - Re: (Score:1)
        
        by i.of.the.storm ( 907783 ) writes:
        
        I didn't realize that the 8800GTs had dropped down to the MSRP by now, so they are around the same price as the 3870. That 30% slower is definitely wrong though, it's only a little slower in some games. And if my 2900 Pro, which is slower than the 3870, can do Crysis at 1680x1050, medium/high settings at ~30fps (and that was the demo, not the final game with the 1.1 patch which supposedly improves performance considerably) then I have no doubt that the 3870 can run it fine.
- Re:Benchmarks are a marketing tool only (Score:4, Informative)
  
  by TheMeuge ( 645043 ) writes: on Monday February 11, 2008 @12:35PM (#22380100)
  
  As long as you don't run two 30 inch monitors, any name brand video card for about 200 bucks will give you great playable rates at 1680 x 1050.
  
  Evidently, you've never actually PLAYED Crysis. On an AMD64 Dual Core at 2.4GHz, 2GB of RAM, and Nvidia 8800GTS 640MB (>>$200), I needed to reduce my resolution to 1280x1024 and set everything to Medium, to have the framerate not drop into single digits or low teens, and stay at 20-30fps.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by PoderOmega ( 677170 ) writes:
    
    Seconded. I have an almost identical setup other than my 8800GTS is 320 megs and I had to play with everything set on medium to be playable.
  - Re: (Score:2)
    
    by Bullfish ( 858648 ) writes:
    
    actually, I have played (play) crysis... a mix of high and medium settings at 1680 x 1050... I use a HIS Ice 3850, 4 gigs of ram (yeah only 3 are used) and an E8400... I will say that I never said you could use a $200 card to run a game at high settings with great rate (and crysis is a pig for resources), just that you could get great frame rates, and you can by playing with the settings. And the games still look really good.
    
    The other guy who has trouble playing call of duty 4, that I don't get, I found it
  - Re: (Score:2)
    
    by mugnyte ( 203225 ) writes:
    
    I play using the quake raytracing engine and my benchmarks are sec/frame, not frame/sec.
  - Re: (Score:2)
    
    by witte ( 681163 ) writes:
    
    Well, my rig is much older and I run crysis on a nv6800, and a 1.8GHz cpu.
    So the gameplay sucks.
    The difference is that I spent a lot less money on hardware, ergo, I got a lot more sucky gameplay for my money.
    Or in other words, my suck per buck ratio is a lot higher.
    Yeah.
Benchmarking Benchmarks? (Score:1)

by Scubafish ( 1224972 ) writes:

So who's going to benchmark the benchmarks of the benchmarks?
Not the same card (Score:3, Insightful)

by jandrese ( 485 ) writes: <kensama@vt.edu> on Monday February 11, 2008 @12:47PM (#22380238) Homepage Journal

One thing that's bothering me is that HardOCP said "Anandtech benchmarked this card vs. an 8800GTS and said it came out faster, then we benchmarked it against an 8800GTX and it game out faster, then people complained that our results didn't match". Isn't that expected? The GTX is a faster card than the GTS last time I looked. Why is it such a shock that the ATI card came in between them in performance?

It is a bit of a shock that ATI's latest and greatest can't seem to consistently beat nVidia's over a year old GTX cards I guess.

Share
twitter facebook
It is about the "cheating" in benchmarks (Score:1)

by Iberian ( 533067 ) writes:

At least that is what I think he was trying to say. If ATI/NVIDIA knows that everyone will be benchmarking their respective cards using X benchmark why not write drivers that excel in that benchmark. Even further you can create hardware to much the same effect, though given the lead times for hardware design this will be harder.

What the best method for eliminating the discrepancies from those best able to code for a given benchmark is I am not sure but it seems he tries.
Suuure... (Score:2)

by JohnnyBigodes ( 609498 ) writes:

FLASH NEWS: [H]ardOCP throws such outdated concepts such as "controlled testing environment" and "repeatability" out the window and calls it revolutionary! Yay!
We prefer stopwatches (Score:1)

by neilticktin ( 660748 ) writes:

MacTech Labs [mactechlabs.com] (part of MacTech Magazine [mactech.com]) has done a number of benchmarks that were very mainstream in the past year -- including most recently Parallels vs. Boot Camp vs. VMware Fusion [mactech.com], and Office 2008 [mactech.com]. In designing each of these, we went out of our way to figure out how to make them "real world". In other words, not only to only test the things that most users would do ... but also to measure them in a way that users perceive. One way that we do that is to do the testing with stopwatches. Because, if it
- Re: (Score:1)
  
  by TheCycoONE ( 913189 ) writes:
  
  While stopwatches may work well for load time and busy waiting scenarios, you'd have to be particularly quick to measure frame rates with one.
Why DX10? (Score:1)

by InsaneProcessor ( 869563 ) writes:

How about benchmarking frame rates on the real platform. Friends don't let friends play games on Vista. All of the serious gamers I know avoid it like the plague because of crappy frame rates and poor performance.
- Re: (Score:1)
  
  by Silver Surfer 1 ( 193024 ) writes:
  
  No kidding,
  I personally dont know any gamers that use Vista other than what might have come on a new laptop and most of those have even removed it of the laptop.
Insufficient sample size (Score:2)

by Guspaz ( 556486 ) writes:

They've examined ONE SINGLE game and used this to (try to) invalidate the testing method for EVERY game. Sorry, doesn't work like that.

All they've proven is that there is something wrong with the timedemo system in Crysis.
- Re: (Score:3, Funny)
  
  by SQLGuru ( 980662 ) writes:
  
  Apparently you were using the wrong benchmark. You just thought you were fast.
  
  Layne
  - - Re: (Score:2)
      
      by somersault ( 912633 ) writes:
      
      The thing is that games are different each time you play them, so that isn't really a benchmark. The summary says that real games are slower than benchmarks.. I mean DUHHHH! Benchmarks are (or should be) on rails, with no user interaction to ensure that they're the same on each system. Over and above what the benchmarks do, games need to monitor user input and do AI for the enemies at the very least (probably some other obvious things that I'm missing out but those seem to be the main differences to me at t
- Re: (Score:1)
  
  by i.of.the.storm ( 907783 ) writes:
  
  If you really understand their methods, they're trying to give a more subjective rating because numbers aren't always that helpful and can be misleading/thrown off by various factors including video card company driver "cheating" to improve framerates at the cost of image quality.
- Re: (Score:2)
  
  by PitaBred ( 632671 ) writes:
  
  Windows won't use PAE very well in general, and will only turn on if you tell it to with a kernel switch. With Linux, you have to compile a kernel that's aware of it (set HIGHMEM64G=yes or something like), and it does lower the performance somewhat.
  
  But by default, Windows and Linux will boot and just ignore any extra memory they can't address. PAE shouldn't enter the picture for any serious gamers.
- Re: (Score:2)
  
  by Gromius ( 677157 ) writes:
  
  Ah but you can only address 4GB of ram in 32-bit windows. That includes your graphics cards memory, after all its got to be addressable. High end video cards have lots of ram. So one 8800GTX knocks your addressable ram to ~3.25Gb and two in sli knocks it to 2.5 GB. Suddenly see why alot of people have 2gb of RAM when running with high end graphic cards...
  - Re: (Score:2)
    
    by ShadowsHawk ( 916454 ) writes:
    
    I didn't realize that the video card ram counts against the 4Gb. I just built a new duo core with 3Gb and a 8600GTS 512Mb, but still good to know. Thanks!

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Erase Futuremark = instant win (Score:1, Insightful)

OSS (Score:1, Funny)

Re: (Score:2, Insightful)

Re: (Score:3, Funny)

Re:OSS (Score:5, Funny)

Re: (Score:1)

Re: (Score:1)

back in my day... (Score:5, Funny)

Re: (Score:3, Interesting)

Re: (Score:3, Insightful)

Re:back in my day... (Score:5, Insightful)

Re:back in my day... (Score:4, Insightful)

Re:back in my day... (Score:4, Interesting)

Re: (Score:1)

Re:back in my day... we didnt make bad analogies (Score:2)

Re: (Score:1)

Re: (Score:2)

Re:back in my day... (Score:5, Funny)

Re:back in my day... (Score:5, Informative)

Re: (Score:1)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:back in my day... (Score:4, Informative)

Re: (Score:3, Funny)

Re: (Score:2)

Re: (Score:1)

Those were rigged, too. (Score:2)

FRAPS Overhead? (Score:1)

Re: (Score:3, Informative)

Re: (Score:1)

whatevermark (Score:3, Funny)

Re: (Score:2)

Re: (Score:2)

hmm (Score:2, Funny)

My old benchmark (Score:3, Funny)

Re: (Score:2)

Re:My old benchmark (Score:4, Funny)

Re: (Score:2)

Synthetics not entirely useless (Score:4, Informative)

Re: (Score:1)

Re: (Score:1)

Re: (Score:2)

We need international benchmarking standards! (Score:4, Funny)

Would like to see a real world comparison for EQ (Score:2)

Re:Would like to see a real world comparison for E (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

Re:Would like to see a real world comparison for E (Score:2)

Re:Would like to see a real world comparison for E (Score:1)

Re: (Score:2)

Re: (Score:1)

Benchmarks (Score:5, Insightful)

Re: (Score:2)

Re: (Score:2)

Benchmarks != Reality (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Obligatory Portal Reference (Score:1)

HardOCP benchmarks suck ass (Score:2)

Re:HardOCP benchmarks suck ass (Score:4, Insightful)

Re: (Score:3, Insightful)

Re: (Score:2)

[H] raises more questions than it answers (Score:3, Informative)

Re:[H] raises more questions than it answers (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Benchmarks are a marketing tool only (Score:2)

Re: (Score:3, Insightful)

Re: (Score:1)

Re: (Score:2)

Re: (Score:1)