Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Shared Video Memory and Memory Bandiwidth Issues?

Cliff posted more than 10 years ago | from the it's-all-about-the-performance-baby! dept.

Portables 37

klystron2 asks: "Does shared video memory consume a huge amount of memory bandwidth? We all seem to know that a notebook computer with shared video/main memory will have performance drawbacks.... But what exactly are they? It's easy to see that the amount of main memory decreases a little bit, but that shouldn't make a big difference if you have 1GB of RAM. Does the video card trace through memory every time the screen is refreshed? Therefore consuming a ton of memory bandwidth? If this is the case then the higher the resolution and the higher the refresh rate, the lower the performance of the system, right? I have searched the Internet for an explanation on shared memory and have come up empty. Can anyone explain this?"

cancel ×

37 comments

Sorry! There are no comments related to the filter you selected.

First Cement Mixer (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#7889966)

eeeee oaaarrrrrrrrrr

no issue. (0)

Anonymous Coward | more than 10 years ago | (#7889967)

bandwidth is not going to be an issue unless you're saturating your bus. How many people can tell the difference between PC133 and PC2100 RAM?

I can -- but then I'm enough of a professional not to buy a sorry-ass eMachine at my local Walmart.

Re:no issue. (-1, Troll)

Anonymous Coward | more than 10 years ago | (#7889984)

Also, turds are fine as long as you don't smell them.

Pro/Con (3, Interesting)

Tune (17738) | more than 10 years ago | (#7890001)

Besides the obvious penalty caused by the DAC, there should also be a minor improvement in applications where the CPU needs to have direct access to (physical) screen memory (older 2D games) or the GPU needs data stored in main memory (3D bitmaps). In the "standard setup" this would require data transfer from main memory to video memory (and vice versa) including the overhead of PCI/AGP synchronization.
Wrt. performance the benefits of a separated frame buffer outweigh those of shared memory, in my experience. I'm not sure if this is true as well wrt. the performance/power consuption ratio (use suitable definition), however. Especially when the (Dvi LCD/TFT) screen already has a frame buffer and VSync is only 20-40Hz. (Ditch GPU altogether?)

Anyone with ideas, data?

--
As far as we know, our computer has never had an undetected error -- Weisert

I'd assume so, too (4, Interesting)

Scorchio (177053) | more than 10 years ago | (#7890020)

I can't say for current day laptops, but early ARM based Acorn computers had shared video and system memory. It did indeed suck bandwidth, which seems was a problem when it came to IO. If you set the resolution high enough, the screen would turn black while loading from disk, presumably so any incoming data from the drive doesn't get lost while the bus is in use by the video output. Fortunately, at the time, the OS and software in general didn't constantly suck at the disk, as is common today.

I presume today's bus speeds, processor caches and other buffers are sufficiently fast and large enough to share the memory without too much of a noticable effect...

'T'ain't nuthin' compared to a Sinclair ZX-81... (4, Informative)

leonbrooks (8043) | more than 10 years ago | (#7890156)

...which had the Z80 CPU generating the video directly, leaving only interframe gaps for computing.

Since the greeblie had no interrupts and they were too lazy to quantise the BASIC interpreter so that they could run it in the interframe and still generate reasonably consistent sync pulses, the screen went away completely while programs ran. A modern monitor would go postal, faced with a constantly appearing/vanishing sync pulse train but TVs are kind of used to dealing with cruddy signals.

I think the Sinclair was branded a Timex in the UK.

Re:'T'ain't nuthin' compared to a Sinclair ZX-81.. (2, Informative)

Steve Cox (207680) | more than 10 years ago | (#7890247)

> I think the Sinclair was branded a Timex in the UK.

No, Timex sold the Sinclair ZX81 in North America. Sinclair Research Ltd. sold the Sinclair ZX81 in the UK. The US variant was named the Timex Sinclair 1000.

The Timex 1000 was pratically identical to the ZX81, except for a few changes on the circuitboard and a whopping 2K of RAM instead of the 1K that the ZX81 had.

Steve

Re:'T'ain't nuthin' compared to a Sinclair ZX-81.. (1)

MrResistor (120588) | more than 10 years ago | (#7892758)

I remember the Sinclair 1000. My dad got one when I was young enough that I don't remember now how young I was. That was one suck-ass computer! It couldn't even run that stupid condor game at a speed that even remotely approached playability, and mind you, that was compared to machines which it was supposedly competing against.

It's sort of a fond memory, in the same sense that might "fondly" remember the first time you got sick from drinking to much...

I stand corrected, ta! (1)

leonbrooks (8043) | more than 10 years ago | (#7902534)

Here in Oz, 't'was Sinclair.

Re:'T'ain't nuthin' compared to a Sinclair ZX-81.. (1)

steve.m (80410) | more than 10 years ago | (#7890559)

Bzzt. Incorrect!

The ZX-80 suffered from that, but the ZX-81 could display and execute.

It also had a fast mode, so you could ignore the display and use the whole 3.5MHz for your app.
As described here [old-computers.com]

I'm learning a lot today (-: (1)

leonbrooks (8043) | more than 10 years ago | (#7902553)

+1 Informative, that man!

Re:'T'ain't nuthin' compared to a Sinclair ZX-81.. (1)

advocate_one (662832) | more than 10 years ago | (#7891169)

'twas the Sinclair ZX80 that blanked the screen when computing... the Sinclair ZX81 could maintain a display while computing because, as you say, it did the BASIC interpretation during the flyback.

Re:'T'ain't nuthin' compared to a Sinclair ZX-81.. (1)

bdraschk (664148) | more than 10 years ago | (#7891508)

And think of the good old Amiga, which had that shared memory architecture. Which high load through the chipset, the CPU was slowed considerably when accessing RAM. I remember printing a HiRes picture (640x400, 16 levels of grey), which was dead slow and went much faster after switching to a LoRes screen.

Later models came which so called FastRam, which wasn't affected by the slowdown, but caused all sorts of troubles, as programms couldn't deal with the fact that this RAM wasn't accessible by the grafic chips.

Those were the times, ... :-)

Re:'T'ain't nuthin' compared to a Sinclair ZX-81.. (1)

cybpunks3 (612218) | more than 10 years ago | (#7896287)

That must have been on an older version of the OS. The Amiga can prioritize memory allocation so that it uses FastRAM before ChipRAM without confusing applications. The only ChipRAM utilization on an Amiga would have to be for accesses to the chipset for gfx, built-in sound, and built-in I/O ports (serial, parallel, floppy, game ports). In such an environment ChipRAM can be thought of as dedicated video RAM as in a graphics card, but with the option of using it for regular program storage if you don't have enough (or any) fastRAM.

Re:'T'ain't nuthin' compared to a Sinclair ZX-81.. (1)

TheSunborn (68004) | more than 10 years ago | (#7903840)

He is probely talking about an Amiga 500 which don't
normally don't have Fastram. Even the 512K Ram expansion were not real fastram. It were called slowram and had the performence of chipram, but were not accesable for the chipset. The worst of both words)

Martin Tilsted

Re:'T'ain't nuthin' compared to a Sinclair ZX-81.. (1)

Squozen (301710) | more than 10 years ago | (#7911215)

That depended entirely on the chipset. The ECS chipset could access the entire 1Mb. The AGA chipset could access 2Mb. If my memory serves me correctly, that is. :)

Pixels are read from RAM everytime (4, Informative)

G4from128k (686170) | more than 10 years ago | (#7890066)

Does the video card trace through memory every time the screen is refreshed? Therefore consuming a ton of memory bandwidth? If this is the case then the higher the resolution and the higher the refresh rate, the lower the performance of the system, right?

Yes. The pixels on the screen are read out every single frame time (i.e., 60 to 75 times each second). The DAC (Digital to Analog Convertor) must be fed the pixel data every time -- with video in main RAM, there is no other place to store this image data because the main memory is this buffer. The product of the frame rate, resolution, and color depth tells you how much bandwidth is consumed.

The exact performance impact is not easy to predict though. Where it gets tricky is with CPUs that have large L1, L2, and L3 caches. It is possible for the CPU to be running at 100% while the video is being read if the CPU is finding all the data and instructions in the cache. But if the CPU must access main RAM, then there will be competition.

Re:Pixels are read from RAM everytime (1)

tundog (445786) | more than 10 years ago | (#7896670)

You say that every frame must be read out from main memory. This is true if the shared memory system has 0 memory (caching issues aside), but don't shared memeory systems have at least a single buffer to store at least the last frame? I mean, how much can 3 Meg of RAM cost these days (i.e. 1024 x 728 @ 32 bit)?

I would imagine that the shared memory part comes into play when you have a lot of clipping. But then again if you're buying a shared memory system, I guess the minor advanage of 3 Meg RAM isn't really important to you anyway so why bother making the card more expensive.

Cost & space efficiency vs. performance (2, Informative)

G4from128k (686170) | more than 10 years ago | (#7897016)

You say that every frame must be read out from main memory. This is true if the shared memory system has 0 memory (caching issues aside), but don't shared memeory systems have at least a single buffer to store at least the last frame? I mean, how much can 3 Meg of RAM cost these days (i.e. 1024 x 728 @ 32 bit)?

No, these systems have no separate frame buffer - main RAM is the buffer. Even when nothing is changing on the screen, the video subsystem is reading data at the full frame rate from RAM.

Although 3 MB of RAM chips might seem cheap, every component adds cost (most system designers try to minimize the total number of components). More importantly, space on the motherboard (especially in a laptop or miniATX) is a precious commodity. The most cost efficient and space efficient way to have 3 MB of video memory in a PC is to borrow it from the 256 MB DIMM that you will be putting in there anyway.

Borrowing from main RAM may incur a slight performance penalty, but the systems that use this approach are not sold for their performance. Low cost or extreme compactness drive the designer to avoid adding special video memory buffers. And with DDR RAM, the memory bandwidth is sufficently high to not cause too much of a performance hit.

Band*I*Width? (5, Informative)

shyster (245228) | more than 10 years ago | (#7890119)

I know I always have band*i*width issues...

But seriously, you may want to take a look at this [tomshardware.com] Tom's Hardware article detailing the weaknesses of an integrated chip.

For those looking for the quick answer, I'll do my best to summarize. First off, since integrated graphics tend to be low cost solutions, transistor counts are nowhere near current add-in boards. From the article, Nvidia's FX5200 has 47 million transistors (FX5600=80 million and FX5900=130 million), while their onboard solution (equivalent to GeForce4 MX440) has only 27 million.

Then, there's the question of memory bandwidth. Dual channel DDR 400 has a peak of 6.4GB/s, which is shared, while an equivalent GeForce4 MX440 would have a dedicated 8GB/s.

Now, to your question. Does this consume a ton of bandwidth and affect performance? Well, that would all depend on what you're doing with it.

If you're running 3D games and the like, then both performance and bandwidth will be an issue and limit your framerates. Comparing the previous review and this [tomshardware.com] review of add in boards, shows about a 25% reduction in framerate (at 1024x768) between an add in GeForce4 MX440 and an NForce2 integrated chipset in UT2003, and an almost 40% reduction in 3DMark 2001. Since the machines were not identical, don't take the numbers as gospel, but they were similar enough to make a meaningful comparison IMHO.

That being said, for normal 2D work, bandwidth utilization is negligible and shouldn't seriously impact performance as shown by this [tomshardware.com] SysMark 2002 test. AFAIK, this doesn't take into account extremely intensive RAM->CPU loads, but I wouldn't expect results to vary significantly, since memory requirements for 2D work are relatively low.

Be warned though, that Tom's Hardware did note image quality issues with most of the integrated chips-which they theorized was the result oflow cost manafacturing, not a limit of the technology itself. This theory is bolstered by the fact that their low cost add in card (Radeon 9200) suffered the same problems.

Re:Band*I*Width? (1)

Shanep (68243) | more than 10 years ago | (#7903308)

Now, to your question. Does this consume a ton of bandwidth and affect performance?

I don't think he is concerned with 3D rendering performance. He is concerned with the impact on main memory bandwidth, since a part of main memory is being used as a frame buffer.

For a constant image to appear on screen, the frame buffer must read for each frame displayed. 70 times per second for 70Hz. This can add up to hundreds of megabytes per second, depriving this bandwidth from the CPU.

These shared main memory/frame buffer memory systems are not going to be real 3D speed demons.

That being said, for normal 2D work, bandwidth utilization is negligible and shouldn't seriously impact performance ... I wouldn't expect results to vary significantly, since memory requirements for 2D work are relatively low.

Using a portion of main memory as a frame buffer, even to display a static 2D image, is going to mean that that portion of memory is read completely for each frame displayed. That might be a lot of bandwidth out of that available.

If someone can afford a system with DDR main memory (where the lost bandwidth will be less noticable), then they probably can also afford a cheapo video card with a real, dedicated frame buffer.

Re:Band*I*Width? (1)

enigmatichmachine (214829) | more than 10 years ago | (#7917408)

I have currently 1 laptop 1.5ghz athalon xp, 712megs ram shared as video memory, ddr 266 ram, some ati integrated video bullshit i wanna say a radeon 7000 or somewhere in that area. I'm running in 1024x768

luckily for you all, my desktop is a 1.5ghz athalonxp, 712 megs of ddr 266 ram, and a geforce 3 top of the line when i bought it video card, and i run it at 1024x768 32bpp,
I ran sysoft sandra on both of these computers, and run 3dstudio Max animation on both of them, and use them both daily.

the stats: my laptop, despite being a moble processor and having shared video memory, is about 2% slower overall, in the various tests i've ran.( not including hard drive ones) than my desktop. even the shared memory but slightly newer 3d chip in the laptop lets me work with more poly's in a scene in 3ds max, and feels more responsive than the the desktop system, just slightly.

Oddly enough my roomate has a SDR ram athalon xp 1.5ghz box with a radeon 7500(on which my mobile chip is based) and its about 15% slower than either of mine, simply because of memory bandwidth that DDR gives you. he'll be upgradeing to play half life 2, I think i'm (hopefully) sittting pretty thanks to DDR.

Re:Band*I*Width? (1)

Shanep (68243) | more than 10 years ago | (#7934985)

about 2% slower overall

Did you measure the main memory speed? I would have thought the laptop would be about 10% slower than the desktop, considering the resolution/colour depth and the usage of DDR main memory.

I think i'm (hopefully) sitting pretty thanks to DDR.

Definitely. The quicker main memory becomes, the easier they can get away with profit maximizing techniques like this. Dedicated frame buffer memory of equal speed to main memory (all other things being equal), will always be faster. But if it's only 2%, then the cost might not be worth it.

Lets do some sums (3, Informative)

EnglishTim (9662) | more than 10 years ago | (#7891944)

Let's say you're running a game at 1280 x 1024 * 32bit @ 75Hz.

1280 x 1024 x 32 x 75 = 3145728000 bits/second just to display

That's 375 Mb/s.

If you've got DDR 2700 memory, that's a peak rate of around 2540 Mb/s.

Therefore, the screen refresh alone is taking up 15% of your memory bandwidth.

You've also got to be drawing the screen every frame, let's say it'd doing this 25 times a second, and that the game you're playing had an average overdraw per pixel of 1.5 and it hits the z-buffer on average twice per pixel.

You've got 125Mb/s used up with the colour and 125Mb/s used up with z-buffer accesses (assuming 16bit buffer) that uses up 10% of your maximum data rate

Overall, then, a quarter of the maximum available bandwidth is being used by the video card.

Re:Lets do some sums (1)

Wolfrider (856) | more than 10 years ago | (#7901831)

--I think your ' x32 ' may be a bit off. IIRC I heard somewhere that color only goes up to 24bpp, and the supposed 32-bit color is just a "salesman's math"-ing of it.

Re:Lets do some sums (1)

EnglishTim (9662) | more than 10 years ago | (#7901994)

You're sort of right - '32 bit colour' is actually 24bit with eight spare bits.However, often on a graphics card the frame buffer will display 24 bit colour using 32 bits per pixel, and use the extra 8 bits for a stencil buffer. The reason for this is that it's normally a lot quicker to access 4 byte aligned memory than unaligned memory.

For textures, the remaining eight bits are often used as the alpha (transparency) channel.

Re:Lets do some sums (0)

Anonymous Coward | more than 10 years ago | (#7912030)

How did you get 2540 Mbit/s? I thought PC2700 means 2700 MByte/s.... I'm confused....

Re:Lets do some sums (1)

EnglishTim (9662) | more than 10 years ago | (#7919615)

It's 2700 MBytes/s if you count a megabyte as 1,000,000 bytes. If you count it as 1,048,576 bytes (1024 * 1024), then it works out as 2540 MBytes/s.

I think I may have written Mb rather than MB, I always forget which one is bits and which one is bytes. My bad.

Re:Lets do some sums (1)

PReDiToR (687141) | more than 10 years ago | (#7923591)

8 bits in a Byte - take a Big Byte

Yes, and it ticks me off (0)

Anonymous Coward | more than 10 years ago | (#7891999)

I've got a couple of "junk" laptops that I like to play around with, but they keep getting less and less useful because I can't upgrade the memory.
They DO have the CPU capability and they run command-line Linux and NetBSD just fine, but throw X into the mix and they severely suck.
They still suck if I'm doing the X client on my main PIII desktop and only putting the X server on the laptop...it isn't a CPU issue. It's because of memory.

Tom's Hardware have an article about that. (2, Insightful)

chrestomanci (558400) | more than 10 years ago | (#7892108)

Funny I was just reading an article [tomshardware.com] over on Tom's Hardware guide [tomshardware.com] about that

The article benchmarks three different boards with integrated graphics solutions (Intel i865G , nForce2, & SIS 651) using both the integrated graphics hardware, and a $50 graphics card.

Unsurprisingly, in 3D applications, all have quite poor performance [tomshardware.com] , only the nForce 2 system has acceptable performance with even older games at low resolution.

More important to your question, They also run comparative benchmarks using windows office applications [tomshardware.com] , with both the integrated graphics, and the $50 card. The graphs clearly show, that there is no effective difference in performance, and that the benchmark results are largely CPU bound.

In concussion, I would not expect integrated graphics to hut general computing performance. Though I would of course check that the graphics performance is adequate, as it may not be possible to update in the future.

Re:Tom's Hardware have an article about that. (4, Funny)

0x1337 (659448) | more than 10 years ago | (#7892632)


...In concussion, I would not expect integrated graphics to hut general computing performance. Though I would of course check that the graphics performance is adequate, as it may not be possible to update in the future.

Lol... stop banging your head on the desk - then you'll stop getting concussions, and integrated graphics will cease and desist making a hut over computer performance.

Re:Tom's Hardware have an article about that. (1)

Shanep (68243) | more than 10 years ago | (#7903522)

Funny I was just reading an article over on Tom's Hardware guide about that

I wouldn't put too much emphasis on what you read at THG. Once upon a time, an article at Tom's hardware tried to claim that AGP provided no gains over PCI, by comparing current (at the time) PCI and AGP 3D cards. A stupid stupid way to prove the point.

The AGP cards were new and working in glorified PCI mode. Not using advanced AGP features. What's more, the software being used to benchmark "PCI vs AGP" exploited fill rate limits and NOT PCI/AGP bus limits.

Bert McComas I beleive was the "experts" name.

The irony was, that later, using a specially crafted Quake 2 level (by S3), which used huge textures, a slow AGP Matrox G200 could be seen running the benchmark faster than a "super fast" PCI setup of dual Voodoo2's!

This benchmark took the emphasis off fill rate limits and onto the PCI/AGP bus. Suddenly a G200 was faster than dual Voodoo2's?! And it all came down to PCI vs AGP.

THG was trying to prove that AGP sucks by saying, "look this PCI card is faster than this AGP card, therefore AGP sucks". They refused to back down too, when techs working in the design of this industry informed them, point by point, why they were wrong. Hell, there is data on THG that proves THG wrong.

THG has little credibility as far as I am concerned.

One point about memory (3, Informative)

wowbagger (69688) | more than 10 years ago | (#7893292)

What we call "RAM" (Random Access Memory) really isn't all that good at random access.

When you read a location of RAM, the RAM chips have to read the entire row that location lives in. For a memory that is 256 million locations (where a location could be a bit, or a byte, or even a dword, depending upon the memory's layout), to read a location means loading 16 thousand locations into the sense amps of the chip.

Now, once you've fetched the data into the sense amps, reading the rest of the row out can happen much faster than that initial access.

CPUs tend to access things more or less sequentially when it comes to code (modulo jumps, calls, interrupts, and context switches), but data isn't quite as nice.

Video, on the other hand, is great from the DRAM controller's point of view - it can grab an entire row of data and shove it into the display controller's shift register. And wonder of wonders, the next request from the video refresh system is going to be the very next row!

So while video refresh does take bandwidth, in many ways driving the video controller is "cheaper" than feeding the CPU.

(the details in this post GREATLY simplified for brevity)

Depends on the implementation... (4, Informative)

mercuryresearch (680293) | more than 10 years ago | (#7893339)

In general, yes, shared memory sucks bandwidth. As others pointed out, the calculations are pretty straightforward (X * Y * #bytes/pixel * refresh rate = Bandwidth).

However, in today's systems it's FAR more complicated that this.

First, some older implementations, particularly the Intel 810, used a 4MB display cache. The net of this is that the display refresh was generally served from a secondary memory and didn't interfere with main memory bandwidth. As well, Intel used some technology Chips & Tech developed that basically did run-length encoded compression on the display refresh data (look right at your screen now, there's a LOT of white space, and RLL will shrink that substantially.)

Today most chip sets incorporate a small buffer for the graphics data and compression techniques to minimize the impact of display refresh on bandwidth.

But wait -- it gets even MORE complicated. With integrated graphics on the north bridge of the chip set, the memory controller in the chip set knows both what the CPU and what the graphics core want to access. So the chip set actually does creative scheduling of the memory accesses so that the CPU doesn't get blocked unless absolutely necessary. So most of the time the CPU is either getting its memory needs services by its own cache, or it's getting (apparently) un-blocked access to memory. So the impact of graphics is much less than the simple equation above would suggest.

Finally... we now have dual-channel memory systems. Even more tricks to keep the graphics and CPU memory accesses separate come into play here.

So, the short answer is yes, there's an impact, but it used to be much worse. Innovative design techniques have greatly reduced the impact so that in non-degenerate cases it doesn't affect the system too much. In a degenerate case of your app never getting cached and doing nothing but pound on the memory system with accesses, however, then you'll see the impact in line with the bandwidth equation above.

hrm (0)

Zulu (78464) | more than 10 years ago | (#7895314)

bandiwidth? :P

Shared video memory... (0)

Uplore (706578) | more than 10 years ago | (#7956278)

with system memory is a cheap option for motherboard manufacturers and practically guarantees sub-standard performance since main memory is not nearly as fast as most, if not all, dedicated video memory.
Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>