×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Exhaustive Data Compressor Comparison

kdawson posted about 7 years ago | from the pick-one-smaller-or-faster dept.

Software 305

crazyeyes writes "This is easily the best article I've seen comparing data compression software. The author tests 11 compressors: 7-zip, ARJ32, bzip2, gzip, SBC Archiver, Squeez, StuffIt, WinAce, WinRAR, WinRK, and WinZip. All are tested using 8 filesets: audio (WAV and MP3), documents, e-books, movies (DivX and MPEG), and pictures (PSD and JPEG). He tests them at different settings and includes the aggregated results. Spoilers: WinRK gives the best compression but operates slowest; AJR32 is fastest but compresses least."

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

305 comments

duh (5, Funny)

Gen. Malaise (530798) | about 7 years ago | (#18836217)

Nothing to see. High compression = slow and low compression = fast. umm duh?

small = slow (5, Funny)

Anonymous Coward | about 7 years ago | (#18836249)

So that's why smaller computers are slower, right?

Re:small = slow. Tunning UPX (Ultimate Packer eXe) (1, Informative)

Anonymous Coward | about 7 years ago | (#18836855)

Book: "Digital Compression for Multimedia". PRINCIPLES & STANDARDS. Morgan Kaufmann Publishers Inc.

Interesant algorithms: i suppose that the patents are expired. Key items:
  • Tail-biting LZ77.
  • Lempel-Ziv-Yokoo LZY 1992, Kiyohara and Kawabata 1996.
  • LZ78SEP.
  • LZWEP.
  • LZYEP.
No War, Peace Again!

Re:duh (2, Insightful)

timeOday (582209) | about 7 years ago | (#18836259)

So you alreay knew WinRK gave the best compression? I didn't; never even heard of it. My money would have been on bzip2.

Re:duh (2, Funny)

dotgain (630123) | about 7 years ago | (#18836441)

So you alreay knew WinRK gave the best compression? I didn't; never even heard of it.
Well thank heavens we have now! If there's one area of computing I've always felt I wasn't getting enough variety, it's compression algorithms and the associated apps needed to operate with them.

If there's one thing that brightens my day, is a client sending me a PDF compressed with "Hey-boss-I-fucked-your-wife-ZIP" right on deadline.

Linux is fading away? (-1, Offtopic)

Anonymous Coward | about 7 years ago | (#18836859)

Linux weenies, explain this [google.com].

Re:duh (4, Informative)

morcego (260031) | about 7 years ago | (#18836901)

So you alreay knew WinRK gave the best compression? I didn't; never even heard of it. My money would have been on bzip2.


I agree with you on the importance of this article but ... bzip2 ? C'mon.
Yes, I know it is better than gzip, and it is also supported everywhere. But it is much worst than the "modern" compression algorithms.

I have been using LZMA for some time now for things I need to store longer, and getting good results. It is not on the list, but should give results a little bit better than RAR. Too bad it is only fast when you have a lot of memory.

For short/medium time storage, I use bzip2. Online compression, gzip (zlib), of course.

Re:duh (5, Funny)

setirw (854029) | about 7 years ago | (#18836263)

High compression = slow and low compression = fast

You compressed the article into that statement. How long did it take to write the comment?

Re:duh (3, Insightful)

kabeer (695592) | about 7 years ago | (#18836313)

Compressing the article into that statement would technically be classed as a lossy compression e.g. jpeg.

ATTN: SWITCHEURS! (-1, Troll)

Anonymous Coward | about 7 years ago | (#18836709)

The only thing more pathetic than a PC user is a PC user trying to be a Mac user. We have a name for you people: switcheurs.

There's a good reason for your vexation at the Mac's eschewal of longtime Mac vendor Aladdin System's proprietary compressed archive format, StuffIt, for the more universally compatible "zip," along with the fact that The Rest Of Us just plain don't give a crap about compression algorithms: You don't speak its language. Remember that the Mac was designed by artists [atspace.com], for artists [atspace.com], be they poets [atspace.com], musicians [atspace.com], or avant-garde mathematicians [atspace.com]. A shiny new Mac can introduce your frathouse hovel to a modicum of good taste, but it can't make Mac users out of dweebs [atspace.com] and squares [atspace.com] like you.

So don't force what doesn't come naturally. You'll be much happier if you stick to an OS that suits your personality. And you'll be doing the rest of us a favor, too; you leave Macs to Mac users, and we'll leave beige to you.

Re:ATTN: SWITCHEURS! (0)

Anonymous Coward | about 7 years ago | (#18836967)

The only thing more pathetic than a PC user is a PC user trying to be a Mac user. We have a name for you people: switcheurs.

We have a name for you people too. Unfortunately, it can't be repeated in mixed company.

Not really (3, Insightful)

Toe, The (545098) | about 7 years ago | (#18836265)

Not every software achieves maximum efficiency. It is perfectly imaginable that a compressor could be slow and bad. It is nice to see that these compressors did not suffer that fate.

Re:duh (3, Informative)

aarusso (1091183) | about 7 years ago | (#18836321)

Well, since the dawn of ages I saw ZIP v ARJ, bzip2 vs gzip.
What's the point? Same programs compressing same data on a different computer.

I use gzip for big files (takes less time)
I use bzip2 for small files (compresses better)
I use zip to send data to Windows people
I really, really miss ARJ32. It was my favorite on DOS Days.

Re:duh (2, Insightful)

cbreaker (561297) | about 7 years ago | (#18836937)

Hell yea. Although ARJ had slightly better compression, it allowed for *gasp* two files in the archive to be named the same!

Now a days it's all RAR for the Usenet and Torrents and such. RAR is really great but it's piss slow compressing anything. It's just so easy to make multipart archives with it.

I really wish Stuffit would go away ..
   

Re:duh (1)

Petrushka (815171) | about 7 years ago | (#18836421)

I take it you didn't look at the "Compression Efficiency" graph at the bottom of each page.

Of course they don't seem to reveal their methodology for calculating that graph, but even a glance at the other tables will show that, for example, Stuffit is almost always much faster saves very nearly as much space as 7-Zip (sometimes more). That's why comparisons like this are interesting.

Re:duh (0)

Anonymous Coward | about 7 years ago | (#18836531)

I've done such a test about a month ago, but not with the exact same apps:

My quick summary:
-there's some very fast apps (like using the zip format), but compression is so low it's almost pointless to bother with it in the first place
-there's some higher compression formats, but the last few percent you usually get from them often means doubling or even more the compression time -- not worth the wasted time, unless bandwidth/file size is absolutely critical. This usually means using compressors most people haven't heard of and annoys them (like .7z archives, which most people will complain that they can't open it with winzip -- .ace and .arj were very popular back then, now nowadays...)
-my winner? Winrar. Why? Ultimate speed:compression ratio/tradeoff. I challenge anyone to find something (preferably without using some obscure format) that has a better compression ratio:time spent -- I haven't found one. It's almost as good as the most extreme compressors that take forever, but it's still very fast. Also, winrar's GUI is much nicer and intuitive than many of the others, it does multipart archives real well, it handles decompression of most formats, has decent shell integration and all. Not free though.

And when I need more "extreme" compression I use 7zip with the ultra setting, but it's ~8x slower than Winrar, for something around 10% smaller most of the time.

Re:duh (5, Funny)

h2g2bob (948006) | about 7 years ago | (#18836669)

Oh, if only they'd compressed the article onto a single page!

Re:duh (2, Funny)

MillionthMonkey (240664) | about 7 years ago | (#18836955)

Server Error in '/' Application.
Server Too Busy
Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.
Exception Details: System.Web.HttpException: Server Too Busy
Source Error: An unhandled exception was generated during the execution of the current web request. Information regarding the origin and location of the exception can be identified using the exception stack trace below.
Stack Trace:
[HttpException (0x80004005): Server Too Busy]
      System.Web.HttpRuntime.RejectRequestInternal(HttpW orkerRequest wr) +148
Version Information: Microsoft .NET Framework Version:1.1.4322.2300; ASP.NET Version:1.1.4322.2300

Someone should write an article about how you should always replace your default error screens and remove information identifying your server software and version.

And slashdotting == no comression at all (1)

EmbeddedJanitor (597831) | about 7 years ago | (#18836991)

Server Too Busy

Description: An unhandled exception occurred during the execution of the current web request. Please review the stack trace for more information about the error and where it originated in the code.

Exception Details: System.Web.HttpException: Server Too Busy

One question... (0)

Anonymous Coward | about 7 years ago | (#18836229)

Which compression format are you going to send the article....

WOW! (5, Funny)

vertigoCiel (1070374) | about 7 years ago | (#18836239)

I never would have guessed that there was a tradeoff between the quality and speed of compression! No way! Next they'll be saying things like 1080p HD offers quality at the expense of computational power required!

Re:WOW! (1)

seanadams.com (463190) | about 7 years ago | (#18836335)

I never would have guessed that there was a tradeoff between the quality and speed of compression! No way! Next they'll be saying things like 1080p HD offers quality at the expense of computational power required!

If you really mean quality (as opposed to compression ratio) you've got it backwards. Lossless compression algorithms are generally simpler than lossy ones, especially on the encode side. Lossy algorithms have to do a lot of additional work converting signals to the frequency domain and applying complex perceptual models.

Re:WOW! (2)

renegadesx (977007) | about 7 years ago | (#18836917)

No dont forget people are rediculous in their claims, next they will say that 1080p takes up more disk space. Next thing you know Bill Gates will go on record saying you will actually need more than 640K of RAM

Screw speed, size reduction: gimme compatibility (5, Insightful)

xxxJonBoyxxx (565205) | about 7 years ago | (#18836243)

Screw speed and size reduction. All I want it compatibility with other OSs (i.e., fewest things that have to be installed on a base OS to use it). For that, I'd have to say Zip and/or gzip wins.

Re:Screw speed, size reduction: gimme compatibilit (4, Insightful)

Nogami_Saeko (466595) | about 7 years ago | (#18836305)

Nice comparison, but there's really only two that matter (at least on PCs):

ZIP for cross-platform compatibility (and for simplicity for less technically-minded users).

RAR for everything else (at 3rd in their "efficiency" list, it's easy to see why it's so popular, not to mention ease of use for splitting archives, etc).

Re:Screw speed, size reduction: gimme compatibilit (2, Informative)

BluhDeBluh (805090) | about 7 years ago | (#18836775)

It's closed sourced and proprietary though. Someone needs to make an open-source RAR compressor - the problem is you can't use the official code to do that (as it's specifically in the licence), but you could use unrarlib [unrarlib.org] as a basis...

Depends on the application (1)

Toe, The (545098) | about 7 years ago | (#18836329)

Some people are sending huge graphics files and paying for badnwidth and/or sending to people with slow connectiuons, so they actually have a use for maximal compression.

I have to agree that for most people (myself included), compatibility is all that matters. I'm so glad Macs now can natively zip. But there are valid reasons to want compression over compatibility.

You might want an interface. (1)

twitter (104583) | about 7 years ago | (#18836413)

All I want it compatibility with other OSs (i.e., fewest things that have to be installed on a base OS to use it). For that, I'd have to say Zip and/or gzip wins.

Sure, but there's also the issue of finding the files you really want to share and there KDE has very nice front ends. There's a nice find in Konqueror, with switches for everything including click and drool regular expressions. Krename coppies or links files with excellent renaming. Finally, Konqueror has an archive button. The slick interface does not preclude the use of command line tools because the rename and archive programs will take piped input. The GUI is nice for review of the output and easy further processing.

Re:You might want an interface. (0)

Anonymous Coward | about 7 years ago | (#18836691)

Erm, sorry but what's your point, this is completely off topic, we aren't discussing how bloated and unintuitive KDE's file manager is here, the article is about archive formats.

Re:Screw speed, size reduction: gimme compatibilit (2, Interesting)

NMerriam (15122) | about 7 years ago | (#18836511)

Screw speed and size reduction. All I want it compatibility with other OSs (i.e., fewest things that have to be installed on a base OS to use it). For that, I'd have to say Zip and/or gzip wins.


I have to admit I switched over/back to ZIP about a year ago for everything for exactly this reason. yeah, it meant a lot of my old archives increased in size (sometimes by quite a bit), but knowing that anything anywhere can read the archive makes up for it. ZIP creation and decoding is supported natively by Mac and Windows and most Linux distros right from the GUI, so it makes it brain-dead simple to deal with.

Re:Screw speed, size reduction: gimme compatibilit (1)

Deliveranc3 (629997) | about 7 years ago | (#18836743)

With Quantum computing perhaps we'll start to see really elegant compression, like 2d checksums with bitshifting. If you can make all the data relate to each other than each bit of compressed file cuts the possibilities in half, get it down to maybe 1,000,000,000 possibilities and then tell it that it needs to be able to play in winamp and... well, use a lot of processing power.

I keep it simple (5, Funny)

Anonymous Coward | about 7 years ago | (#18836251)

I fill an old station wagon with backup tapes, and then put it in the crusher.

Re:L-Zip (2, Funny)

Anonymous Coward | about 7 years ago | (#18836557)

The L-Zip project at http://lzip.sourceforge.net/ [sourceforge.net] seems to be down right now but it should be included in any file compression comparison. It could reduce files to 0% of their original size and it was quick too.

It was so good at what it did that I bet Microsoft bought them out and are going to incorperate the technology into Windows.

Skip the blogspam (5, Informative)

Anonymous Coward | about 7 years ago | (#18836271)


as its slashdotted

this site
http://www.maximumcompression.com/ [maximumcompression.com]
has been up for years and performs tests on all the compressors with various input sources, much more comprehensive

Re:Skip the blogspam (1)

Darkinspiration (901976) | about 7 years ago | (#18836369)

My god this site list more than 150 compression algo... never taught that there was so many of them. You learn new thing every day.

ARTICLE TEXT (Conclusions only) (1, Informative)

Anonymous Coward | about 7 years ago | (#18836897)

Save yourself 24 pages of crap, here's the punchline:

Aggregate Results

Overall, WinRK was the champion at compressing the filesets. It had an average compression rate of 23.2%. It was 9% better at overall compression than its closest rival, SBC Archiver which had an average compression rate of 21.3%.

The poorest compressors overall, at default settings, were the trio of WinZip, gzip and ARJ32. They only had average compression rates of about 13%. ...

However, gzip was the undisputed speed champion. It only took just over 121 seconds to completely process the complete fileset collection which weighed in at over 1.6GB. It was over a third faster than the runner-ups, ARJ32 and WinZip.

The other compressors were pretty slow at their normal compression settings. However, WinRK was extremely slow, compared to the others. It took almost 1.5 hours to compress the entire fileset collection. ...

The most efficient data compressor for the aggregated results was gzip. Its super-fast compression speed, coupled with its average compression rate allowed it to become the undisputed overall efficiency champion. ARJ32 and WinZip were also very efficient compressors. They were more than twice as efficient as their nearest rivals, StuffIt and bzip2.

The other compressors may have been good at certain files, but overall, they were pretty inefficient. The most inefficient compressors overall was WinRK by a large margin . No matter how good it was at compressing files, its extremely slow compression speed totally killed its efficiency ratings.

Conclusion

WinRK was the best compressor in most filesets it encountered. So, it was not surprising that it was the overall compression champion. However, its performance was offset by its abysmally slow performance. Even with a really fast system, it still took ages to compress the filesets. On several occasions, it took more than 18 minutes to compress just 200MB of files. Thanks to this flaw, it had the dubious honour of being the most inefficient compressor as well.

SBC Archiver, which was just slightly poorer than WinRK at compression was much faster at the job. Although it was nowhere near the top of the speed rankings, its faster speed allowed it to attain a moderate efficiency ranking.

WinRAR, which is a favourite of many Internet users, displayed a surprisingly bland performance at default settings. Although it had a pretty good overall compression rate of just under 19%, it was very slow at its default settings. That made it the third most-inefficient compressor. Surprising, isn't it?

In contrast, another perennial favourite, WinZip which had a lower overall compression rate of 13% managed to attain a much higher efficiency rating because it was able to compress the filesets much faster than WinRAR. Quite surprising since many users have abandoned it for WinRAR in view of its rather dated compression algorithm.

StuffIt is a dark horse. It has a pretty good compression rate overall but with an unimpressive compression speed. However, its amazing performance with JPEG files cannot be denied. JPEG files is undeniably StuffIt's forte. No other compressor even comes within a light year of it.

gzip and ARJ32 are both the fastest and the worst compressors of the lot. They have unimpressive overall compression rates but more than makes up for it with their tremendous compression speeds. Therefore, it isn't surprising to see them garner the top two spots in compressor efficiency. However, we would still recommend GUI alternatives like WinZip. It is almost as efficient as gzip and ARJ32 and far more user-friendly.

Based on our results, we can only come to one conclusion. If you do not like to change the settings of your data compressors and want a good, fast and user-friendly data compressor, then WinZip is the best one for the job.

So there you have it - the results of the Normal Compression Test.

Re:Skip the blogspam (1)

xigxag (167441) | about 7 years ago | (#18836911)

maximumcompression.com is an excellent site but it just compares compression ratio, not speed. Hence for some people, it's of limited use.

And of course, there are other factors that these types of comparisons rarely mention or that are harder to quantify: Memory footprint, compression speed while multitasking, both foreground and backgound, single anad dual core, OS/gui integration, cross-platform availability, availability of source code, cost (particularly for enterprise users), backup options (how quiet is quiet mode), processor load (to what extent will it interfere with the use of a multimedia app), spanning options, etc. Raw comparisons are fine, but once you've eliminated the ludicrously slow/inefficient programs, you need to actually try the remaining choices before committing to them.

Re:Skip the blogspam (2, Interesting)

_|()|\| (159991) | about 7 years ago | (#18836979)

After scanning MaximumCompression's results [maximumcompression.com] (sorted by compression time) the last time one of these data compression articles hit Slashdot, I gained a newfound appreciation for ZIP and gzip:
  • they compress significantly better than any of the faster (and relatively obscure) programs
  • the programs that compress significantly better take more than twice as long
  • they're at the front of the pack for decompression time
If you have a hard limit, like a single CD or DVD, then the extra time is worth it. Otherwise, look no further than the ubiquitous ZIP.

This is nothing new (1, Informative)

Anonymous Coward | about 7 years ago | (#18836277)

I remember people did MUCH more exhaustive (30+ programs) comparisons back in the BBS days. Yes... it was a much simpler time.

What about LHA, TAR (2, Insightful)

Anonymous Coward | about 7 years ago | (#18836291)

These two formats are still widely used out there, and why are we compressing MP3's?

Re:What about LHA, TAR (3, Funny)

SirSlud (67381) | about 7 years ago | (#18836455)

TAR for compression? I woulda thought you were trolling if you didn't have LHA up there. Too bad you're anonymous, you'll never get to find out how unqualified you are for participating in this discussion.

Interesting, needs better graphs (4, Informative)

MBCook (132727) | about 7 years ago | (#18836297)

I read this earlier today through the firehose. It was interesting, but the graphs are what struck me. It seems to me all the graphs should have been XY plots instead of pairs of histograms. That way you could easily see the relationship between compression ratio and time taken. Their "metric" for showing this, basically multiplying the two numbers, is pretty bogus and isn't nearly as easy to compare. With the XY plot the four corners are all very meaningful. One is slow with no compression, one each good compression/time, and the sweet spot of good compression and good time. It's easy to tell those on two opposing corners apart (good compression vs good time), where as with the article's metric they could look very similar.

Still, interesting to see. The popular formats are VERY well established at this point (ZIP in Windows and Mac (stuffit seems to be fading fast), and GZIP and BZIP2 on Linux). They are so common (especially with ZIP support built into Windows since XP and also built into OS X) I don't think we'll see them replaced any time soon. Of course, with CPU power getting cheaper and cheaper we are seeing formats that are more and compressed (MP3, H264, Divx, JPEG, etc) so these utilities are becoming less and less necessary. I no longer need to stuff files on floppies (I've got the net, DVD-Rs, and flash drives). Heck, if you look at some of the formats they "compressed" (at like 4% max) you almost might as well use TAR.

Re:Interesting, needs better graphs (1)

TubeSteak (669689) | about 7 years ago | (#18836547)

Heck, if you look at some of the formats they "compressed" (at like 4% max) you almost might as well use TAR.
For high bandwidth websites, saving 4% means saving multiple GBs of traffic

And I still zip up multiple files for sending over the internets.

Re:Interesting, needs better graphs (1)

karnal (22275) | about 7 years ago | (#18836575)

Of course, with CPU power getting cheaper and cheaper we are seeing formats that are more and compressed (MP3, H264, Divx, JPEG, etc)so these utilities are becoming less and less necessary.
You do realize that you're talking about two different datasets whether you're talking something like .zip and then something like .mp3??? The more and more compressed options you spoke of only work well because they're for specific applications - and they're lossy to boot; the typical compression tools are lossless and for any data set.

I don't think common compression libraries/utilities will ever fade, where there's a data set, there's always a need to get it just a little smaller....

Re:Interesting, needs better graphs (0)

Anonymous Coward | about 7 years ago | (#18836713)

You do realize that you're talking about two different datasets whether you're talking something like .zip and then something like .mp3??? The more and more compressed options you spoke of only work well because they're for specific applications - and they're lossy to boot; the typical compression tools are lossless and for any data set.

His point is that as the former become more widely used (because of CPUs that can handle them on-the-fly) the latter become less relevant. You don't need to compress mp3's like you did with wav's.

Which ones of these run cross platform (1)

rminsk (831757) | about 7 years ago | (#18836299)

Which compressors on the list run on non windows platforms?

Re:Which ones of these run cross platform (1)

vertigoCiel (1070374) | about 7 years ago | (#18836349)

Only gzip, bzip2, and Stuffit run multi-platform, although other programs to uncompress most of the file types used are available on most platforms.

Re:Which ones of these run cross platform (1)

metamatic (202216) | about 7 years ago | (#18836977)

Only gzip, bzip2, and Stuffit run multi-platform, although other programs to uncompress most of the file types used are available on most platforms.

That's a bit misleading. For example, PKzip may not be multi-platform, but there are good native Zip compression and decompression programs available for every major platform.

no best compression results? (1)

Uksi (68751) | about 7 years ago | (#18836301)

You have gotta be kidding me, article is posted and there are no best compression test results! Lame!

Poor article. (5, Insightful)

FellowConspirator (882908) | about 7 years ago | (#18836307)

This is a poor article on several points. First, the entropy of the data in the files isn't quantified. Second, the strategy used for compression isn't described at all. If WinRK compresses so well on very high entropy data, there must be some filetype specific strategies used.

Versions of the programs aren't given, nor the compile-time options (for the open source ones).

Finally, Windows Vista isn't a suitable platform for conducting the tests. Most of these tools target WinXP in their current versions and changes to Vista introduced systematic differences in very basic things like memory usage, file I/O properties, etc.

The idea of the article is fine, it's just that the analysis is half-baked.

Re:Poor article. (5, Insightful)

RedWizzard (192002) | about 7 years ago | (#18836615)

I've got some more issues with the article. They didn't test filesystem compression. This would have been interesting to me because often the choice I make is not between different archivers, but between using an archiver or just compressing the directory with NTFS' native compression.

They also focused on compression rate when I believe they should have focused on decompression rate. I'll probably only archive something once, but I may read from the archive dozens of times. What matters to me is the trade-off between space saved and extra time taken to read the data, not the one-off cost of compressing it.

Re:Poor article. (1)

cpaglee (665238) | about 7 years ago | (#18836783)

The website is riddled with annoying ads and no way to print the article. And Windows only: there is zero information on which programs use compression algorythms supported in Linux.

The data is in a pretty useless format. The data should definitely be charted in compression vs. speed format with identical scales to measure the significance of the compression. Compression of 7% for video is really not that interesting to me. I don't want to be bothered with the time it takes to decompress. For different users different levels of compression is significant. For me if I can't compress by 20% then I don't bother.

It seems almost like this article was written to get Slashdoted. The article is a complete waste of time.

What's the point of compressing JPEG,MP3,DivX etc (5, Insightful)

mochan_s (536939) | about 7 years ago | (#18836315)

What's the point of compressing JPEG,MP3,DivX etc since they already do the compression? The streams are close to random (with max information) and all you could compress would be the headers between blocks in movies or the ID3 tag in MP3.

Re:What's the point of compressing JPEG,MP3,DivX e (1)

Lehk228 (705449) | about 7 years ago | (#18836417)

because then they can use those graphs to pump their sponsor (WinRK)

Re:What's the point of compressing JPEG,MP3,DivX e (5, Interesting)

trytoguess (875793) | about 7 years ago | (#18836701)

Er... did ya check out the comparisons? As you can see here here [techarp.com] jpeg at least can be compressed considerably with Stuffit. According to this [maximumcompression.com] the program can "(partially) decode the image back to the DCT coefficients and recompress them with a much better algorithm then default Huffman coding." I've no idea what that means, but it does seem to be more thorough and complex than what you wrote.

Re:What's the point of compressing JPEG,MP3,DivX e (1)

ampathee (682788) | about 7 years ago | (#18836721)

Mod parent up! I noticed that too, very interesting - I wonder whether a jpg compressed as efficiently as the JPEG standard allows could still be improved upon by StuffIt, or whether it just takes advantage of the inefficiency of most jpg compression code..

Re:What's the point of compressing JPEG,MP3,DivX e (1)

maxume (22995) | about 7 years ago | (#18836895)

Yes. Jpeg includes lossless compression. First, it discards information(which is the part that you can tune), and then it losslessly compresses the result of that step. Stuffit backs out the standard lossless compression and uses some other better algorithm. If you are worried about it, use Jpeg 2000 or something similar, they are better at discarding information.

Re:What's the point of compressing JPEG,MP3,DivX e (2, Insightful)

slim (1652) | about 7 years ago | (#18836997)

"the program can "(partially) decode the image back to the DCT coefficients and recompress them with a much better algorithm then default Huffman coding."
Whew, that makes me feel a bit dirty: detecting a file format an applying special rules. It's a bit like firewalls stepping out of their network-layer remit to mess about with application-layer protocols (e.g. to make FTP work over NAT).

Still, in both cases, it works; who can argue with that.

Re:What's the point of compressing JPEG,MP3,DivX e (3, Insightful)

athakur999 (44340) | about 7 years ago | (#18836831)

Even it the amount of additional compression is insignificant, ZIP, RAR, etc. are still very useful as container formats for MP3, JPG, etc. files since it's easier to distribute 1 or 2 .ZIP files than it is 1000 individual .JPG files. And if you're going to package up a bunch of files into a single file for distribution, why not use the opportunity to save a few kilobytes here and there if it doesn't require much more time to do that?

Default options and stuffit (2, Informative)

Rosyna (80334) | about 7 years ago | (#18836939)

By default, Stuffit won't even bother to compress MP3 files. That's what it shows an increase in file size (for the archive headers) and why it is the fastest throughput (it's not trying to compress). If you change the option, the results will be different.

I imagine some other codecs also have similar options for specific file types.

Hmm... (1)

neonstz (79215) | about 7 years ago | (#18836325)

They didn't think their cunning plan to create more ad revenue by creating a shitload of pages all the way through...

english language is mostly fluff (4, Funny)

Blue Shifted (1078715) | about 7 years ago | (#18836327)

the most interesting thing about text compression is that there is only about 20% information in the english language (or less). yes, that means that 4/5ths of it is meaningless filler. filled up with repetitive patterns. as you can see, i really didn't need four sentences to tell you that, either.

i wonder how other languages compare, and if there is a way to communicate much more efficiently.

Re: english language is mostly fluff (0)

Anonymous Coward | about 7 years ago | (#18836445)

Y.

Re:english language is mostly fluff (1)

maxume (22995) | about 7 years ago | (#18836491)

Does efficiency derive directly from the size of the textual representation of the words? I would think it would have to include things like robustness, expressiveness, clarity, as all of those things have a significant effect on how completely a given message is transmitted, which seems to be the numerator in the efficiency calculation.

I use a single finger (0)

Anonymous Coward | about 7 years ago | (#18836521)

I use a single finger

Re:I use a single finger (0)

Anonymous Coward | about 7 years ago | (#18836867)

I use a single finger

I don't want to hear about your anal stimulation preferences.

Re:english language is mostly fluff (1)

demonlapin (527802) | about 7 years ago | (#18836659)

Yes, you can communicate much more efficiently. Much of the length of English words is related to defining part of speech - we use "-ing" for adjectival forms of verbs, "-ly" for adverbs, etc. It's just an attribute that is expressed by a clearly recognizable pattern. As such, it's easily comprehended by the reader who can identify parts of words rather than letter-by-letter reading. This is the essence of true speed-reading.

An anecdotal observation: my wife is better than I am at linguistic tasks, but not a lot better if the words are spoken. Reading is an entirely different matter. I'm a letter-by-letter reader, I still sound out words in my head, and I'm pretty good at it - I read pulp fiction at around 100 pages an hour. She is one of those people who can read a line - or sometimes a paragraph - at a time, and routinely reads pulp books at 250 pages an hour with total recall. She can identify the pieces of words by sight and digests them instantly.

If you can figure out what she's doing, you've vastly multiplied the efficiency of human communication.

Pizzachish: setting a new standard in languages (2, Interesting)

pizzach (1011925) | about 7 years ago | (#18836693)

I have been thinking about creating a new language with about 60 or so words. The idea is that you don't need a lot of words when you can figure out the meaning by context. Strong points are that the language would be very easy to pick up, and you would get that invigorating feeling of talking like a primitive cave man.

As an example of the concept, we have the words walk and run. They are a bit too similar to be worth wasting one of our precious few 60 words. Effectively, one could be dropped with have the other taking on a broader meaning without any real repercussions. The words sit and shit are also fairly similar. When you have a guest over, you can say something like, "Please, shit down." Because of context, it would be all okay. Just remember, there is a difference between shitting on the toilet and shitting in the toilet.

Re:Pizzachish: setting a new standard in languages (1)

Kandenshi (832555) | about 7 years ago | (#18836817)

If you're going to do that, I'd suggest considering the way some languages deal with modified versions of words like warm and hot.

"warm warm" = "hot"
"walk walk" = "walk fast/run"

It could greatly reduce the number of adjectives and verbs(and other stuff) you need in the language.

Re:Pizzachish: setting a new standard in languages (1)

maxume (22995) | about 7 years ago | (#18836951)

With sixty words, you would be lucky if 'shit body area' meant shitting on the toilet(where body area is your stand in for bathroom). I doubt you could express sitting on a toilet. If you disagree, consider that a poor vocabulary for a native English speaker is something like 20,000 words, average is 50,000 and people that speak a lot of jargon commonly have 75,000 words.

7zip (4, Insightful)

Lehk228 (705449) | about 7 years ago | (#18836357)

7-zip cribsheet:

weak on retarded things to zip like WAV files (use FLAC) mp3's, jpegs and divx movies.

7zip does quite well in documents (2nd) and ebooks (2nd) 3rd on MPEG video, 2nd in PSD

also i expect 7zip will improve in higher end compressions settings, when possible i give it hundreds of megs and unlike commercial apps 7zip can be configured well into the "insane" range

Doesn't really matter (2, Informative)

644bd346996 (1012333) | about 7 years ago | (#18836393)

These days, file compression is pretty much only used for large downloads. In those instances, you really have to use either gzip, pkzip, or bzip2 format, so that your users can extract the file.

Yes, having a good compression algorithm is nice, but unless you can get it to partially supplant zip, you'll never make much money off it. Also, most things these days don't need to be compressed. Video and audio are already encoded with lossy compression, web pages are so full of crap that compressing them is pointless, and hard drives are big enough. Although, I haven't seen any research lately about whether compression is useful for entire filesystems to reduce the bottleneck from hard drives. Still, I suspect that it is not worth the effort.

Backups (1)

Craig Ringer (302899) | about 7 years ago | (#18836781)

File compression is also very important for backups, both for capacity and backup/restore speed. But you know what? In backups, you want to ensure that the archives are going to be recognisable and readable by as wide a variety of software as possible, so your disaster recovery options are open. Sure, you probably encrypt them, but there portable and fairly standard tools are also a good idea rather than some compression&archival app's built-in half-baked password protection.

As for compressing whole file systems, it doesn't work well because data compresses by variable amounts. It's hard to get a layout that handles this well - when a program overwrites a few blocks of a file, those blocks might grow and force everything to move, or force fragmentation of the file. That sort of thing. You might say to compress the data but store it in the original block layout - which works and solves the above problem, but loses you your performance gains because the drive will generally read a whole block if part of it is needed, so you have no net change. This doesn't mean that efficient read/write compressing file systems aren't possible, just that they are hard, and probably won't perform as well as you might initially expect. They'll also have very _different_ performance characteristics because of the changes required to make them work without insane levels of fragementation or lots of block copying.

Compressing file systems are amazing for backups, though, where files are written, read, or truncated, but rarely appended to or partially overwritten. I'd LOVE a widely supported r/w compressing FS for our backups here, but have to make do with compressed archives at the moment. Tape drives compress, but I don't have the cash for an SDLT here and we need that kind of capacity.

Exhaustive? don't forget flac.. (0)

Anonymous Coward | about 7 years ago | (#18836461)

UM, yeah, the dataset includes WAV files. Try flac [sourceforge.net]. Then you will have exhausted a little more of the compression programs available.

Re:Exhaustive? don't forget flac.. (2, Informative)

moronoxyd (1000371) | about 7 years ago | (#18836681)

> UM, yeah, the dataset includes WAV files. Try flac [sourceforge.net].
> Then you will have exhausted a little more of the compression programs available.

You are aware that all the tools tested are general purpose compressors, and FLAC is not, aren't you?

Otherwise, you would also have to talk about Wavepack, Monkey Audio, Shorten and others.
And those are only the loseless audio codecs. What about lossy codecs?

What about all those different formats for pictures? They compress data as well.
And what about the different video codecs? ...

Moo (0, Offtopic)

Chacham (981) | about 7 years ago | (#18836493)

This is the First Post compressed really well, so it took until after a few posts to show up.

Alternate Compressor Comparisons (0)

Anonymous Coward | about 7 years ago | (#18836501)

I read the article, got shocked at the time spent comparing the compression of MP3s and DiVX, and didn't read much further.

Google's top hit turns up this site which is chock full of data on every compressor you ever & never heard of:
http://www.maximumcompression.com/index.html [maximumcompression.com]

Wikipedia has nice charts to quickly see features and OS support for a handful of common compressors:
http://en.wikipedia.org/wiki/Comparison_of_file_ar chivers [wikipedia.org]

The newsgroup comp.compression has been around awhile, and is maintaining [google.com] an excellent FAQ:
http://datacompression.dogma.net/index.php?title=C omp.compression_FAQ [dogma.net]

I've got you all so beat (0, Flamebait)

turing_m (1030530) | about 7 years ago | (#18836587)

I use MS DOS 6 with doublespace doubling my hard drive space. I store stuff on both C and H drives, zipping, arjing and rarring all my jpgs and divx files. I figure with the amount of compression I'm using, I'll have roughly 20 times as much room as regular plebs. Suckers!

You should also see my 133t power strip setup. I don't need extra sockets, with daisy chaining I can fit as many devices as I want! LOL LOL Unfortunately my faulty circuit breaker keeps switching off at the most inconvenient times, I'll have to get that seen to.

Re:I've got you all so beat (0)

Anonymous Coward | about 7 years ago | (#18836705)

All that effort and you're not even using XTRATANK? Lame!

You can't even BEGIN to claim maximal ub3r-133t compressionality without the first, and best, disk space doubler.

Archive Comparison Test (4, Insightful)

Repton (60818) | about 7 years ago | (#18836609)

See also: the Archive Comparison Test [compression.ca]. Covers 162 different archivers over a bunch of different file types.

It hasn't been updated in a while (5 years), but have the algorithms in popular use changed much? I remember caring about compression algorithms when I was downloading stuff from BBSs at 2400 baud, or trading software with friends on 3.5" floppies. But in these days of broadband, cheap writable CDs, and USB storage, does anyone care about squeezing the last few bytes out of an archive? zip/gzip/bzip2 are good enough for most people for most uses.

Exhaustive?! (5, Informative)

jagilbertvt (447707) | about 7 years ago | (#18836625)

It seems odd that they didn't include executables/dlls in the comparison (where maxmumcompression.com does). I also find it odd that they are compressing items that normally don't compress very well with most data compression programs (divx/mpegs/jpegs/etc). I'm guessing this is why 7-zip ranked a bit lower than most.

I did some comparison last year, and found 7-zip to do the best job for what I needed (great compression ratio without requiring days to complete). It also doesn't take into account the network speed at which the file is going to be transmitted. I use 7-zipfor pushing application updates and such to remote offices (most over 384k/768k WAN links). Compressing w/ 7-zip has saved users quite a bit of time compared to winrar or winzip.

I would definitely recommend checking out maximumcompression.com (As others have, as well) over this article. It goes into a lot greater detail.

poor sample data choices (1, Redundant)

SideshowBob (82333) | about 7 years ago | (#18836813)

It's a waste of time using a general purpose compressor on data that's already been compressed by domain specific audio or video compressors.

Compress big files before -ALL- File-Transfers (0)

Anonymous Coward | about 7 years ago | (#18836839)

I am forever amazed that originating servers & mirrors of oft-released (minor releases of large) EXEs, ISOs, etc. do NOT - by default - compress their files, ie, before the first-requested transfer happens.

PROPOSAL (not likely to be so new, I suppose):

Whenever a requested file is NOT already compressed:

1. On the Server-Side:

- [FTP or HTTP] file-transfer programss/protocols should (by default) compress them (using the best compressor for that type of file), and

- save the now-compressed version of the big file on the server (in case of future requests for the same file), and

On the Client-Side:

- User can be asked (unless there's been a default reply saved) in which form the file should be saved (ie, compressed or decompressed), and

- the received file is saved in the form requested by User.

We, in Australia, need such compression, as we've recently had significant INCREASES in our Internet service DATA costs... either because ISPs are just beginning to need to invest in ADSL-2+ DSLAMS -or- we're now using data for VoIP applications (and ISPs figure they're entitled to some of the $'s we save) -or- due to greed?

Others may also have high data costs.

In any case, I'm sure no one would mind some server-side changes that would reduce the sheer quantity of data that needs to be transferred.

My response to this article... (0, Troll)

wbren (682133) | about 7 years ago | (#18836925)

PK....k..6]..Y..Q...zip.huSmk.A..~..&.K!..3...GYo. s../..w.^..3...rw.na.sT.9..,$z..Tf..K..os..r.i.saS ..a..O.7...*.._BP.8.W!.`9..*..k..R;.".0.^..;.'..*. o.~L_.7.. T(w.J...6t..i..X.]...u.+..W..?.r..K...Y.O..{.."}.. *,.;..Zp..WZ).YQ.0~2)xE..59C..m+.Vk..t
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...