Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Digital Big Bang — 161 Exabytes In 2006

kdawson posted more than 7 years ago | from the had-we-but-world-enough-and-room-to-store-it dept.

Data Storage 176

An anonymous reader tips us to an AP story on a recent study of how much data we are producing. IDC estimates that in 2006 we created, captured, and replicated 161 exabytes of digital information. The last time anyone tried to estimate global information volume, in 2003, researchers at UC Berkeley came up with 5 exabytes. (The current study tries to account for duplicating data — on the same assumptions as the 2003 study it would have come out at 40 exabytes.) By 2010, according to IDC, we will be producing far more data than we will have room to store, closing in on a zettabyte.

Sorry! There are no comments related to the filter you selected.

XXX (5, Funny)

daddyrief (910385) | more than 7 years ago | (#18245188)

And half of that is porn...

Re:XXX (4, Funny)

the-amazing-blob (917722) | more than 7 years ago | (#18245196)

Way to clog the tubes up, guys. Seriously. :P

Re:XXX (2, Funny)

daddyrief (910385) | more than 7 years ago | (#18245212)

That's just what happens after a 'digital big bang.'

Re:XXX (1)

iamstretchypanda (939837) | more than 7 years ago | (#18245692)

I don't know about you, but my tubes aren't clogged at all. Seriously. :p

Re:XXX (1)

Coucho (1039182) | more than 7 years ago | (#18245878)

In other news, colleges around the country are requesting their male students to clog the tubes-- at home.

Re:XXX (1)

harp2812 (891875) | more than 7 years ago | (#18245204)

Only half? Hmm, the porn industry must be slowing down.

Re:XXX (4, Funny)

maxume (22995) | more than 7 years ago | (#18246082)

They upgraded to mpeg4.

Re:XXX (1)

LighterShadeOfBlack (1011407) | more than 7 years ago | (#18245210)

And half of that is porn...
Only half? Honestly?

Re:XXX (0, Redundant)

Quantam (870027) | more than 7 years ago | (#18245254)

And half of that is porn...

I'd imagine more in the 95%-99% range...

The number is way too low! (2, Insightful)

EmbeddedJanitor (597831) | more than 7 years ago | (#18245940)

OK,ok, I didn't RTFA and did not really RTFSummary (that's not the point of /.).

If we consider all digital data, not just the stuff that flows over the internet, then this is way too low. Consider the data in all the DTVs, GPS receivers etc.

A top-end GPS is grinding over 10^9 bits per second in its correlators (about 50 correlator channels x 20Mbps or so sampling rate). That ends up being approx 3x10^15 bytes per year per GPS... or 40,000-odd top-end GPSs would be grinding 1.61x10^20 bytes per year. There are far more than 40k high end GPSs in the world, so the budget is already blown...

Re:The number is way too low! (2, Informative)

Simon80 (874052) | more than 7 years ago | (#18246220)

RTFS - it's not about bandwidth, it's about unique data, knowledge, ideas, information.

Re:XXX (1)

Gunslinger47 (654093) | more than 7 years ago | (#18246204)

And half of that is porn...

...and in another five years, the other half will be Wikipedia.

Re:XXX (1)

sukotto (122876) | more than 7 years ago | (#18246292)

and most of the rest of it is crappy YouTube flicks

It was only 9 megs (5, Funny)

noewun (591275) | more than 7 years ago | (#18245198)

Without Slashdot dupes.

Re:It was only 9 megs (4, Insightful)

cmacb (547347) | more than 7 years ago | (#18245414)

HA!

But seriously, I wonder what percentage of this data is text. I'd guess it is a very very small amount. When I had a film camera, in twenty years I bet I took less than 100 rolls of film. With digital cameras I've take thousands of pictures, sometimes taking a dozen or more of the same subject, just because the cost to me is practically zero. Now there are vendors that will let me upload large numbers of these amateurish photos for free, and let's pretend that there are enough people interested in seeing my pictures that these companies can pay for this storage with advertising. That's scary.

Excluding attachments I think it would be practically impossible for anyone to use up Googles 2 gig of storage, but I've heard of people using it up in little more than a week by mailling large attachments back and forth (oh yeah, I HAVE to have every single iteration of that Word document, sure I do!)

But what's scarier is that for some nominal fee (like $20 a year) they place no limit at all on my ability to hog a disk drive somewhere. I know people who are messed up in the head enough to want to test these claims. Give them 5 gig for photos and they've filled it up in a week, give them "unlimited" and they upload pure junk to see if they can break the thing.

Like any house of cards, this thing is gonna come down sooner or later. I just hope that people who are making sensible use of these online services don't lose everything along with the abusers.

Re:It was only 9 megs (1)

ET_Fleshy (829048) | more than 7 years ago | (#18246280)

There is an extension [mozilla.org] for Firefox that lets you use your gmail account as an online storage database. This makes it quite realistic to use up all of your space. Enjoy ;)

Re:It was only 9 megs (1)

neax (961176) | more than 7 years ago | (#18245492)

the earth is going to become one giant hard disk. perhaps we should outsource to the moon...or mars.

Finally, an excuse... (5, Funny)

bigforearms (1051976) | more than 7 years ago | (#18245216)

The furry porn gets deleted first.

Re:Finally, an excuse... (1, Funny)

Anonymous Coward | more than 7 years ago | (#18245574)

The furry porn gets deleted first.

I'd mod you down, but there's no -1, Fursecution option.

Damn you Slashdot! When will it ever stop!?!?!!~

Re:Finally, an excuse... (1)

PitaBred (632671) | more than 7 years ago | (#18246048)

You really don't need an excuse to delete that stuff. I mean it.

How many... (3, Funny)

Looce (1062620) | more than 7 years ago | (#18245222)

... times does the Library of Congress fit in that? Exabytes simply don't speak to me.

Alternatively, you can also answer in anime episodes, or mp3 files.

Re:How many... (5, Funny)

LighterShadeOfBlack (1011407) | more than 7 years ago | (#18245262)

That'd be 1,191,400 Libraries of Congress.

Honestly, I don't know why the /. editors allow these "scientific articles" that only provide data in these obscure and archaic "byte" measurements. Absurd!

Re:How many... (5, Funny)

asifyoucare (302582) | more than 7 years ago | (#18245690)

1,191,400 Libraries of Congress ought to be enough for anyone.

Re:How many... (1)

bluemonq (812827) | more than 7 years ago | (#18245420)

Assuming the "standard" fansub size of 233MB for an ED release, it would be about 760 billion episodes of anime. To put that in perspective, Naruto is currently 224 eps long (85 too many).

Re:How many... (5, Funny)

Anonymous Coward | more than 7 years ago | (#18245538)

760 billion episodes of anime.In other words, about half the length of a typical Dragonball Z fight scene.

Re:How many... (5, Informative)

franksands (938435) | more than 7 years ago | (#18245458)

Since you asked:

Oh, the equivalents! That's like 12 stacks of books that each reach from the Earth to the sun. Or you might think of it as 3 million times the information in all the books ever written, according to IDC. You'd need more than 2 billion of the most capacious iPods on the market to get 161 exabytes.

I don't have anime estimates, but I can make a Heroes [wikipedia.org] analogy.a hi-def episode is more or less 700mb. Considering the first season has 23 episodes, that would make 16.1gb. So 161 exabytes would be 10,000,000,000 (ten billion) seasons of Heroes. Since the earth currenlty has around 6.6 billion people, this would mean that you would have 1 episode for each person on the planet, and all the people of China, India and the US would have a second episode. That's how big it is.

Regarding the storage space, I call shenanigans. We already have HDD that stores terabytes. A couple years from now, MS office will require that space to be installed.

Re:How many... (1)

susano_otter (123650) | more than 7 years ago | (#18245638)

Um, I already have several Hero-episode equivalents stored here at home. If all this data adds up to everybody gets a DVD, and some people get two, it doesn't seem like something anybody would really even notice, amongst all the DVDs they have already.

Re:How many... (1)

franksands (938435) | more than 7 years ago | (#18245728)

Small mistake, where it reads "one episode for each person" it should read "one season for each person", and the same for the second sentence. Quite a lot, huh?

Re:How many... (1)

PitaBred (632671) | more than 7 years ago | (#18246080)

Yeah, well... I'll see your Heroes reference, and raise you a Firefly quote (that is vaguely, but not really at all, related): Dr. Simon Tam: Uh, her... her medications are erratic. There's-there's not one that her system can eventually break down, and... Mal: When want a lot of medical jargon, I'll talk to a doctor. Dr. Simon Tam: You are talking to a doctor.

Sorry, my fault... (5, Funny)

slobber (685169) | more than 7 years ago | (#18245230)

I left cat /dev/urandom running

Re:Sorry, my fault... (1)

iPaul (559200) | more than 7 years ago | (#18245330)

You might as well. I imagine 99.999% of that is useless gibberish that people are retaining the way they retain random loose screws and bolts in coffee cans in their garage.

Re:Sorry, my fault... (0, Offtopic)

iggymanz (596061) | more than 7 years ago | (#18245742)

not to mention the random loose screw we're retaining in the white house that's full of gibberish

Re:Sorry, my fault... (4, Interesting)

product byproduct (628318) | more than 7 years ago | (#18245548)

Amazingly it would take 1,600,000 years for /dev/urandom to produce 161 exabytes (assuming 3.2 MB/s, YMMV)

Re:Sorry, my fault... (1)

DarkAxi0m (928088) | more than 7 years ago | (#18245626)

not unless we use a beowulf cluster or some kinda boinc project...

Re:Sorry, my fault... (4, Funny)

Dirtside (91468) | more than 7 years ago | (#18246296)

Yes, but what he didn't say is that he left it running on every computer on earth.

And here I thought Malthus was dead (5, Insightful)

Anonymous Coward | more than 7 years ago | (#18245240)

We won't be running out of space just like we didn't run out of food. New technology will allow us to store ever more data.

Re:And here I thought Malthus was dead (3, Informative)

LighterShadeOfBlack (1011407) | more than 7 years ago | (#18245370)

As the article notes, the amount we produce is not the same as the amount we would actually want to store. Since that 161EB includes duplications such as broadcasting, phone calls, and all manner of temporary or real-time data it's not really relevant to compare that number with storage capabilities as the summary implies.

Re:And here I thought Malthus was dead (1, Insightful)

Anonymous Coward | more than 7 years ago | (#18245822)

The Earth has a finite mass, which at the very least means we can't have infinite humans on Earth. So there IS a carrying capacity, Malthus just guessed the wrong number.

Malthus has just gone down to the shops (2, Insightful)

roesti (531884) | more than 7 years ago | (#18246242)

We won't be running out of space just like we didn't run out of food. New technology will allow us to store ever more data.

I remember when software came on cassettes and when food came from close to where you live.

When floppy disks were too small, we made higher-density floppy disks, and we still needed a whole box of them.
When there wasn't enough of a particular food, we got it shipped from further away.

When CD-ROMs came out, we still ended up not only filling them but spreading things over multiple CDs.
When the imported food got too expensive, we started using chemical fertilisers to grow more of them closer to home and more cheaply.

We had to invent bigger CDs. DVD became HD-DVD and Blu-Ray. People are already complaining that they're not big enough.
We got bigger trucks and bigger boats to cover food with more preservatives and ship it here from further away, and the more of this we bought, the cheaper it got.

You got that bigger hard disk, so you could amass data and store it forever. Remember how you said you'd never fill it up? Then broadband happened, and P2P happened, and fill it up you did.
You didn't worry about it, though, the same way you didn't worry about not having enough food, either. Your supermarket is awash with thousands of varieties of food, from wherever it's cheap, and you can eat as much as you want of whatever you want.

Because everything is more available, more quickly and more easily, you now have more stuff than you could ever use. Nowadays, people don't think twice about Tivo-ing or downloading something that they're never even going to watch. As the technology gets better - as disks get bigger, and as networking gets faster - this is only going to become more prevalent.

But there is a physical limit to what can be done. Do you need a new hard drive, or a new router? What metals and chemicals are required to make them? How much energy is required? Where are they built, and how do they get to you? There's only a finite amount of this stuff in the ground, and none of this is invincible to exponential growth. The people who think this can go on forever, or even for the rest of their natural lives, are kidding themselves.

Eventually, these materials will be harder to get, things will start to become more difficult to make and more expensive, and everyone will be complaining about how expensive their last computer was. Really, though, I don't even want to know these people. They've gotten their priorities all wrong.

The parent poster says we won't be running out of anything. All that's really happened is that we haven't run out yet. The planet simply can't sustain the 6.5 billion of us there are now, let alone the billions more to be born in the next few decades. The problem is that when there isn't enough to go around, some of us will be lining up for new video games and iPods, and some of us will be lining up for food, water and fuel.

I should warn you to choose wisely, but really, what do I care? Choose unwisely, and leave more for the rest of us.

Re:Malthus has just gone down to the shops (0)

Anonymous Coward | more than 7 years ago | (#18246426)

Mod parent down -1 Useless fluff

What's an exabyte? (2, Informative)

Anonymous Coward | more than 7 years ago | (#18245256)

Simply put, a lot [wikipedia.org]

10^18 bytes, or One million terabytes

Re:What's an exabyte? (2, Informative)

springbox (853816) | more than 7 years ago | (#18245528)

Did they measure in exabytes or exbibytes [wikipedia.org] (2^60 bytes)? The difference between 161 exabytes and 161 exbibytes are 24,620,362,241,702,363,136 bytes - about 21.36 exbibytes. Kind of important since the margin of error will only increase as the measured data grows. (Lets stop using the SI units when we don't actually mean it.)

Re:What's an exabyte? (0)

Anonymous Coward | more than 7 years ago | (#18245716)

It's not terribly important in this case because in the course of their calculation they multiplied by the pulled-out-of-their-ass single-significant-digit number "3". Also a quick look at TFA will reveal that they used "metric" exabytes.

Re:What's an exabyte? (1)

HappyEngineer (888000) | more than 7 years ago | (#18245556)

I've got a much more thorough page at: http://g42.org/tiki/tiki-index.php?page=BigNumbers [g42.org]

Yotta is the largest metric prefix and it's the next one after Zetta, so it looks like the standards people are going to have to get together to name some more prefixes.

Re:What's an exabyte? (1)

TranscendentalAnarch (1005937) | more than 7 years ago | (#18246092)

When you realize that the data created at a single personal computer in your home accounts for One 161-millionth of the entire world's data production, you gotta ask yourself: do I seriously have time to watch all this pron?

What if ISP's are forced to retain data? (4, Interesting)

cryfreedomlove (929828) | more than 7 years ago | (#18245260)

I imagine that a lot of this is web traffic logs. What if the US government really does force ISP's to keep records detailing the sites visited by their customers? Will my ISP rates increase to pay for all of that disk space?

Re:What if ISP's are forced to retain data? (1)

daeg (828071) | more than 7 years ago | (#18245268)

Just wait until the government hears that URLs change and they try to force ISPs to maintain a cache of pages along with the history.

Re:What if ISP's are forced to retain data? (2, Funny)

garcia (6573) | more than 7 years ago | (#18245362)

Will my ISP rates increase to pay for all of that disk space?

No, of course not. Any law or regulation that the government comes up with doesn't have any hidden costs.

Re:What if ISP's are forced to retain data? (2, Funny)

daeg (828071) | more than 7 years ago | (#18245388)

Costs be damned when you're The Decider and, much to the dismay of IT budgets everywhere, can change time itself on a whim!

Must be the space donuts (5, Funny)

Anonymous Coward | more than 7 years ago | (#18245278)

So the sum total of data has increased by a factor of more than 30 since 2003? I knew Brent Spiner was putting on weight, but damn.

The awesome information we retain (5, Insightful)

iPaul (559200) | more than 7 years ago | (#18245282)

Web server log files with the history of people clicking around. My address stored by everybody I ever bought anything on line from. It's more an information land-fill than an information warehouse.

And there used to be so little on-line data (5, Interesting)

Animats (122034) | more than 7 years ago | (#18245288)

What's really striking is how little data was available in machine-readable form well into the computer era. In the 1970s, the Stanford AI lab got a feed from the Associated Press wire, simply to get a source of machine-readable text for test purposes. There wasn't much out there.

In 1971, I visited Western Union's installation in Mawah, NH, which was mostly UNIVAC gear. (I worked at a UNIVAC site a few miles away, so I was over there to see how they did some things.) I was shown the primary Western Union international gateway, driven by a pair of real-time UNIVAC 494 computers. All Western Union message traffic between the US and Europe went through there. And the traffic volume was so small that the logging tape was just writing a block every few seconds. Of course, each message cost a few dollars to send; these were "international telegrams".

Sitting at a CRT terminal was a woman whose job it was to deal with mail bounces. About once a minute, a message would appear on her screen, and she'd correct the address if possible, using some directories she had handy, or return the message to the sender. Think about it. One person was manually handling all the e-mail bounces for all commercial US-Europe traffic. One person.

Re:And there used to be so little on-line data (0)

Anonymous Coward | more than 7 years ago | (#18246100)

One person was manually handling all the e-mail bounces for all commercial US-Europe traffic. One person.
Unbelievable! I'd pay good money to see footage of that!!

Perhaps it looks something like this [navy.mil] .

Re:And there used to be so little on-line data (0)

Anonymous Coward | more than 7 years ago | (#18246234)

Sitting at a CRT terminal was a woman whose job it was to deal with mail bounces. About once a minute, a message would appear on her screen, and she'd correct the address if possible, using some directories she had handy, or return the message to the sender. Think about it. One person was manually handling all the e-mail bounces for all commercial US-Europe traffic. One person.

That made my jaw drop.

Inquiring minds... (-1, Redundant)

Dunbal (464142) | more than 7 years ago | (#18245290)

161 Exabytes In 2006

Of course the burning question on everyone here is - exactly how much of this was slashdot dupes?

"closing in on a zettabyte" (3, Funny)

Supreme Dragon (1071194) | more than 7 years ago | (#18245294)

Is that the size of the next MS OS?

Re:"closing in on a zettabyte" (1)

josemayor1 (1070508) | more than 7 years ago | (#18245378)

What if the US government really does force ISP's to keep records detailing the sites visited by their customers? Your are sure of this? http://www.dovoyeur.com/ [dovoyeur.com]

Re:"closing in on a zettabyte" (0)

Anonymous Coward | more than 7 years ago | (#18246218)

Shouldn't the oceans have boiled by then?

It doesn't seem like that much (1)

gelfling (6534) | more than 7 years ago | (#18245332)

I'm just one person and I have 20GB just of OS and applications code. Plus another 20GB of MP3's. 161 billion /40 is about 4 billion 'gelfling people units'. Doesn't seem like a lot.

Re:It doesn't seem like that much (1)

Umbrae (866097) | more than 7 years ago | (#18245428)

You forget that this removes duplicates.

Every OS file you have, application file you have, mp3 file you have, is only counted once. So 10000 gelflings is still only 40GB.

Re:It doesn't seem like that much (1)

gelfling (6534) | more than 7 years ago | (#18245654)

No it isn't I have 4 other people in my house, 4 other computers and they have even more per machine. I work for a company that does outsourcing. I don't think there is a reasonable estimate for the number of physical servers we manage. It's easily in the hundred thousand plus. How much DASD? Who knows. Figure one 100GB per server @ 40% utilization per x 100,000 = 400,000GB. Double that for offline and nearline storage. That's 800,000GB, easily.

Re:It doesn't seem like that much (0)

Anonymous Coward | more than 7 years ago | (#18245926)

I'm just one person and I have 20GB just of OS and applications code. Plus another 20GB of MP3's.

How many music CDs do you own? Each CD has about 600-700MB. How many movie DVDs? 4.5-9GB each. VHS tapes? Maybe 1-2GB (wild guess).

Supply and demand (4, Insightful)

rufty_tufty (888596) | more than 7 years ago | (#18245340)

I'm sorry, how stupid is this?
"producing far more data than we will have room to store"

That's like saying, for the last 2 months, my profit has increased by 10%. If my profit keeps increasing at 10% per month, then pretty soon I'll own all the money in the world, and then I'll own more money than exists! Damn I must stop making money now before I destroy the world economy!!!

Who are these people who draw straight lines on growth curves? Why do people print the garbage they write and why weren't they the first against the wall after the dot com bust?
The only things that seem certain are death, taxes, entropy and stupid people...

Re:Supply and demand (4, Insightful)

Looce (1062620) | more than 7 years ago | (#18245404)

Actually, you're spending some of the money you earn, in investments. You are neither a sink nor a source of money.

Though with data, some people, or even companies, are merely sinks. They store huge amounts of data, mostly for auditing purposes. Access logs for webservers. Windows NT event logs. Setup logs for Windows Installer apps. For ISPs, a track record of people who got assigned an IP address, in case they get a subpoena. Change logs for DoD documents. Even CVS for developers, to keep track of umpteen old versions of software. Even the casual Web browsing session replicates information in your browser cache. Many more of these examples could be given.

We also need to produce more and more hardware to store these archived data, the most obiquitous of which is the common hard drive. In the end, we'll need more metal and magnetic matter than the Earth can provide.

Martian space missions, anyone?

Re:Supply and demand (1)

Rodness (168429) | more than 7 years ago | (#18245846)

Forget Mars, we can just tow asteroids into orbit and mine them. Hell, there's already one on the way! :)

Re:Supply and demand (1)

ni1s (1065810) | more than 7 years ago | (#18245486)

You forgot to assume the summary was sensational.

"If everybody stored every digital bit, there wouldn't be enough room."
Well, Duh!

Re:Supply and demand (1)

maxume (22995) | more than 7 years ago | (#18245512)

Stupid is relative. If stupid is a certainty, it implies something else is.

We won't produce more data than can be stored. (4, Funny)

ProfessionalCookie (673314) | more than 7 years ago | (#18245354)

Data that cannot be stored will not be produced because all data that is produced must be stored. Data that is not stored (for however short a time) is not really produced.

Then again the past no longer exists anyway, the future doesn't exist yet and the present has no duration- so maybe the data never existed anyway. Maybe you don't exist?!?! Awe man maybe I *~/ disappears in a puff of logic*
----
Kudos to Augustine and Adams

Re:We won't produce more data than can be stored. (1)

xenophrak (457095) | more than 7 years ago | (#18245524)

This is of course, not true.

I routinely have to compile static versions of my company's web stores in order to archive them and they are about 1GB each of HTML once compiled.

Each store, however is about 100 megs of assets and then the data in the DB makes up another 50M or so. All of this is then generated dynamically and sent to client browsers that will just cache them temporarily. So, the data transmitted may be huge, but what people are storing would appear to be less.

Re:We won't produce more data than can be stored. (1)

Fordiman (689627) | more than 7 years ago | (#18245688)

May want to start compressing that shit. Use 7z; it's really good at noticing redundancies in logs and backups.

Of course we will (2, Interesting)

PIPBoy3000 (619296) | more than 7 years ago | (#18245702)

Think about scientific instruments that gather gigabytes of data per second. They hold on to that for as long as they have to, pulling out interesting data, summarizing it, and throwing out the rest. I track all the web hits for our corporate Intranet. The volume is so huge that the SQL administrators come and have a little heart-to-heart chat with me if I let it build up over a few months. I don't really care about the raw information past a month or so. Instead, I want to see running counts of which pages are being viewed, which people are big utilizers of our network, and so on.

A good analogy is the human brain. We gather in huge amounts of information per second via touch, sight, and so on, but throw out the vast majority of the information. The key is to have good filtering systems so that things that are interesting and relevant are held onto.

Re:We won't produce more data than can be stored. (4, Funny)

istartedi (132515) | more than 7 years ago | (#18245780)

disappears in a puff of logic

Great. Now we're all going to be inhaling second-hand logic. There ought to be a law...

2010 (1)

ni1s (1065810) | more than 7 years ago | (#18245430)

An anonymous reader tips us to an AP story on a recent study of how much data we are producing. IDC estimates that in 2010 we created, captured, and replicated close to a zettabyte of digital information. The last time anyone tried to estimate global information volume, in 2006, researchers at IDC came up with 161 exabytes. (The current study tries to account for duplicating data -- on the same assumptions as the 2006 study it would have come out at 250 exabytes.) By 2012, according to IDC, we will be producing far more data than we will have room to store, closing in on 6 zettabyte.

Internet | uniq (2, Insightful)

Duncan3 (10537) | more than 7 years ago | (#18245436)

The problem is, everything is duplicated, a LOT. All those copies needs to be stored tho, so here we are swimming in data.

My work machine that I backed up a couple weeks ago, was a 30MB zip file, and 3/4 of that was my local CVS tree. So out of a 30GB, less then 1/3000th was not OS, software, or just copied locally from a data store.

At home, I've saved every email, every picture, everything from my Windows, Linux, OSX and every other box I've every had since ~1992, and that's barely a few GB uncompressed.

The amount of non-duplicate useful material is far far smaller then your would think.

Re:Internet | uniq (1)

Firehed (942385) | more than 7 years ago | (#18245686)

Only a few gigs? You clearly don't have a camera that shoots in RAW... I've burned through well over ten gigs of storage just from mine, and I've owned it for all of six weeks (averaging to just under 300MB per DAY of new content). Sure, email takes next to nothing and I have plenty of duplicate content, but I have over a terabyte of storage and after doing my best to trim out redundancy, I still have a very sizable chunk of it used. I suppose it's really down to usage habits, but with 10+MP cameras and HD camcorders being available at consumer-level pricing, the amount of non-duplicated original content is going to shoot through the roof.

Re:Internet | uniq (1)

maxume (22995) | more than 7 years ago | (#18245972)

Are you keeping everything or stuff that is 'worth it'? I have the keep-everything mindset, but I take steps to help myself notice how rarely I use a lot of stuff. I figure if I ever end up with a raw capable camera and get in the habit of actually shooting raw, I would at least try to render lots of it to 'good enough' jpegs and not worry about it and only keep the raw for the shots where it was interesting(and hope that my ideas of interesting didn't change too much).

Sd cards are getting pretty cheap. For a consumer camera, treating 1 GB cards as write only media isn't all that crazy at this point, if you figure ~2000 photos for $12(or even 1000). I sort of hope that 10 MP cameras don't make much progress, at least not as far out of lockstep as they are with the lenses and sensor size already.

Re:Internet | uniq (1)

Duncan3 (10537) | more than 7 years ago | (#18246116)

I am an anti-packrat, I purge all the junk every time I transfer to a new hard drive (they never last long do they) and so I keep it pretty trim, having "stuff" is generally just more of a headache. Deleting email attachments you don't need also goes a long way, those 10MB .doc files that say the same thing as the paragraph of ASCII.

I will admit the digital camera turned it from a CD into a DVD of backups, but I just need to get a really good 3-4MP camera, instead of a rather bad 8MP one. Also just deleting all the duds right away is a good 20x savings.

Re:dude (1)

maxume (22995) | more than 7 years ago | (#18246244)

Packrats are awesome:

http://en.wikipedia.org/wiki/Packrat_midden [wikipedia.org]
http://www.google.com/search?q=packrat+middens [google.com]

Watch what you say about them!

I try not to keep junk around; keyword is try. I set my 5MP camera to 3MP and don't feel like I am losing anything. Stuff I consider important fits on a DVD; stuff I will bother moving to a new drive is more like 120GB.

Re:Internet | uniq (0)

Anonymous Coward | more than 7 years ago | (#18245762)

Congratulations on not being addicted to pr0n.

2nd the motion of Firehed (1)

NotQuiteReal (608241) | more than 7 years ago | (#18245900)

I too am a programmer, and I have almost every scrap of code I ever wrote, including z80 assembly code to play "pong" on an analog oscilloscope. Why do I have it? I dunno, because I can. I don't even know where it is at this moment, but I THINK I have it on a cd-rom somewhere. And as long as that "archive bit" is set in my mind, it is ok (but if I couldn't find it, I'd just shrug and say, oh well...)

Text (code, misc letters) IS very small. Up until just a couple of years ago, all the "good stuff" would fit on a CD-R or two.

Now, I have several full DVD-Rs with copies of digital photos, and I just finished making 30+ more DVD's of (compressed data) that hold 60+ hours of old home video before the tapes rot.

By-and-large, there is a lot of crap that I personally don't feel a need to save (because I can always get it from somewhere else, if need be) but even "personal" stuff is adding up to 100's of Gigabytes.

Still, data is smaller than boxes of pictures and video tapes.

I didn't RTFA, but I don't see what the big deal is. From where I sit, I have computer power and data storage equivalent to what cost millions and millions of dollars at one time, in my own lifetime.

And it keeps getting cheaper...

I suspect, just like in the physical realm, "important" digital items will survive, thru shear duplication and media updates, far more often than "unimportant" items... like my family photos, but at least they have a shot.

Upload all your snapshots to a royalty free photo site and gain digital immortality via file hoarders. The only rub is that you can't let your pix be personal enough to trace back to you or the stalkers will get you.

Internet a product of biology? (2, Interesting)

blubadger (988507) | more than 7 years ago | (#18245450)

In River Out of Eden [wikipedia.org] Richard Dawkins traces the data explosion of the information age right back to the big bang.

"The genetic code is not a binary code as in computers, nor an eight-level code as in some telephone systems, but a quaternary code with four symbols. The machine code of the genes is uncannily computerlike."

Dumbest comment ever... (0)

Anonymous Coward | more than 7 years ago | (#18245506)

"By 2010, according to IDC, we will be producing far more data than we will have room to store, closing in on a zettabyte."

So I guess in the future you can't buy more hard drives or something...

stupid (0, Redundant)

Lord Ender (156273) | more than 7 years ago | (#18245616)

By 2010, according to IDC, we will be producing far more data than we will have room to store
Does anyone else find that statement to be utterly moronic?

Re:stupid (0)

Anonymous Coward | more than 7 years ago | (#18246132)

Thank god I'm not the only one.

How the hell can you produce data but unable to store it, sounds like "Contradictio in terminis"? is it like writing all the data to /dev/null?

How much is actually used? (4, Interesting)

basic0 (182925) | more than 7 years ago | (#18245646)

Ok, so we generate some staggering amount of computerized data every year. This is one of those stories where I can't remember hearing about it before, but it really doesn't feel like "news".

My question is how much of this data is actually being used? I'm horrible for constantly downloading e-books, movies, software, OSes, and other stuff that I'm *intending* to do something with, but often don't get around to. I end up with gigabytes of "stuff" just sucking up disc space or wasting CDs. I burned a DivX copy of Matt Stone and Trey Parker's popular pre-South Park indie film "Orgazmo" in about 2001. I've since seen the film 2 or 3 times on TV. I STILL haven't watched the DivX version I have, and now I can't find the CD I put it on. I know I'm not the only one who does this either, as many of my friends are using up loads of storage space on files they've just been too busy to have a look at.

Right now I'm on a project digitizing patient files for a neurologist. We're going up to 10 years deep with files for over 18,000 patients. Most of this is *just* for legal purposes and nobody is EVER going to open and read the majority of these files. The doctor does electronic clinics where he consults the patient and adds new pages to their file, which will probably sit there undisturbed until the Ethernet Disk fails someday.

I think a more interesting story (although probably MUCH more difficult to research) would be "How much computerized data is never used beyond it's original creation on a given storage medium?"

Re:How much is actually used? (0)

Anonymous Coward | more than 7 years ago | (#18245818)

-I know I'm not the only one who does this either-

Actually, I'd bet you ARE the only one searching for a CD with a DivX rip of "Orgazmo".

Think tank bias (1)

RealGrouchy (943109) | more than 7 years ago | (#18245656)

I'd take this study's fearmongering with a grain of salt. It probably came from one of those deletionist [wikimedia.org] Think-Tanks.

- RG>

Suzumiya Haruhi (0, Offtopic)

alexjohnc3 (915701) | more than 7 years ago | (#18245710)

Make sure you watch out for giant crickets [concretebadger.net] , especially if you visit Superior Japan [concretebadger.net] .

Exabyte tapes (2, Funny)

Roger W Moore (538166) | more than 7 years ago | (#18245784)

So at this rate it won't be long before we will need real Exabyte tapes. I always thought the original ones should qualify for the award of world's most misleading name since their capacity was 500 million times less what their name suggested.

What are we supposed to do with it all??! (1)

AaronPSU777 (938553) | more than 7 years ago | (#18245786)

Seriously, forget about storing all this data, what exactly are we going to do with it?? How are we going to process and manage zettabytes worth of data? What tools are we going to use to sift through that much data and get what we need? Should we even be keeping it?? Hell %90 of it may well be porn. The more data we produce the more urgent it will become to ask these sorts of questions, and find the answers to them.

Google Says: (2, Interesting)

nbritton (823086) | more than 7 years ago | (#18245808)

(161 exabytes) / 6,525,170,264 people = 26.4931682 gigabytes per person.

Google Says: (2, Interesting)

nbritton (823086) | more than 7 years ago | (#18245884)

(161 exabytes) / 1,093,529,692 people[1] = 158.086639 gigabytes per person and 19.6380918 gigabytes per person if you don't count the duplicate data.

[1] Total est. of people on the Internet:
http://www.internetworldstats.com/stats.htm [internetworldstats.com]

Re:Google Says: (0)

Anonymous Coward | more than 7 years ago | (#18245946)

Unfortunately the Google calculator doesn't handle exabytes and gigabytes correctly, so you have to redo those calculations manually.

Re:Google Says: (1)

DragonTHC (208439) | more than 7 years ago | (#18246122)

yes, yes, in the U.S. Government's citizen dossiers. 26 GB per person.

zetta--watt iss that? (1)

454_Casull (805113) | more than 7 years ago | (#18245826)

....and in the year of our lord 2012 our data became self aware.....

Low SNR (5, Insightful)

Jekler (626699) | more than 7 years ago | (#18246038)

As interesting as the sheer volume is, most of it is garbage. I'd rather have 50 terabytes of organized and accurate information than 500 exabytes of data that isn't organized, and even if it were, it's accuracy is questionable at best. In essence, even if you manage to find what you want, the correctness of that information is likely to be very low.

I've long said we are not in the information age, we are in the data age. The information age will be when we've successfully organized all this crap we're storing/transmitting.

Re:Low SNR (0)

Anonymous Coward | more than 7 years ago | (#18246226)

Clearly you have not learned to properly misinterpret the text.

Re:Low SNR (1)

General Wesc (59919) | more than 7 years ago | (#18246258)

I'd rather have 50 terabytes of organized and accurate information than 500 exabytes of data that isn't organized
Everybody stand back! [xkcd.com]

The Singularity is coming! (1)

Ikoma Andy (41693) | more than 7 years ago | (#18246142)

Run!

Oblig Seinfeld? (1)

iminplaya (723125) | more than 7 years ago | (#18246272)

Yotta, yotta, yotta...
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?