Researcher Warns of "Digital Dark Age"

Soulskill posted about 6 years ago

Data Storage 367

alphadogg writes "A assistant professor from the University of Illinois at Urbana-Champaign is sounding a warning that companies, the government and researchers need to come up with a plan for preserving our increasingly digitized data in light of shifting document management and other software platforms (think WordPerfect and floppy disks). Jerome P. McDonough, who teaches at the Graduate School of Library and Information Science at the University of Illinois at Urbana-Champaign, says there exists about 369 exabytes worth of data, and that includes some pretty hard to replace stuff, including tax files, email and photos. Open standards could play a key role in any preservation effort, he says. 'If we can't keep today's information alive for future generations, we will lose a lot of our culture,' McDonough said. Even over the course of 10 years, you can have a rapid enough evolution in the ways people store digital information and the programs they use to access it that file formats can fall out of date.'"

Anonymous Coward | about 6 years ago

Hahahahaha fuckers!


infonography (566403) | about 6 years ago

He consorted with the Devil so he could make that posts.

Confess sinner or be condemned to eternal Dial-up speeds.

oh, and I hear he weights as much as a Duck.

Anonymous Coward | about 6 years ago

The Future (0, Insightful)

Anonymous Coward | about 6 years ago

In 100 years, I won't care.

Re:The Future (1, Troll)

CRCulver (715279) | about 6 years ago

In 100 years, I won't care.

Well, with technology progressing the way it is, maybe there's some small hope that we will soon see the possibilities that Ray Kurzweil foretells in his works like The Singularity is Near [] . One would think that the chance that the younger generations will have indefinitely longer lifespans would encourage more people to think of long-term consequences.

They won't care either (4, Insightful)

rtfa-troll (1340807) | about 6 years ago

Most of the garbage that we have now just isn't worth keeping. The biggest problem is filtering out the junk we have so that we know what is really valuable. That would be things like great music; writing; the origins of software freedom; works of history and biography etc. Then we could store that, but the problem is we mostly store SOX inspired lies for compliance audits. This garbage takes away from any effort to store serious stuff long term. Who could we trust to do the filtering? The govt? (no please don't answer that :-)

Best thing about digital dark age... (0)

Anonymous Coward | about 6 years ago

Re:Best thing about digital dark age... (2, Funny)

binarylarry (1338699) | about 6 years ago


Well at least that's all we got out of the Word files describing the beast.

I say (5, Funny)

speedingant (1121329) | about 6 years ago

Lets go back to using basic text editors and floppy disks. Would we REALLY miss the new "XYZ 5000-tron GUI" Microsoft Word?

And who needs to store pictures and movies on their computers anyway? In fact, I think the world would be a better place without them!

Now if you excuse me, I'm going back to watching Iron Man on my wrist watch.

Re:I say (4, Insightful)

Anonymous Coward | about 6 years ago

It's funny how when digital culture is under attack by the RIAA people say that "software is art and deserves all the same legal protection" but when we talk about preserving 1980s and 1990s computer culture in the same way that we preserve books there are comments of ridicule. People pick some shit software and cast all software with the same (shitty) brush.

And I'm not immune of course, there's a lot of shitty software out there and it's easy to trivialise the value of Custers Revenge [] or Giana Sisters [] but remember that historically archivists want to know about tasteless/racist video games or tributes/Mario-ripoffs just like they want to know about 1980s comedy shows and magazines.

This article is saying that libraries and archivists had a blind-spot when it came to software. It took them decades to realise that people expressed themselves artistically in this medium. Archivists didn't know that they should preserve it like we do other media.

I know how easy it is to mock these efforts (Eg, the tag "!nothingofvaluewaslost") but please consider supporting and justifying this digital culture as part of a wider effort to justify software expression.

It's easy to pick out dumb software but closing

mrmeval (662166) | about 6 years ago

Re:I say (4, Funny)

geekoid (135745) | about 6 years ago
And more importantly, a song:
sung to the spiderman tune.

"Iron Man, Iron Man
Does whatever an iron can
Presses pants really fine
Keeps those pleats right in line
Look out! Here comes the Iron Man" - Marvel

Marketing and Management already know! (5, Funny)

CorporateSuit (1319461) | about 6 years ago

We can just store everything in the cloud! Problem solved!

Re:Marketing and Management already know! (4, Funny)

NoobixCube (1133473) | about 6 years ago

In the cloud? Oh my god! What happens when it rains?! The farmers will have all our data! We'll have to sue the farmers for their harvest, since their crops will contain all the data and applications!

Re:Marketing and Management already know! (1)

starfishsystems (834319) | about 6 years ago

Mod parent as "funny", not "informative"! Put data into the cloud, it's not even yours to manage any more. How is that any more future-proof?

Re:Marketing and Management already know! (3, Funny)

ArsonSmith (13997) | about 6 years ago

I'm sure the story tellers of old laughed in the same was as the cave painter said, "Ug draw this story on wall."

Re:Marketing and Management already know! (0)

Anonymous Coward | about 6 years ago

I'm sure the story tellers of old laughed in the same was as the cave painter said, "Ug draw this story on wall."

Ug's 'story on wall' is still there several thousand years later. I can't even access my 1980's War College papers any more as MacWrite no longer exists.

Re:Marketing and Management already know! (2, Insightful)

geekoid (135745) | about 6 years ago

Except you can explain what a painting is, no one can clearly define what the cloud is. Mostly becasue it's a marketing term looking for a technical design it can adhere to.

Re:Marketing and Management already know! (1)

drig (5119) | about 6 years ago

In theory, a cloud provider (like Amazon's S3) has a responsibility to backup the data. I lose a ton of data every time my drive crashes or I reinstall w/ out backing up my home directory. In contrast, about a decade ago I put some MP3 files on Xythos' webdav server (now known as xythos on demand), and they're still there. The MP3s are no big deal, but the fact is that this cloud provider stored my data for a decade.

10 years isn't exactly 'future proof', but that's the oldest cloud provider I could think of. The question is whether we think Amazon's S3 is still going to be in service in 10 years, 25 years, 100 years. I *know* my hard drive won't.

Re:Marketing and Management already know! (0)

Anonymous Coward | about 6 years ago

One response. No.

Even better (0, Offtopic)

chebucto (992517) | about 6 years ago

Just mandate that all physical media has to contain an extra partition with schematics of a drive to read the media (the schematics themselves can be saved as a .DWG). Likewise, mandate that all file contain a binary blob which defines the file format's specifications. That way all media and all files have within them the key to read them.

Problem solved!

not to worry.... (1)

wherrera (235520) | about 6 years ago

I'm not so sure that every megabyte of those old data disks is worth preserving. What of the past centuries' romance of the lost maps that had told of hidden treasure? Let there still be space for legends in future generations. Let the sleeping floppies lie :).

Re:not to worry.... (4, Insightful)

CarpetShark (865376) | about 6 years ago

Historically, things that have been very uninteresting at the time, have been hugely valuable to researchers later on. We may not care about the countless people talking "crap" on bebo right now, but in a few hundred years it might be a different story. When people can easily analyse all those posts for meaningful psychological profiles that aren't currently understood never mind modelled and easily detected, all of that could tell a lot about our society. Even rubbish tips from thousands of years ago are hugely valuable to paleontologists.

This goes more so, for important government records, etc. Peter Quinn did a great job of explaining that, with his Sovereignty talk.

Re:not to worry.... (1)

mr_mischief (456295) | about 6 years ago

We may not care about the countless people talking "crap" on Slashdot right now, but in a few hundred years it might be a different story.

Sorry, but I had to fix that for you... for comedic effect.

Re:not to worry.... (1)

geckipede (1261408) | about 6 years ago

You don't need to store everything to allow historians in the far future to understand us, nor even any deliberate attempt to store a representative sample. However tiny a fraction of the data we currently consider to be useless survives, it will still be a vast amount, easily enough to fill in gaps in what you might call the official record of stuff we recognise today as being valuable.

Re:not to worry.... (2, Interesting)

pilgrim23 (716938) | about 6 years ago

Recently at work we ran into a problem where a "knowledge management" package died. The company had gone belly up and there is no converter. We are printing and re-typing in thousands of pages because there is just no other way.
I collect antiquarian books. Funny that a collection of plays printed up in Latin in 1542 only require the learning of a language, yet a knowledge base less then 10 years old is unreadable...

Floppies (1)

Riot.ATL (1365395) | about 6 years ago

I still have floppies of Windows 3.1 ...

Re:Floppies (0)

Anonymous Coward | about 6 years ago

And I still have a floppy drive to read them.

Re:Floppies (0)

Anonymous Coward | about 6 years ago

Yeah, too bad no one has a floppy drive :P

It's kind of like having a keys to a ferrari but not the car.. sure you can flash the keys around, but that wont pull the chicks.

Re:Floppies (2, Funny)

Riot.ATL (1365395) | about 6 years ago

Windows 3.1 floppies totally draw all the girls.

Re:Floppies (2, Funny)

Anonymous Coward | about 6 years ago

In mspaint, yeah

Re:Floppies (1)

TheRealMindChild (743925) | about 6 years ago

Which in and of itself is not special, because there are oodles of people who had windows 3.1 floppies, so the inherent chances of said floppies existing somewhere are better than... say... Windows NT 4.0 Embedded. Try and find a copy of that. You won't. You can't. Because no one cared about it. No one bought it. No one remembers it. And now no one can.

Anal (2, Insightful)

Threni (635302) | about 6 years ago

It's only because people are so anal these days. Who gives a shit? It's not like anyone in the future's going to miss anything. Even today with items like the Rosetta stone it's not worth much more than a Trivial Pursuit question - we'd not be any more educated or intelligent if stuff from 2000 years ago hadn't gone missing. Sure, there's a certain entertainment value in it all, but the idea that in 2000 years time anyone's going to be remotely bothered about the loss of websites, games and so on from the late 20th century is just ridiculous.

Re:Anal (4, Informative)

CRCulver (715279) | about 6 years ago

Even today with items like the Rosetta stone it's not worth much more than a Trivial Pursuit question - we'd not be any more educated or intelligent if stuff from 2000 years ago hadn't gone missing.

There have been instances when the metallurgy of times past was remarkably superior in some respects to later arts. Think of Damascus steel or Chinese bell-casting. Though the general trend of technology is constant progress forward, in certain cases the ancients were able to teach us a thing or two.

Re:Anal (1)

DirtySouthAfrican (984664) | about 6 years ago

Though the general trend of technology is constant progress forward, in certain cases the ancients were able to teach us a thing or two.

Including hyperspace travel and kick-ass plot devices!

Re:Anal (5, Insightful)

rugatero (1292060) | about 6 years ago

I'm reminded of this story [] from a few years ago, where a 500 year old Leonardo drawing inspired improvements in mitral valve heart surgery.

Re:Anal (1)

geekoid (135745) | about 6 years ago

Not it isn't. Metallurgy is far superiour today. We can design at the molecular level now.
Some things aren't needed so no one bothers. That different then their metallurgy being 'superiour'.

Stop spreading that tired old myth. Next some ignorant person going to tell me about the 'lost' Japanese sword smith technique being superiour, or some other load of rubbish.
can't seem to be replicated does not equal superiour. Hell, I can ahve made a sword that is far superiour then anything seen in Japan without there being any metal in it all.

Re:Anal (3, Insightful)

DirtySouthAfrican (984664) | about 6 years ago

Fortunately not everyone shares your view. The world we live in is the way it is (for better or for worse) because it has historical context. We don't live from one moment to the next wondering where our next meal is going to come from. We plan, we dream, we reflect.

Re:Anal (1)

Threni (635302) | about 6 years ago

Yeah, we take the best - what works - and go with it, improve it etc. We don't anally store every last fart any little spotty bedroom boy comes up with because it might help with swordcraft or medicine in 500 years. I think that's being a little precious.

Re:Anal (2, Insightful)

Neon Aardvark (967388) | about 6 years ago

Given the degree of effort historians and archaeologists today put into finding as much information as possible from times past, including minutia about how ordinary people lived their lives, you're obviously flat out wrong.

Re:Anal (2, Insightful)

Archwyrm (670653) | about 6 years ago

I think you are gravely undervaluing the worth of things from antiquity. Though I have no evidence on hand, I would wager that to say "nothing from archeology has ever helped to advance current technology" would be a falsehood. Now, I do agree with you concerning things from the late 20th century. There is already a glut of this. So much in fact that no one in their right mind 2000 years from now would want to go through all of it. Not by hand anyway.

Besides, I would rather no one saw the website that I put up in '96 ever again.

Archive... (1)

isBandGeek() (1369017) | about 6 years ago

Major file formats and how to decode them. Problem solved?

Re:Archive... (5, Insightful)

Opportunist (166417) | about 6 years ago

OPEN file formats and OPEN hardware, well documented.

Even if no program exists anymore to read your data, as long as you have the specs you can rebuild it. And I mean hard- AND software. If you know how to build it, you can build it provided you have the means. And I'm pretty confident that our future cousins will be able to build a current computer with their future technology, as long as they know WHAT they should build.

Re:Archive... (1)

Antique Geekmeister (740220) | about 6 years ago

Documenting, via open hardware standards, how to make or read a paperback book does no good when the paperbacks are manufactured with acid-laden paper. Ask your local librarian how difficult it is to preserve popular paperback novels, and how many they have to destroy each year.

Now compare that to magnetic abd today's optical media. Floppies do not last long without careful handling and temperature control. Magnetic tape is subject to serious problems of the tightly wrapped tape affecting the bits on the next layer, which is why we used to rewind old magtapes every year or so and keep them in circulation rather than relying on them sitting on a shelf. Even CD's and DVD's, with well documented formats and with potentially open source content, have lifespans far shorter than the 100 years they were being advertised with. So the idea that simply keeping the formats open is clearly doomed to failure.

Re:Archive... (1)

geekoid (135745) | about 6 years ago

"OPEN file formats and OPEN hardware, well documented."

Same diff.
The format doesn't matter if you have the specs.
It is irrelevant to your OS agenda.

Besides, does anyone really think they won't be able to crack them? we're not talking about stone tables buried in the mud, we are talking about an ever changing and documented system.
Every change, every item, every document is talked about on the internet. Being able to access the data will be trivial.

Re:Archive... (1)

Neon Aardvark (967388) | about 6 years ago

Surely the major problem is lost/damaged storage media.

I don't think there are many important files that people can't now read because of lack of documentation.

But plenty of stuff has been physically lost, e.g. loads of usenet posts from the early 80s.

Re:Archive... (1)

Archwyrm (670653) | about 6 years ago

No kidding.

Even failing that, it is not too hard to reverse engineer file formats given tools, time, and enough interest. People are doing it all the time today even. With computer systems of the future (tens or even hundreds of years) it will be even easier to smash through encryption and encodings, analyze files for patterns which make up some sort of data, mine the data for something actually useful, and so on.

Re:Archive... (1, Insightful)

Anonymous Coward | about 6 years ago

Sort of.

Try loading up an image form Dr. Halo. That was a pretty popular paint program on the PC back in the day. Depending upon your perspective, it wasn't that long ago, 20 years. I think that the format has been published. Maybe it's skewed because there isn't that much really desirable data in the Dr. Halo format but it was a pretty popular toy.

Even if the format is published (which I believe it is to some extent) it's a bit of a chore to go write a decoder. Go back a few more years, say pull some PDP11 files or EBCIDIC files, it's not impossible by any stretch to decode them but the benchmark goes up just a touch more. EBCIDIC may never die simply because of the size and might of IBM, but PDP11? It doesn't seem that hard to imagine a world in a few more years where people don't really know what middle endian is. (not that they'll forget exactly but It just makes the chore that bit more complicated. How often to you go back and reverse engineer a video game for the C64 of original PC or Apple II? How come nobody is modding them? (maybe there are a few folks doing it but it's not like it's a really popular hobby) It's not a matter of possibility so much as the benchmark to accessing the data, it cuts off the common person who wants to casually look at history. Imagine if we make a fairly radical change in the way we process data in the next 50 years, this isn't a stretch, maybe organic computing or quantum computing where some of the "fundamentals" as we know them change. Maybe in 100 years, binary data on drives will look like punch cards look today, how often have to captured data off of a punch card on a modern computer?

Open specs are part of it, coming up with some intelligent ways to develop more timeless document formats without over engineering the hell out of them is part of it too. Maybe the most diligent thing to do would be to contrive a format for that and as part of the specification as it grows over time migration is included in it. That's one thing no format really "takes care of" for you, if all your documents are in Word it's all well and good but what you want is every time a new version comes out, you want to freshen all your docs to the new format and maybe resave them without the legacy stuff. Try doing that to all your jpeg digital photos. The data either has to be kept alive as part of the specification or something more intelligent has to happen. Fast forward say 100 years, your great great great grand son finds a way to pull our ODT Resume off of a DVD, what's the liklihood of him building an ODT viewer to crack open the data? Even with the specs, that's a somewhat involved task.

Which is why OOXML is the devil (-1, Flamebait)

SpaceLifeForm (228190) | about 6 years ago

If you use OOXML, you can count on not being able to
read the document after 10 years (or less).

Re:Which is why OOXML is the devil (0, Troll)

Antique Geekmeister (740220) | about 6 years ago

Try '10 minutes'. That format is simply not stable.

Re:Which is why OOXML is the devil (1)

ArsonSmith (13997) | about 6 years ago

Yea, I have to reboot my OOXML like 4 times a day.

Re:Which is why OOXML is the devil (-1, Flamebait)

Anonymous Coward | about 6 years ago

you stupid fucking bitch it's an open standard. dont you fucking understand what the fuck that means yet? if not shut your fucking mouth and tell the rest of the faggots to shut their fucking mouths too. you're all too stupid to live. go get fisted by that faggot bitch.

Re:Which is why OOXML is the devil (2, Informative)

xmarkd400x (1120317) | about 6 years ago

Words: 50 Swears: 9 Facts: 1 (It's an open standard) Relevant Points: 0 You're a little light on the swears section for typical interweb poasts.

Anonymous Coward | about 6 years ago

Anonymous Coward | about 6 years ago

Anonymous Coward | about 6 years ago

Anonymous Coward | about 6 years ago

rastoboy29 (807168) | about 6 years ago

Information outlives technology (5, Insightful)

starfishsystems (834319) | about 6 years ago

"I often ask, 'Everyone in the audience who thinks they're going to be using the same word processor in ten years, raise your hand.' No hands go up. 'Everyone who has data around that's going to have value in ten years?' After a minute's thought, every hand goes up. The lesson is clear: information outlives technology."
- Tim Bray

Re:Information outlives technology (4, Insightful)

Sebastopol (189276) | about 6 years ago

Been using Excel, MS Word since 1990 and Quicken since 1992.

I can still open all my work from my thesis, and can search credit card purchases from 20 years ago.

No problem here.

Re:Information outlives technology (3, Insightful)

gnud (934243) | about 6 years ago

The story would perhaps have been different if you had used any other software packages?

Re:Information outlives technology (1)

chardash (801837) | about 6 years ago

but ten years from /now/? the choices available in terms of operating systems and software are a little different compared with ten years ago, and will only become more varied.

Re:Information outlives technology (3, Insightful)

ceoyoyo (59147) | about 6 years ago

Change it around. Everyone who's been using the same word processor for the last ten years raise your hands. Every hand probably goes up. For the ones that don't go up, ask can your current word processor read files written by your word processor ten years go? The rest go up.

I've got a few archive CDs from over ten years ago. Every file on them is readable today. Even if I'd be a little inconvenienced to dig up a copy of Corel Draw, there are lots of modern drawing and layout programs that can read the files.

Re:Information outlives technology (1)

geekoid (135745) | about 6 years ago

And he might have a point if the stage in between weren't known. Since every upgrade and change is going to be known, getting thatb data will be easy.

There is no digital data that can't be cracked.

And 10 years? sure. IS the data you ahve now going to be valuable in 100 years? Probably not.
As long as you bring it along for the ride, it won't matter.

Subtly different from all similar warnings (3, Insightful)

Anonymous Coward | about 6 years ago

The cultural loss isn't something that should be overlooked, some can bemoan it but the value of culture is that it exists, and that different ones existed in the past. Culture changes from moment to moment but without some action the real meat of the early 21st century will be lost forever. That is the big thing here, and that is justification for working for truly readable digital archival methods. There is a project of making minisuce indentations, but that requires a lot of technology to see much less decode. Continuous duplication, by transfer of all old data across all mediums as they rise and fall, by printing content and storing it in climate regulated warehouses, etc. We relish seeing things from thousands of years ago. This is humanity, that is our legacy. We need to leave a legacy for our grandchildren.

Migrate, migrate, migrate... (4, Insightful)

I.M.O.G. (811163) | about 6 years ago

The only motivation for a company to invent new ways to preserve data long term is to provide it as a service so they can profit from it. Other than that, a companies main goals are deleting everything it legally can. Anything that no longer exists can't result in a lawsuit.

Everything that is preserved is a potential liability. For items requiring indefinite retention because they are critical to the business... They will be stored, redundant, and backed up appropriately. As the systems that provide those qualities age, they will be replaced in regular maintenance and upgrade schedules as economics and timing come together in the right proportions. In that way, reliability and long-term survivability are maintained - nothing stays on ancient systems that are unmaintainable forever. When systems go out of support, everybody has already been looking to the next solution to migrate to.

So what's wrong with this approach? Its essentially what all "big" companies are currently doing. I don't believe in this proprietary format FUD either - if the proprietary format is no longer supported, you migrate. Potential of future cost to migrate is the only concern, not survivability.

Migration is todays solution to long term storage and I see no reason it should be ignored. Like security, data retention is an ongoing objective that requires maintenance - its not some end-state. Dreaming of a solution that will just last forever seems archaic, no?

Re:Migrate, migrate, migrate... (0)

Anonymous Coward | about 6 years ago

It's the same we do. We live short, but humanity lives for long. The solution: Built new copies, fill the data in the new copies. Die.

That's life.

Of course (0)

Anonymous Coward | about 6 years ago

Because the average photos of your dog and the penis enlargement spam in your inbox will be culturally important in two hundred years.

About as important as this very comment.

Re:Of course (4, Interesting)

Bragador (1036480) | about 6 years ago

If archeologists find knives and trash to be important in a search, I'd say the average pictures that we are taking today might actually be very intereting to future generations for they represent normal life.

Doubtful... (2, Insightful)

johnlcallaway (165670) | about 6 years ago

Most of the text in most word processing documents are easily available to be parsed out even without the specs. The formatting would be lost, as would any embedded objects or images.

Open formats would improve it, but I would be more concerned about encrypted documents and media loss than not being able to recover data (text/images/video/music/etc) from available files. There are a lot of clever people that can do amazing things with deciphering proprietary formats.

Dark? (1)

gmuslera (3436) | about 6 years ago

With so much data i would say that the darkness start when you get blinded by the light. Search engines do a great job, but still, the relevant, unbiased and accurate data could be very hidden by the amount of the opposite kind of data that exist in big numbers.

Until... (1)

Chordonblue (585047) | about 6 years ago

Urbana-Champaign gets to inventing HAL, I'd say they should stop wasting their time with this sort of thing...

Typical human arrogance (0)

Phizzle (1109923) | about 6 years ago

Comparing todays Internet content to Library of Alexandria or something... Come on, the VAST majority of the data out there is comprised of pirated movies, mp3s, and pr0n. The authors concern that in case of the sudden demise of our emails, tax files and robust pr0n collections the future generations would be somehow deprived of CULTURE is laughable. When I was growing up, CULTURE came from books, museums and other things that did not involve a floppy or a hard drive.

Re:Typical human arrogance (1)

Bragador (1036480) | about 6 years ago

How are movies, music and visual arts not culturally important?

Re:Typical human arrogance (1)

Faylone (880739) | about 6 years ago

A book on paper is just as culturally relevant as the same book stored on a floppy

This is anything but a new problem (1)

Opportunist (166417) | about 6 years ago

The problem existed for a while. Can you read 8" discs? Do you know how to build a device to read that "data drums" IBM used to store data?

Create documented hardware and use documentes formats to store your data. Dump everything proprietary because chances are good you don't get the whole information you need to recreate the formats or the hardware flawlessly. If you know how to build it, you can build it. If you can build it today, you sure as hell can build it in the future with better technology.

The only problem that remains after that is data deterioration due to "disc rot". I.e. the medium used to store the information failing due to age. And this is anything but a new problem. Ask your librarian what he thinks of post-1700 paper and the ink used in that age which contained some sort of acid IIRC.

Professional Write (3, Insightful)

Zombie Ryushu (803103) | about 6 years ago

Amazing as it sounds, I still have very VERY old data that goes as far back as 7th grade when I started using computers. I know of no converter for Professional Write that will convert Professional Write documents into ODF, or even MS Word 97/2000/2003.

The only hope I have is that I can use strings to extract the text elements of the data.

Books? (3, Insightful)

fatboyslack (634391) | about 6 years ago

From the article -
âoeIf we canâ(TM)t keep todayâ(TM)s information alive for future generations,â McDonough said, âoewe will lose a lot of our culture.â


Apparently none of our culture is stored in books anymore?

Sure if every piece of data was wiped out the world would lose a lot of information... but a lot of valuable and useful information is still put on paper. I don't think that is our biggest cause for concern.

However I do agree that the world really needs to agree on more open / non-proprietary ways of storing data. Sure, I can open a .wav of Blackadder talking about 'sticking a Christmas tree' somewhere from 1992, but I have a bit of trouble opening .ra (real audio) video files from a few years ago.

And working in government everywhere I go the electronics file storage is just a discordant mess. Anything important we have to print and store hardcopies because our electronic systems are just unreliable.

recharged95 (782975) | about 6 years ago

Anonymous Coward | about 6 years ago

On file formats and the future (3, Insightful)

4D6963 (933028) | about 6 years ago

Open standards could play a key role in any preservation effort, he says

The way I see it there are two approaches to the problem. The Quixotic fight consisting in changing the world and forcing in a dictatorship of openness regarding file formats, which doesn't solve the problem for the past 50 years of computer history.

Or let a few hundred people around the world worry about file format parsing or, in the worst case, even emulators to do whatever old computers did. In a hundred years from now, you'll have very complete emulators for our modern PCs. Considered that a 1994 PC is quite comparable to a 2008 PC (and presumably a 2015 PC) from an emulation point of view, you know that's a given, and even then, in case there was no such emulator, you know you could find a good such emulator for machines from the 2040s, which themselves would be well emulated by machines from the 2070s, and so on.. that's what we already do. There's hardly any program you used 20 or 30 years ago that you couldn't use today.

Been there, done that. (1)

qengho (54305) | about 6 years ago

Re:Been there, done that. (0)

Anonymous Coward | about 6 years ago

deadmedia was all talk has actually been PRESERVING data

eBay to the rescue (1)

chelsel (1140907) | about 6 years ago

I bought a copy of Ami Pro on eBay to resurrect some of my old documents so I've already experienced this firsthand.

Shut up, shut up, SHUT UP! (1)

db32 (862117) | about 6 years ago

Is this a problem we really want solved? Look at the ever growing piles of data governments and companies are collecting on people. Is this something that we really want preserved forever?

So what? (0, Flamebait)

frank_adrian314159 (469671) | about 6 years ago

How many records do we have from ancient Assyrians? From the Egyptians? Romans? British Empire?

Entropy and loss happens. Most of the data deserves to be lost. How much do I care that Asuk the Assyrian was assessed two goats in taxes? Not a hell of a lot. How much will someone five years from now care about this post? About as much.

We work and write and live for today. That anything travels down the road of time intact is a miracle. What gets carried along is random. This is part of life. Get over it.

How's that for timing? (1)

cafeman (46922) | about 6 years ago

How's that for timing? PALGN just interviewed Eric Kaltman [] , cataloger at the Stanford University library about his role in cataloging game-related material and the challenges that DRM and MMOs present. Stanford's part of the "preserving virtual worlds" project, along with the University of Illinois mentioned in the article. He's also the guy who writes on the How They Got Game [] blog, where he documents his findings.

It's an interesting field. Far more challenging than I would have thought.

People are starting to take note (2, Interesting)

duffbeer703 (177751) | about 6 years ago

Government agencies and archivists are starting to wake up to the fact that this is an issue -- I think the Office 2007 file format change was a big factor that is getting it on the radar.

Minnesota, California, Massachusetts and New York definitely have people studying the issue. Unfortunately, there are no easy answers when it comes to these things.

In my opinion -- which is not necessarily the opinion of my employer -- one of the major problems is that there are far too many records being preserved.

If you looked at the archives of a government or corporate office 30 years ago, only official memorandums, some meeting minutes and policies were retained. Today, technology like email has improved communication somewhat, but has also encouraged sloppy office practices so that it is nearly impossible to figure out what is useful and what isn't.

To compound matters, the courts are now mandating document retention and email archiving which encourages the retention of even the most banal communication.

IMO, the period 1990-2020 will be a black hole in history.

Anonymous Coward | about 6 years ago

Anonymous Coward | about 6 years ago

The article mixes up 2 problems... (5, Interesting)

BUL2294 (1081735) | about 6 years ago

The article talks about two very distinct and different problems--hardware and file formats. The author has a point about the hardware--if the media goes bad or if there is no way to read the data, then the data is lost. However, the author is completely off-base when it comes to file formats...

The author specifically mentions WordPerfect files. Bad example! The default file format in Wordperfect X4 (released in April, 2008) is the same as what was used in WordPerfect 6--which came out in 1993 (DOS and Windows). While I can't speak for OpenOffice or Google Docs, MS-Word can read those files (and WordPerfect 5.x files) with a simple File/Open. Excel opens Lotus 1-2-3 files as well. So, Word can open popular formats in use since 1988 (WP 5.0) and Excel can open some formats in use since 1983 (1-2-3 r1a). You can also buy programs like FileMerlin [] to convert old documents.

Frankly, when it comes to file formats, conversion apps will exist for a LONG time. For DOS apps, you could even go so far as to create a v/m or use Dosbox, load up your obsolete word processor (I miss "Leading Edge Word Processor"!) and copy/paste the text into Word or Notepad...

Image files, sounds, & videos are no exception... GIF has been around since 1987, JPEG has been around since the early '90s (opening those on a 10Mhz 8088 was slow!), and MPEG/WMV/AVI/Quicktime videos are easily openable...

Finally, the more people that are affected by obsolete files, the more interest there is in some way to convert the data... But don't forget that a LOT of the data is junk--do you really care about your 7th grade paper you wrote on Hong Kong in 1989?

Simple: shorter copyright (2, Insightful)

thisissilly (676875) | about 6 years ago

Make copyright last 5 years. Then everything worthwhile will be backed up by someone who cares about it.

Licensing Formats (1)

Josuah (26407) | about 6 years ago

The thing that I've started to dislike is the requirement that you license formats in order to use them. I fully understand where this is coming from, but there was never a need to license IP to build a microfilm reader, CD player, or VHS player (I may be oversimplifying here).

But if you want to play a Blu-ray disc, or Dolby Digital TrueHD audio, suddenly you can't just buy a bunch of off the shelf parts and build something that'll read that data.

We need to do something about making formats become open. I have no idea what that something is, though.

Slashdot again misses the point (2, Insightful)

Anonymous Coward | about 6 years ago

Everyone here seems to be missing the point -- Businesses don't need help preserving data. Anything that's really valuable and needs to be preserved will eventually be put on a laptop and lost in an airport. But what about your wedding photos? What about that book you've worked for three years on, and saved it in word doc format?

The problem of data preservation is not one business needs to address -- there's a million geeks (hi slashdot) that will be eager to earn their pay coming up with washing-machine sized solutions for business, in black cases with a stylish logo on the front. But what about me -- the person who makes less than $30k a year, keeps all my files on a laptop and an external drive, and doesn't have a lot of cash?

Let's say I want to put it in a safety deposit box and forget about it for 10 years? 20? 50? What are my options for preserving photos, videos, and text cheaply? And by cheaply, let me say less than a grand, since "cheap" seems to be relative here.

The problem is real to museum conservators. (3, Interesting)

bornwaysouth (1138751) | about 6 years ago

My father (dead, retired 20 years ago as a curator of a technology museum), was bothered as were others in the field, back in the 80's. He had seen microfiche come and go, apart from the *new* digital stuff that was already being junked. He was relying on high quality long term photographs in nitrogen canisters. It only worked because he was storing a visual media such as a sheets of paper. Only the important ones, but about a million of them existed.

As for Wordperfect and floppy disks: yep. That's a problem in our home. We are having to migrate WP files now and then. It is not sufficient to have old computers that run the programs. I had WP on my computer (but didn't use it.) A series of glitches when upgrading to SP3 had as a side effect the corruption of WP on my computer. Whatever the problem was, I could not even re-install it. We are now down to one computer that can read it.

I, when I worked in IT, migrated library data. Getting it into any sort of readable text form was a trial. We have even been sent old Macintosh computers in the hope that we could get stuff off them. Usually we could, but it wasn't been done economically, and I cursed the Education system that had highly paid administrators who did not even dimly consider that a data storage system had a finite lifetime. Not even 20 years after my father retired on under half their salary.

The core solution is as the original article says - for all government software, mandate that data export to a widely used open standard be available within the package at no extra charge. I do not know of any impediment to this worth considering. Where there are privacy issues, it is simply exported encrypted and funds are established that allow a few facilities to decrypt and migrate the data. If you cannot sell to government, including any educators, then you are marginal. OK, so some games will be unavailable to future generations. That is inevitable. But then that will be a reason to collect and maintain the hardware if you are a hobbyist.

As for large corporations, it may be sufficient that the auditors require that data be accessible for forensic and liquidation purposes. That is, not readily, but if need be in extreme circumstance.

In short, the immediate solution is an administrative one. Software and hardware is the relatively easy bit.

My own prize example of a dead data format - the Windows .mic image format. I have a few files still of those on my computer. You can see what the picture is if you thumbnail it. But when you try to get a full sized image, Windows says it cannot recognize the file format. It is now a .mock format. Is there a term for operating systems no longer being able to recognize their own past? 'Osheimers' for example.

My vote is for data on stone tablets (0)

Anonymous Coward | about 6 years ago

The only proven lasting method for recording history.
Let's face it, nothing else comes close.

