Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

E-Book Museum at Library of Congress?

timothy posted more than 10 years ago | from the bruce-sterling-has-a-head-start dept.

Books 91

David H. Rothman writes "E-books and other digital publications in the U.K. are about to go into a national archive, and in fact the Brits and others have even shown an interest in the e-book technology of yore. Goodness knows, as some have pointed out, we already have enough virtual e-book museums--unwittingly created by the march of technology. But how about an International Electronic Book Museum in the Real World, ideally the Library of Congress? Before Luddites and crypto-Luddites keel over at the thought, they should keep in mind that the technology is already several decades old and that it would be helpful to collect the artifacts in a systematic way before it's too late. More at TeleRead."

cancel ×


Sorry! There are no comments related to the filter you selected.

A museum? (0)

Anonymous Coward | more than 10 years ago | (#7399428)

Are E-Books already a thing of the past we can only see in museums now?

Re:A museum? (-1, Troll)

Anonymous Coward | more than 10 years ago | (#7400554)

I work for the British Museum and am therefore posting anonymously. While this was not done on purpose, it was by a sole librarian, and not a decision by the British Museum. That librarian has since been let go.

E-Book Museum still uncomfortable to read. (0, Offtopic)

pheared (446683) | more than 10 years ago | (#7399432)

Need electronic paper.

Environmentalists caused fires in California (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#7399446)

by blocking sensible legislation to clear cut undergrowth...may they rot in hell those liberal bastards!

Re:Environmentalists caused fires in California (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#7399857)

hehe, ironically, some of them did get to watch their houses burn up :)

Re:Environmentalists caused fires in California (0)

Anonymous Coward | more than 10 years ago | (#7399998)

At least the western tit mouse is safe!

Oh wait, they burned up too.

Quick! (0)

Anonymous Coward | more than 10 years ago | (#7399458)

Someone download them all and create your own library!

Hard Drives-The Movie. (0)

Anonymous Coward | more than 10 years ago | (#7399460)

"Goodness knows, as some have pointed out, we already have enough virtual e-book museums--unwittingly created by the march of technology."

P2P and my hard drives.

God ,there are st00p1d n166erZ every\/\/ay3r (-1)

SpongeScrodSpareCock (717608) | more than 10 years ago | (#7399466)

God ,there are st00p1d n166erZ every\/\/ay3r

God ,there are st00p1d n166erZ every\/\/ay3r

God ,there are st00p1d n166erZ every\/\/ay3r

God ,there are st00p1d n166erZ every\/\/ay3r

God ,there are st00p1d n166erZ every\/\/ay3r

my only complaint... (5, Insightful)

chmod_localhost (718125) | more than 10 years ago | (#7399469)

What happens when the software for reading these e-books is no longer supported? By using proprietary formats, it is inevitable that one day, the stuff in our nation's own library will be unreadable.

Only by creating an open standard, which anyone can choose to implement on the system of their choice (open source it, while you're at it!), can the information truly be timeless.

Re:my only complaint... (0)

Pedro Vigdny (651840) | more than 10 years ago | (#7399515)

What is wrong with text file or html? The only important thing about books is text. If cerain e-books have image, then html.

Re:my only complaint... (1)

darkwing_bmf (178021) | more than 10 years ago | (#7399528)

For most texts ASCII or Unicode work just fine and are fairly efficient and easy to compress for archive purposes. Although, I'm not really sure what format to put graphics or pictures in.

Re:my only complaint... (-1, Troll)

Anonymous Coward | more than 10 years ago | (#7399569)

Stupid troll. When linux is dead in 6 months what will we do with all your open code? Hopefully Linus will still be around so we can shove it back up his ass.

Re:my only complaint... (1)

Fedaykin_Commando (592346) | more than 10 years ago | (#7399661)

Why not bundle the application to read the format with the book storage? Problem solved!

Re:my only complaint... (3, Interesting)

farnz (625056) | more than 10 years ago | (#7399717)

What about the hardware to run it on? What about the OS? Is an eBook that only runs on 48K ZX Spectrums with Microdrives now good enough? Can we even read the media?

The advantage of an open specification for the format (unencrypted PDF would work, for example) is that provided I can access the data, and provided I have a copy of the specification, I can read the books. If I don't have the specification in an alternative format, I'm screwed. If the reader requires (say) a PC without PCI to work, and I don't have a spec, I'm screwed.

The second is more likely than the first, so I'd rather have a format with an open spec.

Re:my only complaint... (1)

Hittite Creosote (535397) | more than 10 years ago | (#7399769)

Only by creating an open standard

Stating the bleeding obvious, there's this thing called HTML. This isn't just about 'e-books', indeed those are a small part of the UK proposed law. They'd be storing webpages and electronic journal publications (e.g. science journals online). Much of which is in HTML anyway, which I was under the impression was, despite the efforts of certain large companies, an open standard impementable on the system of your choice.

Re:my only complaint... (1)

kfg (145172) | more than 10 years ago | (#7399813)

That's why my personal ebook museum is all in ASCII. The text is recoverable if you can recover the 1s and 0s in any way whatsoever.

Some people can even read the stuff directly from the printed binary, but that's a bit much for me. I'd transliterate back into text.

No need to choose and implement any new standard, we've already got a beaut for English and Unicode is coming along.


Re:my only complaint... (1)

henrygb (668225) | more than 10 years ago | (#7402437)

HTML is ASCII, plus some ASCII tags.

Re:my only complaint... (1)

kfg (145172) | more than 10 years ago | (#7402710)

That is correct. That would be why all my HTML encoded ebooks would be in ASCII format. :)

Rinse and repeat for XML, TeX, et al.


Re:my only complaint... (1)

stratjakt (596332) | more than 10 years ago | (#7399954)

If you reencode them in a different format, you've altered them, and you are no longer archiving the original eBooks.

Keep em as they are. Our primitive 1024 bit encryption keys will be a joke to the quantum processing space men of the future, anyways.

It'd be like translating french works from folks like Voltaire or Hugo into english, and throwing out the original manuscripts, because it will be easier for future historians to grok.

The medium is the message.

Re:my only complaint... (1)

agrippa_cash (590103) | more than 10 years ago | (#7400818)

In the early days of film, the LOC would only copyright things on paper. This resulted in many old films being printed on paper soley for the purpose of copyright. In retrospect this seems absurd, however a lession leard may be that the LOC can (and maybe should) should require that THEIR copy be manufactured using some archival method which may be unsuited for general distibution.

Let's just hope... (4, Insightful)

CountBrass (590228) | more than 10 years ago | (#7399482)

They don't make the same mistake as the BBC's Doomsday book project where they stored all the data on quickly obsoleted BBC Micro controlled laser discs using a proprietary format - woops! A real pain for them to recover it only a decade later.

Re:Let's just hope... (1)

4of12 (97621) | more than 10 years ago | (#7399932)

I can see it now...

We've got an archive full of documents and emails sent from the PM about Dr David Kelly, they're right here [] .

Oops - anybody got a working Windows RMS hooked up?

Re:Let's just hope... (1)

_Laban_ (166315) | more than 10 years ago | (#7400013)

I really don't think there's a Windows version of RMS.

Re:Let's just hope... (0)

Anonymous Coward | more than 10 years ago | (#7400171)

I thought that was controlled by a BBC Master? Nitpick, nitpick...

Re:Let's just hope... (1)

ajs318 (655362) | more than 10 years ago | (#7401362)

The Domesday Project was rescued just in time, but should stand as an example to everyone of just why it is important to have copies made of works in a format that will be readable in the future - even if this means using a different medium and eschewing technological copy-prevention measures for archive copies. This is our duty: to preserve this material so it can enter the public domain when its copyright expires.

The format was not actually too hard to hack into, as the video discs were CAV analogue with modified analogue audio {one of the two channels was used for data, by means of custom high-speed modem; the other channel was fed to both speaker outputs} and the low-speed serial comms between the computer and the player were easy to work out.

It could so easily have been worse. What if a descendant of the format was still in use by a manufacturing cartel who refused to release its specs for fear of compromising their dirty secrets? That hardly bears thinking about .....

Re:Let's just hope... (0)

fuckfuck101 (699067) | more than 10 years ago | (#7402536)

Not really. You simply dig out some old hardware to go with it. Not hard. Hell you could go to a museum and pick one out. Technology doesn't just evaporate once something better comes along, it still exists, it just isn't used in the mainstream.

The hardware is a valued piece of a history just like anything that may run on it.

Slashdot hits a new low (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#7399495)

sorry, but it's the truth

Re:Slashdot hits a new low (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#7399891)

Don't worry. Next week will have an even more mundane topic that will make you yearn for ebook museums.

Books on Tape Archive? (3, Insightful)

sporkboy (22212) | more than 10 years ago | (#7399507)

How many eBooks have been released as eBook only, not counting prereleases of excerpts or first chapters with "special intros". Aren't most of them just existing publications in a different format? If the format dies then there is a reason, and if the work continues in some sort of archival medium then how is it a loss? Would the same lamentations be heard over cassette recordings of books on tape?

e-books are searchable (1)

balamw (552275) | more than 10 years ago | (#7399604)

I agree with you, but for the fact that unlike the dead tree or audio formats, the e-book has at least the potential to be full-text searchable. Which could be invaluable for the work in question.

If this flies we wouldn't need Distributed Proofreaders [] anymore. B

Re:e-books are searchable (1)

kfg (145172) | more than 10 years ago | (#7399901)

And if I may segue from an earlier post this is another reason to stick to ASCII/Unicode. grep is great. grep is good. grep (and his buddies sed, awk and Perl) moved text searching from the realm of the "potential" to the fully realized, lo these many years ago.

It's only the commercial interests that feel the need for new text format and new text tools for that format.

Fuck 'em. Don't let 'em do it. Only buy ebooks in the existing open standard, just like you wouldn't buy a dead tree book that required special patent glasses and a Capt. Midnight secrect decoder ring.


sounds good, but... (1)

pulse2600 (625694) | more than 10 years ago | (#7399508) many Libraries of, oh ok nevermind...

Re:sounds good, but... (0)

Anonymous Coward | more than 10 years ago | (#7399817)

yeah, i mean, can u imagine a beowulf cluster of these things?!

Billius Casear (1)

Bendebecker (633126) | more than 10 years ago | (#7399514)

I can see it now: They go with a Microsoft databse, and the actual books decay and are lost. Then one day, an M$ update that goes out of control causes the database to crash. Irreplaceable works by such authors as the Minnesota steel worker who penned "here I sit all brokenhearted" are lost to the sands of time.

Re:Billius Casear (1)

Shivantrill (654978) | more than 10 years ago | (#7399959)

Yes, or there is a security problem and Hackers go in and rewrite some of the books without anyone knowing it. This is a scary thought :) I also agree that an open source format would be best. Maybe the slashdot community can start working on that. It could be a community project. After all.... It takes a village, people!

Re:Billius Casear (0)

Anonymous Coward | more than 10 years ago | (#7400815)

Irreplaceable works by such authors as the Minnesota steel worker who penned "here I sit all brokenhearted" are lost to the sands of time.

Do you mean the graffitic poem commonly found in mens' rooms?

Here I sit all broken-hearted
Tried to shit but only farted
Later that day I took a chance
Tried to fart and shit my pants

How many LOCs will the LOC have? (1)

ewithrow (409712) | more than 10 years ago | (#7399527)

This is really going to screw up the Library of Congress data storage unit.

Now the Library of Congress will be holding many Libraries of Congress. It's a conundrum!


Re:How many LOCs will the LOC have? (0)

Anonymous Coward | more than 10 years ago | (#7399557)

How many LOCs will the LOC have?


former 'hackers' turned hobbyists dismayed at.. (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#7399529)

payper liesense corepirate nazi stock markup FraUD ?pr? ?firm? hypenosys bouNTy hunter program.

like a few script kiddIEs can do any damage comparable in scope to the whoreabull MiSdeeds of the greed/fear/ego based wall street of deceit/capitollist hill thieves, murderers, billyonerrors, etc....

we're no fans of vandals, but they're small spuds compared to the foibles of the smoke&mirrors peddlers from the felonious kingdumb et AL.

WTF? (0)

Anonymous Coward | more than 10 years ago | (#7399535)

e-journals? They want to archive blogs???? Any idiot can e-publish. Sheesh.


Anonymous Coward | more than 10 years ago | (#7399552)

Looks like VA Linux finally had it with this guy. No more inane communist windbag rants.

that's va softwar to you buddIE (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#7399596)

va lairIE/robbIE et Al, have long agoo shed their tux moniker, in order to become even more sucksassfull billyonerror wannabes themselves.

talk about forgetting who made you? many of you seem to have done that. lookout bullow.

Why the LoC? (2, Insightful)

azzy (86427) | more than 10 years ago | (#7399561)

> ideally the Library of Congress?

Why? What's so ideal about the Library of Congress to hold an international collection of e-books?

Re:Why the LoC? (3, Funny)

tommck (69750) | more than 10 years ago | (#7399594)

Yeah, I'm an American, but even I was going to say "Wait a minute!" to that one. The USA is not the whole world. Unfortunately, until we take over the planet, there isn't a single place that one can go with these things. NATO, the United Nations... They all only have some countries as members.

I guess we now have a good reason for world domination!

Re:Why the LoC? (0)

Anonymous Coward | more than 10 years ago | (#7399834)

This is an interesting implied political statement.

Maybe, that's why it got modded +5?

Re:Why the LoC? (1)

leandrod (17766) | more than 10 years ago | (#7400061)

NATO, the United Nations... They all only have some countries as members.

Which countries are missing from the UNO?

Countries not members of the UN. (1)

joto (134244) | more than 10 years ago | (#7400890)

Switzerland (currently joining, but earlier denied membership because of their neutrality), the Vatican City, Taiwan (China is a member though), East-Timor, Kiribati, Nauru, Tuvalu and Tonga, and maybe a few others.

Also, there are probably a few micronations that could be added to the list, e.g. Sealand.

Re:Why the LoC? (1)

RogerWilco (99615) | more than 10 years ago | (#7401998)

I think that actually you would find that about 99% of the World's population lives in a country that's a member of the UN. As an onther poster pointed out Switserland is joining, leaving Taiwan (0.3% of World) as the largest country not a member.
And I as I understand historically the seat occupied by China in the UN
is considered to be Tiawanese by most/some Tiawanese.
(Taiwan is wat is left of pre-comunist China, they themselves and the rest of the world is still figuring out if they are a separate country or not, at least that what i've heard here in Holland)

Re:Why the LoC? (1)

tommck (69750) | more than 10 years ago | (#7405666)

"some" does not mean "all". "99%" does not mean "all".
So, who is going to represent East-Timor and Taiwan and the others in getting all their books in the UN library?

I guess it will have to be the UN library. That way we only have to conquer a bunch of small countries. Maybe we can wait until a Democrat is President and he can feel good about his own military victories for once...

Re:Why the LoC? (1)

RogerWilco (99615) | more than 10 years ago | (#7406151)

I am not a native speaker, but I was under the impression that
in english "some" was related to "several" "a few", more than to
"most", "a lot". But then I could be mistaken, and indeed it does not mean "all".
And I was under the impression that East-Timor is a member of the UN. 9.doc .htm

Leaving mainly Taiwan ROC, and I was arguing about it's uncertain status, as both in mainland China
and on Taiwan a lot of (some?!?) people can be found who would argue that publications
from Taiwan should be archived under the China label.
you could also talk about Tibet, Western Sahara, Palestina, Kurdistan,
Kosovo, Scotland those have an even more uncertain status, let's not go there.


Re:Why the LoC? (1)

tommck (69750) | more than 10 years ago | (#7406236)

From some adj.
1) Being an unspecified number or quantity: Some people came into the room. Would you like some sugar?
2) Being a portion or an unspecified number or quantity of a whole or group: He likes some modern scupture but not all.
3) Being a considerable number or quantity: She has been directing films for some years now.
4) Unknown or unspecified by name: Some man called.
5) Logic. Being part and perhaps all of a class.
6) Informal. Remarkable: She is some skier.

Being a geek, I tend to subscribe to #5 as my definition. In this case, however, I was thinking of definition #2, which is a portion. It just was not "all", thus the call for world domination... :-)

Re:Why the LoC? (1)

ddimas (629883) | more than 10 years ago | (#7403759)

True. We must rebuild the Library of Alexandria! And while you're at it kick all those Arab squatters out of Egypt.

Re:Why the LoC? (1)

stratjakt (596332) | more than 10 years ago | (#7399685)

The LOC is one of few institutions in a position to do so.

Because the LOC is located in a free country (blah blah slashdot rightwinger whining here) that will not censor the books, and will share the books with anyone who wants to see them. They also have the funding and resources available to make it happen.

It doesnt preclude, say, China from making their own archive, and no doubt they would. But their archive would only include government approved books, and fat chance ever getting access to it.

Re:Why the LoC? (1)

azzy (86427) | more than 10 years ago | (#7399835)

I'd regard my country (UK) as 'free' also, but looking at political decisions taken in the 'free' world can we really trust them to allow totally unbiased uncensored material to be stored? I recall recently the US govnt created a list of websites that were banned, related to terrorism, what's to stop them demanding the LoC not stock certain e-books?

Re:Why the LoC? (1)

stratjakt (596332) | more than 10 years ago | (#7399886)

It wasn't the US government, it was some doofy judge in Pennsylvania.

But, if your question is, will the LoC archive child pornography? No, they wont.

I'm sure the UK would do it, or France, or Germany. But to them, heading an "international" effort means spending US tax dollars. "International Space Station" = NASA money, 2/3rds of the UN operating budget = American money.

No matter who does it, as an American resident, I'm going to wind up bankrolling the motherfucker. Might as well keep it local.

Re:Why the LoC? (1)

azzy (86427) | more than 10 years ago | (#7401069)

*patpats* Keep taking the dried frog pills.

Re:Why the LoC? (1)

boringgit (721801) | more than 10 years ago | (#7399927)

As a Brit, the LOC git kinda got my juices flowing with the "Why should that be an american role?" attitude.

On reflection - It is down to the UK and other European countries to archive the content, often generated in the US, which is banned by the US, and down to the US to archive content which is banned (or at least restricted) in Europe.

I would suggest that several different international repositories are required. When at some point we wind up as a united Earth, we can then emalgamate the lot. (At which point the new Earth government will no doubt enter into some form of censorship of it's own :( )

Re:Why the LoC? (1)

azzy (86427) | more than 10 years ago | (#7400842)

And then when Babylon 5 break away from Earth Gov, they can store the e-books.

Re:Why the LoC? (1)

Rotten168 (104565) | more than 10 years ago | (#7401267)

The first amendment for starters. You have the right to read terrorist literature, you don't have the right to donate money terrorist organizations, however. That is what was "banned".

Re:Why the LoC? (1)

richardellisjr (584919) | more than 10 years ago | (#7400068)

Do you really think a book like the Big Book of Michief which explains how to build bombs and other things that the goverment doesn't want people to make will be available there. I have no doubts that in our "free country" certain books will not be available. You could argue strongly that very few people should have access to detailed instructions on making nerve gas, but I doubt anyone could argue that not providing unlimited access is censorship.

Re:Why the LoC? (1)

OECD (639690) | more than 10 years ago | (#7400361)

What's so ideal about the Library of Congress to hold an international collection of e-books?

Probably because they could. Of course, it would make more sense to do it on linguistic/regional/national lines and have them point to each other when needed.

Re:Why the LoC? (1)

mjtg (173905) | more than 10 years ago | (#7404875)

>> ideally the Library of Congress?
> Why? What's so ideal about the Library of Congress to hold an international collection of e-books?

This is a valid point. Why does the LoC rate as the "default" international library ? Why not, say, Library and Archives Canada [] ? Or the Australian National Library [] ? Or the National Library of Ireland [] ? Or the National Library of Jamaica [] ? Or .... any of any of these [] ? Why the LoC in particular ?

I'm not trying to sound anti-American, just offering a non-American perspective.

What exactly (1)

geeber (520231) | more than 10 years ago | (#7399707)

does the phrase, "Electronic Book Museum in the Real World" mean? Isn't an e-book museum, by it's very nature, virtual? If not, aren't the e-books then just regular books, minus the "e"?

misconceptions about e-books (4, Insightful)

bcrowell (177657) | more than 10 years ago | (#7399780)

Above all, as no printed material is produced during the delivery of a 'book', the cost of publishing the book is significantly reduced and the whole process of publication is environmentally green.
That's not true. Here (pdf file) [] is some info on college textbooks, for example. Printing, paper, and binding (PPB) are almost never a significant percentage of the retail price of a book.

I would like to see the Library of Congress start accepting digital books for copyright registration, however -- it's a drag to have to send them hardcopies.

In the early 1990s, Adobe's Acrobat reader was released. Although it is not a software specifically for eBooks, its multi-platform file format (PDF file) is an attractive feature for eBook publications. The digitization of both texts and graphics into a compact file that can be recognized in every platform is an important concept in eBooks. However, we still do not have an eBook publishing standard at the moment, though work in that direction is being done.
Well, actually PDF is the defacto standard for digital books. It's just that none of the handheld devices use the standard; they all use their own nonstandard, proprietary formats instead.

There are standard subsets of PDF that have been defined that are appropriate for archiving books. For example, the subsets don't allow you to include video or programs.

Re:misconceptions about e-books (1)

iantri (687643) | more than 10 years ago | (#7400303)

PDF sucks for eBooks, as far as I am concerned.

One of the main problems is that, when you get down to it, the core functionality is putting images of a bunch of physical pages into one big file. This is fine when you can read it on a 1600x1200 screen, but when you need to view the image on a Palm, it doesn't work. (The text doesn't magically reflow to fit the Palm.)

Personally, I think simple HTML (i.e. HTML 3.2) would be perfect for e-books.. easily parsed by any device (Palm, PocketPC, Desktop Computer..) and it (with the exception of tables) easily reflows to fit any device.

Re:misconceptions about e-books (1)

bcrowell (177657) | more than 10 years ago | (#7402531)

PDF sucks for eBooks, as far as I am concerned. [snip] Personally, I think simple HTML [...] would be perfect for e-books.
It completely depends on what you're trying to do: create an electronic archive of books, or read books on a handheld device. Actually, one of the reasons for the failure of so-called "e-books" in the marketplace (apart from the proprietary formats) is that very few people actually want to read a whole book off of a hand-held computer.

Re:misconceptions about e-books (1)

gidds (56397) | more than 10 years ago | (#7400472)

PDF is the defacto standard for digital books

And a very bad standard it is, too, IMO. PDF is great for one thing: producing an exact copy of a work on your screen or printer. Complete with the exact same font sizes, formatting, pagination, and so on.

There are situations that's wonderful -- sheet music is an example I've used recently. But it's a lousy aim for most ebooks. In most cases you don't want the same pagination and formatting - you want the text to be reformatted to match how you're looking at it, to fit your screen size, using your preferred colours and fonts, and so on. How can you read a PDF file on a small, narrow screen? You either need to reduce the font size to something unreadable, or to scroll from left to right with every line you read. Wonderful, eh? Not.

IME most ebooks work far better using logical, not physical, markup. If your machine knows where the chapter/scene/paragraph breaks are, which sections are indented or fixed format, where to use bold or underlined text, then it can make far better decisions about how to lay out the text than any predetermined form. And that, after all, is pretty much what HTML was invented for, however much people insist on using it to specify things pixel-by-pixel...

(While we're on the subject, there's one simple thing HTML can't do well: preserve double-spaces at the ends of sentences. You can faff around with   codes, but as well as the hassle, they don't always work as you expect, especially around line endings, and SlashCode seems to strip them out anyway!)

But apart from that, HTML seems a much better format for ebooks than PDF.

However, even plain text can work very well. A few simple conventions for chapter headings, scene breaks, /italics/, *bold*, &c, and you have something which is easy to use and works with far more existing tools and readers than anything else out there. (The same applies to formats which are isomorphic to plain text, such as Palm DOC and TCR.)

Re:misconceptions about e-books (1)

nautical9 (469723) | more than 10 years ago | (#7401455)

Unfortunately, for citations and other references, it's useful if not necessary to specify page, paragraph, and sometimes even line numbers and have them to point to the same place regardless of the "format".

Something like this could be kludged by going overboard with <p id="chpt.1,p.55,par.4"> type tags using HTML, but there's something to be said for preserving the exact formatting of the original text.

A lesser problem is properly preserving hypenation, which can pose a problem with HTML as well. I'd love to be rid of format-induced hyphens forever (ie. the ones at the end of a line), but as most of these archival projects involve massive amounts of OCRing pages, and computers can have a tough time figuring out which hyphens are because of formatting and which are simply part of the word that just ended up at the end of a line, a human would have to manually fix these up.

I agree that PDFs are a horrible format, as I enjoy reading eBooks on my 160x160 Treo, but text and HTML have their own problems as well.

Re:misconceptions about e-books (1)

gidds (56397) | more than 10 years ago | (#7401707)

Citations may not be a big issue for many people. And if they are, maybe some formatting-independent measure can be used; something like 'Chapter M, paragraph N' might be awkward for printed books, but might be easily-automated and accurate enough for ebooks. And it wouldn't need any changes to the text itself.

Similarly, hyphenation is another artefact of the limitations of printing; surely ebooks shouldn't need to suffer from those limitations too? If text is stored in paragraphs, then it's up to the renderer to decide how to flow the lines, whether breaking at whole words (justifying or not), or using its own dictionary to decide good hyphenation points. Personally, I think that hyphenation's main purpose is to avoid wasting paper rather than making things any easier to read, and as that's simply not a consideration for ebooks, we're better off without hyphenation at all.

Presumably, we're not talking so much of converting existing editions, but of making new ones? If so, I expect most publishers have been keeping the pre-hyphenation text of their books for a while now, so OCR limitations may not be an issue for kosher ebooks.

Re:misconceptions about e-books (1)

nautical9 (469723) | more than 10 years ago | (#7402602)

You're right - I misread the summary as meaning older texts to be added to a "museum", in which case you would want to preserve formatting so that existing citations would match up when looked at 100 years down the road.

As for citations not meaning much to most people, perhaps, but the works that are most often referenced and studied by others (be them scientific, religious, or [soon to be] classic literature) are also likely the ones you want to preserve and have an accurate, concise way to reference any given passage.

As for all ebooks going forward, I'm sure a format-independent methodology could and should be used. And just as I'm too lazy to RTFA, I'm also too lazy to do a bit of googling to find any existing citation formats for the "new age". :)

Re:misconceptions about e-books (1)

HoldmyCauls (239328) | more than 10 years ago | (#7401442)

According to that PDF, the cost was 32.3 cents on the dollar, or nearly a third. That's a lot of money considering most college texts are between 20 and 50 dollars. That's 6 bucks off the largest of my lit class books, and 15 off my Java and UNIX books while I was majoring in SE.

Here I'd like to note that I saved on my Shakespeare and Jonson class by finding nearly every text on Project Gutenberg (if you need a link to get there, shame on you! [] ), while even at the used shops they were 4 and 5 pounds apiece (I'm at UWA right now on exchange. Enjoyed the Guy Fawkes celebration on the beach!).

Re:misconceptions about e-books (1)

bcrowell (177657) | more than 10 years ago | (#7402472)

The category in their breakdown is "paper, printing, and editorial costs," so only part of that 32% is for the physical production of the book. The actual cost of paper, printing and binding depends on a lot of factors. Most importantly, it depends on how many colors of ink were used (1 to 4), and on the length of the press run.

Re:misconceptions about e-books (1)

dvdeug (5033) | more than 10 years ago | (#7404512)

I would like to see the Library of Congress start accepting digital books for copyright registration, however -- it's a drag to have to send them hardcopies.

The whole point of accepting hardcopies is so they have something to store (and the preservation of paper is well studied) and check out.

Library of Congress on my Ipod (1)

Dinglenuts (691550) | more than 10 years ago | (#7399819)

Seriously, how much storage would you need for the Library of Congress? If I can fit the human genome on my Ipod right now, what size hard drive do I need for a gajillion books?

Re:Library of Congress on my Ipod (1)

bitmason (191759) | more than 10 years ago | (#7400841)

Here's one guess from Brewster Kahle at Alexa Internet:

"guess-timated" the Library of Congress' existing print holdings as "about 20 terabytes or $200,000 in storage space. It would take up the space of a couple of Coke machines." Of course, unlike Alexa Internet, which takes everything on pages including video clips, sound, and graphics, Kahle's estimate for digital storage of LC's print collection reflects "only the text, all ASCII. The graphics would get very complicated to estimate."

*snore* (0)

Anonymous Coward | more than 10 years ago | (#7399820)


Crypto-luddite? (1)

CausticWindow (632215) | more than 10 years ago | (#7399868)

That's a nice way to dismiss any criticism of the subject at hand (I'm not saying that it deserves any criticism though).

If you don't agree, you're a luddite, and if you claim you're not a luddite, disagreeing will make you a crypto-luddite. It's almost like the unbeatable logic behind "denial is the first symptom of addiction".

Library of Congress is working on it (3, Interesting)

spotteddog (234814) | more than 10 years ago | (#7399880)

The Library of Congress is already working on a program for preserving "digitally born" documents. Look at

*disclaimer: I currently work at the Library of Congress, but not on this project.

Re:Library of Congress is working on it (2, Interesting)

DavidRothman (646393) | more than 10 years ago | (#7400420)

No NIH syndrome, I'd hope. Keep in mind that the E-Book Museum proposal [] focuses on the artifacts that the public can see right there in person and on the Net--the machines and the media, as well as videos of old e-book references in movies, on TV, and so on. That's a different issue from content preservation per se. What's more, the TeleRead item already includes a link to don't think I've denied LOC credit for existing activities. What I have in mind, of course, would make the preservation job easier by reducing the chance that LOC would be out of luck because it could not find the right machines to display dead formats and emulation was tricky. Let's hope, of course, that a nonproprietary standard e-book format arrives soon, but if nothing else, as I've noted in the TeleRead item, the E-Book Museum could help cope with the present mess. Please take a look at what I wrote for TeleRead, not just the quick summary. And by the way, I'm right across the Potomac River in Alexandria, and, though I realize you don't deal with e-books at LOC, you or colleagues are welcome to reach me at 703-370-6540. Thanks! David Rothman, for

Usenet belongs at the LOC, Royal Library, etc. (1)

SgtChaireBourne (457691) | more than 10 years ago | (#7404928)

Usenet is currently the most significant "born digital" internation collection of documents. No I don't mean all those binary groups, but the ones that are conveniently already in ASCII, ISO-8859-*, or Unicode. Amidst the noise, there is a lot of knowledge there.

A significant amount of early Internet history is there as well: Stuff you don't/won't see in AOL or MSN and stuff you certainly won't see in newspapers or books anymore because it doesn't validate today's corporate dogma.

The Usenet archives need to find some independent mirrors before Google gets torn to shreds and its remains sold of to appease shareholder pressure. It's not hard to imaging the new group of MBA [] overlords deciding that maintenance of the archive is not profitable enough to warrant the active effort it takes to keep it from entropy.

DRM? (1)

iantri (687643) | more than 10 years ago | (#7400363)

Won't DRM make it difficult for the Library of Congress to archive these? What about when it needs to be transferred to a new digital format (because paper has been around for ages; computer technology completely changes every 10 years)?

Re:DRM? (1)

ajs318 (655362) | more than 10 years ago | (#7401584)

Get in touch with your MP or foreign equivalent right now and point out the need for archival copies of normally-DRM'ed works to be made in some unencumbered form. As a disincentive against the misuse of DRM, the provision of an unprotected version for archival should be a precondition to validate any law which would make it an offence to bypass said DRM restrictions. So if you haven't deposited an unprotected copy of your work at your own expense with your National Library, then any anti-circumvention laws could not be used against anyone having a go at your protection {their circumvention efforts may be regarded simply as a reasonable-force effort to bypass the protection for the benefit of future generations}.

Public domain E-Book museum needs your help! (1, Interesting)

Anonymous Coward | more than 10 years ago | (#7400489)

Distributed Proofreaders [] is the main source of public domain electronic books. It is part of Project Gutenberg [] . DP consists of thousands of volunteers doing hundreds of books each month, and some of our math books, for which DP is using LaTeX. Thus, the project needs savvy (La)TeX folk to correct the OCRed texts.

Thus, if you have a spare ten minutes now and then, you can make a significant contribution to public domain and mathematics. The finished e-books are free, downloadable, and computer-searchable. Sign up here! []

The work is done through a web interface that lets you compare a scanned page image against OCRed text, and make any necessary changes to the text. The interface works with most browsers, from IE and Netscape to Mozilla and Opera. (I have proofread a couple pages myself, and can vouch for it being straightforward.) You can do one page whenever you have time or a hundred a day -- it's up to you. No commitments, no schedules.

If you'd like a change from mathematics, there are plenty of other books to do: from classics to pot-boilers, in English, French, German, Dutch,
Finnish, Swedish, etc.

Re:Public domain E-Book museum needs your help! (1)

DavidRothman (646393) | more than 10 years ago | (#7400702)

Glad to see a plug for PG and DP! Needless to say, in the full-length post, I noted that the proposed E-Book Museum could feature a video interview with Michael Hart as well as the terminal he used in the early days of PG (or an equivalet). Would be one more way to promote PG and the related question of volunteering! I myself recently worked with other volunteers on Upton Sinclear's "The Brass Check." Bottom line? No conflict between The Computer Museum idea and PG, just synergy. - DR

What's already there? (1)

Dinglenuts (691550) | more than 10 years ago | (#7401246)

I would think that many publishers already have a great deal of their works completely digitized at this point. As an aside, PDF would be just fine for a project like this, in fact pdf might even be overkill. And as far as how much trouble upgrading in 10 years will be, that's bollocks if the system is done right.

[spam] Current Source for Electronic Archives (2, Informative)

Flabio (111772) | more than 10 years ago | (#7401306)

People looking for electronic archives should check out [] . We have what I believe is the largest online archive of electronic documents. 25,000 documents online right now and another 225,000 waiting to be sorted by our librarians. As a warning, though, it's a mish-mash of stuff. A lot of full books, but a lot of other crap too: Old hacker 'zines, random usenet archives, and other more esoteric things.

Plus, it's an open community. Anyone can become a librarian on the site and help sort documents.

It's mainly NOT about e-books! (2, Interesting)

philipx (521085) | more than 10 years ago | (#7401323)

If you read the article, it's actually mainly NOT about books but rather about the other digital publications: zines, online-newspapers, et all.

I think this is very useful as a large number of online versions of paper zines & newspapers have far more resources than their dead-tree counter part. Wall Street Journal and The Financial Times to name the few. So far, there was no central and/or organized way to capture this information.

I also liked the bit: "This new legislation means that a vital part of the nation's heritage will be safe and accessible as an important resource for businesses and education users in the future," said Mole.

Re:It's mainly NOT about e-books! (1)

DavidRothman (646393) | more than 10 years ago | (#7404988)

Just a gentle reminder that the E-Book Museum would preserve the artifacts and tell the story of the technology. It would not be so much of a content-preservation project. That's for other worthy endeavors and proposed endeavors [] . As for the role of e-books in the UK content-preservation initiative, they are at least among the items included. Good enough for the point to be made! Needless to say, I couldn't agree with you more about the usefulness of preserving e-copies of nonbooks, too, such as magazines and newspapers. - DR

On s similar(ish) note... (1)

Raccroc (238757) | more than 10 years ago | (#7402846)

Around 5 or so years back the Library of Congress (or one of its peers) started digitally archiving old LPs and other recordings to preserve them. I know at one time, this archive was publicly available, but I've no idea of its current status or availability.

An example of the content is it had several hours of mp3s transferred from live interviews of hillbilly moonshines. How-to's, stories, tales, etc...

I'm curious if anyone knows where this might be, who is running it, and if it's still around?


Museum of failed, overhyped technology (1)

Pathetic Coward (33033) | more than 10 years ago | (#7406028)

Oh,gee. Ebooks. What's next? Virtual reality? Esther Dyson?
Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?