×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Project Gutenberg's 32nd Birthday

timothy posted more than 10 years ago | from the read-franklin's-autobiography dept.

Education 178

David Moynihan writes "July 4th marks the 32nd anniversary of that day in 1971 when Michael Hart first sped an all-caps version of the Declaration of Independence to anyone and everyone then on what later became the web, thus founding Project Gutenberg. Thanks to an army of volunteers and the Distributed Proofreaders, this is the last year PG will have fewer than 10,000 titles. Strangely, Microsoft picked this dual anniversary of literacy and freedom to re-launch their Reader product, with three free bestsellers a week, if you activate the new version with Passport, sign a EULA, etc. Real reason for the upgrade might be that the DRM on MS's old Reader was cracked. If you're not into giving away data, or are running a system other than Windows, maybe you could take the time to tell a friend about free books online, or even help out by visiting the Distributed Proofers and editing one page per day."

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

178 comments

Must...avoid...Steve...Gutenberg...joke... (4, Funny)

mikeophile (647318) | more than 10 years ago | (#6368257)

Seriously, awesome work people.

YOU FAIL IT!! (-1, Troll)

Anonymous Coward | more than 10 years ago | (#6368478)

I do hearby claim your failed first post as my own! This FIRST POST is claimed for the Queen of Spain!

Re:YOU FAIL IT!! (0)

Anonymous Coward | more than 10 years ago | (#6368493)

Just because your dad likes to dress like an archaic seniorita, doesn't make him the Queen of Spain.

You can't be serious (5, Funny)

ryants (310088) | more than 10 years ago | (#6368271)

even help out by visiting the Distributed Proofers and editing one page per day.
You can't seriously be asking Slashdotters to volunteer as proofreaders.

Re:You can't be serious (4, Funny)

BabyDave (575083) | more than 10 years ago | (#6368291)

Could be worse - they could be asking the Slashdot editors!

Re:You can't be serious (1)

AndroidCat (229562) | more than 10 years ago | (#6368340)

Could be worser - they could be asking the Slashdot trolls!

What all these Soviet Russia and All Your Base changes that have been marked in? And what is goatse?

Re:You can't be serious (0, Troll)

psylent (638032) | more than 10 years ago | (#6368384)

"could be worser" ... go back to school dude.

Re:You can't be serious (0)

Anonymous Coward | more than 10 years ago | (#6368397)

>==**WHOOSH**==>

Re:You can't be serious (2, Funny)

MadCow42 (243108) | more than 10 years ago | (#6368536)

Well, it wouldn't be THAT bad, we'd just have 5 different versions of each book, each released about a day apart.

MadCow.

Re:You can't be serious (0)

Anonymous Coward | more than 10 years ago | (#6368641)

"It was the best of times, it was the *BLURST* of times?!?"

Re:You can't be serious (5, Interesting)

Aldarondo (686893) | more than 10 years ago | (#6368407)

As one that has been involved with Distributed Proofreaders for the past 18 months, yes we are serious about having Slashdot people proofread. The last time a story about D.P. ran in November, thousands of new users joined us and helped us grow and expand to our current size.

Go and check it out, there is great work being done there. (I am a bit biased though). Click here [pgdp.net] for a history of DP.

Re:You can't be serious (1, Interesting)

tommertron (640180) | more than 10 years ago | (#6368814)

The thing is, this brings up a somewhat serious point. I've proofread professionally in the past, and I know that it's hard and nobody's perfect at doing it. An open approach might work with software, because anyone can easily test it: there are bugs in the program. But without a wiki-type format (www.wikipedia.org) who is there to make sure it's proofread properly? If this is proofread incorrectly and distributed to schools and stuff, I have to worry about the quality level of the texts students are learning with if they use the free texts. I have in fact read a lot of public domain texts, and find typos and grammatical errors to be fairly common in them. Would a wiki-format help open texts? (Or maybe a moderated wiki-format.)

Eh? The text is in ASCII but requires JavaScript? (1)

WuphonsReach (684551) | more than 10 years ago | (#6368911)

So I try to go to http://www.pgdp.net/ [pgdp.net] - only to find out that the page won't load unless you enable JavaScript!

Um... I thought PG was all about not using the latest bells and whistles? (semi-facetious)

Now for the marketing... (4, Insightful)

Blaine Hilton (626259) | more than 10 years ago | (#6368272)

Now all we need is more people promoting this in schools and printing the books. Much like the IA Bookmobile [archive.org]. It seems like the people who could use this the most, don't even know it exists.

Re:Now for the marketing... (1, Insightful)

Googa (608200) | more than 10 years ago | (#6368321)

Yes, I can agree with this. We people here won't benefit from it half as much as needy school districts who could use the texts. Methinks what they really need to do is work on some awareness program, distributing the books to teachers... or even letting know that such a resource exists. With more technology in the classroom, Gutenberg shouldn't be out of reach to many teachers.

Re:Now for the marketing... (0)

Anonymous Coward | more than 10 years ago | (#6368328)

Yes, because needy school districts have tons of computers and speedy internet connection.

Re:Now for the marketing... (0)

Googa (608200) | more than 10 years ago | (#6368368)

A speedy internet connection and tons of computers wouldn't be needed to print out documents from Gutenberg.

My point is that if schools know of this, they would realize that it would be cheaper in the longrun to get texts off Gutenberg, instead of buying pre-bound books elsewhere.

Cheaper, but useful? (3, Insightful)

yerricde (125198) | more than 10 years ago | (#6368896)

A speedy internet connection and tons of computers wouldn't be needed to print out documents from Gutenberg.

It still costs money to turn downloaded digital copies of works into printed copies for 100 students in a grade level.

they would realize that it would be cheaper in the longrun to get texts off Gutenberg, instead of buying pre-bound books elsewhere.

Public domain etexts, such as those offered by Project Gutenberg, would be useful in schools only under limited circumstances. Though they would be useful in literature classes in high school (and possibly middle school), forget about them in elementary school, where most books are illustrated, because most PG editions leave out illustrations. Forget about them in science classes as well; the 1911 Encyclopaedia Britannica [1911encyclopedia.com] contains outdated views of anything scientific, and anything significantly newer is tied up forever in the Bono Act and its obligatory sequels. And what keeps a publisher from tying purchases of its science books to purchases of its literature books?

Re:Now for the marketing... (2, Insightful)

AndroidCat (229562) | more than 10 years ago | (#6368610)

and speedy internet connection

The first Gutenberg books I came across were being passed around BBSs at 2400 bps or so. When they started 32 years ago, 110, maybe 300 bps. Who cares? Check the size of the files, these aren't Word documents, you know.

Re:Now for the marketing... (0)

Anonymous Coward | more than 10 years ago | (#6368779)

The city of Atlanta spends $12,000 per student every year. If they choose to waste it instead of purchasing computers than it is their own fault.

All caps? (0, Funny)

Anonymous Coward | more than 10 years ago | (#6368273)

They had AOL back then?

You can't even make an AOL joke? (0)

Anonymous Coward | more than 10 years ago | (#6368377)

On Slashdot? Sheesh.

Gutenberg (-1, Troll)

Anonymous Coward | more than 10 years ago | (#6368274)

It's german for LINUX SUCKS!

Re:Gutenberg (-1, Troll)

Anonymous Coward | more than 10 years ago | (#6368310)

Hi this is Richard Stallman, you may remember me from trolls such as, "losing her to GNU" and "Why run Linux on a Mac."
In 1971 Project GNUtenberg's website was designed by a intern using GNU/Microsoft FrontPage, she made a typo and we have only recently discovered this error, please rename all references to GNUtenberg project, the websight will be updated vary soon.

Teh 5dwm is released (-1, Troll)

Anonymous Coward | more than 10 years ago | (#6368287)

ERIK MASSON: YOU ROCK MY WORLD! (-1, Troll)

Anonymous Coward | more than 10 years ago | (#6368545)

Thank god, you FINALLY released the code.
GOOD WORK YOU SACK OF SHIT

ATTN Crapflooders (-1, Offtopic)

Anonymous Coward | more than 10 years ago | (#6368296)

What is the script/program you all use? Shitstorm doesn't work for me unless I register a bunch of accounts.

Thanks

doh (1, Funny)

Anonymous Coward | more than 10 years ago | (#6368300)

Download Error

You'll need to install and activate the current version of Microsoft Reader before you can download these Owner-Exclusive titles.

Click here to get started now.


No Linux version!? Gah.

Re:doh (0)

Anonymous Coward | more than 10 years ago | (#6368452)

try cat or less

very timely for me (5, Interesting)

b17bmbr (608864) | more than 10 years ago | (#6368315)

i am going to be teaching modern civ next year in high school (i have been at the junior high for 7 years) , and have already gone to the site and gotten works from aristotle, plato, locke, montesque, et al. thanks guys. there is still something to be said for a classical education. glad somebody is doing all they can to preserve the classics, especially with all the assaults on it from the social reconstructionists.

founding fathers (4, Funny)

Tablizer (95088) | more than 10 years ago | (#6368317)

...first sped an all-caps version of the Declaration of Independence to anyone and everyone then on what later became the web

I knew it! This country was founded by COBOLers.

Re:founding fathers (3, Funny)

Anonymous Coward | more than 10 years ago | (#6368343)

ADD 1 TO POST-POINTS.
MOVE "Funny" TO POST-STATUS.

(That's Cobol, for those who don't know)

Re:founding fathers (1)

mayotte (686049) | more than 10 years ago | (#6368909)

I was thinking of something along the same lines. Imagine a public reading of that text, SHOUTED AT THE TOP OF HIS LUNGS.

Really great work by the guys behind the project! (5, Interesting)

jaemark (601833) | more than 10 years ago | (#6368322)

There's really a problem though about getting the word out to people, in pretty much the same way the popularity of libraries today has been dropping. A good idea would be a separate advocacy site to come up with lists of texts in the project (i.e. What's New?, Most Popular, etc.) to help people wade in immediately.

Re:Really great work by the guys behind the projec (3, Insightful)

Cthefuture (665326) | more than 10 years ago | (#6368447)

Yes, they need something like that badly.

I remember poking around on PG not long ago but soon forgot about it.

If you're not looking for something specific then the site is kinda, meh. As you suggested, they need a news site, ratings, and other stats so you can see what's available.

And sections. "Technical", "Poetry", etc. Otherwise it's not very useful to the casual browser.

Re:Really great work by the guys behind the projec (-1, Flamebait)

Knife_Edge (582068) | more than 10 years ago | (#6368806)

"the popularity of libraries today has been dropping."

This is because stupid people have more children than well-educated ones. Because of this, as the population increases, the amount of interest in education and even reading remains about the same. Therefore dropping proportionate to the increase in population.

I think it is unfair to say that the popularity of libraries is dropping, just that the level of interest is not increasing with increased population. If you accept that stupid people have more children, this is what you would expect. Right?

Re:Really great work by the guys behind the projec (2, Interesting)

Anonymous Coward | more than 10 years ago | (#6368875)

Want to know what's new, etc? The Project Gutenberg website admittedly sucks, and their ASCII adherence admittedly verges on dogma, but there is a good substitute:

The Online Books Page
http://digital.library.upenn.edu/books/ [upenn.edu]

It currently has 20,000 FREE titles listed, from hundreds (at least!) of sources, in all subjects, beautifully categorizes by title, author and subject--and topped off by an up-to-date what's new listing and a fine search engine. Much props to John Mark Ockerbloom and the University of Pennsylvania for supporting the site.

P.S. Won't one of you nice Slashdotters with time or interest in good works consider doing a complete redesign of the PG site, a full-text on-site search engine for the texts, a better categorization system and just a decent, half-respectable look? It don't get no respect lookin' as it does now. Among other things, the lack of internal organization means that individual texts get shafted in Google rankings.

Re:Really great work by the guys behind the projec (0)

Anonymous Coward | more than 10 years ago | (#6368990)

There is a feature on DP right now that allows you to see the 10 latest projects posted to Project Gutenberg. It is in RSS format and at http://www.pgdp.net/c/feeds/backend.php?content=po sted&type=rss

Joseph Gruber
DP Developer

More free books (5, Informative)

Cruciform (42896) | more than 10 years ago | (#6368324)

The Baen Free library [baen.com] has a number of titles available in several formats.

It's a great way to introduce readers to a series or a talented new author.

'reader' books not much cheaper (3, Insightful)

Chmarr (18662) | more than 10 years ago | (#6368332)

Just on a whim, I decided to see how much cheaper titles in microsoft reader format was over a physical book.

I went to the MS Reader site and followed the links to the on-line publishers sites (such as B&N and amazon). In most cases, the reader format is only $1 cheaper, and sometimes $2 more expensive, than the corresponding paper book (soft or hardcover).

So... why in the world would anyone want to use a format that ties them to the computer?? With a paperback, I can read it anywhere, read for as long as I want without having to change batteries, and even pass the book onto a friend.

If they want to make the electonic formats more attractive, they need to make them a LOT cheaper than the corresponding paper version.

Re:'reader' books not much cheaper (2, Interesting)

Jonathan (5011) | more than 10 years ago | (#6368361)

So... why in the world would anyone want to use a format that ties them to the computer?? With a paperback, I can read it anywhere, read for as long as I want without having to change batteries, and even pass the book onto a friend

Well, I don't use MS-Reader myself (For commercial e-books I like the cross-platform Mobipocket), but a major reason I like e-books is I like to read them on my PDA -- not to save money. I carry my PDA around anyway, and having e-books means less to carry. I would purchase all my books as e-books if they were available as such.

Re:'reader' books not much cheaper (3, Interesting)

Joe Tie. (567096) | more than 10 years ago | (#6368453)

Someone else mentioned the fact that he's got a reader with him all the time anyway, which makes it pretty conveinent to have a book or three in there. I'm not going to bring a book around with me everywhere I go just on the offchance that I might get stuck in a long line, or waiting for someone. But when such an event happens, having good reading material right at hand is very nice. Also nice is being able to have a selection of books in there at any one time, just in case I finish one book while waiting somwhere.

Battery life isn't much of an issue for me. I've got an older ipaq, and even with that I can usually squeeze about ten hours out of it with the addition of an extra battery pack that's small enough to tote around with the pda. Hooking it up isn't much of an issue. Take out of pocket, plug into pda. And if at home, the power situation wouldn't be an issue.

Re:'reader' books not much cheaper (1)

donutello (88309) | more than 10 years ago | (#6368508)

I have never really used the reader however an advantage to having the book electronically is being able to search.

Re:'reader' books not much cheaper (1)

dissy (172727) | more than 10 years ago | (#6368609)

> and even pass the book onto a friend.

Ahh but you are forgetting, in the USA, you cant do that.
Well, you can, but then you are voilating copyright and thus a criminal.

The law specifically says you can not distribute a work that is copyrighted without the copyright holders permission.

The only reason its not _illegal_ is because of fair use laws, but the DMCA removed most of those, and the next version of law change will no doubt remove most or all of the rest.

Its only a matter of time if things dont start getting better soon.

Software and music labels already go after people selling used copyrighted materials online (Ebay and amazon and such)
Once its in their best interests to do this to real world stores, they will. And they will win there too.

Its sad, and it sucks, and i hate it too.. but its true :{

Re:'reader' books not much cheaper (1, Informative)

Anonymous Coward | more than 10 years ago | (#6368837)

Passing a hardcover or paperback book on to a friend is not a copyright violation in the U.S. and does not make you a criminal.

The principle that protects you is not Fair Use, but First Sale Doctrine -- which says that once a copyright holder distributes a copy of a work, the copyright holder loses any right to control further redistribution of that copy.

First sale doctrine (3, Informative)

yerricde (125198) | more than 10 years ago | (#6368847)

The law specifically says you can not distribute a work that is copyrighted without the copyright holders permission.

True, 17 USC 106 [cornell.edu] says that, but it limits itself "Subject to sections 107 through 121", such as 17 USC 109 [cornell.edu]:

Notwithstanding the provisions of section 106(3), the owner of a particular copy or phonorecord lawfully made under this title, or any person authorized by such owner, is entitled, without the authority of the copyright owner, to sell or otherwise dispose of the possession of that copy or phonorecord.

fair use laws, but the DMCA removed most of those

From the DMCA: "Nothing in this section [cornell.edu] shall affect rights, remedies, limitations, or defenses to copyright infringement, including fair use, under this title."

Re:'reader' books not much cheaper (1)

Jerf (17166) | more than 10 years ago | (#6368913)

I went to the MS Reader site and followed the links to the on-line publishers sites (such as B&N and amazon). In most cases, the reader format is only $1 cheaper, and sometimes $2 more expensive, than the corresponding paper book (soft or hardcover).

These facts being plainly obvious, the logical conclusion is either that A: The cost of setting up the Reader infrastructure is so high that these high prices must be charged to recoup them, or B: They want them to fail.

I don't know which it is. But there comes a time where the choice "They don't realize how stupid this is" ceases to be an option, and I think this is one of those times. These are not stupid people, they are out to make a buck, and if they aren't making money directly, they either expect to make money in the future, or are making it indirectly.

Re:'reader' books not much cheaper (1)

mikeboone (163222) | more than 10 years ago | (#6368914)

I agree that the eBook prices are too high. I've settled for reading the classics on my Handspring Visor.

Check out Plucker Books [pluckerbooks.com]. These are Gutenberg books formatted for the Plucker reader.

I still prefer a real book, but these come in handy when I'm feeding my infant son...bottle in one hand and Visor in the other.

Huh??? (2, Insightful)

lilricky (632829) | more than 10 years ago | (#6368335)

"...to anyone and everyone then on what later became the web..." What?? In 1971 http protocol was around? Or is the author trying to suggest that the internet became the web? I thought the web was part of the internet, not a replacement for. Perhaps Im misreading the article.

Re:Huh??? (2, Insightful)

dissy (172727) | more than 10 years ago | (#6368629)

> "...to anyone and everyone then on what later became the web..." What??

I think they are saying in 1971 it was distributed to anyone and everyone...
Then, on what later became the web, they distributed it there too.

Keeping in mind the web ripped most of its ideas from gopher, and FTP before that, so the web wasnt a breakthrough idea out of nothingness.
But i dont think they meant it as 'distributed on one medium which later that medium turned into the web'

Thats atleast how i believe it was suppost to be read.. Hard to tell without commas and what not ;}

XML please (3, Insightful)

DrXym (126579) | more than 10 years ago | (#6368352)

Gutenberg is great and all, but it really needs to dump the text format. So much information is lost that it makes reading some texts extremely difficult. Some format that preserved chapter headings, footnotes, illustrations etc. would be a massive step forward.

Re:XML please (4, Informative)

starseeker (141897) | more than 10 years ago | (#6368410)

I think they discuss this somewhere. The whole point of ASCII is that it can be accessed simply, by almost any machine. It is as stable a format as you will find for data storage, anywhere. They are commited to these books being widely readable, and ASCII is the best way to assure this.

However, I agree that some books (most actually) lose something in ASCII. What I would like to see is a project which works off the basic Gutenberg texts and formats them in a readable way, preserves illustrations, etc. But it should be an add on to the project, not the main project. Also, remember that that level of preservation is much harder than just typing in and proofreading - you have to consider formatting and scanning images as well.

As a temporary measure, it would be nice to see someone do an XML markup that can be easily translated into LaTeX, so people can have pdfs with nice fonts, table of contents, title page, etc. That would be a step up. But to do it properly would take a separate effort, and a very large scale one even by Gutenberg standards. Worthwhile, yes. But involved.

Re:XML please (1)

AndroidCat (229562) | more than 10 years ago | (#6368468)

a very large scale one even by Gutenberg standards

I wonder. Does Gutenberg keep their sources in ASCII or something else that they runoff to produce the ASCII final version? It might be that they already have formating information that a smarter runoff process could use. (Heh, I can dream, right?)

Re:XML please (3, Informative)

belbo (11799) | more than 10 years ago | (#6368529)

The final ASCII version is also produced by hand. After two rounds of proofing, the text gets into a queue. From that queue, a 'post-processor' checks it out and reformats it according to the Gutenberg guidelines, along with any error corrections that might still be necessary. Then she or he uploads the final version to Project Gutenberg, where the 'whitewashers' check the text yet again before posting it to the archive.

About the XML: You are in fact welcome to produce an XML version, I believe some fellows at DP indeed do that already. However, the main version is the simple text version, since you can read that with everything. But nothing keeps you from also posting an XML or PDF or TeX or whatever version.

belbo, post-processor at DP

(Boy I do hope there are no spelling errors in this *g*)

Re:XML please (1)

AndroidCat (229562) | more than 10 years ago | (#6368664)

It seems a lot of work. (From here at a distance.) I know that even when they started there were tools for that sort of work. (I just found my DTSS RUNOFF*** manual, whee!)

Ah well, if they have a standard way formating ASCII text then producing an XML version from it should be too daunting. (Me, once again from a distance.)

But an automated translation to Klingon, priceless! (I'm joking, that would be daunting!)

Re:XML please (4, Insightful)

fm6 (162816) | more than 10 years ago | (#6368533)

The whole point of ASCII is that it can be accessed simply, by almost any machine.
Just because you store something in XML, doesn't mean people have to use XML to read it. The whole point of XML is to have a format that you can easily transform. Transforming in ASCII is particularly easy.
XML markup that can be easily translated into LaTeX
If it's a good content-oriented XML app, it's easily transformed into LaTeX, or anything else. If it isn't a good content-oriented XML app (the StarOffice native format comes to mind) then it shouldn't be used for an online document repository.

I think the basic problem with the Guttenberg/DP people is that they've been doing things a certain way for so long, and they don't want to retool. And I can see their point -- changing over to XML is a lot of work. And the core DP team already seems pretty busy keeping the web site going.

On the other hand, I do wish they'd make it a priority. Right now I'm a volunteer proofreader, concentrating on getting out the famous Britannica 11th edition [wikipedia.org]. The amount of information that gets lost in scanning in Greek and other text with weird phonological conventions is just appalling. And the conventions for math and science formulas and equations produces a complex linear format I can't believe anyone would actually want to read.

Then again, it wouldn't be that hard to go back and insert proper markup. For 90% of the text there's a simple transform between the Gutenberg conventions and a reasonable XML format. The other 10% probably need another look anyway, and wouldn't be hard to do if they've saved the scan images. I haven't had the heart to ask if they do.

Re:XML please (1)

andrewjjenkins (617179) | more than 10 years ago | (#6368773)

They scan anyways - the proofreaders compare the ASCII version to the scanned image of a page to make sure they match.

Re:XML please (0)

Anonymous Coward | more than 10 years ago | (#6368439)

I totaly agree. Once the format was standardized developers could more easily create software to display or search the information in which ever way they choose. End user could then use different viewers depending on their intended use for the information.

Re:XML please (3, Informative)

DarkOx (621550) | more than 10 years ago | (#6368445)

The entire point of the project is to preserver the content in a format that is both human and machine readable. See if I don't have any software from the present here in fifteen years and XML is long dead I will still be able to read standard ASCII text even if I am just cat(ing) it through less or printing it as is. I can't resonably read a book that is filled with XML tags and if there is no longer software to parse them then its not to useful. I am not saying that it would be hard to write such software but, the concept is to make sure its easy and always easy to get the data. Also they do put chapter breaks in as text so if you want to find one most wordprocs and e-book readers these days even the fifteen year old ones can find text strings.

Re:XML please (4, Insightful)

Eloquence (144160) | more than 10 years ago | (#6368480)

I can't resonably read a book that is filled with XML tags and if there is no longer software to parse them then its not to useful.

This is complete bullshit. With a proper setup you would convert the source into multiple output formats, including TXT, but you would keep the source in a format that maintains meta information such as formatting, chapters and pages. XML is used in the entire industry exactly with the expectation that it will be around for decades. Even if it won't, the open source code that we have to parse it will not magically disappear -- PG would keep using it to generate output texts from the XML source through all these years. You might as well argue that ASCII will go away.

Re:XML please (2, Insightful)

GigsVT (208848) | more than 10 years ago | (#6368503)

With a proper setup you could read MS Word 2000 docs 100 years from now too. The whole point is to not make it reliant on any particular software, or any particular fad.

XML hasn't been around long enough to say whether it is a fad or not. ASCII has been around longer than most of us have existed.

What would Captain Kirk say? (0)

Anonymous Coward | more than 10 years ago | (#6368506)

Imagine! What if he reads out the Yangs' Holy text, and "We The People" isn't in at least bold text?

Re:XML please (2, Insightful)

Vann_v2 (213760) | more than 10 years ago | (#6368554)

With some works the layout itself is an important part in comprehending them. Do blindly remove the formatting so that everyone can read it is an injustice to the original author.

Re:XML please (3, Insightful)

DrXym (126579) | more than 10 years ago | (#6368568)

Yeah but the entire point of XML is that it defines structure not presentation. If you want to go off and produce something which is readable in some other format (e.g. text), feed the document through some XSL transformation or perl script and it pops out the other end in any way you desire. Someone else can feed it through something that produces a PDF, someone else a Palm e-Book, someone else braille. And this can all be automated on the server. Everyone is happy.


As for XML being long dead, this is highly unlikely. XML is just structured data and is itself just text. It would be trivial 5, 10, or even 100 years from now to pull out the data from the xml format in any way you please. Unless the grammar is horribly mangled (MS Office), it would even be possible to infer it without even knowing the grammar. I would trust Gutenberg to collectively come up with a format which would be simple for proof readers and parsers alike.

Re:XML please (4, Informative)

Teancum (67324) | more than 10 years ago | (#6368511)

Michael Hart has repeatedly made mention that he does not want to get caught up into the fad of the moment with text formatting issues, and that plain old ASCII is one constant that hasn't needed changing. Indeed, you can open up the original Declaration of Independence document with your standard web browser, and you can still read it just fine. I dare you to try and find any other data format that was commonly used 32 years ago that you can still read with current equipment.

With that said, I believe that XML is perhaps going to have the staying power that ASCII text has had for the past many years. And there are many volunteer projects that you can get involved with that do this including:

The HTML Writers Guild [hwg.org] - Originally they were trying to convert all of the gutenberg texts to HTML, which has been admittedly a resonable standard for a good number of years. Currently they are now going to a version of XML with some standard headings for titles, copyright info (or lack thereof), chapter headings and so forth. More is on this website.

Project Gutenberg XML [pgxml.org]This is a group more dedicated to the XML, but has a very similar purpose.

The point here is that once the data is put into ASCII text format, projects like this can and are being done. If you really feel that you want to help with the effort, please join one of these. Also, at any time you can also take the Project Gutenberg files yourself and do this, but at least this gives you a forum to share your work once you are done.

Re:XML please (1)

DrXym (126579) | more than 10 years ago | (#6368625)

The thing is, XML is just plain ascii too (assuming you mandate not to use Unicode or some weird charset), so therefore you're not reducing the ability of people to read the text. At worst they'd be inconvenienced by extra tags if they tried to read it raw, but then again they wouldn't have to.


The reason for this is XML is easily translatable into just about anything else that the grammar allows for. So I don't see it would make any difference to the project goals if the 'master copy' for every document were in XML and a plain ASCII transform was immediately produced and kept in sync with it. People could still grab a .txt file if they wanted, but for those of us who want to read something on a palm pilot, or comfortably in a browser, we'd be able to do just that.

using XML doesn't prevent using ASCII (1)

Trepidity (597) | more than 10 years ago | (#6368715)

One of the advantages of XML is that it's very easily transformable. If Project Gutenberg were to produce XML texts, it'd be trivial for them to automatically convert them to plain ASCII and make that version available as well.

Re:using XML doesn't prevent using ASCII (0)

Anonymous Coward | more than 10 years ago | (#6368778)

>One of the advantages of XML is that it's very easily transformable.
>If Project Gutenberg were to produce XML texts, it'd be trivial for
>them to automatically convert them to plain ASCII and make that
>version available as well.
>
>
If you're going to be converting the XML to plain ASCII why bother to use XML to begin with? Sounds like a huge waste of time and effort to produce the XML version.

[sigh] (1)

Trepidity (597) | more than 10 years ago | (#6368961)

The point is that many of us would prefer an XML version. The argument against this was that ASCII is a longer-lasting archive format. My counter-argument was that an ASCII version can trivially be produced from the XML both for archival purposes and for those who would prefer such a version.

XML is not a file format (0)

Anonymous Coward | more than 10 years ago | (#6368940)

How can XML have staying power? It isn't a file format?

It is essentially a meta-format. You can put any tags in there you want. And that's the problem with it. Same problem as TIFF. Anyone can generate one, but few can read others files because to do so means you need to understand every tag that could possibly be in there.

And since the format is so flexible, people create new tags every day. SO programs written a year ago have zero chance of understanding a file. Just like TIFF.

If Gutenberg were to switch to anything it should be RTF, it's been around 10 years and still going strong.

Re:XML please (0)

Anonymous Coward | more than 10 years ago | (#6368810)

In the works, and has been for a while. I have just released my vision paper as to where Distributed Proofreaders (DP) is headed and where we would like to take Gutenberg in the future.

Conversion on the fly to various formats is a major goal.. but first we need a good source of high-quality marked up etexts. To create this source we are going to be doing some re-working of the processes at DP.

You can read my paper here [pgdp.net] (http://www.pgdp.net/vision/)

And comment on it in the DP forums [pgdp.net](http://www.pgdp.net/phpBB2/viewforum.php?f= 4) (yes, you must make an account to post)

Charles Franks
Founder, Distributed Proofreaders

Re:XML please (1)

Q Who (588741) | more than 10 years ago | (#6368818)

Great idea!

I am sure it will be seriously considered... after, say, 25 years.

(If there is still XML)

how about HTML? (0)

Anonymous Coward | more than 10 years ago | (#6368956)

HTML preserves formatting and illustrations, and being an ascii format, it is recoverable even if one doesn't have an HTML browser.

Oh, who reads books anymore anyway? (4, Funny)

Faust7 (314817) | more than 10 years ago | (#6368354)

I absorb all information directly through a USB link from my laptop to my head. Pretty nice, except for the typographical migraines. I always have ibuprofen in hand when visiting Slashdot.

Is that full-speed or hi-speed USB? (0)

Anonymous Coward | more than 10 years ago | (#6368809)

Is that full-speed or hi-speed USB?

cool (0)

Anonymous Coward | more than 10 years ago | (#6368360)

way to go /. This publicity is sure to help the project. Those who haven't got accounts can start helping or atleast consider it. There is bond to be a few people with extra time on their hands to kill, haven't heard of distributed proof reading, and are willing to do it.

Too bad... (5, Interesting)

Insurgent2 (615836) | more than 10 years ago | (#6368370)

Unfortunately, with the copyright periods being extended so long, the material will only be of (ancient) historical interest. The 98 percent of copyrighted works that are unpublished and should be on there, unfortunately, gets to sit collecting dust instead of benefitting mankind.

Re:Too bad... (0)

Anonymous Coward | more than 10 years ago | (#6368546)

This points up why we need to institute "Intellectual Property Taxes". Copywrites on intellectual property can be held out of the public domain as long as some minimal taxes are paid. Similar to real property, if there is no benefit to the owner, he will sell it or revert it to the public domain. Presently the owner has no incentive to revert material to the public domain. There is no excuse for material to languish unused.
I would propose that the property tax schedule be based on a self appraisal of the value. The owner sets a price on its value and he must honor any buyer who meets the price.

Business Model (1, Interesting)

AndroidCat (229562) | more than 10 years ago | (#6368381)

1. Gather great PD books.
2. Hard work to put them in computer form.
3. ????
4. Profit! (For all humanity.)

Hip-Hip-Hooray for a job well done!

What the?? (0, Flamebait)

Pave Low (566880) | more than 10 years ago | (#6368422)

Nice..another opportunity to take an undeserved potshot at Microsoft for no apparent reason. Doesn't it ever get old?

Newsflash: Microsoft is not trying to promote literacy or freedom. They are trying to make money, like just about every other business.

If you want to criticize their Reader/ebook business go ahead, but it's rather petty that the submitter had to attach it to a completely unrelated story. Instead of more information and background about Project Gutenberg, we get this crap.

pot...kettle...black... (0)

Anonymous Coward | more than 10 years ago | (#6368443)

While we're on the subject of attaching criticism and potshots to unrelated stories, maybe you should check you sig.

Re:pot...kettle...black... (0)

Anonymous Coward | more than 10 years ago | (#6368501)

it looks like someone here doesn't know what a sig is.

MS Reader is crapola (2, Interesting)

blair1q (305137) | more than 10 years ago | (#6368437)

"cannot open this title on a Terminal Services session"

What bollocks. Free software and free books but you can't read them over a network link to your own compute server? Microsoft, as usual, screws the pooch.

Now. How do I uninstall this without removing my adenoids?

Ptui! (1)

usotsuki (530037) | more than 10 years ago | (#6368446)

This is why copyrights shouldn't be more than 25 years.

I say, make 'em 10 years renewable up to 50 (and non-transferable).

If only there were more works there like, er, hmm, Roald "Charlie & the Chocolate Factory"/"Matilda"/"The Witches" Dahl. :}

Meh, well, better than nothing. Too bad though they don't have the Tomson New Testament of 1576 [tripod.com].

-uso.

Mac disclaimer on PG files (1, Troll)

ArsSineArtificio (150115) | more than 10 years ago | (#6368498)

From the disclaimer/header on Project Gutenberg files:

If you have an FTP program (or emulator), please
FTP directly to the Project Gutenberg archives:
[Mac users, do NOT point and click. . .type]


Given that a) Macs, being Unix-based, have command-line FTP like everybody else and b) the idea of a point-and-click interface has now passed so far from being a bizarre and contemptible innovation that lots of people are trying hard to develop nice-looking Linux GUIs... ... isn't this snarky instruction now more than a little dated?

ASA

first figlet! (-1, Troll)

Figlet Troll (686897) | more than 10 years ago | (#6368500)

_______________________________________
|_|_____|_|_|_|___________(_)_______/__|__________ _____
|_|/___\|_|_|_|_|_|_|_'__\|_\_\/_/_|_|__/__`_|/__` _/___|
|_|_(_)_|_|_|_|_|_|_|_|_|_|_|> ____|___|_(_|_|_(_|_\___\
|_| \___/ |_|_|_|\__,_|_|_|_|_/_/\_\_|_|__\__,_|\__,_|___/
______________________________________________|___ /
Important Stuff: Please try to keep posts on topic. Try to reply to other people's comments instead of starting new threads. Read other people's messages before posting your own to avoid simply duplicating what has already been said. Use a clear subject that describes what your message is about. Offtopic, Inflammatory, Inappropriate, Illegal, or Offensive comments might be moderated. (You can read everything, even moderated posts, by adjusting your threshold on the User Preferences Page)

We should all actually read this (4, Insightful)

tie_guy_matt (176397) | more than 10 years ago | (#6368692)

Putting a flag on your front porch is a great way to celebrate the 4th of July. An even better way to celebrate the United States' birthday would be to go to this site and actually read the documents that define us as a country.

In this day in age when it seems everyone is a suspected terrorist and our liberties are stripped one by one in the name of homeland security, and in the name of the rights of large companies, I wish some of our elected officials would actually read these documents sometime.

A red white and blue flag isn't what makes this country great, nor does an extremely high gross domestic product -- it is the set of ideas that where written over 200 years ago that makes the USA great.

So everyone go to this site and read those documents. Even if you aren't American you should still read those documents because everyone has the right to the freedoms that our founding fathers wrote about.

Re:We should all actually read this (0)

Anonymous Coward | more than 10 years ago | (#6368848)

Tell me who's the real patriots

The Archie Bunker slobs waving flags?

Or the people with the guts to work

For some real change


Read more [lyricscafe.com].

Thanks for support, plans for future (5, Informative)

gbnewby (74175) | more than 10 years ago | (#6368789)

Thanks to everyone who has helped contribute eBooks and other support to Project Gutenberg! If you haven't already, please visit Distributed Proofreaders [pgdp.net] and proof a page today!

Lots of plans for the future:

  • Post-#10000 formatting changes. We'll be rearranging our directories to make it easier to find things. Likely we'll go with something OAI (OpenArchives.org) compliant
  • Conversion on the fly to many formats. We'll putting eBooks into XML format (mostly using teixlite.dtd, we think) for conversion on the fly to many other formats.
  • New ways to donate. "Sponsor a book"
  • More contemporary content. We receive donations nearly every week from currently published authors who want to make their stuff available to a wider audience (i.e., our Doctorow's Down and Out [ibiblio.org])
  • Your ideas! Visit gutenberg.net [gutenberg.net] to sign up for newsletters, find out how to get started producing an eBook, and find eBooks


Thanks especially to our main and backup distribution sites, iBiblio [ibiblio.org] and The Internet Archive [archive.org]. And thanks to the THOUSANDS of volunteers who have brought us nearly to our 10,000th eBook.



Dr. Gregory B. Newby

Chief Executive and Director

Project Gutenberg Literary Archive Foundation
http://gutenberg.net

A 501(c)(3) not-for-profit organization with EIN 64-6221541

gbnewby@pglaf.org

A decent fast scanner? (0)

Anonymous Coward | more than 10 years ago | (#6368796)

Does there exist a decent FAST scanner using free software that runs on GNU/Linux or *BSD?

Especially when you only need to scan text, it seems that every scanner on the market takes > 10 seconds per page.

Where are the 1-3 second scanners? What do PG volunteers use?

Something based on DV cameras? (1)

yerricde (125198) | more than 10 years ago | (#6368966)

Where are the 1-3 second scanners?

Wouldn't it be possible to rig up a high-speed scanner based on digital video technology? Or are CCD and CMOS image sensors not fine enough yet?

Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...