Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Is Free Software Ready For E-publishing?

Unknown Lamer posted about 3 years ago | from the information-wants-to-be-free dept.

Open Source 221

johanneswilm writes "Over more than 3 years I have been writing my PhD thesis on the politics of Nicaragua. Being the most professional system for PDF generation, I went with LaTeX, and, to make the text accessible for the editors, I used the LyX editor. Now that the publication date comes near, I found I had to spend considerable time creating a script to convert the manuscript to formats such as Epub as none of the available tools were quite ready to do it automatically. Is LaTeX only good for writers in the natural sciences? Is the open source community boycotting ebook formats, as Richard Stallman has proposed? Are there better tools to do the same?"

cancel ×

221 comments

Sorry! There are no comments related to the filter you selected.

Wordpad (0, Troll)

Dr.Bob,DC (2076168) | about 3 years ago | (#36984840)


When I was at Life Chiropractic College I used Microsoft Write then Wordpad for all my assignments. Those are free and handled some fairly large documents with clipart pictures (spines, vertibrae, hands, etc.)

Now that I'm making money, we use MS Office for all our office PCs. That's not free though. I bet Wordpad would do what you need and there are lots of clipart libraries you can download or find online for your thesis.

Re:Wordpad (-1)

Anonymous Coward | about 3 years ago | (#36984944)

Thank you. Next time, staying on-topic would be much appreciated, fascinating though these insights into your meaningless existence are.

Re:Wordpad (-1)

Anonymous Coward | about 3 years ago | (#36985196)

Are you kidding?

He got through a whole post without mentioning how you get subluxations from being on the same continent as a computer. Clearly he's been to some sort of lunatics anonymous, and is recovering from his crazy addiction.

Re:Wordpad (-1)

Anonymous Coward | about 3 years ago | (#36985308)

Hmm, you're right. I guess this is a very good point and he should be commended for staying so close to the topic.

In which case, Dr. Bob, I apologise. Thank you for saying something more or less relevant. I look forward to your next missive with great excitement.

You should had compared (4, Insightful)

zget (2395308) | about 3 years ago | (#36984844)

Being the most professional system for PDF generation, I went with LaTeX

Now that the publication date comes near, I found I had to spend considerable time creating a script to convert the manuscript to formats such as Epub

It sure sounds the like most professional system!

The truth is, if you want your job done, you look at the merits of every possible program without considering if it's open source or not. There are good software like Apache that are mostly good for web hosting (unless you have certain requirements). Then there is lots of shit. The same is true for proprietary software tho. But if you want to get something real done, it's just stupid to limit yourself to only open source OR proprietary software. Pick the best tool for the job.

Re:You should had compared (-1)

Anonymous Coward | about 3 years ago | (#36984906)

Pick the best tool for the job.

Thank you, captain obvious.

Re:You should had compared (-1, Flamebait)

orasio (188021) | about 3 years ago | (#36984970)

-1, Shill

Re:You should had compared (-1)

Anonymous Coward | about 3 years ago | (#36985180)

-1, Fanboy

Re:You should had compared (-1)

Anonymous Coward | about 3 years ago | (#36985236)

Exactly what is he a "shill" for? He did not recommend or even mention a single product anywhere in his post.

Re:You should had compared (1)

AndrewNeo (979708) | about 3 years ago | (#36985334)

Apache! Burn the fanboy!

Re:You should had compared (0)

Alex Belits (437) | about 3 years ago | (#36985964)

Microsoft.

"Use best tool for the job" is now a code word for "Hey, look, Microsoft has some shiny thing we want you to use -- we promise, it's better than what you use now!"

Re:You should had compared (5, Informative)

TheRaven64 (641858) | about 3 years ago | (#36985002)

My fourth book (Go Phrasebook) is due to be published soon. I send 3 copies to the publisher:

  • Print, PDF, generated by pdflatex. Black and white with crop marks.
  • eBook PDF, generated by pdflatex, with cross-referencing hyperlinks and colour for the syntax highlighting.
  • XHTML, generated by some code I wrote [gna.org] , with hyperlinks and cross references and semantic markup in the code listings generated by clang for [Objective-]C[C++].

The publisher can then just tweak the CSS for the ePub (XHTML) version. A C code listing has lots of span tags marking words as keywords, typedefs, macro uses, variables, and so on. How these are presented is controlled from the CSS, as is all of the rest of the styling.

The important thing is to make sure you separate content from presentation. If you use a lot of TeX markup in your chapters, then it's hard to use anything other than [La]TeX to typeset it. If you use simple semantic markup with all of the macros defined in a document class, then you can parse the same markup easily with something else and then transform it into some other format.

You could use some sort of XML and generate TeX from it, but typing XML is horrible. I like to work in vim, and with a couple of macros entering LaTeX is really easy.

Re:You should had compared (2)

gatzke (2977) | about 3 years ago | (#36985220)

For mathy stuff, LaTeX is great. I am not sure how to separate that content from the TeX itself. And I never understood why MathML did not just use TeX.

And I am a huge proponent of LyX. Much easier to get students to use than advocating vi+LaTeX....

Re:You should had compared (1)

TheRaven64 (641858) | about 3 years ago | (#36985926)

My code doesn't convert the math stuff yet, but it's pretty easy to do. There are existing libraries out there that will generate MathML from LaTeX, and it's easy for a parser to detect math mode stuff bracketed by dollar signs and just pass that off to a library.

MathML didn't use TeX because it does more. TeX is great for typesetting maths, but it doesn't give you any semantic information. MathML captures both - how an equation should look and what it means, although both parts are optional.

Re:You should had compared (1)

gatzke (2977) | about 3 years ago | (#36985342)

What was wrong with using latex2html to generate hyperlinked files and such? What is the advantage of using your code? I did not see a readme or faq.

I have used latex2html successfully in a few cases, but maybe I am missing something.

Re:You should had compared (2)

impaledsunset (1337701) | about 3 years ago | (#36985526)

Because it doesn't work? There is no generic LaTeX to HTML convertor out there, and all of them handle only specific use-cases.

In particular, all convertors either try to parse part of the LaTeX code or try to interpret the DVI that is produced as a result. With the first approach you'd have to limit your use of packages, because otherwise the convertor will fail unless it's a complete LaTeX implementation. With the second approach you won't be able to use XeLaTeX because it doesn't produce standard DVI files, and instead produces a new extended format which is not supported by any other tools yet. And you should really be using XeLaTeX for serious work.

latex2html uses the first approach, and it fails to work with documents that use UTF-8 international characters, and if it will fail particularly spectacularly if you use some more interesting packages.

The best approach is to use a subset of LaTeX that you can safely convert to another format, using tools written by yourself is actually easier than using the tools that are available, because the tools available also handle only a specific scenario, only that you don't know which one. I have my own personal simplistic format that I convert to XeLaTeX code and I plan to create XHTML convertors when I happen to need them. It's really simple, I have a few macros that I replace with XeLaTeX code and a template.

Re:You should had compared (1)

gatzke (2977) | about 3 years ago | (#36985918)

A quick google, looks like latex2html may work on files with UTF-8 chars in them:

After a lot of research I finally could find the right invocation of latex2html to generate HTML correctly with accents:

latex2html -html_version 4.0,latin1,unicode book.tex

From http://miguel.leugim.com.mx/index.php/2008/05/18/latex2html-and-utf8-encoding/ [leugim.com.mx]

But you are right on use of more interesting packages, latex2html probably can't handle them. But I don't see the advantage of writing my own code and limiting myself to simple cases.

Re:You should had compared (1)

rwa2 (4391) | about 3 years ago | (#36986010)

Meh, it worked OK for my thesis. I managed not to need too many international characters, I guess.

lyx (eventually I migrated it to emacs so I could stop wrestling with the GUI, which was having a cow with some of my larger figures) -> latex -> latex2html -> pluckr

also had a makefile to generate/update the .dvi -> .ps -> .pdf targets as well.

Re:You should had compared (1)

lee1 (219161) | about 3 years ago | (#36985724)

You could use some sort of XML and generate TeX from it, but typing XML is horrible. I like to work in vim, and with a couple of macros entering LaTeX is really easy.

I type both xml and LaTeX with vim, and I find both are easy if you use the right plugins. Using the xml plugin vim will type all the angle brackets for you, complete your tags, format, and generally take the drudgery out of writing xml.

Re:You should had compared (0)

Anonymous Coward | about 3 years ago | (#36985046)

The truth is, if you want your job done, you look at the merits of every possible program without considering if it's open source or not.

I don't think you realize where you are.

Re:You should had compared (1)

mkkohls (2386704) | about 3 years ago | (#36985054)

I agree completely. Though as a student the best way it seems to get full featured software without pirating or spending an arm and a leg is open source. The quality is there if you know what you're looking for.

Re:You should had compared (3, Insightful)

udoschuermann (158146) | about 3 years ago | (#36985192)

Actually it makes perfect sense if you do not wish to support the mindset of proprietary software, and the dependencies and liabilities that such an association creates (and not just for yourself, either!) Obviously there is a price to be paid for refusing to run with wolves, hence the posted question: Is there a way to accomplish what needs to be done using only FLOSS (Free/Libre/Open Source Software)?

When copyleft restricts which tools may be used (2, Informative)

tepples (727027) | about 3 years ago | (#36985330)

But if you want to get something real done, it's just stupid to limit yourself to only open source OR proprietary software. Pick the best tool for the job.

Be careful: sometimes, especially in cases of works under a "copyleft" or "share-alike" license, a work's copyright license limits which tools for the job are lawful. For example, some licenses require works to be made available in an editable format that isn't Java-trapped [gnu.org] .* See, for example, sentences containing "Transparent" in the GNU Free Documentation License [gnu.org] and sentences containing "technological" in CC BY-SA [creativecommons.org] . You can use proprietary tools yourself, but you also have to make sure that the work can be edited with free tools.

*Term's original is historical, prior to IcedTea.

Re:When copyleft restricts which tools may be used (4, Insightful)

Anonymous Coward | about 3 years ago | (#36985846)

Be careful: sometimes, especially in cases of works under a "copyleft" or "share-alike" license, a work's copyright license limits which tools for the job are lawful

What are you talking about? This would only be relevant if we were discussing templates and packages to be embedded as part of the document, it has nothing to do with software.

(If you still don't understand: GPL/GFDL/CC-SA only affect derivative works [derivative as defined by copyright law], a document made in MS Word is not subject to the copyright of MS Word unless you decompile Word and paste the code into the document. This simple fact is why you can (and many people, most obviously Apple, do) compile proprietary applications using GCC. DISCLAIMER: I'm talking about copyright law here, not contract law. If the MS Word EULA [which is a contract rather than a copyright license] says that all MS Word documents must be copyrighted a certain way then that may be a problem, fortunately I'm not aware of any applications that have such a clause)

Re:considering if it's open source (1)

TaoPhoenix (980487) | about 3 years ago | (#36985434)

Sorry, I resoundingly disagree.

Looking at your brand new user name, some members would call your post a broad shill for all proprietary closed programs.

The entire point of Open Source is that it can be moved to new innovative uses. Open Source will be slightly-to-much harder to use in many cases! But that is not the point of Open Source! The point is that a valid computing experience can be made out of open components. Yes, someone will have locked down the "1-click" version of a feature with a patent. So it takes you three clicks. Three clicks is way better than spending an hour kludging it.

Re:You should had compared (1)

hedwards (940851) | about 3 years ago | (#36986036)

The truth is, if you want your job done, you look at the merits of every possible program without considering if it's open source or not. There are good software like Apache that are mostly good for web hosting (unless you have certain requirements). Then there is lots of shit. The same is true for proprietary software tho. But if you want to get something real done, it's just stupid to limit yourself to only open source OR proprietary software. Pick the best tool for the job.

The problem there is that you also need to factor in the cost of Windows or OSX if you want to use proprietary software that's designed for professionals. If you're using one of those platforms then the calculation is much easier, but if you're not interested in giving those companies money then your options are severely limited.

through HTML (2)

fph il quozientatore (971015) | about 3 years ago | (#36984940)

The best way to go seems LaTeX->HTML->ePUB. I guess many of your problems do not come from LaTeX itself, but from the fact that the LaTeX code that LyX outputs is... well... not meant for human editing and for further work. (haven't worked with LyX in a while, though -- maybe the quality of the TeX it produces has considerably improved in the meantime).

Re:through HTML (2)

gatzke (2977) | about 3 years ago | (#36985086)

I have been using LyX for over a decade, and I feel it is a great tool for using LaTeX without the headache of LaTeX. The code it produces is not great, but it is reasonably readable.

Re:through HTML (1)

tehniobium (1042240) | about 3 years ago | (#36985238)

Once you practice a little bit with LaTeX (we're talking using it for a couple of weeks) there really is no headache.

In fact, the lack of headache from LaTeX is what makes it better than any WYSIWYG editor out there (LyX is good, but still a headache imo)

Re:through HTML (3, Interesting)

gatzke (2977) | about 3 years ago | (#36985484)

There are some things LyX is better at than pure LaTeX code.

I can see the current version of my figures, not just rely on the file name.

I can add references from a list instead of trying to remember what labels I have used.

I can search bib items and add / order citations easily.

I can make complex tables without forgetting some damn }

I can generate and view a new version of my document in a single keypress.

I can see my equations without having to mentally render them, while still using most of my TeX knowledge (\alpha _12 in LyX is the same as \alpha_{12})

Students can make the transition from Word a little more readily. Remember, LyX is not WYSIWYG, it is WYSIWYM (what you mean) so the on screen representation is close to the final but not exact.

Plus you have access to tons of menu options that you may not be aware of. I learn more about LaTeX by using and exploring LyX. And you can always use pure code if you want, for any fancy stuff.

calibre (3, Informative)

Anonymous Coward | about 3 years ago | (#36984942)

calibre is a free and open source e-book library management application that can convert to and from most of ebook formats. And does a pretty good job at it.

http://calibre-ebook.com/

Easy solution (1)

theatreman (931919) | about 3 years ago | (#36984960)

LaTeX -> pdf then convert here: http://www.2epub.com/ [2epub.com]

Re:Easy solution (5, Insightful)

TheRaven64 (641858) | about 3 years ago | (#36985060)

Going through PDF is horrible. LaTeX contains a lot of semantic markup. ePub is XHTML, which is a form of semantic markup. PDF is a presentation format. So, you start with semantic markup, discard it all, and then try to generate it again by magic.

You end up with something that looks vaguely like the PDF, but loses most of the semantic information (e.g. section / chapter breaks). Worse, you often don't want the ePub version to look like the PDF - they're aimed at different form factors.

Re:Easy solution (5, Insightful)

digitig (1056110) | about 3 years ago | (#36985170)

The trouble is, PDF is a pretty rotten format for e-readers, because it's all page-layout oriented and so produces output that doesn't scale well for different screen formats and text sizes. It's the wrong format for the job. And DVI has pretty much the same problems. The problem isn't that free software isn't ready for ePublishing -- Calibre and Sigil do the job well. The problem is that there's a disconnect between the assumptions laTeX makes about a document and the assumptions that are valid for ePublishing, Sorry if it's restating the blindingly obvious, but you didn't want the best system for PDF generation, you wanted the best system for PDF and EPUB generation, and that probably isn't laTeX.

Re:Easy solution (0)

lee1 (219161) | about 3 years ago | (#36985818)

But you can make your pdf with any page size and font. You could easily make a set of pdfs for all the common devices from a single LaTeX source, and the result would look much better, typographically, than the typical ebook.

Re:Easy solution (2)

ConceptJunkie (24823) | about 3 years ago | (#36985904)

This is fine. Then if I purchase an e-Book, I only need the PDF version specific to the device I'm currently using (a Nook Classic)... oh, and any device I might ever want to use for the rest of my life. A proper eBook format cannot be tied to a specific page format.

I like PDF for computer use, but the parent is right... it's definitely "rotten" for e-Readers. I've tried converting PDF to ePub to use on my Nook and it's a hit-or-miss proposition, with much more "miss" than "hit".

Re:Easy solution (2)

Dog-Cow (21281) | about 3 years ago | (#36985982)

Who defines the "common" devices? How do you handle something like an Android or iDevice, where the orientation can be changed? PDFs are not a good format for anything destined for a screen instead of paper. That computer monitors are (mostly) large enough to display most of the common paper size (letter/A4) is fortunate, but should not be relied upon.

MY NEW INTRO FOR WHITE HATS ONLY !! (-1)

Anonymous Coward | about 3 years ago | (#36984978)

We can dance if we want to
We can leave your friends behind
'Cause your friends don't dance and if they don't dance
Well they're, no friends of mine

Say, we can go where we want to
A place where they will never find
And we can act like we come from out of this world
Leave the real one far behind

And we can dance

We can go when we want to
Night is young and so am I
And we can dress real neat from our hats to our feet
And surprise them with a victory cry

Say, we can act if we want to
If we don't nobody will
And you can act real rude and totally removed
And I can act like an imbecile

And say

We can dance, we can dance
Everything's out of control
We can dance, we can dance
We're doing it from pole to pole

We can dance, we can dance
Everybody look at your hands
We can dance, we can dance
Everybody's taking the chance

its a white hat dance
Oh well its safe to dance
Yes it safe to dance

We can dance if we want to
We've got all your life and mine
As long as we abuse it, never going to lose it
Everything will work out right

I say, We can dance if we want to
We can leave your friends behind
'Cause your friends don't dance, and if they don't dance
Well they're no friends of mine

I say, we can dance, we can dance
Everything's out of control
We can dance, we can dance
We're doing it from pole to pole

We can dance, we can dance
Everybody look at your hands
We can dance, we can dance
Everybody's taking the chance

Well it's safe to dance
Yes it's safe to dance
Well it's safe to dance
Well it's safe to dance
Yes it's safe to dance
Well it's safe to dance
Well it's safe to dance

It's a white hat dance
Well it's a white hat dance
Oh it's a white hat dance
Oh it's a white hat dance
Well it's a white hat dance

...PROFIT!! (1, Flamebait)

djsmiley (752149) | about 3 years ago | (#36984982)

1. Realise no scripts exist for problem
2. Write scripts
3. Release scripts as open source
4. Don't post pointless problem on slashdot
5. ???
7. PROFIT!

(We don't talk about point. 6)

Re:...PROFIT!! (5, Insightful)

Khan Fused (446871) | about 3 years ago | (#36985148)

1. Realise no scripts exist for problem
    1,1 Realize that someone writing a thesis on Nicaraguan politics may not know how to program
    1.2 Begin learning to program
    1.3 Spend more time learning to program
2. Write scripts
    2.1 Divert time from PhD thesis to write scripts
    2.2 Spend more time (diverted from PhD program) learning to program sufficiently to write workable scripts to solve stated issue
3. Release scripts as open source
    3.1 Fail to complete PhD thesis in time due to time spent programming

Re:...PROFIT!! (1)

kravlor (597242) | about 3 years ago | (#36985382)

This is very true. I wish I had mod points!

Re:...PROFIT!! (4, Funny)

dstar (34869) | about 3 years ago | (#36985406)

4. Realize this is exactly what happened to Knuth.
        4.1 Take consolation in the fact that at least it's just a thesis, not the next volume of TAOCP.

Re:...PROFIT!! (0)

Anonymous Coward | about 3 years ago | (#36985856)

Arghh! Beat me to it.

Re:...PROFIT!! (1)

Anonymous Coward | about 3 years ago | (#36985452)

4. The scripts have a bigger audience than yet another thesis on Nicaraguen politics.

Re:...PROFIT!! (0)

Anonymous Coward | about 3 years ago | (#36985696)

One of the best comments I have read on here. I am a technology user, not a hard core technology lover. There is room for both so it is nice to see a recognition that not everyone needs to be a proficient programmer.

Perhaps you should have (0)

Anonymous Coward | about 3 years ago | (#36985010)

considered what LyX/LaTeX were actually designed for. LaTeX is an interface to the TeX *typesetting* program, which does exactly that: it sets type on a page for printing (on paper) purposes. It just so happens that TeX can "print" to digital formats like PDF and DVI, which also just happen to support some "ePublishing" features like indices, search and table of contents. TeX/LaTeX/LyX are themselves NOT ePublishing platforms, they are means of applying visual styles to text. Since ePublishing itself doesn't even make sense as a term (is a website with information considered "ePublishing"? I think so -- so perhaps you should have authored your paper on a web CMS) when you consider the technical intricacies of everything involved, I think you'd have been better off investigating all "publishing" options before even starting to author your paper in a complex system. If it was me: I would have started to write text in a plain text file until I had decided the best route to go. If graphics and equations were required, I would have moved to a generic HTML + css method.

CSS paged media (2)

tepples (727027) | about 3 years ago | (#36985388)

If graphics and equations were required, I would have moved to a generic HTML + css method.

Most web browsers that I've seen are based on the model of rendering a web page to a scroll that is 960px wide by infinitely tall. But in the real world of print, the codex has replaced the scroll. The paged media module in CSS3 [w3.org] is still only a Working Draft. So which web browser would you recommend that has thorough support for MathML and for CSS paged media?

Formats are all over (0)

Anonymous Coward | about 3 years ago | (#36985018)

Work with Calibre if you need to convert formats.

epub is basically compressed html with some indexes, I think its even .zip format.

Calibre can handle most of this stuff for you if you can get it into an acceptable format from tex.

Calibre? (1)

anton.karl (1843146) | about 3 years ago | (#36985022)

Why not just use Calibre to convert from pdf? It is very easy to use and supports all kinds of formats. See: http://calibre-ebook.com/ [calibre-ebook.com]

Re:Calibre? (3, Informative)

ConceptJunkie (24823) | about 3 years ago | (#36985968)

Because PDF to ePub conversion generally gives you pretty awful results. Nothing against Calibre. I use it. But most PDFs I've tried to convert for my Nook Classic have had less than stellar results: readable if you're lucky, but not nicely formatted. And if there are embedded images, all bets are off.

RE: Stallman link (3, Interesting)

craftycoder (1851452) | about 3 years ago | (#36985032)

Stallman complains about DRM and a lack of anonymity with eBooks. It seems to me that this story relates very closely to legally acquired music. While it is still difficult to legally acquire digital music anonymously, it is easy to get it without DRM. I suspect books will follow this same path if consumers value it as a feature. In practice there is in fact little anonymity in the purchase of real books as everyone wants you to swipe your "club" card and use your debit card to make the purchase but his point is well taken. The option to buy an unpopular book in secret is nice.

With time and interest from consumers we will have DRM free books.

Anonymity is dead and gone and I didn't even get an invitation to the funeral. We should all mourn it's passing.

Re: Stallman link (0)

pnewhook (788591) | about 3 years ago | (#36985248)

Yes, well Stallman is quite the crackpot, so anything he says should be questioned as to ulterior purposes.

What rock did you just crawl out from under. (1)

bobs666 (146801) | about 3 years ago | (#36985438)

Richard Stallman is one of the geniuses of our time. All he wants is to share software with his friends. and to have his friends share it back.

johanneswilm also wants to share. Its clear by his wanting to use FOSS in his process.

Perhaps you need to rethink who is the crackpot here? The answer is clear here.

Re:What rock did you just crawl out from under. (0)

Anonymous Coward | about 3 years ago | (#36985828)

I really value RMS's work, but most people can't make a living by being the Free Software Pope.

Re:What rock did you just crawl out from under. (1)

ConceptJunkie (24823) | about 3 years ago | (#36986002)

I think it's fair to say RMS is both a genius and a crackpot. His goals are laudable, and I support them, but what he's willing to sacrifice to achieve them isn't necessarily so.

Re: Stallman link (1)

Richard_at_work (517087) | about 3 years ago | (#36985410)

I don't think anonymity is dead, I think its quite alive and kicking.

What did die was the expectation that others should protect your anonymity for you.

Open source solves problems programmers have (1)

Qzukk (229616) | about 3 years ago | (#36985034)

I went to arxiv.org and picked a dozen or so papers from the "new" list and clicked their other format links. They're available in pdf, ps, and dvi formats. This is hardly a complete analysis since I don't have any access to the "real" journals, but I have to wonder how many journals and universities are demanding papers in ebook format.

Open Source generally scratches itches. You may be one of the first people with the itch of converting theses to ebook formats.

Re:Open source solves problems programmers have (3, Informative)

TheRaven64 (641858) | about 3 years ago | (#36985108)

I've not published anything in a journal for a couple of years, but in computer science every journal worth reading accepts PDF submissions and either provide a LaTeX style, asks for your LaTeX source to edit themselves, or tells you which standard LaTeX style to use. It's a good first check for a journal - if they don't encourage LaTeX submissions, they probably suck. Apparently the same is true in mathematics and physics, but less so in other subjects. In the humanities it's common for journals to require MS Word documents (and place insanely strict requirements on the formatting of the bibliography that are trivial with BibTeX and very hard with MS Word, from what I've heard).

Re:Open source solves problems programmers have (2)

Phillip2 (203612) | about 3 years ago | (#36985486)

Short answer; none want epub formats as submissions. But this doesn't mean to say that there is not a desire to produce them from submissions. Lots of scientists and academics want to read articles on the go, without having to carry around lots of paper.

My own experience, however, is that the big move up is from PDF to HTML. This improves the reading experience enormously. EPUB on the other hand is limited. Many ebook readers don't work that well for academic content: mathematics is dealt with badly with non-scalable fonts, graphs and images are poor, citations are not well supported. I haven't see a huge use case for epub yet.

Boycotting? Hardly (2)

Clopy (857418) | about 3 years ago | (#36985042)

"Is the open source community boycotting ebook formats?"

Hardly. Calibre [calibre-ebook.com] is an excelent converter, library manager and it's compatible with most of the readers out there for syncing. You could try converting from pdf to e-pub with it, although PDF is a lousy input format.

Re:Boycotting? Hardly (1)

pseudonomous (1389971) | about 3 years ago | (#36985556)

That's linked in TFA, so apparently it didn't work in this case.

Re:Boycotting? Hardly (1)

geantvert (996616) | about 3 years ago | (#36986110)

The link to the Richard Stallman page is not against ebooks but about Amazon ebooks, or to keep it simple about DRM, proprietary formats and all other nice features introduced to "protect" users

There is nothing wrong with open ebook formats such as EPUB (XHTML+CSS+XML) as they remain DRM free.

         

Pandoc (4, Informative)

bbk (33798) | about 3 years ago | (#36985062)

I've found pandoc (here: http://johnmacfarlane.net/pandoc/ [johnmacfarlane.net] ) to be very useful for generating PDF/ePub/LaTeX/etc from Markdown formatted text files.

Re:Pandoc (3, Informative)

metamatic (202216) | about 3 years ago | (#36985740)

Indeed. For a PhD thesis on the politics of Nicaragua, I'd have started with markdown and then converted that to ePub and LaTeX.

Boycott? I Think the Tools Merely Lack Maturity (5, Insightful)

eldavojohn (898314) | about 3 years ago | (#36985074)

Others have told me that the financial gain of publishing an academic book may be up to 700 USD. In comparison to current Scandinavian wages that really means very little, so I don’t think that earning another 700 USD should be a motive to restrict the access to one’s thoughts.

First of all I would like to commend you and thank you for this sentiment.

Is the open source community boycotting ebook formats, as Richard Stallman has proposed?

I don't understand, Stallman decries e-book formats that aren't open. There are many open e-book formats [wikipedia.org] --including ePub. Granted, there are tools out there that allow you (to varying degrees of success like Calibre) to crack and convert to these formats but why bother? As you can see in that table, most everyone supports PDF. You are misunderstanding Stallman's gripe. It's not that we are boycotting e-books, it's that e-book makers are trying to carve out their own proprietary section of the electronic market, reader and creators included. So let them take their ball and play elsewhere. As you noted in your blog, this isn't the only problem:

Most ebook-readers out there so not implement the Epub-standard perfectly. That means that although one has an Epub that follows all the standards, one can be quite sure that it will not display properly on all the readers. Kovid Goyal, the creator of the Calibre ebook management software has done a good job in creating conversion scripts that create Epubs for all the different readers. Unfortunately they do this by breaking compatibility with the standard, and many distribution sites will only check whether your Epub complies to the standards and not whether the book will actually look good in the reader.

Most readers handle PDF, I would just stick to the output of LaTeX. I might suggest that your expectations are misdirected at the open source community and might be better directed at the makers of readers that apparently force you to break standards. It's the IE6 conundrum all over again.

Stallman didn't suggest boycotting ebook formats, just the DRM associated with them (big surprise there). The problem you are experiencing is that sometimes it's difficult to go from one open standard to another. The tools are lacking in maturity and I'm guessing that since my Android phone can easily display PDFs for me that there's not a lot of people demanding this ePub support that apparently needs multiple flavors for each device (and Calibre helps you with this). The tools exist [johnmacfarlane.net] but they'll only get you so far and I think the really special stuff that LaTeX does well is what you'll find yourself needing to fine tune in the end product. Look at how long it's taken LaTeX to get that beautiful and I think you'll discover that making a magical cure-all converter to ${random format} can be a non-trivial task.

If you start a kickstarter and get your university to donate hosting to making an open free market for any academic papers in any open format, I'd definitely throw in $20 (I've spent about $200 on kickstarter in the past two years). Either that or maybe throw your lot in with arxiv and work with them to fund more format support [arxiv.org] ?

Re:Boycott? I Think the Tools Merely Lack Maturity (1)

strangeattraction (1058568) | about 3 years ago | (#36985488)

There is an open free market for Scientific publishing called PLoS http://www.plos.org./ [www.plos.org] PDF's suck on eReaders mainly due to the fact the text does not reflow for different size readers. The reason eReaders don't support ePub as well as they should is because most eReaders are not sold for profit but to hook you into the distributor's DRM'd products ala Amazon. It is not their priority. Converts just suck. Enough said.

Pandoc (3, Interesting)

cyocum (793488) | about 3 years ago | (#36985098)

The solution to your problems is Pandoc [johnmacfarlane.net] which can convert LaTeX to EPUB if you like. Now, it will probably take some fiddling on your part with the output but it very much smooths the process.

Asciidoc (1)

olafura (539592) | about 3 years ago | (#36985120)

You should try Asciidoc [methods.co.nz] or docbook directly. I don't know if LaTeX has enough information to be faithfully converted to epub. But Asciidoc can reuse the LaTeX notations for a number of things.

I think it's the questions of right tool for the job, docbook is very widely used and is designed for working with books, and asciidoc and simular tools are for the non masochistic of us that prefer to edit text files and not raw xml. If you like gui there are plenty of guies for docbook, including LyX [lyx.org] .

Re:Asciidoc (1)

Enry (630) | about 3 years ago | (#36985454)

I agree with the DocBook recommendation. I was writing a lot of DocBook back in 2000/2001 and it would output to just about any format, including PDF and a few of the ebook formats that were available at the time (plucker? among others).

The thing that most people (including the original) is separating content from presentation. HTML tried to do this and failed miserably. When you start thinking about things like layout while writing, you're spending more time trying to make it look good rather than actually be good. Especially in the publishing industry, there's people that can do the layout for you and DocBook makes the layout portion quite easy even if you have widely varying output types.

Take a look at what the Linux Documentation Project was able to do.

RMS not boycotting e-books (5, Informative)

spf13 (1468419) | about 3 years ago | (#36985194)

While he states "We must reject e-books until they respect our freedom." He also outlines 7 things amazon's e-books do that violate this freedom. Fortunately epub is the most widely accepted e-book format and it has none of these 7.

  1. Available anonymously.
  2. Standard ownership applies.
  3. License determined by vendor, but many have very liberal licenses including CC and public domain.
  4. Open format based on html.
  5. Lending rules same as physical book.
  6. No inherent DRM (though Adobe has a version compatible with DRM).
  7. No one can remotely delete it any more than any other file on your computer.

RMS isn't against e-books. He's against amazon's approach to e-books.

Re:RMS not boycotting e-books (2)

icebraining (1313345) | about 3 years ago | (#36985316)

In fact, he specifically mention "Project Gutenberg" as freedom respecting ebooks, and they distribute EPUBs.

Re:RMS not boycotting e-books (1)

sourcerror (1718066) | about 3 years ago | (#36985866)

They release books in other formats as well, like HTML.

Re:RMS not boycotting e-books (1)

Enry (630) | about 3 years ago | (#36985494)

No one can remotely delete it any more than any other file on your computer

Given you can copy the .amz file to your local system and save it (thus preventing Amazon from deleting it), this doesn't sound like much of a problem.

A word processor? (1)

superdude72 (322167) | about 3 years ago | (#36985254)

Is there some reason MS Word or OpenOffice + stylesheets aren't up to the task? It sounds like you might be overcomplicating things.

Re:A word processor? (1)

fartrader (323244) | about 3 years ago | (#36985318)

Is there some reason MS Word or OpenOffice + stylesheets aren't up to the task? It sounds like you might be overcomplicating things.

I totally agree. Wrote my computer science PhD in word with no problems. IMHO LaTex is the right tool for the job if you have lots of equations, and unless his PhD is on psychohistory I very much doubt he needs that kind of power.

Re:A word processor? (1)

gbjbaanb (229885) | about 3 years ago | (#36985390)

or you need it converted. IANAP (I'm not a publisher) but I doubt they use Word to generate the necessary typesetting for a print run, and Word is pretty bad at converting a document to anything else whilst keeping the formatting intact.

Re:A word processor? (1)

Anonymous Coward | about 3 years ago | (#36985804)

or you need it converted. IANAP (I'm not a publisher) but I doubt they use Word to generate the necessary typesetting for a print run, and Word is pretty bad at converting a document to anything else whilst keeping the formatting intact.

I do deal with publishers, and actually, it's pretty much the standard for manuscripts to be Word documents these days. Most technical books, at least, aren't so rigidly formatted as to require a PageMaker-type of program.

Re:A word processor? (0)

Anonymous Coward | about 3 years ago | (#36985428)

Word & OpenOffice documents look like ass. You'd need Adobe before it'll look as nice as LaTeX.

Re:A word processor? (1)

kravlor (597242) | about 3 years ago | (#36985448)

Setting aside the fact that LaTeX will perform typesetting, those word processing tools utterly fail for creation of documents with lots of (or complex) equations.

They are also very cumbersome for generating cross-references, bibliographic formatting, and management of figures/tables.

One killer feature MS Word *does* have over TeX-based solutions for now is excellent commenting, change tracking and shared collaboration features.

I know both worlds well, having used MS Word for collaborative proposal writing, and TeX for scientific publishing. I strongly prefer LaTeX.

Re:A word processor? (1)

serviscope_minor (664417) | about 3 years ago | (#36985534)

One killer feature MS Word *does* have over TeX-based solutions for now is excellent commenting, change tracking and shared collaboration features.

I have done some cooperative writing.

Latex supports comments trivially and always has. Anything starting with a %. There are also packages which allow you to insert them into the text as well.

For change tracking, word is OK if you pass the document around from one person to another in sequence. However, since LaTeX uses plain text files, you can use it with any VCS, which are vastly more advanced and capable.

Re:A word processor? (1)

Phillip2 (203612) | about 3 years ago | (#36985870)

I sometime Google docs these days for collaborative writing as it avoids the "pass the word doc" around nightmare. Although with dropbox the latter has got easier.
In the end, the proposal gets turned into a word doc though for final formatting, because it is what people expect.

In terms of change tracking, I find this only works in word for a view people. Otherwise, you end up with change tracks everywhere and it's just an unreadable mess. Tex/latex in a versioning system can also work, although again only with so many changes and only if everyone is geeky enough to be able to use it.

I wish their were a perfect workflow, but there really isn't.

Phil

Re:A word processor? (1)

serviscope_minor (664417) | about 3 years ago | (#36985468)

Is there some reason MS Word or OpenOffice + stylesheets aren't up to the task?

Have you ever written a long document in either word/OO and latex?

Basically everything that should be trivial, e.g. cross referencing, references and bibliographies, contents, figures, applying consistent styles (including bibliographic styles), separating content from presentation, quality typography is trivial in latex.

Also, since latex is in text files, you can use any version control system you wish.

It sounds like you might be overcomplicating things.

Not at all. The learning curve is such that for any even moderately long document, it is worth learning latex from scratch rather than using a word processor.

Once you're used to it, it is so much simpler to use.

Re:A word processor? (1)

Hatta (162192) | about 3 years ago | (#36985754)

Any text processing tool that requires you to use the mouse is overcomplicating things.

LaTeX the problem (0)

Anonymous Coward | about 3 years ago | (#36985268)

LaTeX is the problem, not the FOSS movement. It's horribly obsolete and makes almost everything harder than it needs to be. LaTeX and TeX are the wrong answer for everything, natural science included, unless you have to set large amounts of mathematics.

SGML? (0)

Anonymous Coward | about 3 years ago | (#36985306)

I don't know if it meets your needs, but did you look into SGML and all the related facilities? That suite seems to enjoy good support, especially in Debian-based distros.

i heard you liked to describe descriptions (0)

Anonymous Coward | about 3 years ago | (#36985358)

"Over more than most of the timely use of my time of 3 years I ..."

I heard you liked to describe descriptions so I put an adjective next to your adjective to describe your description

Calibre? (1)

sheepe2004 (1029824) | about 3 years ago | (#36985412)

Googling pdf to epub turns up this: http://lifehacker.com/5509965/how-can-i-convert-pdfs-and-other-ebooks-to-the-epub-format [lifehacker.com] It talks about an open source program called Calibre which can apparently convert pdfs to epub (and many others). So presumably pdflatex followed by this would give exactly what you needed.

The best tool is the one you already use (1)

Phillip2 (203612) | about 3 years ago | (#36985420)

I have helped to create a site for scientists to post their articles on the web. One of the problems is that academics tend to love their tools and do not want to switch, often because they have relatively elaborate workflows and practices, which can cope with their lives; whether this involves writing lots of maths, spending lots of time offline travelling, collaboration or whatever.

We got around this just using Wordpress. Many of the tools out there can already communicate with a blog: this includes Word which, like it or not, is the main tool that scientists use. Others have mentioned things such as asciidoc (which I use). It's okay for short articles, but for a thesis, I would want to use latex. The support for editing in asciidoc is just not as advanced, particulary if you want to do crossreferences, citations, graphs and so on.

There is currently not a good latex -> HTML solution -- in the end, I used PlasTeX to create a tool unimaginatively called latextowordpress. Not perfect, but it works okay in most cases.

http://www.russet.org.uk/blog/2010/08/latex-to-wordpress/ [russet.org.uk]

Once you are in wordpress epub and PDF fall out for free, as there are standard plugins for generating these. Personally, I don't do so; I have not found any substantive advantage over HTML, but they are there if you want them.

The process of publishing in this way is not entirely slick, but the results are quite nice. See http://knowledgeblog.org/ [knowledgeblog.org] and subdomains for examples. And even if the process could be improved, my experience suggests that it is easier than using a commercial publisher. In many cases, it is even less error-prone, as you can see the final published form as you are going, without human intervention in the way.

HTML and LaTEX are not semantic markup (1)

strangeattraction (1058568) | about 3 years ago | (#36985424)

First semantic markup refers to enhancing the text by providing information about its meaning. HTML (CSS) and LaTEX specify the layout of the text regardless of its meaning. Secondly the open source packages are a little behind the curve regarding ePub and support for MathML in browsers. Having a coherent tool that publishes to PDF, HTML , ePub and supports equations well would be a great boon to scientific and technical publishing. BTW it also needs to be scriptable. That said if people know of something I don't please fess up I do scientific publishing and you would save me considerable time if solutions already exist.

Don't follow RMS. (0)

jellomizer (103300) | about 3 years ago | (#36985458)

You should listen to him, then you need to judge for yourself if what he is saying is Insightful or just Dribble.
RMS is very intelligent, but an Utopian idealist, who looks at situations more academically then practically, has a strong distrust of opposing ideal, so he is closed minded.
If you just follow RMS then you are not thinking for yourself. RMS has some good points and great ideas, and other just mindless rants to rage against "The Man".
Being that RMS calls for a boycott of something doesn't mean one should follow it.

Is there a compelling case for EPUB? (1)

The Famous Brett Wat (12688) | about 3 years ago | (#36985550)

My version of e-publishing was, "write the thesis in LaTeX, output in PDF via pdfLaTeX, and upload the PDF to Google Books." Instant global accessibility for anyone that wants it (well, instant after the processing period) -- certainly a heck of a lot better than any exposure my University can offer, although I gave them the PDF too, and they supposedly make it available somewhere. It's not EPUB, sure, and I would convert it to other formats if I felt that the effort was worth it, but maximising availability was more important to me than making it convenient for small form-factor e-book readers. I considered EPUB, but I feel that PDF is good enough, particularly given the effort that went into making it look nice in its published dimensions.

If I were going to write another book, however, I'd finish my half-baked "writer's mark-up language" project first. It's a markup language designed to be writer-friendly, medium-agnostic, and readily translated into other forms like HTML and LaTeX for actual rendering. I don't have any immediate plans to write another book, though: writing the thesis has taken the edge off my enthusiasm for the subject for now.

PDF is fine (1)

evanh (627108) | about 3 years ago | (#36985622)

What more do you want? PDF works well as an e-document.

Re:PDF is fine (1)

Phillip2 (203612) | about 3 years ago | (#36985820)

Actually, it doesn't. The results don't look that great, don't work in a web browser, often fail in screen-readers, are harder to archive, very difficult to text extract from. PDF is really pretty much a legacy format.

Re:PDF is fine (1)

grumbel (592662) | about 3 years ago | (#36986040)

The results don't look that great, don't work in a web browser,

Neither does ePub. With PDF the situation is actually a little better as Chrome will get (or already has) native PDF support and Plugins for other browsers are also rather widespread. That ePub is internally just HTML in a Zip doesn't really help when the browser doesn't have a way to deal with that.

This is one of the weird things with browsers that I never really understood: They are essentially the primary tool for consuming text these days, yet they are incredible shitty at actually handling something that is the size of a book or heck, they even fail at rendering something nicely that is just raw HTML, violating every rule of good typography with their default style (Readability and other third party hacks however help).

RMS against ebooks? (1)

Compaqt (1758360) | about 3 years ago | (#36985638)

While it's true that ebooks present the possibility of digital restrictions management, Smashwords [smashwords.com] , a ebooks distributor site, doesn't use DRM, AFAIK.

pandoc is your answer... (1)

Lumpy (12016) | about 3 years ago | (#36985642)

http://johnmacfarlane.net/pandoc/ [johnmacfarlane.net]

Unless you do some really wacky latex stuff, pandoc works great

org-mode exports to pdf, html and OpenDocumentText (1)

complex_pi (2030154) | about 3 years ago | (#36985864)

It is based on plain text and allows cross linking, references, equations. http://orgmode.org/ [orgmode.org]

Why Not Just... (1)

Greyfox (87712) | about 3 years ago | (#36985930)

Write a new DVI outputter for epub?

I really like TeX and LaTeX (But don't google on the latter without some additional modifying keywords...) and used to maintain my resume in it. Turns out most contracting companies don't want a static document they can't modify, so I ended up dropping the whole thing into a big E-Lisp data structure which I serialize into eieo objects and then emit to some other markup language. I wrote emitters for HTML and Plain Text, but really you can do anything. I have it on my to-do list to rewrite it in another object oriented language one of these days, and possibly change the storage format to XML. It wouldn't be hard to get it in XML format -- I'd just have to change my HTML markup emitter a little bit.

What else you want? (1)

Faisal Rehman (2424374) | about 3 years ago | (#36985952)

All you need is proficiency in latex and that's it. There is no shortcut. Be patient and become expert. The more expert you are the more fast and productive you will be. Hifi tools make you lazy [associatedcontent.com] .

WHY CONVERT!? There's a New Kid on the Block. (1)

rjbradlow (1683712) | about 3 years ago | (#36986044)

Living in a Winblows world with all the brainwashed money and fanboys behind it, its hard to fight mass ignorance. Enter FastPencil.com... http://www.fastpencil.com/company/media [fastpencil.com] http://www.fastpencil.com/company/learning_center#TB_inline?height=648&width=800&inlineId=importing-pdf [fastpencil.com]
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>