×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

DocBook 5

samzenpus posted more than 3 years ago | from the read-all-about-it dept.

Books 68

frisket writes "Definitive guides by the authors or maintainers of software systems tend to have the edge over other documentation because of the insight they provide. DocBook 5 — The Definitive Guide comes well up to scratch. DocBook has long been the de facto standard for computer system documentation in XML (and SGML before that), and Norm Walsh has revised and updated both the language and the documentation in a concise and valuable form, usable both by beginners and by tech doc experts." Read on for the rest of frisket's review.DocBook is a rich XML vocabulary, primarily for the documentation of software systems. It provides markup both for the structure of your documents and for the descriptive detail of your writing, to an extent that few other XML systems match. Like XML itself, DocBook's popularity rests on its robustness, scope, and extensibility; and Walsh makes it clear that the Technical Committee has tried hard to balance stability and adaptability in releasing a new major version which does have a few backward-incompatible changes.

This is a reference book, so the initial chapters (1-5) are short (70 pages) but full of clear explanations of how DocBook works, what it does, and how to use it. Part II is 400 pages, covering every element type in the language, with a detailed description of what it is for, how and where to use it, and how it interacts with everything else. Both for the beginner and the expert, these descriptions are the key to effective use, and Walsh's explanations are clear and comprehensive.

For those of you who have been using DocBook in earlier incarnations, the changes are not deal-breakers, and many of them are welcome rationalizations of the way things have grown organically over the years. It still walks like a duck and quacks like a duck (and the book still has a duck on the cover), so it immediately feels like the same format that you're used to — the changes to element types are relatively few. Chapter 1 (Getting Started) has a brief history, a summary of the changes, and an explanation of the namespace and availability.

If you've never used DocBook before, its structure will still be familiar: in Chapter 2 (Creating DocBook Documents) Walsh explains the division of reference material like books, articles, and manuals into chapters, sections, and subsections, with all the conventional features like lists, figures, tables, and references, as well as the technically-oriented features like equations, programming constructs, interface descriptions, and code samples.

There is help in Chapter 3 (Validation) for those who construct or generate DocBook documents without the use of an XML editor (or even with them: more on editors below). The most common problems with misplaced markup (and the error messages they create) are clearly explained with examples.

Chapter 4 (Publishing) very briefly explains the role of stylesheets (CSS, XSL, and XQuery) in displaying and transforming your documents to other formats, but as these all have their own books and manuals, this book doesn't go into them in any detail.

Customizing DocBook is fairly commonplace, either to avoid the need to commit tag abuse, or to extend its structure into other fields (I added a new element type for typographical examples for my book on LaTeX, and it only took a few minutes). Chapter 5 provides some rules and explanation of customization layers and modularity for those who design schemas and DTDs.

The five Appendixes cover Installation, Variants, Resources, Interchange, and the GNU Free Documentation License — yes, you can read the whole thing online at docbook.org, for which Tim, Norm, and many others are to be thanked. It is a rare publisher who groks the need to be able to point someone at a reference, or quote it in email or a tweet, where a paper copy doesn't cut the mustard.

There isn't anything here about actually using an XML editor or about how to choose one. Editors do of course all come with their own documentation (much of it written using DocBook) and editor selection can be a complex business. However, there is a list of some common tools in Appendix C (Resources). Editors are a minefield, as my own research into the usability of editing software for structured documents is showing, so I can understand the omission, but some pointers to editor resources would have been useful.

The chapter on Publishing is useful for those who haven't been in the publication process before, but it could have emphasized more the need for accuracy and consistency. Experienced technical authors know this, but many other writers don't see the need for it, assuming that the publisher (or some elf) will automagically heal everything before publication. DocBook 5 and this book will help enormously, but author-edited documents sometimes unwittingly misuse or abuse the markup, no matter how exhaustive the manuals.

If you write computer documentation, or anything related to it, from a conference paper to a thesis to a book, DocBook 5 is probably what you should use if you want the document to survive and to be usable and reusable; and this is the book to help you do it.

You can purchase DocBook 5: The Definitive Guide from amazon.com. Slashdot welcomes readers' book reviews -- to see your own review here, read the book review guidelines, then visit the submission page.

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

68 comments

Wait, what? (1, Insightful)

Anonymous Coward | more than 3 years ago | (#33291446)

"I added a new element type for typographical examples for my book on LaTeX, and it only took a few minutes."

Why are you using DocBook for a book on LaTeX?

Re:Wait, what? (-1, Offtopic)

Anonymous Coward | more than 3 years ago | (#33291672)

Does anyone else get goosebumps when you take a poop and it first starts to slide out after holding it in after a while?

Re:Wait, what? (0)

Anonymous Coward | more than 3 years ago | (#33293378)

I get goosebumps when you take a poop!

Re:Wait, what? (1)

frisket (149522) | more than 3 years ago | (#33317578)

Why are you using DocBook for a book on LaTeX?

Because (a) I write in XML as a matter of course, and (b) the book is available online (HTML) as well as PDF (LaTeX), so it made sense to author it in something that could easily be transformed to multiple outputs.

DocBook - like HTML 1.0, only dumber (0)

Animats (122034) | more than 3 years ago | (#33291460)

DocBook is being used for what HTML was originally intended - technical publications. Why not just use HTML? It even supports pictures!

Re:DocBook - like HTML 1.0, only dumber (1)

LingNoi (1066278) | more than 3 years ago | (#33291724)

Not only that, it sounds like a horrible format if you need documentation to write in the documentation language. Just looking at their What is DocBook [docbook.org] page leaves me wondering what the hell it really is...

Re:DocBook - like HTML 1.0, only dumber (1)

ebuck (585470) | more than 3 years ago | (#33292878)

Not only that, it sounds like a horrible format if you need documentation to write in the documentation language. Just looking at their What is DocBook [docbook.org] page leaves me wondering what the hell it really is...

Even how to write English is documented in English, so why do you argue that any language which can use itself to document how to make more of itself is bad?

Re:DocBook - like HTML 1.0, only dumber (1)

LingNoi (1066278) | more than 3 years ago | (#33297076)

When writing documentation in word or open office I don't need to read an entire book in how to do it. That's why this bloated design by committee xml language is a complete waste of time. You analogy fails because it's not even slightly related and doesn't at all translate to real life. When typing this comment is wasn't constantly referring to a dictionary.

Re:DocBook - like HTML 1.0, only dumber (1)

borgar (31365) | more than 3 years ago | (#33299082)

If you are one person writing a 50 page document Word may very well be perfect. However, imagine you have 20 people who need to collaborate on keeping a 1000 page documentation set updated.

Now imaginge doing this in Word.

Re:DocBook - like HTML 1.0, only dumber (1)

Meshach (578918) | more than 3 years ago | (#33291806)

DocBook is being used for what HTML was originally intended - technical publications. Why not just use HTML? It even supports pictures!

HTML was originally meant as a subset or replacement of SGML. The primary goal was to be able to share documents; technical or not. Tim Berners-Lee's main goal in creating HTML was to have a way to share information easily.

Re:DocBook - like HTML 1.0, only dumber (1)

Michael Kristopeit (1751814) | more than 3 years ago | (#33291832)

you can't make any money selling books on HTML 1.0 anymore... you also can't make any money releasing languages that are fully implemented and don't require updating... hence #5.

DocBook is garbage.

Re:DocBook - like HTML 1.0, only dumber (2, Informative)

slide-rule (153968) | more than 3 years ago | (#33291888)

One of the big reasons is that HTML lacks semantic meaning beyond simple paragraph constructs. Documentation-oriented markup languages (of which I'm more familiar with DITA) and schemas can seem arbitrarily complicated to a casual observer, granted; but having an identifier that clarifies "this" paragraph being an instruction that should be executed by the user, and "that" paragraph being merely an example can allow for some rules-based (automated) processing to exist between authorship and production that wouldn't be possible lacking some notion of the semantic purposes of a random collection of raw paragraphs.

Re:DocBook - like HTML 1.0, only dumber (1)

Michael Kristopeit (1751814) | more than 3 years ago | (#33292050)

you mean an identifier like the html "id" attribute? or a more generalized "class" attribute?

how is forcing everyone to declare their paragraphs and authorship in a fixed format better than allowing them to declare it however they want, achieving the same thing more simply?

if you want rules-based automated processing, you can do that with html... pretty much the entire modern web is built on that. HTML+CSS+CMS > DocBook

Re:DocBook - like HTML 1.0, only dumber (1, Insightful)

Anonymous Coward | more than 3 years ago | (#33293458)

"HTML+CSS+CMS > My woefully inadequate understanding of DocBook"
FTFY.

Re:DocBook - like HTML 1.0, only dumber (2, Insightful)

frisket (149522) | more than 3 years ago | (#33317700)

if you want rules-based automated processing, you can do that with html

You can, the same way that you can write a book in PostScript.

But if you want all the internal checks on consistency and effectivity that make a piece of documentation robust and persistent, you probably don't want to do it in HTML and CSS.

Re:DocBook - like HTML 1.0, only dumber (1)

jdeisenberg (37914) | more than 3 years ago | (#33292506)

A short look at the Docbook element reference (about halfway down the page at http://www.docbook.org/tdg5/en/html/docbook.html [docbook.org] ) will show some of the elements that are relevant when publishing a *book*; elements for citations, bibliographies, indexing, callouts, glossaries, etc. HTML does not provide these elements.

Re:DocBook - like HTML 1.0, only dumber (1)

madddddddddd (1710534) | more than 3 years ago | (#33293058)

html does provide those elements... any element could be a citation element or index element... it is up to the CMS and CSS to organize them however the author wants. removing the authors decision making, and forcing them to use a framework full of elements they'll never use is not a step forward.

Re:DocBook - like HTML 1.0, only dumber (0)

Anonymous Coward | more than 3 years ago | (#33372278)

Oh, ok. Show me the HTML and CSS required to insert a footnote that, for @media print, will place a superscript footnote number in the text, automatically restarting with the ordinal footnote number "1" on each new page, and place the corresponding footnote text at the bottom of the physical page on which it was referenced.

MS Word gives lots of footnote options... placement (bottom of page, bottom of text), number format, start at #, continuous numbering or restart each section or page (and yet still, with all the options it gives it's too inflexible for any serious author to use it). So, show me how you do that in HTML and CSS.

Re:DocBook - like HTML 1.0, only dumber (1)

madddddddddd (1710534) | more than 3 years ago | (#33372328)

hey, idiot, i also said CMS... CONTENT MANAGEMENT. when you want to print, SOMETHING has to set the @media to "print".... THAT SOMETHING will alter the page layout and add the footnotes.

now let me know how docbook handles footnotes INSIDE OF footnotes? oh no, it can't you say? when are the docbook developers planning on adding it? they aren't you say? with a custom CMS, i can make that feature at my leisure. with docbook, you are just locked into a DIFFERENT way of doing things someone else's way.

Re:DocBook - like HTML 1.0, only dumber (1)

frisket (149522) | more than 3 years ago | (#33317730)

...some of the elements that are relevant when publishing a *book*; elements for citations, bibliographies, indexing, callouts, glossaries, etc. HTML does not provide these elements.

Earlier versions of HTML did provide a lot of these, but the W3C took 'em all out (deprecated them) becase it was clear that no-one in their right minds would author a complex technical book in HTML.

DocBook DITA (1)

readthemall (1531267) | more than 3 years ago | (#33293110)

DocBook is being used for what HTML was originally intended - technical publications.

True, DocBook is used mainly for technical publications. Not true, HTML was intended for implementing the hypertext (that's why HT is part of the name) and not specifically for technical publications.

Why not just use HTML? It even supports pictures!

Because DocBook provides much more meaningful elements for technical publications than HTML. Because DocBook is intended mainly for documents published on paper, while HTML is intended for Web pages displayed in a browser. There is a reason why nobody uses HTML for technical publications.

The real question must be, why use DocBook when we already have DITA [wikipedia.org] ? While both formats are designed specifically for technical publications, DITA is superior.

Re:DocBook DITA (1)

Pinky's Brain (1158667) | more than 3 years ago | (#33293896)

"already have DITA"

That implies the reverse order of invention as actually occurred. DITA might be superior, I have no idea ... haven't really used either. DocBook seems a bit more actively developed though, no official RelaxNG schema for DITA for instance.

Re:DocBook DITA (2, Informative)

aamcf (651492) | more than 3 years ago | (#33295270)

I've used both DocBook and DITA. While you can do the same jobs with both of them, DocBook is better, in my experience, for linear documents. while DITA seems to work well for non-linear stuff. DITA also uses topic maps, which can be hard for people to understand.

Re:DocBook DITA (1)

frisket (149522) | more than 3 years ago | (#33317790)

The real question must be, why use DocBook when we already have DITA

DITA is descended from DocBook in many ways. DITA is an architecture with which a willing participant can, if she tries long and hard enough, come up with a system that can be used for authoring large series of technical documents.

DocBook works right out of the box.

Re:DocBook - like HTML 1.0, only dumber (1)

frisket (149522) | more than 3 years ago | (#33317606)

HTML was never intended for formal publications: TBL did it for people at CERN to read lab reports and internal papers.

Of course you can abuse HTML for anything you want...just look at the web :-)

DocBook? More like ShitBook amirite?!?! (-1, Troll)

Anonymous Coward | more than 3 years ago | (#33291554)

DocBook? More like ShitBook amirite?!?!

Go suck a dong and die.

CmdrTaco has a micropenis.

Editors are a minefield (1)

wiredog (43288) | more than 3 years ago | (#33291840)

If everyone would just use the sensible choice: EMACS Vi Notepad Pico!

Re:Editors are a minefield (1)

Haeleth (414428) | more than 3 years ago | (#33293836)

Only one of those comes with a validating XML editor built in. And, sadly, it only comes with a schema for DocBook 4, though it should be simple enough to update.

Re:Editors are a minefield (1)

frisket (149522) | more than 3 years ago | (#33317832)

...it only comes with a schema for DocBook 4, though it should be simple enough to update.

Takes all of 30 seconds to download DocBook5, unzip it, and declare it in a new document.

Math markup (0)

Anonymous Coward | more than 3 years ago | (#33291928)

Does docbook finally have a usable solution for embedding mathematics?

MathML [wikipedia.org] is a hyperverbose turd, especially if you're used to a reasonable markup language like LaTeX.

language != application (2)

bigogre (315585) | more than 3 years ago | (#33291938)

XML (and SGML before it) is a meta language. From that you derive a description language for the specific use. HTML meets the needs for an on-line presentation of information. HTML is not designed and does not work well for printed materials. DocBook is designed to be used for multiple ways of presenting information and has the features for books and other printed media.

To use a bad analogy, think of XML and C. You can write the "hello world" example in C, but it doesn't replace a database application written in C. C can be used for big or small applications. XML can be used for relatively simple description languages (such as HTML) or very rich description languages for large, complex documents (such as DocBook).

First doc? (0)

Anonymous Coward | more than 3 years ago | (#33291946)

I guess not.

It is for Docbook specialists (1)

beachdog (690633) | more than 3 years ago | (#33292216)

The subtitle "...The Definitive Guide" means this book is for specialists that work with the DocBook publication tag language.

The information in this book isn't for the user of the word processor or editor program.

DocBook is a syntax and tag language and this is a book for people who work with the tag language.

Re:It is for Docbook specialists (1)

frisket (149522) | more than 3 years ago | (#33317896)

The subtitle "...The Definitive Guide" means this book is for specialists that work with the DocBook publication tag language.

Correct, but not exclusively specialists.

The information in this book isn't for the user of the word processor

Correct: people who use wordprocessors either have a team of expensive editors to format their documents for publication, or are only interested in making it look pretty, rather than correct (and pretty).

or editor program.

Incorrect: the book is for anyone writing technical documents in the field of computing who needs or wants to use XML to do it

DocBook is a syntax and tag language and this is a book for people who work with the tag language.

Yep, either writing, editing, formatting, publishing, archiving, or transforming the document to new target formats.

DocBook is horrible (1, Interesting)

bjourne (1034822) | more than 3 years ago | (#33292364)

DocBook is probably the absolutely worst document writing format I have ever had the displeasure of working with. It seems to have been born in some deranged xml-lovers wet dream in which "documents" are "self-documenting," semantic structure is more important than content and structure is kept separate from presentation. You know all those generally good ideas that become very dangerous when taken to far, which DocBook exemplifies. The more xml the better, seem to have been their guiding principle. In HTML, P is the tag for paragraphs, not so in DocBook, guess P wasn't descriptive enough so it had to be PARA instead. In HTML, to create a preformatted block you often use PRE. Well obviously that was to simple for DocBook so you have to nest two tags INFORMALEXAMPLE PROGRAMLISTING source code /PROGRAMLISTING /INFORMALEXAMPLE.

Maybe you are asking, who the hell came up with the INFORMALEXAMPLE tag? Well in DocBook you can not just say "give me a block with fixed-width font" you have to be "semantic" because you must separate presentation from structure. This is the reason why the maintainers of the DocBook standard has to continuously invent new tags for use cases they didn't think of. For example, there are all these tags for describing different programming language identifiers: KEYWORD, FUNCTION, CLASSNAME, STRUCTNAME, TOKEN, PROPERTY, TYPE.. etc. They all make it so the word within the tags are formatted using italic text. But what if the programming language you are writing about in the text has a different concept not covered by DocBooks standardized tags? Then you're out of luck. You either cheat and use a different tag which happen to produce the same presentational italicized text you wan't or you submit an enhancement proposal to DocBook and wait for them to standardize your new tag. If you choose the former, you quickly realize that your carefully marked up DocBook text is nothing more than glorified HTML, with retardedly verbose tag names, in the latter case you will never complete your documentation because there will always be tags you'll need that you can't have.

Please let me know why I should read that. (-1, Flamebait)

Anonymous Coward | more than 3 years ago | (#33292900)

Please let me know why I should read that. Because, frankly, I don't know who the fuck you are and why you'd be a go-to source for document formats. Is your name Leslie? Or Donald? No? Then why should you be able to tell anyone else whether DocBook is a good or bad document format?

Re:Please let me know why I should read that. (1)

mewshi_nya (1394329) | more than 3 years ago | (#33294842)

There's a reason why GNOME docs are moving away from DocBook...

Re:Please let me know why I should read that. (1)

Pinky's Brain (1158667) | more than 3 years ago | (#33295890)

They are switching to something more domain specific though, not to some general alternative.

Re:Please let me know why I should read that. (1)

frisket (149522) | more than 3 years ago | (#33318428)

It's actually because good XML editors are expensive. The free ones are designed for XML experts who know all about markup: there are no usable XML editors (free or non-free) for non-experts in markup, as I showed last year [balisage.net] .

Re:Please let me know why I should read that. (1)

Pinky's Brain (1158667) | more than 3 years ago | (#33319744)

It's an interest bit of research you are plugging ... but Gnome is switching to Mallard, which is XML+RelaxNG, not switching away from XML.

Re:DocBook is horrible (5, Insightful)

ebuck (585470) | more than 3 years ago | (#33293122)

My experiences are quite different; DocBook is simply awesome.

If you start selecting tags to make the output look the way you want it to look, you don't understand XML (and subsequently shouldn't be using DocBook).

Any sane documentation project separates the formatting from the content; because, when you need to update the look of the documentation, you don't want to spend days checking each individual document element to determine if it is the correct font, point, weight, etc. Only the novices create documentation that doesn't permit consistent formatting. Using something that goes half way there (inconsistently applied style sheets) eventually leaves you with the same amount of work to make sure your documentation shows as it should. Style sheets may be applied consistently, but you can't know for sure without paying the "verify the whole document" price.

With XML, there is no formatting in the document. All of the formatting is done with the XSL document. If you didn't like the format, layout, font, italics, or whatever of the output documentation, the correct choice was to change the XSL document used to build the output, not to go on a happy hunting DocBook tag search for the tags that made it look how you wanted it.

The fact that you couldn't find the <code> tag but could find the other tags you've mentioned is just depressing, especially when those tags are most often sub-tags of a code tag block.

Just wait until you need to generate HTML help, Text file documentation, a web page manual, and a printed PDF of the same core documentation. The single-source design of DocBook will be much better appreciated then, if you learn how to use it.

Re:DocBook is horrible (2)

jgrahn (181062) | more than 3 years ago | (#33294052)

If you start selecting tags to make the output look the way you want it to look, you don't understand XML (and subsequently shouldn't be using DocBook).

Well, to *me* it sounded as if the grandparent understood it well enough -- he just thought it sucked.

Any sane documentation project separates the formatting from the content; because, when you need to update the look of the documentation, you don't want to spend days checking each individual document element to determine if it is the correct font, point, weight, etc.

Well, not littering your document with exact font sizes and colors is one thing, filling it with (for the most part) useless semantic tagging is another. And I don't believe you can produce *really* good printed text without tinkering -- you may have to reformulate sentences to get rid of orphans, not to mention tweaking the hyphenation.

I sometimes think MediaWiki hit the sweet spot here. Although God knows it has plenty of ugly features ... and of course it's closely tied to HTML as output.

Re:DocBook is horrible (1)

frisket (149522) | more than 3 years ago | (#33318490)

If you start selecting tags to make the output look the way you want it to look, you don't understand XML (and subsequently shouldn't be using DocBook).

Well, to *me* it sounded as if the grandparent understood it well enough -- he just thought it sucked.

I'm not so sure. It sounded as he still thought it was some kind of programming language, and he clearly believed that the semantics of the element types were irrevocably bound to the formatting that he saw in his application: a common mistake.

It's called single source documentation (0)

Anonymous Coward | more than 3 years ago | (#33294554)

Anyone who has participated in a signifigant document effort or a multiple authorship project understands the need for the DocBook XML schema. To often "authors" depend on MS Word formatting styles which just don't cut it. Worse yet, authors also don't bother to use styles and just format each element in their document individually, leaving someone else to figure out what they meant. DocBook helps facilitate communication between the authoring process and the publishering process.

Just wait until you need to generate HTML help, Text file documentation, a web page manual, and a printed PDF of the same core documentation. The single-source design of DocBook will be much better appreciated then, if you learn how to use it.

I couldn't agree with ebuck more!!!

Re:DocBook is horrible (1, Flamebait)

bjourne (1034822) | more than 3 years ago | (#33294576)

If you start selecting tags to make the output look the way you want it to look, you don't understand XML (and subsequently shouldn't be using DocBook).

Anyone who was some experience writing documentation knows that the main objective is to write beautiful and readable documentation, not choosing the right markup...

The fact that you couldn't find the tag but could find the other tags you've mentioned is just depressing, especially when those tags are most often sub-tags of a code tag block.

The CODE tag is new in DocBook 4.3. Version of jade shipped with Ubuntu 9.10 is 1.2.1 and it does not know about the CODE tag. That's another problem with DocBook, it is a moving target with a standard that moves faster than the tools that support it.

Just wait until you need to generate HTML help, Text file documentation, a web page manual, and a printed PDF of the same core documentation. The single-source design of DocBook will be much better appreciated then, if you learn how to use it.

I doubt most people who express that belief has actually tried to publish the same documentation in HTML and PDF form. DocBook produces PDF by first converting the document to LaTeX (so one is left wondering, why not use LaTeX itself in the first place?) and then use its tools to export to PDF. The result is a document as ugly and badly type-setted as an O'Reilly book. The HTML output basically looks like a raw data dump of the text, like this book [docbook.org] for example. That's underwhelming to say the least, considering that 50%+ of a DocBook document is spent writing XML markup.

If you really want to know why DocBook sucks so much, you should check out Sphinx [pocoo.org] which is a document writing system done right. For some reason, it can manage without the overly verbose XML and idiotic semantic markup and still produce high quality documents that blow DocBook's out of the water.

Re:DocBook is horrible (1)

aamcf (651492) | more than 3 years ago | (#33295320)

The CODE tag is new in DocBook 4.3. Version of jade shipped with Ubuntu 9.10 is 1.2.1 and it does not know about the CODE tag. That's another problem with DocBook, it is a moving target with a standard that moves faster than the tools that support it.

Don't use the latest version unless you need something only it provides.

DocBook produces PDF by first converting the document to LaTeX

It can do that, but any time I've used DocBook, I've generated the PDF using XSL-FO and FOP. No LaTeX involved.

Re:DocBook is horrible (1)

frisket (149522) | more than 3 years ago | (#33318756)

If you start selecting tags to make the output look the way you want it to look, you don't understand XML (and subsequently shouldn't be using DocBook).

Anyone who was some experience writing documentation knows that the main objective is to write beautiful and readable documentation, not choosing the right markup...

It's not usually the author's job to do the beautification unless it's for self-publication through Lulu or for internal consumption. Conventional publishing houses have sets of standards governing how things must appear in their series, and the author does not get much chance to influence that except on special occasions.

Any author who is spending more time making it look pretty than actually writing readable, usable text needs firing. And they shouldn't have to spend time choosing the right markup (although, goddess help us, they do): they need editing software that provides them with the right choices, and right now it's the editors that suck, not the markup.

That's another problem with DocBook, it is a moving target with a standard that moves faster than the tools that support it.

This will always happen if you restrict yourself to pre-written tools.

I doubt most people who express that belief has actually tried to publish the same documentation in HTML and PDF form. DocBook produces PDF by first converting the document to LaTeX (so one is left wondering, why not use LaTeX itself in the first place?) and then use its tools to export to PDF.

Docbook itself does not produce anything: it's a markup format, not a program. You are confusing the markup format with one of the tools used to process it (OpenJade). There are dozens, if not hundreds, of others. Go look at them, or just write your own in XSLT. And I explained in an earlier comment why I transform my XML to LaTeX to produce PDF (another reason is that LaTeX produces very high quality PDF, which the other affordable XSL:FO tools don't).

...50%+ of a DocBook document is spent writing XML markup.

Then you are definitely either using the wrong editing tools, or the tools you are using are badly-written.

If you really want to know why DocBook sucks so much, you should check out Sphinx [pocoo.org] which is a document writing system done right. For some reason, it can manage without the overly verbose XML and idiotic semantic markup and still produce high quality documents that blow DocBook's out of the water.

Sphinx is very pretty, but it's going down a blind alley industrially unless it uses XML.

Re:DocBook is horrible (0)

Anonymous Coward | more than 3 years ago | (#33295586)

Just wait until you need to generate HTML help, Text file documentation, a web page manual, and a printed PDF of the same core documentation. The single-source design of DocBook will be much better appreciated then, if you learn how to use it.

This is one of those false goals that always leads people down the path to horrible, horrible end results. It reminds me of Java - write once, run on any platform - and suck big time because nothing is optimal for any of those platforms. (I'm a fan of Java, by the way, but I totally acknowledge its failings in this regard).

My company went through a DocBook obsession for a while and in the end we just gave up - it was just horrible. Most of the editors were unusable, very few WYSIWYG options so unless people were major geeks it was repulsive to use. Even geeks found it so hostile to use. I actually agree that improving the semantic content of the documents is good, DocBook is just a horrible way to do it. Something like HTML microformats would be a better option.

Re:DocBook is horrible (1)

lee1 (219161) | more than 3 years ago | (#33301040)

Just wait until you need to generate HTML help, Text file documentation, a web page manual, and a printed PDF of the same core documentation.

There are alternatives. tbook [sourceforge.net] is another xml application that succeeds very well at this. Its author explains in detail why he didn't just use docbook: the main reason is that it forces you to write deeply nested tags to express simple things.

Re:DocBook is horrible (1)

vocaro (569257) | more than 3 years ago | (#33297590)

It seems to have been born in some deranged xml-lovers wet dream in which ... structure is kept separate from presentation

You say that like it's a bad thing.

Re:DocBook is horrible (1)

cobbaut (232092) | more than 3 years ago | (#33298334)

In HTML, to create a preformatted block you often use PRE. Well obviously that was to simple for DocBook so you have to nest two tags INFORMALEXAMPLE PROGRAMLISTING source code /PROGRAMLISTING /INFORMALEXAMPLE.

You could use SCREEN /SCREEN.

Re:DocBook is horrible (1)

frisket (149522) | more than 3 years ago | (#33318372)

It sounds as if you have been seriously damaged by incompetent software, incompetent programmers, incompetent teachers, or incompetent managers, possibly all four.

...some deranged xml-lovers wet dream in which "documents" are "self-documenting," semantic structure is more important than content and structure is kept separate from presentation.

Semantic structure usually outlasts content, so in a sense it is more important. If you have ever had to publish someone else's work which was done using a wordprocessor or HTML, you'll know that you do indeed need to separate presentation out to a separately-manageable layer.

[...] In HTML, P is the tag for paragraphs, not so in DocBook, guess P wasn't descriptive enough so it had to be PARA instead.

DocBook started around the time that HTML was becoming popular. At that time, a lot of [SGML] DTDs used P as the paragraph element type name (TEI was another) but were finding that other semantics in other fields had an equally valid claim on "P" (particularly the powerful vertical-market vocabularies in aerospace, rail, and medical applications); and P also risked carrying the wrong semantics when used outside the English-language areas. PARA was equally heavily used in other applications, so it was pretty much a toss-up as to which to go for.

In HTML, to create a preformatted block you often use PRE. Well obviously that was to simple for DocBook so you have to nest two tags INFORMALEXAMPLE PROGRAMLISTING source code /PROGRAMLISTING /INFORMALEXAMPLE.

Not in any version of DocBook I have ever used. INFORMALEXAMPLE and PROGRAMLISTING are peers, and are two different element types for two different purposes. PROGRAMLISTING is for program listings, and INFORMALEXAMPLE is for informal examples. Perhaps you misread the names.

Maybe you are asking, who the hell came up with the INFORMALEXAMPLE tag?

I don't know, but the documents I write have both formal and informal examples, so I use both EXAMPLE and INFORMALEXAMPLE. The first usually gets formatted as a fullwidth block, in a different typeface, with a number and a title, and it gets indexed and put into the list of examples. The second usually gets formatted to the narrower width of a block quotation, and set in the body font without any title or referencing, because it's, uh, informal.

Well in DocBook you can not just say "give me a block with fixed-width font"

Um. You can, and I do. Frequently. I'm not clear what makes you think you can't. This is why I think you have been rather badly misled.

you have to be "semantic" because you must separate presentation from structure.

That's correct. That's what it's all about. But no-one is forcing you to do it (I hope) if you don't want to. Many people are very happy creating documents in Microsoft Word, and I'm equally happy to make a living from repairing the damage they cause :-)

This is the reason why the maintainers of the DocBook standard has to continuously invent new tags for use cases they didn't think of.

If you read the book you'll find that it's very discontinuous. I think you might have meant "continually". And, unsurprisingly, computing has changed and evolved over the years since 1991, so DocBook has tried (fairly successfully) to keep pace with the industry.

For example, there are all these tags for describing different programming language identifiers: KEYWORD, FUNCTION, CLASSNAME, STRUCTNAME, TOKEN, PROPERTY, TYPE.. etc. They all make it so the word within the tags are formatted using italic text.

I don't know what software you are using, but in my system they do different things for different reasons. If you're using some low-grade formatter or editor that uses italics for everything, I suggest you trash it and get something better. In any case, you're still thinking presentationally instead of conceptually. The reason for the different element types is that the software can do stuff with them, especially automation: in my thesis I deal a lot with functions, so all mentions of functions get indexed. Likewise packages, using PACKAGE, but they also become links, which fuctions don't. I barely touch the rest

But what if the programming language you are writing about in the text has a different concept not covered by DocBooks standardized tags? Then you're out of luck.

You really have been exposed to some crummy information. One of the major reasons for using DocBook is that you can customize it, so if you need a new element type, you can add it. I did this for my book on LaTeX, because I needed to be able to show typographical changes accurately, and have them related contextually to their attributes. This took about 15 mins to do, so I now have a customization layer which has served me well for nearly a decade. I can't imagine the nightmare of trying to maintain this kind of thing in a wordprocessor.

You either cheat and use a different tag which happen to produce the same presentational italicized text you wan't or you submit an enhancement proposal to DocBook and wait for them to standardize your new tag.

You can certainly submit a proposal, and a lot of the development changes have come from exactly that: user proposals. You can also commit tag abuse if you wish; that's up to you. But it's far simpler to add what you need in a reproducible manner.

If you choose the former, you quickly realize that your carefully marked up DocBook text is nothing more than glorified HTML, with retardedly verbose tag names

Why would the length of the element type names be a concern? Surely you're not still using a plaintext editor where you can actually see them? And risk trespassing on them? There are plenty of good synchronous typographical XML editors which shield you from ever seeing a pointy bracket. Glorified HTML is actually pretty close: no-one in their right mind would use HTML for large or complex documentation — it needed glorifyin'.

in the latter case you will never complete your documentation because there will always be tags you'll need that you can't have.

Someone has been keeping you in the dark about XML. There are never tags that you can't have: does the word "extensible" ring any bells? I suggest you go and find out a bit more: I think you'll be surprised.

do77 (-1, Flamebait)

Anonymous Coward | more than 3 years ago | (#33293382)

Why Not? It's quick

What about reStructuredText? (1)

Jacked (785403) | more than 3 years ago | (#33293510)

How does reStructuredText [sourceforge.net] stack up against DocBook? It's on my "look into later" list for technical documentation. My first impression of it was pretty good, especially combined with the Sphinx [pocoo.org] document generator.

Re:What about reStructuredText? (1)

trampel (464001) | more than 3 years ago | (#33293738)

reStructuredText reminds of the markup used by various Wikis ... while it's far easier to type than anything related to XML, it's also far more limited.

Re:What about reStructuredText? (3, Interesting)

Manhigh (148034) | more than 3 years ago | (#33293766)

I use Sphinx for all of my Python development and am really happy with it. Autogeneration forces me to write reasonable docstrings into my code, and I'm pretty pleased with the HTML output.

I still think I'd prefer LaTex for large scale, intended-for-print documents, though.

Re:What about reStructuredText? (0)

Anonymous Coward | more than 3 years ago | (#33295542)

You could learn AsciiDoc instead: it can generate DocBook :)

Re:What about reStructuredText? (1)

Jacked (785403) | more than 3 years ago | (#33307164)

Thanks for the tip, AsciiDoc looks really good, too.

I don't need all the features of DocBook, as my writing is internal documentation of software operation, api docs and various processes, and usually viewed onscreen. Not books or anything significantly complex.

Just use a lightweight language (0)

Anonymous Coward | more than 3 years ago | (#33295296)

Please be a little indulgent with your self and do not write in *DocBooK or XML: soon you will just stop writing becouse of the pain.

Use something like AsciiDoc: http://www.methods.co.nz/asciidoc/ then convert in DocBook or straight in any of the others supported output languages.

Docbook (2, Informative)

starseeker (141897) | more than 3 years ago | (#33296534)

I have some experience with Docbook, although probably not enough to qualify as an expert. From what I've seen so far:

Pro:

1. Generating pdf, html and (sometimes) man pages from a single source document. This is probably the biggest single win for Docbook.

2. Combining parts of documents with xinclude. If you have four documents of different types which need to contain the same introductory description of a tool (say) or a synopsis of command arguments (book, man page, short article, comprehensive encyclopedia, etc...) you can write the description once in one document and xinclude that specific piece of the document in other documents.

Cons:

1. Toolchain. TeX distributions get this right - install texlive with all the packages and you're done - you can handle any LaTeX document. For Docbook, it's a struggle to figure out what you NEED, never mind how to install it. Once you get it worked out you can integrate it into your build system and forget it, but it takes a while to get there.

2. You need to learn a lot of languages to customize the look of your output documents, and it's not exactly for the faint of heart. I suppose this is kind of a wash between TeX and Docbook, since both don't invite casual tinkering with the look of output, but it's a bit scary. I believe the Firebird RDBMS manual is an example.

3. Finding the "right" tags for what you're trying to do. Price of doing business of course, but there are a LOT of tags to sort through.

LaTeX of course mops the floor with Docbook when it comes to things like mathematics or pstricks, but to be fair about it that's not what Docbook was intended for.

LaTeX using DocBook? (1)

vocaro (569257) | more than 3 years ago | (#33297560)

I added a new element type for typographical examples for my book on LaTeX, and it only took a few minutes

Wait a second... you're writing a book on LaTeX using DocBook?

Does not compute...

Shouldn't you be using LaTeX to write a book on LaTeX?

Re:LaTeX using DocBook? (1)

frisket (149522) | more than 3 years ago | (#33318812)

No, LaTeX sucks for document management (and don't get me started on LyX, please). I generate LaTeX from DocBook with XSLT. Using XML for large-scale documentation simply has too many benefits to ignore. I use LaTeX for formatting because that's what it was designed for, and it still does that better than anything else. But XML whipped its ass for document management when XML was still SGML.
Check for New Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...