The purpose of this page is to give an overview of the DocBook format. It offers an explanation of the advantages of this format, links further reading on this subject and contains a short tutorial.
2. What is DocBook ?
DocBook is an XML based standard, which is used in many of today's documentation tasks. When you want to create a DocBook document source, you write XML files which describe the document layout, paragraph division and other attributes. XML file structure may look familiar to you if you have seen HTML code before. XML tends to be an improvement over the older HTML specification and can be used to produce complete web pages and other markup documents.
3. What are the Advantages of DocBook?
DocBook is an OASIS standard and the format in which most open source projects store their documentation. Docbook is developed as an open source application. The project is hosted at SourceForge and is made available under the GPL. DocBook is available as a Document Type Definition (DTD) and XML Schema (XSD). The project has a large developer and support community spanning both open source and commercial groups.
The most important reasons why the project uses DocBook include:
DocBook is a standard
DocBook is open source
DocBook is used by most major projects
DocBook has a large developer and support community
DocBook is also an XML application and XML technologies solve a number of publishing problems for documentation teams, including:
- Collaborative authoring
- Cross-platform editing
- Multi-channel publishing
- Improving information quality and consistency
- Enhancing functionality of electronic output
- Negating vendor lock-in
More information on these points can be found at http://www.sastc.org.za/index.php?option=com_content&task=view&id=18&Itemid=35
If you already understand XML then you are in a good position to start learning DocBook. If you do not understand XML the good news is that learning DocBook will help you learn XML. Below are two books that are a must read for anyone just starting with DocBook.
4. Further Reading
DocBook - The Definitive Guide http://www.docbook.org/tdg/en/html/docbook.html
DocBook XSL – The Complete Guide http://www.sagehill.net/docbookxsl/index.html
DocBook crash course: http://opensource.bureau-cornavin.com/crash-course/index.html.
Yelp - the Gnome help browser - uses DocBook/XML files directly. Templates may be found at http://developer.gnome.org/projects/gdp/templates.html
Please read the DocBookReference if you want to know which tags to use where (and are too lazy or confused to look at the official reference).
If you have installed the package 'docbook-defguide' you can access the guide either through your web browser as:
http://localhost/doc/docbook-defguide/html/docbook.html (assuming that your Apache still has /doc aliased to /usr/share/doc)
You can also access it from the command line using:
Whilst reading these works it is useful to experiment. For this you will need an XML publishing tool-chain and an XML Editor. The DocBook Web site and Wiki will provide you with links to more information on the tool chain and editors you can use to author DocBook documents.
For an explanation of the Ubuntu Documentation Projects usage of DocBook see the "[Ubuntu DocBook Interchange Protocol]."
5. Quick Tutorial
5.1. What does DocBook look like?
DocBook defines a number of 'tags' just like HTML does. To set the authors name you would write something like...
<author> Christoph Haas </author>
As you can see this is very similar to HTML. Below is a working example of a complete XML document.
The 'flavor' used to write these tags is XML. Therefore it is called DocBook/XML. (The other 'flavor' would be SGML which is not greatly different. XML is stricter than SGML. HTML is a kind of SGML language. Most people believe that SGML is deprecated. Thus documentation in Debian is currently converted to XML.) Even if you have not yet used XML you should not have much trouble.
5.2. Style sheets
To create the output document from your XML input you also need a style sheet. Stylesheets are called 'XSL Transformations' (XSLT) and are written in a language called 'Extensible Stylesheet Language' (XSL). Basically XSLT describes how to convert one document into another. Usually you will not need to know how style sheets look. You also need a 'processor' that takes the XML and the XSLT and creates the output file from it. We will use the free 'xsltproc' program for that purpose.
There is a number of stylesheets available to convert your document into:
- Postscript - PDF - XHTML - man - texinfo
Other converters exist to convert DocBook into formats like Yelp. Yelp is the the Gnome Help format.
5.3. Hello World
First you need to install the following packages:
- xsltproc (the XSL Transformations Processor) - docbook-xsl (stylesheets for HTML, XHTML, HTML Help and others) - docbook-defguide (The Definitive Guide to DocBook - recommended)
Enter these lines into a file and call it test.xml:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" "http://docbook.org/xml/4.2/docbookx.dtd"> <article> <title>My first DocBook document</title> <sect1> <title>The greeting</title> <para> Hello world </para> </sect1> </article>
Please note that you should use UTF-8 as a character encoding. You may need to switch your terminal and your editor to UTF-8 mode too.
Run this command::
xsltproc -o test.html /usr/share/xml/docbook/stylesheet/nwalsh/xhtml/docbook.xsl test.xml
You should find a file 'test.html' in the current directory. View it with your favorite web browser.
Now what did that line actually do?
'xsltproc' is the converter program. '-o test.html' sets the output file. The next parameter '.../docbook.xsl' is the stylesheet you are using for the conversion - this one converts XML to XHTML. And finally the 'test.xml' tells xsltproc where your input file is located.
5.4. Customising style sheets
You will probably be disappointed with the look of the DocBook output. It is great to have it convert the document automatically but it probably does not fit into your web design or 'corporate identity' at all. There is a remedy however.
Style sheets usually provide a number of parameters that you can adjust. Usually you write your own stylesheet that imports the 'standard' style sheet. For example::
<?xml version='1.0'?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:import href="/usr/share/xml/docbook/stylesheet/nwalsh/xhtml/docbook.xsl"/> <xsl:param name="toc.max.depth">1</xsl:param> <xsl:param name="html.stylesheet" select="'/ubuntu.css'"/> <xsl:template name="user.header.content"> <a href="/">Back to main page</a> </xsl:template> </xsl:stylesheet>
This stylesheet first imports the docbook.xsl mentioned earlier. It also sets a few parameters:
- Set the maximum depth of the TOC (table of contents) to '1'. So
only the <sect1> sections will be included in the TOC.
- The final XHTML document will use the 'ubuntu.css' style sheet (CSS). - Include a link to the main page on top of the page.
These settings only work with the XHTML style sheet. For other output format you need other settings. The settings above are documented at /usr/share/doc/docbook-xsl/doc/html/index.html
You will also want to use http://www.sagehill.net/docbookxsl/ as a reference.
If you have multiple XML files or style sheets you may want to have all the processing done in a Makefile. For example::
# Add your language file here: TARGETS = faq.html XSLTPROC = /usr/bin/xsltproc XSL = ubuntu.xsl %.html: %.xml $(XSL) @$(XSLTPROC) -o $@ $(XSL) $< all: $(TARGETS) clean: @rm -f *.html
6. DocBook to PDF
The simplest way to convert a DocBook to PDF is to install the xsl-fo stylesheet (to convert to FO format), and fop (to convert FO to PDF). For some reason, the xsl-fo stylesheet is in the docbook-xsl-doc-pdf package.
sudo aptitude install fop docbook-xsl-doc-pdf
Now to convert your docbook file to pdf run:
xsltproc -o intermediate-fo-file.fo \ /usr/share/xml/docbook/stylesheet/nwalsh/fo/docbook.xsl input-docbook-file.xml fop -pdf final-pdf-file.pdf -fo intermediate-fo-file.fo
Here's an example Makefile for a docbook named networkmanager-manual.xml:
STYLESHEETS_DIR = /usr/share/xml/docbook/stylesheet/nwalsh all: html pdf html: xsltproc -o networkmanager-manual.html $(STYLESHEETS_DIR)/xhtml/docbook.xsl networkmanager-manual.xml fo: xsltproc -o networkmanager-manual.fo $(STYLESHEETS_DIR)/fo/docbook.xsl networkmanager-manual.xml pdf: fo fop -pdf networkmanager-manual.pdf -fo networkmanager-manual.fo clean: rm -rf networkmanager-manual.html networkmanager-manual.fo networkmanager-manual.pdf
To use the makefile, just change the input and output names (networkmanager-manual.*) to whatever you want them to be.
Note: The w3-Organization has blocked the download of the necessary dtd files by unknown user agents. Due to this (at least in Karmic) fop throws a TransformerException with the notice that the w3 server returned a 503 HTTP response. The workaround seems to be to set up a local dtd repository, as noted here and here.
The dblatex program can also do this.
7. Editing Programs
This editor has syntax highlighting and code snippets for DocBook and many other languages.
- conglomerate (WYSIWYG)
Somewhat beta. Doesn't hide the gory details. You still need to read the DocBook reference. Just makes it graphical.
- VIM file type plugin "xmledit"
Good for vim-lovers. See http://www.vim.org/scripts/script.php?script_id=301
A recent update based on the above script. http://www.vim.org/scripts/script.php?script_id=1397
- EMACS XML support
Some say DocBook is easy to write only under psgml. Some use Emacs only for psgml-mode.
nxml-mode however is far superior to psgml-mode. It does real-time syntax and error highlighting.
See also DocBookEditors.