Docbook: an XML vocabulary for writing documentation, manuals and technical books
DocBook is a vocabulary of XML comprising tags with defined semantics and syntax (DTD) for writing documentation, manuals and technical books. You can serve a DocBook XML file with a stylesheet, or you can transform it with XSLT stylesheets into one or several HTML files, into an EPUB file with dbtoepub, or even into a PDF file for printing.
The stylesheets are documented in DocBook XSL: The Complete Guide, by Bob Stayton, freely available at http://www.sagehill.net.
O'Reilly & Associates publishes a book called DocBook: The Definitive Guide, by Norman Walsh and Leonard Muellner. You can buy it or read it online.
Block Elements
Sections usually have a title, and maybe a titleabbrev and/or a subtitle, too. Possibly an info element. They typically hold some subsections and/or paragraphs.
There are three types of sectioning elements in DocBook:
- Explicitly numbered sections, sect1 ... sect5 , which must be properly nested and can only be five levels deep.
- Recursive section's, which are alternatives to the numbered sections and have unbounded depth.
- simplesect's, which are terminal. The simplesect s can occur as the
leaf
sections in either recursive sections or any of the numbered sections, or directly in components. The important semantic distinction of simplesect elements is that they never appear in the table of contents.
floatin a component. You can place paragraphs and other block elements before a section, but you cannot place anything after it.
Sectioning: a Hierarchy of Containment
- set: optional
- book or article
- part: optional, only in book's
- chapter
Within a chapter, you may find either a hierarchy of nested:
- sect1
- sect2
- sect3
- sect4, and
- sect5
Or a hierarchy of nested section's.
article*
Out of Flow Block Elements
These may lie at the front, back or to a side.
Front Matter Elements*
preface, toc, partintro, ...
Back Matter Elements*
appendix, glossary, index,
Out of Flow Elements
sidebar
Additional constraints: sidebar must not occur among the children or descendants of sidebar.
Description
A sidebar is a short piece of text, rarely longer than a single column or page, that is presented outside the narrative flow of the main text.
Sidebars are often used for digressions or interesting observations that are related, but not directly relevant, to the main text.
Example
<article xmlns='http://docbook.org/ns/docbook'>
<title>Example sidebar</title>
<section>
<title>An Example Section</title>
<para>Some narrative text.</para>
<sidebar>
<title>A Sidebar</title>
<para>Sidebar content.</para>
</sidebar>
<para>The continuing flow of the narrative text, as if the
sidebar was not present.</para>
</section>
</article>
annotation
Additional attributes: annotates...
Additional constraints: annotation must not occur among the children or descendants of annotation.
Description
The annotation element is a block annotation. Block annotations can be used for pop ups and other out of line
effects.
An annotation element is associated with another element by using a reference to an xml:id value. The association can go in either direction. An annotation element can use an annotates attribute on itself to point to an xml:id on another element. Or the other element can use an annotations attribute (one of the common attributes) on itself to point to an xml:id on an annotation element. There is no assumption that an annotation element is associated with its parent or any other ancestor element.
The attribute type of annotations and annotates is plain text, not IDREF or IDREFS. That enables modular content files to form associations with elements in other files without generating validation errors.
Example
<article xmlns='http://docbook.org/ns/docbook'> <title>Example of an annotation</title> <annotation xml:id="note-parts-list"> <para>This list is not comprehensive.</para> </annotation> <para annotations="note-parts-list">An automobile contains an engine, wheels, doors, and windows.</para> </article>
epigraph
Description
An epigraph is a short inscription, often a quotation or poem, set at the beginning of a document or component. Epigraphs are usually related somehow to the content that follows them and may help set the tone for the component.
Lists
Lists are made up of items. The simplest list is the simplelist. In a variablelist terms are enclosed between term tags.
- TTF
-
TrueType fonts.
Numbered and unnumbered lists
They are either an orderedlist or an itemizedlist. Both are made up of listitem's.
Lists of Definitions
A variablelist is a list of terms and definitions or descriptions.
While glossarys are usually limited to component or section boundaries, appearing at the end of a book or chapter, for instance, glosslists can appear anywhere that the other list types are allowed.
Using a glosslist in running text, instead of a variablelist, for example, maintains the semantic distinction of a glossary. This distinction may be necessary if you want to automatically point to the members of the list with glossterms in the body of the text.
<glosslist> <glossentry><glossterm>C</glossterm> <glossdef> <para>A programming language invented by K & R. </para> </glossdef> </glossentry> <glossentry><glossterm>Pascal</glossterm> <glossdef> <para>A programming language invented by Niklaus Wirth. </para> </glossdef> </glossentry> </glosslist>
Would be presented as:
- C
- Pascal
A programming language invented by K&R.
A programming language invented by Niklaus Wirth.
Admonitions
There are five types of admonitions in DocBook: caution, important, note, tip, and warning.
All of the admonitions have the same structure: an optional title followed by paragraph-level elements. DocBook does not impose any specific semantics on the individual admonitions.
Examples, Figures, and Tables
Examples, figures, and tables are supported with the block-level elements: example, informalexample, figure, informalfigure, table, and informaltable.
The distinction between formal and informal elements is that formal elements have titles while informal ones do not.
'example's in DocBook
This is an 'example':
'figure's in DocBook
This is an example SVG image:
This is the code:
<mediaobject>
<imageobject>
<imagedata>
<svg xmlns="http://www.w3.org/2000/svg"
width="100" height="100" version="1.1">
<rect x="20" y="20" width="80" height="80" style="fill:blue; stroke:green; stroke-width: 2; fill-opacity: 0.5; stroke-opacity: 0.9"/>
</svg>
</imagedata>
</imageobject>
</mediaobject>
'table's in DocBook
DocBook supports CALS tables (defined with tgroup, colspec, spanspec, thead, tfoot, tbody, row, entry, entrytbl, and caption) and HTML tables (defined with col, colgroup, thead, tfoot, tbody, tr, td, and caption). Both describe tables geometrically using rows, columns, and cells.
CALS tables
Example table
| Horizontal Span | a3 | a4 | a5 | |
|---|---|---|---|---|
| f1 | f2 | f3 | f4 | f5 |
| b1 | b2 | b3 | b4 |
Vertical Span |
| c1 | Span Both | c4 | ||
| d1 | d4 | d5 |
HTML tables
| Head 1 | Head 2 |
| Body 1 | Body 2 |
Paragraphs
There are three paragraph elements: para, simpara (simple paragraphs may not contain other block-level elements), and formalpara (formal paragraphs have titles).
Equations
There are two block-equation elements, equation and informalequation (for inline equations, use inlineequation).
Informal equations don't have titles. For reasons of backward compatibility, equations are not required to have titles. However, it may be more difficult for some style-sheet languages to properly enumerate equations if they lack titles.
Procedures and Tasks
A procedure contains step's, which may contain substep' or stepalternative's.
The task element is a wrapper around the procedure element that provides additional, optional elements, including tasksummary, taskprerequisite's, example, and taskrelated.
Docbook tables*
Inline Elements
Users of DocBook are provided with a surfeit of inline elements. Inline elements are used to mark up running text. In published documents, inline elements often cause a font change or other small change, but they do not cause line or paragraph breaks. In practice, writers generally settle on the tagging of inline elements that suits their time and subject matter. This may be a large number of elements or only a handful. What is important is that you choose to mark up not every possible item, but only those for which distinctive tagging will be useful in the production of the finished document for the readers who will search through it.
The following comprehensive list may be a useful tool for the process of narrowing down the elements that you will choose to mark up; it is not intended to overwhelm you by its sheer length. For convenience, we've divided the inlines into several subcategories.
The classification used here is not meant to be authoritative, only helpful in providing a feel for the nature of the inlines. Several elements appear in more than one category, and arguments could be made to support the placement of additional elements in other categories or entirely new categories.
Cross-references
The cross-reference inlines identify both explicit cross-references, such as link, and implicit cross-references, such as glossterm. You can make most of the implicit references explicit with a linkend attribute.
- anchor: A spot in the document
- citation: An inline bibliographic reference to another published work
- citerefentry: A citation to a reference page
- citetitle: The title of a cited work
- firstterm: The first occurrence of a term
- glossterm: A glossary term
- link: A hypertext link
- olink: A link that addresses its target indirectly
xref: A cross-reference to another part of the document
DocBook supports a rich set of elements and features for creating cross references. Here are the basic kinds of cross references you can create:
- Within a document.
- Between documents.
- To websites.
- Specialized cross references.
Cross references within a document
DocBook 4 employs the built-in feature of XML for cross referencing within a document, in which attributes are used to identify starting and ending points for cross references. In an XML DTD, an attribute can be assigned an attribute type of ID. That type of attribute is used to label an element as a potential target (end point) of a cross reference. The attribute name does not have be ID, it just needs to be declared as that type. In DocBook 4, the name of the attribute assigned this purpose just happens to be named id as well. In the DocBook DTD, almost every element can have an id attribute, with attribute type ID. That means any element in your document could be the target of a cross reference.
In DocBook 5, you use xml:id instead of id. The xml:id attribute is predefined to have attribute type ID, whether or not a schema is available to confirm it.
According to the XML standard, every value of an attribute with type ID must be unique. In DocBook, this means every instance of an id (DocBook 4) or xml:id (DocBook 5) attribute in a document must be unique within that document. When you validate your document, the validator will tell you if you have any duplicate id attributes.
Other attributes can have a type of IDREF, which is used to point to an ID on another element to form a cross reference. There are a handful of elements in the DocBook DTD that have attributes of type IDREF, of various names. There are two main elements that are used to create cross references to targets within the same document:
-
xref - An automatic cross reference that generates the text of the reference. For that purpose, it is an empty element that takes no content of its own. The stylesheets control what output is generated. The generated text can be the target element's titleabbrev (if it has one) or title, number label (if it has one), or both.
- link
- A cross reference where you supply the text of the reference as the content of the link element. Therefore it must not be an empty element.
Both of these elements require a linkend attribute whose value must match some id or xml:id value in the document. Here are two examples in DocBook 4:
Internal Cross References Example
<chapter id="intro"> <title>Introduction</title> <para>Welcome to our new product. One of its new features is a <link linkend="WidgetIntro">widget</link>. Other new features include ... </para> </chapter> <chapter id="WidgetIntro"> <title>Introducing the Widget</title> <para>Widgets are just one of our new features. For a complete list of new features, see <xref linkend="intro"/>. </para> </chapter>
...
Here is some guidance for using these cross reference elements:
Use xref when you want to reference another element's title or number. It will automatically get the current information so you do not have to maintain such references.
Use link when you want to create a less formal reference that does not include the title or number. You can use whatever words you want.
When adding an id or xml:id attribute, put it on the element itself, not the title. The stylesheets know how to find the title for each element being referenced.
Not all elements are appropriate as targets of an xref, because they do not have a title or number. For example, you may want to cross reference to a para, but you wouldn't want the whole paragraph to be copied as the reference text. See the next section for options you can add to use xref anyway.
Universal linking in DocBook 5
In DocBook 4, only specialized elements are used for creating links within and between documents. In DocBook 4, you can use xref or link with linkend attributes to form links within a DocBook document, you can use olink to form links between DocBook documents, and you can use ulink to form an arbitrary URL link.
In DocBook 5, almost all elements can be used as the basis for a link. That's because almost all elements have a set of attributes that are defined in the XLink namespace, such as xlink:href. For example, you can turn a command element into a link that targets the reference page for the command.
<para>Use the <command xlink:href="#ref-preview">Preview</command> command to generate a preview.</para>
The XML Linking Language (XLink) has been a W3C standard since 2001. That standard says that any XML element can become the source or target of a link if it has the universal XLink attributes on it. These attributes are in their own namespace named http://www.w3.org/1999/xlink. Because these attributes are in their own namespace, they do not interfere with any native attributes declared for an element.
An xlink:href attribute value can have several different forms:
- An attribute such as xlink:href="#intro" refers to an xml:id attribute that exists in the current document. This is similar to the DocBook 4 link and xref elements. The link and xref elements were retained in DocBook 5.
- An attribute such as xlink:href="http://docbook.org" refers to an arbitrary URL. This is similar to the DocBook 4 ulink element, which was removed in DocBook 5. Instead of ulink, use a link element with a URL in its xlink:href attribute.
- An olink-style link from any element can be formed using two attributes. If there is a xlink:role="http://docbook.org/xlink/role/olink" attribute present, then a link attribute of the form xlink:href="targetdoc#targetptr" is interpreted as the two parts of an olink. The olink element itself is retained in DocBook 5. See Chapter 24, Olinking between documents to learn more about DocBook olinks.
At the same time, the familiar DocBook linking attribute linkend has also been added anywhere an XLink can be used. The linkend attribute is limited to linking to an xml:id target within the same document.
The universal linking mechanism enables you to create logical links between any two DocBook elements. However, such logical links may or may not be expressible in formatted output. For example, if you put an xlink:href on an inline element, then the text of the inline element can become clickable link text in the output. However, if you put an xlink:href attribute on a block element such as section, then it is doubtful that making all the text in the section into a clickable link will be useful. The DocBook stylesheets currently only handle xlink:href on inline elements for this reason. If you want to express linking from a block element, you will have to customize the stylesheet to do so, perhaps by putting a clickable icon in the margin.
Markup
The following inlines are used to mark up text for special presentation:
- foreignphrase: A word or phrase in a language other than the primary language of the document
- wordasword: A word meant specifically as a word and not representing anything else
- computeroutput: Data, generally text, displayed or presented by a computer
- literal: Inline text that is some literal value
- markup: A string of formatting markup in text that is to be represented literally
- prompt: A character or string indicating the start of an input field in a computer display
- replaceable: Content that may or must be replaced by the user
- tag: A component of XML (or SGML) markup
- userinput: Data entered by the user
Mathematics
DocBook does not define a complete set of elements for representing equations. The Mathematical Markup Language (MathML) [MathML] is a standard that defines a comprehensive grammar for representing equations. MathML markup may be used in any of the equation elements (equation,informalequation, and inlineequation). For simple mathematics equations that do not require extensive markup, the mathphrase element is an alternative.
User interfaces
These elements describe aspects of a user interface:
Programming languages and constructs
Many of the technical inlines in DocBook are related to programming:
Operating systems
These inlines identify parts of an operating system, or an operating environment:
General purpose
There are also a number of general-purpose technical inlines:
Attributes in DocBook
Common Attributes in DocBook
There are many common attributes that occur on every DocBook element. They are summarized here for brevity and to make the additional attributes that occur on many elements stand out.
- annotations
- It holds text and identifies one or more annotations that apply to this element.
- dir
- Identifies the direction of text in an element. Its allowed values are: ltr (Left-to-right text), rtl (Right-to-left text), lro (Left-to-right override), and rlo (Right-to-left override).
- remap
- Provides the name or similar semantic identifier assigned to the content in some previous markup scheme
- revisionflag
- Identifies the revision status of the element. Its values are: changed (The element has been changed), added (the element is new, i.e. it has been added to the document), deleted (the element has been deleted), and off (revision markup has been explicitly turned off for this element).
- role
- Provides additional, user-specified classification for an element.
- version
- Specifies the DocBook version of the element and its descendants.
- xml:base
- Specifies the base URI of the element and its descendants.
- xml:id
- Identifies the unique ID value of the element.
- xml:lang
- Specifies the natural language of the element and its descendants
- xreflabel
-
Provides the text that is to be generated for a cross-reference to the element.
For elements like
paraornotethat do not have a title or number, you can add anxreflabelattribute to the element. That attribute should contain the text you want to appear when anxrefpoints to that element. The following is an example:<para id="ChooseSCSIid" xreflabel="choosing a SCSI id">The methods for choosing a <acronym>SCSI</acronym> id are ... </para> ... <para> See the paragraph on <xref linkend="ChooseSCSIid"/>. </para>
The
xrefin the second paragraph points to the first paragraph. When processed, the second paragraph will readSee the paragraph on choosing a SCSI id.
and an HTML hot link will take the reader to the beginning of the paragraph.The advantage of
xreflabelover using a link element is to provide consistent reference text to that element. If you decide to change the wording, you only have to change thexreflabel, and allxrefreferences to it will change. If you had used link instead, then you would have to find and edit each instance to change the text.If you put an xreflabel on an element that normally does have generated text, the attribute will override the default generated text.
Common Effectivity Attributes
The common attributes include a collection of effectivity attributes. These attributes are available for authors to identify to whom a particular element applies. Effectivity attributes are often used for profiling: building documents that contain information only relevant to a particular audience.
For example, a section might be identified as available only to readers with a top- secret
security clearance or a paragraph might be identified as affecting only users running the implementation provided by a particular vendor.
- arch
- Designates the computer or chip architecture to which the element applies
- audience
- Designates the intended audience to which the element applies; for example, system administrators, programmers, or new users
- condition
- Provides a standard place for application-specific effectivity
- conformance
- Indicates standards conformance characteristics of the element
- os
- Indicates the operating system to which the element is applicable
- revision
- Indicates the editorial revision to which the element belongs
- security
- Indicates something about the security level associated with the element to which it applies
- userlevel
- Indicates the level of user experience for which the element applies
- vendor
- Indicates the computer vendor to which the element applies
- wordsize
- Indicates the word size (width in bits) of the computer architecture to which the element applies
The names of the effectivity attributes are suggestive of several classes of common effectivity information. The semantically neutral condition attribute was added to give authors a place to put values that don't fit neatly into one of the other alternatives.
Common Linking Attributes
The following attributes occur on all elements that can be the start of a link. They are summarized here once for brevity and to make the additional attributes that occur on many elements stand out.
- linkend (IDREF) or linkends (IDREFS)
- Points to an internal link target by identifying the value of its
xml:idattribute. - xlink:actuate
-
Identifies the XLink actuate behavior of the link:
- onLoad: An application should traverse to the ending resource immediately on loading the starting resource.
- onRequest: An application should traverse from the starting resource to the ending resource only on a post-loading event triggered for the purpose of traversal.
- other: The behavior of an application traversing to the ending resource is unconstrained by this specification. The application should look for other markup present in the link to determine the appropriate behavior.
- none: The behavior of an application traversing to the ending resource is unconstrained by this specification. No other markup is present to help the application determine the appropriate behavior.
- xlink:arcrole
- Identifies the XLink arcrole of the link
- xlink:href
- Identifies a link target with a URI
- xlink:role
- Identifies the XLink role of the link
- xlink:show
-
Identifies the XLink show behavior of the link:
- new: An application traversing to the ending resource should load the resource in a new window, frame, pane, or other relevant presentation context.
- replace: An application traversing to the ending resource should load the resource in the same window, frame, pane, or other relevant presentation context in which the starting resource was loaded.
- embed: An application traversing to the ending resource should load its presentation in place of the presentation of the starting resource.
- other: The behavior of an application traversing to the ending resource is uncon- strained by XLink. The application should look for other markup present in the link to determine the appropriate behavior.
- none: The behavior of an application traversing to the ending resource is unconstrained by this specification. No other markup is present to help the application determine the appropriate behavior.
- xlink:title
- Identifies the XLink title of the link
- xlink:type
- Identifies the XLink link type
Code in DocBook
You may use block programlisting or inline userinput and computeroutput.
Annotating Program Listings
It is often the case that you need to comment on lines of code to explain it. There are three mechanisms you can use for that purpose.
- line annotations,
- line numbering, or
- callouts.
Line annotations
You can mix lineannotation elements in with your code to explain something directly in the text. For example:
<programlisting># constructor
sub new {
my ($file, $output) = @_; <lineannotation>Store args</lineannotation>
my $dir = basename $file; <lineannotation>Get dir name</lineannotation>
}
</programlisting>
Line annotations in the stock stylesheet print as italic, but they inherit the monospace font family of the programlisting.
<character that starts the element will be escaped as
<when it is brought in. You can use lineannotations in files brought in with <xi:include parse="xml">, but then you have to be careful to escape other XML characters in your program file. Line annotations also cannot be used with examples marked as CDATA, because any lineannotation element will not be recognized as an XML element.
Line numbering
You can add line numbers to the listing, and then your paragraphs can refer to the line numbers. Currently line numbering is only available with the Java processors Saxon and Xalan, not xsltproc, because it is done with an extension function.
Line numbers are turned on by a linenumbering attribute on each programlisting element that needs line numbering. By default, the numbering starts at 1, but you can assign your own starting number with the optional startinglinenumber attribute. You can also continue the numbering from the most recent programlisting that had line numbering by adding a continuation="continues" attribute to the current element. The following is an example with startinglinenumber:
<programlisting linenumbering="numbered" startinglinenumber="12">
Once your lines are numbered, you can refer to the line numbers in the paragraphs. The problem with line numbers, though, is you cannot see them until the text is formatted at least once. Also, if you edit the code, the line numbers may change and you will need to adjust your number references. It is useful for stable code examples, though.
Callouts
You can use callouts to mark specific locations in a program listing and link explanatory text to the marks. In DocBook, the callout element contains the explanatory text. The mark, which is called a callout bug, is most easily placed using the co element. Those two elements can be linked to each other to allow the reader to conveniently move back and forth between them.
The callout bug is usually rendered as a white number in a black circle.
<programlisting>
#ifndef _My_Parser_h_ <co id="condition-co" linkends="condition" />
#define _My_Parser_h_
#include "MyFetch.h" <co id="headerfile-co" linkends="headerfile" />
class My_Parser <co id="classdef-co" linkends="classdef" />
{
public:
//
// Construction/Destruction
//
My_Parser(); <coref linkend="classdef-co"/>
virtual ~My_Parser() = 0;
virtual int parse(MyFetch &fetcher) = 0;
};
#endif
</programlisting>
<calloutlist>
<callout arearefs="condition-co" id="condition" >
<para>Make this conditional.</para>
</callout>
<callout arearefs="headerfile-co" id="headerfile">
<para>Load necessary constants.</para>
</callout>
<callout arearefs="classdef-co" id="classdef">
<para>Define new class</para>
</callout>
</calloutlist>
- Use a
coelement to place a callout bug in your code sample. The element is empty, with all the information in attributes. - Give the
coan id or xml:id value so the callout text can be linked directly to the callout bug location. - Its linkend attribute value (condition) should match the id or xml:id value of its
calloutelement. That forms a link from the callout bug to the text. - Use a
corefinstead of acowhen you want to create a duplicate bug number. That is, when you have more than one location in your code that needs to refer to the same callout paragraph, use acoreffor any but the first location. The linkend of thecorefmust point to the id of the mastercoelement. The duplicate callout icons will all hotlink to the same callout paragraph. But the icon next to the callout paragraph will link back only to the mastercoelement. - A
calloutlistcontains a set ofcalloutelements, and formats them as a list. - Each
calloutelement is paired up with acoelement. The numbering order is based on thecoorder, so you should keep thecalloutelements in the same order. - The arearefs attribute value matches the id value of its
cocallout bug . That forms a link from the callout text to the callout bug. - Give the
calloutan id value so the callout bug can link directly to its callout text.
Links Inside and Outside
Many elements admit linking attributes, which effectively turn its contents into a hot link. The most common attributes are linkend, which builds an internal link from an id within the document, and xlink:href, which builds an external link from a URI.
Note that you need to declare the XLink namespace in your document instance to use href and other XLink attributes, like so:
<book xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0">
Other useful XLink attributes are xlink:show, which establishes how the link is shown and may be either
Here are some examples using (internal) linking attributes:
-
Link to another element in the document bearing an id attribute such as xml:id='some_destination':
<link linkend='some_destination'>jump</link>
Alternatively, the href attribute may contain a hash-preceded fragment identifier to create a link within a document. For example:
<command xl:href="#dir">DIR</command>
-
Link to a website, or a webpage, etc.
<link type='website' xlink:href='https://www.domain.xyz/'>hot text of the hyperlink</link>
The uri Tag
The uri element identifies a Uniform Resource Identifier (URI) in content. Although DocBook does not mandate any values for the type attribute, several useful values have been suggested:
xmlnamespace for an XML namespace name; for example, http://docbook.org/ns/docbooksaxfeaturename for a SAX feature name; for example, http://xml.org/sax/features/namespacessaxpropertyname for a SAX property name; for example, http://xml.org/sax/properties/declaration-handlersoapaction for a SOAP action; see SOAP Version 1.2 Part 2: Adjunctsrddlpurpose for an RDDL (Resource Directory Description Language) purposerddlnature for an RDDL (Resource Directory Description Language) naturehomepage for a home page; for example, http://nwalsh.com/weblog for a web log; for example, http://norman.walsh.name/webpage for a web page; for example, http://docbook.org/schemas/website for a website; for example, http://docbook.org
Uniform info Elements
DocBook versions earlier than DocBook V5.0 use unique elements for block information. For example, a book element would contain a bookinfo element. This was done to support different content models for different block elements. DTDs only allow one content model for each element, so a different element name was required for each block's information element. Since RELAX NG does not have this limitation, an element can have a different content model in different contexts. Therefore, the array of info elements (articleinfo, bookinfo, etc.) has been replaced with a single info element.
Thus, an info element contains two series:
-
Interleave of:
title? titleabbrev? subtitle?
-
Zero or more of "Info" elements:
abstract: A summaryaddress: A real-world address, generally a postal addressartpagenums: The page numbers of an article as publishedauthor: The name of an individual authorauthorgroup: Wrapper for author information when a document has multiple authors or collabaratorsauthorinitials: The initials or other short identifier for an authorbibliocoverage: The spatial or temporal coverage of a documentbiblioid: An identifier for a documentbibliosource: The source of a documentcollab: Identifies a collaboratorconfgroup: A wrapper for document meta-information about a conferencecontractsponsor: The sponsor of a contractcontractnum: The contract number of a documentcopyright: Copyright information about a documentcover: Additional content for the cover of a publicationdate: The date of publication or revision of a documentedition: The name or number of an edition of a documenteditor: The name of the editor of a documentissuenum: The number of an issue of a journalkeywordset: A set of keywords describing the content of a documentlegalnotice: A statement of legal obligations or requirementsmediaobject: A displayed media object (video, audio, image, etc.)org: An organization and associated metadataorgname: The name of an organizationothercredit: A person or entity, other than an author or editor, credited in a documentpagenums: The numbers of the pages in a book, for use in a bibliographic entryprinthistory: The printing history of a documentpubdate: The date of publication of a documentpublisher: The publisher of a documentpublishername: The name of the publisher of a documentreleaseinfo: Information about a particular release of a documentrevhistory: A history of the revisions to a documentseriesvolnums: Numbers of the volumes in a series of bookssubjectset: A set of terms describing the subject matter of a documentvolumenum: The volume number of a document in a set (as of books in a set or articles in a journal) from group db._any
annotation: An annotationextendedlink: An XLink extended linkbibliomisc: Untyped bibliographic informationbibliomset: A cooked container for related bibliographic informationbibliorelation: The relationship of a document to anotherbiblioset: A raw container for related bibliographic informationitermset: A set of index terms in the meta-information of a documentproductname: The formal name of a productproductnumber: A number assigned to a product
Required title and version attributes
DocBook V5.0 requires the title attribute on large block elements such as article.
DocBook V5.0 no longer requires a Document Type Declaration. However, because processors may need to know the version of an instance, DocBook V5.0 has added the version attribute, which must appear on the root element of a DocBook document. The version attribute may also appear on other elements, and mixing of versions is allowed.
Including Images in DocBook Doc's
To include a picture in the text, use the graphic tag.
(from http://bochs.sourceforge.net/doc/docbook/documentation/basics.html)
Or something like:
<mediaobject>
<imageobject>
<imagedata format='PNG' fileref='image_folder/Figure1.png'/>
</imageobject>
</mediaobject>
A Callout to an Image or a Listing
In publishing, a call-out or callout is a short string of text connected by a line, arrow, or similar graphic to a feature of an illustration or technical drawing, and giving information about that feature. The term is also used to describe a short piece of text set in larger type than the rest of the page and intended to attract attention.
A similar device in word processing is a special text box with or without a small "tail" that can be pointed to different locations on a document.
Wikipedia
Callouts are marks, frequently numbered and typically on a graphic (imageobjectco) or verbatim environment (programlistingco or screenco), that are described in a calloutlist.
Callouts, such as numbered bullets, are an annotation mechanism. In an online system, these bullets are frequently hot,
and clicking on them sends you to the corresponding annotation. Annotation mechanisms for quoted text is explained elsewhere.
The alt Element Used as a Backdoor
Much like a processing instruction in XML, an alt element may be used... Below is an example of a TeX mathematical formula inclusion meant to be processed by the dblatex engine:
<equation id="eq-with-no-title">
<alt>C = \alpha + \beta Y^{\gamma} + \epsilon</alt>
</equation>
Making a Glossary in DocBook
A glossary, like a bibliography, is often constructed by hand. However, some applications are capable of building a skeletal index from glossary term markup in the document. If all of your terms are defined in some glossary database, it may even be possible to construct the complete glossary automatically.
To enable automatic glossary generation, or simply automatic linking from glossary terms in the text to glossary entries, you must add markup to your documents. In the text, you mark up a term for compilation later with the inline glossterm tag. This tag can have a linkend attribute whose value is the ID of the actual entry in the glossary.
For instance, if you have this markup in your document:
<glossterm linkend="xml">Extensible Markup Language</glossterm> is a new standard...
your glossary might look like this:
<glossary><title>Example Glossary</title> ⋮ <glossdiv><title>E</title> <glossentry xml:id="xml"><glossterm>Extensible Markup Language</glossterm> <acronym>XML</acronym> <glossdef> <para>Some reasonable definition here.</para> <glossseealso otherterm="sgml"> </glossdef> </glossentry> </glossdiv> ⋮ </glossary>
Note that the glossterm tag reappears in the glossary to mark up the term and distinguish it from its definition within the glossentry. The xml:id that the glossentry referenced in the text is the ID of the glossentry in the glossary itself. You can use the link between source and glossary to create a link in electronic formats, as we have done with the HTML and PDF forms of the glossary in this book.
You can use the baseform attribute on glossterm and firstterm when the term marked up in context is in a different form, for example, plural. Here is an example:
<para> Using <glossterm baseform="DTD">DTDs</glossterm> can be hazardous to your sanity. </para>
Making an Address/Phone/Agenda Book
The info tag
The info element contains meta-information about the element that contains it. An info element may contain a title and/or a subtitle element. You may find an example of how it is used here.
Describing a Person in DocBook
A person can be described
in DocBook as in the following example:
<info>
<title>Example author</title>
<author>
<personname>
<honorific>Mr</honorific>
<firstname>Norman</firstname>
<surname>Walsh</surname>
<othername role='mi'>D</othername>
</personname>
<affiliation>
<shortaffil>ATI</shortaffil>
<jobtitle>Senior Application Analyst</jobtitle>
<orgname>ArborText, Inc.</orgname>
<orgdiv>Application Development</orgdiv>
</affiliation>
</author>
</info>
person instead of tag author.Additionally, tab personblurb is used to hold a short description of a person.
Places and Addresses
The following tags are used:
-
email -
addressotheraddrcitycountrystatestreetpostcodepob -
phone
Summarizing and Commenting in DocBook
The abstract may be used for summarizing. An abstract can occur in most components of DocBook. It is expected to contain some sort of summary of the content with which it is associated (by containment). An abstract is a block element and may contain info and paragraph elements.
Other tags for summarizing are epigraph and sidebar.
An epigraph is a short inscription, often a quotation or poem, set at the beginning of a document or component. Epigraphs are usually related somehow to the content that follows them and may help set the tone for the component. It may contain info, attribution, literallayout, and paragraph elements.
Backus-Naur Notation in DocBook*
Dates in DocBook
Tag date for an inline date, although DocBook does not specify the format of the date.
Namespaces Declared in DocBook
Generally, operator color (:) is used to select any element from any namespace except:
- http://docbook.org/ns/docbook
- http://www.w3.org/1999/xhtml
Two common public namespaces are assumed: those for MathML (mml) and SVG (svg).
Here is an example of inclusion of a mathematical formula (mml). It just prints x3:
<article xmlns='http://docbook.org/ns/docbook'
xmlns:mml="http://www.w3.org/1998/Math/MathML">
<title>Example mml-math</title>
<informalequation>
<mml:math>
<mml:msup>
<mml:mi>x</mml:mi>
<mml:mn>3</mml:mn>
</mml:msup>
</mml:math>
</informalequation>
</article>
And here is an example of inclusion of an SVG image:
<article xmlns='http://docbook.org/ns/docbook'
version="5.0">
<title>Example svg-svg</title>
<mediaobject>
<imageobject>
<imagedata>
<svg xmlns="http://www.w3.org/2000/svg"
width="100" height="100" version="1.1">
<rect x="20" y="20" width="80" height="80"
style="fill:blue; stroke:green; stroke-width: 2;
fill-opacity: 0.5; stroke-opacity: 0.9"/>
</svg>
</imagedata>
</imageobject>
</mediaobject>
</article>
Simplified Docbook*
Transforming to PostScript, PDF ...
Transforming to PDF, PostScript, LaTeX through dblatex
dblatex supports links, tables, bookmarks... Nevertheless, it has failed on my large files, but you can do:
dblatex --type=tex my_file.xml
And then:
latex my_file.tex
to get a dvi file, later to be converted to PostScript or PDF format:
dvipdf my_file.dvi
Yet if you want EPS support, do:
dvips my_file.dvi && ps2pdf my_file.ps
dvipdfm and dvipdfmx don't generate links.Unicode is supported through switch: -b xetex or --backend=xetex.
XSL parameteres can be set like this: -P param = value or --param= param = value.
Transforming to PostScript through sdop
sdopreads DocBook XML input, processes it into page images, and writes the result as PostScript. This can be turned into PDF using an application such asps2pdf.SDoPis "simple" because (a) it does not check that the input conforms to the DTD, and (b) it supports only a simple subset of DocBook features.the man page
sdop was written by Philip Hazel, Cambridge, England, who last revised it on November 23rd, 2013.
EPS images are supported out of the box.
Supported Docbook Elements
A proper subset of Docbook, namely Simple Docbook, is supported. The Simplified DocBook elements that are not supported are authorgroup, biblioxxx, entrytbl, and spanspec.
In the following table, elements that are defined for Simplified DocBook are marked with a dagger.
| † abbrev | Ignored |
| † abstract | Ignored |
| † acronym | Ignored |
| address | |
| † affiliation | |
| † appendix | Supports id |
| † article | |
| † articleinfo | |
| † attribution | |
| † audiodata | Ignored |
| † audioobject | Ignored |
| † author | |
| † authorblurb | Ignored |
| † authorinitials | |
| † blockquote | |
| book | |
| bookinfo | |
| † caption | |
| chapter | Supports id |
| † citetitle | Treated as emphasis
|
| colophon | |
| † colspec | Supports align, char, charoff, colwidth, colsep |
| † command | |
| † computeroutput | Treated as literal
|
| † copyright | Supported in articleinfo and bookinfo
|
| † corpauthor | |
| † date | |
| † edition | |
| † editor | |
| Italic by default | |
| † emphasis | Supports role |
| † entry | Supports align, char, charoff |
| † epigraph | Works like blockquote
|
| † example | Supports id |
| † figure | Supports id |
| † filename | |
| † firstname | |
| † footnote | |
| † footnoteref | Only on same page as the footnote |
| formalpara | |
| function | |
| † holder | |
| † honorific | |
| † imagedata | Supports align, depth, fileref, format, scale, scalefit, width |
| † imageobject | |
| index | Supports role |
| indexterm | Supports class, id, role |
| informalfigure | |
| † informaltable | Supports frame, colsep, rowsep |
| † inlinemediaobject | Treated as mediaobject
|
| † issuenum | |
| † itemizedlist | Supports mark |
| † jobtitle | |
| † keyword | Ignored |
| † keywordset | Ignored |
| † legalnotice | |
| † lineage | |
| † lineannotation | Uses a small italic font |
| † link | |
| † listitem | |
| † literal | |
| † literallayout | Supports class |
| † mediaobject | |
| † note | Works like blockquote
|
| † objectinfo | Ignored |
| † option | |
| † orderedlist | Supports numeration |
| † orgname | |
| † othercredit | |
| † othername | |
| † para | |
| † phrase | |
| preface | |
| primary | |
| † programlisting | Treated as screen
|
| † pubdate | |
| † publishername | |
| † quote | |
| † releaseinfo | |
| † replaceable | Italic by default |
| † revdescription | |
| † revhistory | |
| † revision | |
| † revnumber | |
| † revremark | Ignored |
| † row | Supports rowsep |
| screen | |
| secondary | |
| † section | Supports id |
| † sectioninfo | Ignored |
| sectn | Treated as section
|
| see | |
| seealso | |
| † sidebar | Works like blockquote
|
| simpara | |
| † subject | Ignored |
| † subjectset | Ignored |
| † subjectterm | Ignored |
| subscript | No small font in titles; ignored in table of contents |
| † subtitle | For articles, chapters, appendixes, and indexes |
| superscript | No small font in titles; ignored in table of contents |
| † surname | |
| † systemitem | Ignored |
| † table | Supports frame, colsep, rowsep |
| † tbody | |
| † term | |
| tertiary | |
| † textobject | |
| † tfoot | |
| † tgroup | Supports align, char, charoff, cols, colsep, rowsep |
| † thead | |
| † title | |
| † titleabbrev | |
| † trademark | Ignored |
| † ulink | Supports url |
| † userinput | Monospaced by default |
| † variablelist | |
| † varlistentry | |
| varname | |
| † videodata | Ignored |
| † videoobject | Ignored |
| † volumenum | |
| † xref | Supports linkend |
| † year |
Links in Sdop
Attribute id is only supported for chapter and section elements and is used for making links[, but then the documentation for links is inadequate].
If an index is generated, the page numbers it contains are made into clickable links. This and the generation of clickable links from xref elements are turned off by setting xref_links parameter to no. The colour of these links is by default the same colour as the surrounding text, but it can be explicitly specified by xref_rgb, as in <?sdop xref_rgb="0,0,1"?>.
Here is a full example:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/css" href="docbook.css"?>
<book>
<title>A sdop test file</title>
<para>...</para>
<chapter id='ch01'>
<title>First Chapter</title>
<para>...</para>
</chapter>
<chapter>
<title>Second Chapter</title>
<para>An xref: <xref linkend='ch01'/>...</para>
<para>A ulink: <ulink url='circularia.t15.org'/>...</para>
</chapter>
</book>
Transforming to EPUB, HTML, XHTML,
There is a huge (16MB) package of stylesheets which on my Ubuntu machine live at /usr/share/xml/docbook/stylesheet/docbook-xsl. These are documented in DocBook XSL: The Complete Guide, Fourth (Web) Edition, by Bob Stayton, to whom I am indebted. This free guide, currently to be found at http://www.sagehill.net/docbookxsl/, explains everything you need to do all sorts of transformations to your docbook source.
The examples use xsltproc, easily available on linux. On windows you may use Saxon, at http://saxon.sourceforge.net/ or Xalan, which runs on Java [Virtual Machine].
Transforming to EPUB
I use dbtoepub like this:
dbtoepub --css style.css book.xml
It takes about one minute per megabyte and produces a file book.epub. I don't think that the stylesheet supplied is followed, though. For instance, an element like <emphasis>modularity</emphasis> may have been transformed to <span class="emphasis"><em>modularity</em></span>, but my docbook stylesheet styles for emphasis, not for element em inside element span with attribute
Links Within the (XML) Document
The link source is identified through a Docbook 5 xml:id="your_identifier" attribute in a visible element. Then you can use a plain link element with a linkend attribute:
<link linkend="you_identifier">link_text</link>
Transforming to HTML through XSLT
Transforming to a single HTML file
Just use the html/docbook.xsl. As this stylesheet does not name the output, use a switch as in
xsltproc --output myfile.html docbook.xsl myfile.xml
Processing part of a document
You may want to generate output for only part of large DocBook document. For example, you might need just one book from a set document, or one chapter from a book document, etc. The DocBook XSL stylesheets have a parameter that lets you output part of a document.
There are two conditions that must be met:
- The content you want to output is contained in a single element. You cannot use this feature to output an arbitrary selection of elements.
- The selected element must have an id attribute on it.
You process the document as you normally would, but you set the stylesheet parameter rootid to the id attribute value of the element you want to process. For example, if you have a book with three chapters:
<book>
<chapter id="introduction">
...
</chapter>
<chapter id="installing">
..
</chapter>
<chapter id="administering">
...
</chapter>
</book>
You can generate an HTML file for the second chapter with a command like the following:
xsltproc --stringparam rootid "using" --output chap2.html html/docbook.xsl myfile.xml
The chap2.html output file will contain just the second chapter. The entire document is still parsed, so there will not be signficant savings of processing time and the selected content is still processed within the context of the entire document. Any cross references to other chapters will be properly formed, but the links will not actually go anywhere because the targets are not included in the output.
Transforming to multiple HTML files (chunking)
Include the line: <?xml-stylesheet type="text/xsl" href="/usr/share/xml/docbook/stylesheet/docbook-xsl/html/chunk.xsl" ?> at the top of your document and process with an XSLT processor like xsltproc. Or better still, passed the stylesheet as a parameter to the XSLT processor (xsltproc or whatever). The default behaviour chunks up to sect1 elements. If you want to chunk the first sect1 sections too, include --stringparam chunk.first.sections 1 in your command line.
If you want the output to go into a chosen subdirectory, write a processing instruction just after the opening tag of the element like <?dbhtml dir="HowToMakeABook" ?>. (On the other hand, you don't want to specify an output file name.) You can also write processing instructions to determine how files are named by writing an analogous processing instruction following the opening tag of a chunkable element, like this:
<chapter><?dbhtml filename="intro.html" ?> <title>Introduction</title>
This way your xml file will get transformed into its component *.html files corresponding to chapters, sections etc plus an index.html file.