Liam Borgstrom (ed.)

As you learnt, XML is a metalanguage[1] and as such, its use in publishing is ideal, as it allows to catalogue and structure any sort of content. The idea of marking up content was addressed in PUB 210, during which you were required to mark-up a manuscript in your own custom code. A typesetter would then take that information and apply the necessary DTP markup to the text and format it accordingly. However, in a visual layout environment, any descriptors applied to the content only has organisational value to the human eye, the formatting is limited to the operating program.

XML is a favoured format due to its interoperability between platforms, for it is universal standard, formatted by standards. As such it creates an ideal environment for sharing and editing content, and because it separates content from its formatting, it can be edited and re-formatted simultaneously, allowing for some more seamless integration between the beginnings and production phases within a publication’s life-cycle.

There are two basic principles to a digital workflow with which you want to publish to multiple formats:

  1. You need an interim format, one that can contain the bulk of the content and formatting and easily be converted to other formats.
  2. It is easier to use that interim format from the start than to back-convert to it.

Based on what you’ve read above, you can see why the idea of basing a workflow on XML is a popular choice. Hence we can refer to a workflow system in which XML is used throughout the process (XML-in) or one in which XML-based formatting occurs after normal practices as a conversion process (XML-out).


With this pattern, the publishing process can follow its traditional course. The ‘out’ in XML-Out can often refer to ‘outsourced’ as backlist titles must often follow this process. It is also an easy way for publishers to solve the problem of creating multiple, digital formats without having to learn new skills. The other advantage is that where specialist skills are needed, such as for the creation of an app-book, this process makes a lot of sense, as the capabilities for in-depth programming, animation and audio-video production is not necessarily available in the average publishing house. This sort of workflow is the simplest for the average publisher, as it doesn’t upset existing workflows, however it does add extra costs, as new people are needed to perform the conversions, and this can become costly.


With this pattern, the excess costs are not necessarily avoided, but some greater control can be exerted over them, and initial payouts are offset in the long-term as both the time and efficiency of the publishing process can be improved. An XML-In workflow is done in-house and relies on the typesetting being done with XML. Because XML is an intermediary format, styling decisions can be made for each of the intended formats and when the time for export comes, all formats can be released simultaneously. This can increase the length of the typesetting stage, but (as with normal DTP publishing) much of the standards can be created before hand, and because XML requires a well defined styling language the type of ‘intuitive layout’ possible with graphical layout systems cannot be done, and therefore any standards made during the authoring phase are consistent and fixed throughout the document by the style sheet. Such a system also allows for a degree of automation, for once a structure is applied to the manuscript (either by the author or editor) the stylesheet need merely be affixed and checked for errors.

An integrated publishing system

The whole process can be subverted to a large degree and made into a more collaborative process by the use of an integrated platform. This is especially valid for commissioned works. While it requires a substantial change to the working environment the working manuscript can cease to form linearly[2] and rather form dynamically[3]. To do this, the workflow is organised around a central file -system (often a piece of software), which splits the publishing functions for the manuscript into individual components:

  • Authoring (preferably structured)
  • Editing (revisions are tracked and archived)
  • Styling (style rules are applied for each content type)
  • Multimedia (files are linked to the manuscript, but edited and produced externally)

This workflow can have a steep learning curve depending on the software used. It upsets the traditional order for the publishing process as the structural decisions for the document must be made before authoring can occur (with commissioned works) or need to be made applicable for uncommissioned works. The reliance on mark-up languages means that every document must have an associated ‘document type’[4], and while this types can be custom-made fairly simply, it is an extra process that need not be applied for every project, as one document type can be reused for multiple projects.

Document types and schemata

XML’s ability to create languages means that XML by itself can be impossible to work with. In an environment where anything goes, who’s to say what doesn’t. You are.

Each XML document requires an embedded reference to either a Document Type Definition (DTD), or to an XML Schema[5] (XSD). These two formats define exactly what is possible in an XML document, what tags are accepted, and where they are allowed to go. Because of this there exist many standard DTDs, which define certain types of documents such as: DocBook (technical documents), DITAmap (topic guides) and the OPF and NCX (for ePub).

For different types of documents, proprietary or bespoke DTDs can be made which suit the specific needs of the required document, such as this one for a recipe:

A DTD (or XSD) can be crafted to describe whatever type of content is necessary and can have varying levels of granularity[6] which determines to what level the content can be structured. I.e. is a paragraph just a paragraph, or is it a container for emphatic characters, links, images, footnotes…

  1. A language for describing languages
  2. Gaining value stage by stage
  3. With all stages occurring more or less simultaneously
  4. or XML schema
  5. pattern
  6. Granularity: the depth of definition within an XML. This relates to a heirarchical model in which all the elements which should appear in a given document are defined. The greater the granularity, the deeper the heirarchy


Publishing in the Digital Environment Copyright © 2013 by Liam Borgstrom (ed.). All Rights Reserved.