DIGITAL TALKING BOOK STANDARDS DEVELOPED BY NLS AND PARTNERS UNDER NISO AUSPICES
The following article appears in the journal "Library Hi Tech," Volume 19, Number 1, ISSN 0737-8831 and is used with permission.
By John Cookson, Michael Moodie, and Lloyd Rasmussen
John Cookson (email@example.com), Michael Moodie (firstname.lastname@example.org) and Lloyd Rasmussen (email@example.com) are all based at the National Library for the Blind and Physically Handicapped (NLS), The Library of Congress, Washington, DC, USA.
Keywords: Disabled people, Information technology, Standards, Blind people, Books.
Abstract: The functionality, compatibility, and longevity planned for future digital talking books require clear, exact definitions of component format and content. NLS will achieve this by working with a diverse team of experts to establish an applicable standard. This article outlines the plan, describes progress, and indicates what further work is necessary to complete the standard.
Electronic access: The research register for this journal is available at
http://www.mcbup.com/research_registers. The current issue and full text archive of this journal is available at http://www.emerald-library.com/ft.
Under the auspices of the National Information Standards Organization (NISO), a standards developer accredited by the American National Standards Institute, NLS is leading a committee of experts in the development of a digital talking book (DTB) standard. For a more detailed discussion of why and how the standards process was begun, please see our CSUN 97 paper, titled: "Talking books: toward a digital model".
The committee has taken a general approach to a DTB standard to accommodate a wide variety of books, users, producers, and playback devices. There is interest in compatibility with commercial electronic books as well as ease of interaction on an international basis. To this end there is common membership with groups of similar interest, specifically, the DAISY Consortium, the World Wide Web Consortium, and the Open eBook Forum. This perspective allows us to utilize rather than duplicate prior work and enhances the standard's prospects for longevity and support. The committee's membership may be found on the NISO Standards Committee site . A DTB is envisioned to be, in its fullest implementation, a group of digitally encoded files containing an audio portion recorded in human speech; the full text of the work in electronic form, marked with the tags of a descriptive markup language; and a linking file that synchronizes the text and audio portions.
As this document illustrates, such a structure will allow the DTB user a broad range of capabilities not possible with cassette talking books. The standard uses concepts and components found in other open web-based standards, specifically, Open eBook, Synchronized Multimedia Integration Language (SMIL), and Extensible Markup Language (XML). For more details on these please see:
- Open eBook Forum .
- Synchronized Multimedia .
- Extensible Markup Language (XML) .
The standard covers three different classes of players from very simple to very sophisticated and six different types of books, again, from simple to complex. All text files use the ASCII character set. For an overview of the larger NLS digital audio development project please see the companion article in this issue entitled (pp. 15-18), "National library service for the blind and physically handicapped: digital plans and progress."
The provisions of the NISO DTB standard are expressed in two different kinds of documents: normative and formative. Normative documents define the characteristics of a product required for standard compliance. Informative documents provide general information about the standard and recommend ways to achieve compliance. The informative documents presently consist of the following:
Prioritized list of features for digital talking book playback devices. As the name indicates, this document describes characteristics of DTB playback systems. It allows for three different types of players: very simple hand-helds, mainly for linear leisure reading; more complex portables, mainly for students and professionals; and user-supplied computer-based players capable of supporting the most sophisticated features. Features are prioritized as essential, highly desirable, or useful for each type of player, and include functions such as variable-speed reading, book-marking, and the ability to immediately access items listed in a table of contents. The full feature set may be found at the Digital Talking Book Standards Committee's document titled, "Playback device guidelines".
Document navigation features list. Although the prioritized feature set mentioned above contains general navigation requirements, the topic is complex enough to motivate a separate explanatory document. The Navigation Features List  describes mechanisms for immediate random access to selected areas of a book, and other capabilities such as searching, highlighting, excerpting, and skipping user-selected elements.
Structuring guidelines for digital talking books. The Structuring Guidelines document tells DTB producers how to put XML tags into text so that the relationships between components are properly represented. It suggests where tags from the allowable set should be inserted into a document and indicates the proper syntax. For example, <p> marks the beginning of a paragraph and the end is marked by </p>. Consult the document titled, "DAISY structure guidelines" for more information.
Open eBook Standard. We refer to this standard because it represents an industry effort to achieve compatibility among various playback devices and content from various producers. We recognize that for the widest and most enduring support it is advisable to converge on standards that dominate the consumer market. Moreover, eBook participants have a keen and authoritative interest in resolving difficult issues such as digital rights management (DRM) methods and metadata requirements. One section of the eBook standard that is of particular interest is the package file. This file lists the components of a given product and indicates various relationships among them. For example, the spine area of the package file lists product files in a logical linear reading order. It appears that with minor modifications, the package file specification, which is embodied in an XML document type definition (DTD), would be suitable for use in the DTB standard. The modifications would expand the allowed file types to include various audio formats.
The normative documents that comprise the standard presently consist of the following:
Digital talking book (DTB); Document type definition (DTD). This technical paper defines what XML tags are to be used to indicate the structure of a particular document and the proper syntax for their use. As with all DTDs, it is typically used by parser software to verify that target documents are "well formed" and "valid," i.e. are properly marked up. A DTD is typically read only by a computer. The application of this DTD is the subject of the Structuring Guidelines. For a view of the DTD and its history, please see the document titled, "Document type definition (DTD) for digital talking books" .
DTB Bookmark DTD. This technical paper defines the structure, syntax, and content of a bookmark file. Bookmark files are portable files to be composed and read by various playback devices. They are designed to allow a user to set a large number of bookmarks or highlight many sections and attach text or audio labels to them. To ensure compatibility, it is necessary to define a standard format. This DTD is nearing completion. In service it would be directly used only by player software, not by the patron.
DTB navigation control center for XML applications (NCX) DTD. This technical paper defines the structure, syntax, and content of a file called the Navigation Control Center that is used by a player to provide a direct access to various areas of the book being read. The NCX is typically built by software and accessed directly only by player software. This DTD is nearing completion.
DTB package file DTD. This technical paper defines the structure, syntax, and content of a file called the package file. The concept and most of the details are borrowed from the Open eBook Forum. The package file would be built by software with producer intervention and accessed directly only by playback software. This DTD is posted on the Open eBook site and suggested modifications will be the subject of discussions at the eBook Forum.
DTB file specification. This technical paper defines the types of ASCII (text) and binary (audio and image) files that are allowed in a DTB. Most of the text files are of the XML type and consist of the following:
- Book text with tags added to indicate its structure, e.g. RevStd.XML.
- Package file to identify the DTB, list contents, include metadata, e.g. RevStd.OPF.
- SMIL file for fast access and synchronization of text with audio, e.g. RevStd.SMIL.
- NCX file to enable fast access to book components, e.g. RevStd.NCX.
- Bookmark file containing points of interest marked by the user, e.g. RevStd.BKM.
There is one other type of text file besides XML: CSS files that tell the player how to present the material to the user, e.g. RevStd.CSS files that tell the player how to present the material to the user, e.g. RevStd.CSS.
Binary files can be divided into two classes, audio and image. Audio files are as follows:
- PCM files represent audio with numeric samples like music CDS, e.g. RevStdFwd.WAV.
- ADPCM files are similar to PCM but more compact, e.g. RevStdIntro.WAV.
- MPEG files are very compact but have adequate fidelity, e.g. RevStdHist.MP3.