Accessible HTML Production Using DAISY and NIMAS: AB est Practices

Document Sample
Accessible HTML Production Using DAISY and NIMAS: AB est Practices Powered By Docstoc
					                           CAST, Inc.; 40 Harvard Mills Square #3; Wakefield, MA 01880
                           (781) 245-8547; TTY (781) 245-9320; Fax (781) 245-5212;


            DAISY and NIMAS in HTML
            Guide to Accessible HTML Production for DAISY 3 and NIMAS 1.1

            Prepared by Valerie Hendricks

This report was updated with support from the AIM and NIMAS centers, cooperative agreements
between CAST and the U. S. Department of Education, Office of Special Education Programs (OSEP),
cooperative agreement no.s H327T090001and H327P090001. The opinions expressed herein do not
necessarily reflect the policy or position of the U. S. Department of Education, Office of Special
Education Programs, and no official endorsement by the Department should be inferred.
Table of Contents

 Introduction ................................................................................................................ 1
 DAISY, NIMAS, and HTML ......................................................................................... 1
    Comparing the Standards ........................................................................................ 1
    Document Structure ................................................................................................. 2
    Mark-Up: Elements and Attributes ........................................................................... 3
    Mark-Up: Separation of Content and Presentation................................................... 4
 The Need for Accessible HTML Materials ................................................................ 4
 HTML Production Using DAISY and NIMAS ............................................................ 5
    Review Source Content ........................................................................................... 6
    Edit Source Content ................................................................................................. 6
    Presentation: Formatting, Navigation, Rendering................................................... 19
    Testing and Review of HTML Conversion .............................................................. 21
 References ............................................................................................................... 21
 Resources ................................................................................................................ 25
 Acknowledgments ................................................................................................... 26

The National Instructional Materials Accessibility Standard (NIMAS) is a technical standard used to
produce source files in XML format that may be used to develop multiple specialized outputs in a
variety of formats for students with print disabilities. The XML and image source files of a NIMAS
fileset can be used to create Braille, large print, HTML, DAISY digital talking books (DTBs) using
human voice audio or text-to-speech synthetic audio, and more. (The NIMAS applies to instructional
materials published [available for purchase or in print] on or after 7/19/06.)

The NIMAS is a sub-set of the DAISY Standard. DAISY stands for Digital Accessible Information
SYstem and refers to an international standard for creating a variety of DTBs: digital books that are a
combination of synchronized text and audio. The DAISY specification is made up of internationally
agreed-upon rules and requirements necessary to create digital and audio books, including XML and
SMIL file requirements and structural and other aspects.

                                     This guide is based on the
                                    DAISY 2005-3 and NIMAS 1.1
                                      technical specifications

Both DAISY and the NIMAS use XML and therefore XML’s most important asset: the separation of
content from its presentation. Using content’s structure and components as a source, a variety of
outputs or products with very different format, layout, presentation, and features can be created. The
results may then be used to support a diverse group of learners and users with print disabilities.

This guide is intended to assist publishers and other producers of accessible instructional materials
(AIM) to create accessible HTML outputs sourced from NIMAS filesets. Several of the elements of the
DAISY Standard and the NIMAS technical specification do not have HTML equivalents. This guide
was prepared to address the question of how DAISY/NIMAS elements without direct corresponding
elements can be converted into HTML and to cover important aspects of a NIMAS to HTML
conversion project.

Comparing the Standards
While DAISY, the NIMAS, and HTML share many fundamental characteristics, they also differ in
significant ways. Requirements for DAISY files, NIMAS filesets, and HTML documents and some
useful information for comparison are outlined below.
An HTML document consists of one or more files: one or more HTML files and one or more optional
additional files, such as a stylesheet, images, or javascript files. HTML elements are usually
conceptualized as being made up of three main kinds of elements:

  those that describe the structure and the organization of content components (such as
    paragraphs, headings)
  those that describe the format and layout of the content (such as emphasis [<i>, <b>], spacing
    [<colgroup>, <sup>])
  those that link one part of content to another or to outside content (such as link to a glossary, link
    to a web page)

Semantic HTML refers to HTML code that is devoted to content rather than presentation. Since HTML
was originally intended to be semantic HTML, it is an obvious best practice to use it (and in fact, it is
consistent with the NIMAS, which requires content and presentation separation) especially since all
HTML will increasingly be semantic HTML again. HTML is most compatible with multiple browsers
and with assistive technology (AT) when it strictly follows the HTML standards and semantic
definitions of the HTML elements
A DAISY fileset consists of four or more files: an (optional) XML text content file, an OPF file
(metadata), an NCX file (navigation), a SMIL file (synchronization), and (optionally) one or more audio
files. Image files are often present as well.

Within the XML text content file, DAISY elements are classified as major structural, block, and inline
elements. Major structural elements include levels (<level1> through <level6>), <frontmatter>, <bodymatter>,
and <rearmatter>. Examples of block elements include <list>, <paragraph>, and <sidebar>. Examples of
inline elements include <a>, <pagenum>, and <span>.

DAISY filesets often have very many files for two fundamental reasons: to support enhanced
navigation, and to support synchronization of text and audio. The NCX, for example, can contain
information that supports navigation by heading and by page, as well as information to allow direct
navigation to tables, figures, sidebars, and other content within the book. DAISY filesets can contain
textual content only, audio content only, or a mixture of the two. In the latter instance, a mechanism is
needed to allow text to be displayed by a DAISY reader while an audio version of that same text is
being played. Readers capable of processing DAISY DTBs provide synchronized playback of mixed
text/audio books using information in SMIL files; others can provide audio versions of textual content
using only automatically-generated synthesized speech.
A NIMAS fileset consists of three or more files: an XML source file, an OPF file (metadata), one or
two PDF files (print work title and copyright page[s]), and any image files present in the print source

The NIMAS Technical Specification contains a Baseline Element Set that is made up of a relatively
short list of elements meant to cover all of the basic, but no more than the basic, needs of publishers
and other producers of educational works, primarily textbooks. The entire remaining list of elements
available in the DAISY standard are included in the NIMAS as optional elements. Producers are
encouraged to use optional elements where appropriate.

When creating an HTML conversion from a DAISY or NIMAS source, typically the only files used will
be the XML source file and all of the image files.
Document Structure
In many print works, particularly textbooks and other structured works, it is often important to indicate
that certain segments of text or other content be designated as having significant aspects to them
independent of their fundamental content (such as “text” or “image”); for example, a common
meaning, a relationship to each other, or a shared appearance. Properly structured documents are
critical for accessibility. In DAISY, the NIMAS, and HTML, these aspects of a work are addressed by
structure and mark-up. (Presentation aspects such as format, layout, style, etc., are added at a later
date.) Creating a structure and mark-up that accurately reflects the content of the work in question is
the goal of both DAISY and NIMAS source files and of HTML outputs.

Structural Elements
In DAISY, level elements are perhaps the most fundamental of structural elements. The following
example provides some additional consideration of this basic element.

<level1 class=”part”>
        <h1>Part 1</h1>
        <p>This is an introductory paragraph for Part 1.</p>
        <level2 class=”chapter”>
                <h2>Chapter 1</h2>
                <p>This paragraph is part of Chapter 1.</p>
        <level2 class=”chapter”>
                <h2>Chapter 2</h2>
                <p>This paragraph is part of Chapter 2.</p>

The example above shows a top-level section of a textbook identified using <level1 class=”part”>, which
has two sections identified using <level2 class=”chapter”>. These levels could have other content
components (nested properly) within, such as additional chapters, sidebars, block quotes, etc.

                                                      <level1 class=”part”>
                                                        <h1>Part 1</h1>
                                                        <level2 class=”chapter”>
                                                          <h2>Chapter 1</h2>
                                                          <level3 class=”lesson”>
                                                            <h3>Lesson 1</h3>
                                                          <level3 class=”lesson”>
                                                            <h3>Lesson 2</h3>
                                                          <level3 class=”lesson”>
                                                            <h3>Lesson 3</h3>

The example above illustrates a structural organization of content components (part, chapter, and
lessons) and corresponding possible mark up in DAISY or NIMAS XML.
Mark-Up: Elements and Attributes
DAISY, the NIMAS, and HTML share a number of common document structures (structural
components), but may implement them differently. Two representative examples where
DAISY/NIMAS and HTML have conceptual differences are outlined as follows:
HTML has a <title> element, for use in designating the title of a document. It is intended to signify the
title of the work as a whole and to be placed at the top of an HTML document.

DAISY and the NIMAS have the elements <doctitle> and <covertitle>. The <doctitle> element serves to
indicate the official, formal title of the entire document and is placed at the top of an XML document.
The <covertitle> element contains the title that appears on the cover of the work, which often differs
from a work’s full title.

<doctitle>Jane Eyre: An Autobiography</doctitle>
<covertitle>Jane Eyre</covertitle>

<doctitle>Through the Looking Glass, and What Alice Found There</doctitle>
<covertitle>Through the Looking Glass</covertitle>

HTML uses the <caption> element to indicate the caption of a table. It appears only once per table and
is must be placed at the top of the table mark-up, just after the opening <table> tag.

DAISY and the NIMAS also use the <caption> element to indicate the caption of a table, and, when so
used, must be coded in the same way. However, <caption> is also used within the <imggroup> element.
When used with images, additional options and placements are available.

<img id=”u01.c04.011” src=”./images/unit1/chapter4/img001.jpg” alt=”Photo of a panda eating bamboo”/>
<caption imgref=”u01.c04.011”>The diet of the Giant Panda consists of up to 90% bamboo</caption>

Mark-Up: Separation of Content and Presentation
One of the main reasons that a DAISY or a NIMAS fileset is a source file, appropriate for use in
creating a wide variety of outputs, is the fact that content and its presentation are separated. This
separation is crucial to the usefulness of a source file for the creation of specialized formats.
However, it is still almost always necessary for output conversions to provide a styled and formatted
finished product. One of the most common ways to include information about display details for HTML
outputs is the use of CSS. The DAISY Consortium distributes a basic, default stylesheet for use with
DAISY files meant to be used online, with a browser. Such a stylesheet may be modified for HTML
display purposes. Note that NIMAS filesets, and, in many cases, DAISY filesets do not properly
contain stylesheets and the use of CSS is described here as an appropriate addition for HTML

The Need for Accessible HTML Materials
HTML is a format widely used around the world and supported by a vast array of hardware, software,
and web-based applications. Much of available freeware, shareware, public domain content, and
other no-cost products and resources support the HTML format. In addition, virtually everyone with
access to computers and to the Internet is thoroughly familiar with and comfortable using HTML.
Given HTML’s worldwide distribution, use of this format to provide accessible instructional materials is
logical, efficient, and allows enormous variety of delivery and use. Additional considerations include
use of many of HTML’s advantages, associating additional supplementary materials, integrating
additional components such as MathML or MusicML content, and secondary concerns such as user
preference for HTML and the relative ease of HTML conversion from a DAISY or NIMAS source.

However, HTML in and of itself is not necessarily an accessible medium. Creation of an accessible
HTML document requires the inclusion of additional components, such as information that allows
effective navigation (such as hyperlinks), the provision of alternative text (alt text/alt tags and long
descriptions) for images, and ensuring that the structure of a source work is adequately preserved
and portrayed. (One or more of any additional components may be already present in DAISY or
NIMAS sources files, allowing for an easier conversion to HTML.)

Web sites/pages devoted to accessible HTML:

 Creating Accessible HTML
 HTML Techniques for Web Content Accessibility
 Web Standards Project Accessible HTML/XHTML Forms

For additional information regarding the instructional need for accessible HTML, please refer to the
following resources:

  Accessible Media: Text
  Accessible Textbooks in the K–12 Classroom II
  All About AIM

HTML Production Using DAISY and NIMAS
Thanks to the power of XML, used for mark-up of content by both standards, the conversion process
from DAISY or NIMAS to HTML is fairly straightforward. The process includes the following:

      Review the source content
          o check to be sure the source files are valid XML, conformant to either the DAISY
              standard or to the NIMAS, and use consistent mark-up that is accurate to the print
              source throughout (identify any structural or content flaws for correction)
          o check the source files for missing components (for example, missing images)
          o check the structure and components of the source for unique or seldom-used
              organization, parts, or sequence
      Edit the source content if needed and/or desired
          o correct any errors or inconsistencies found during review
          o address any missing parts issues
          o address unusual or challenging pieces
          o choose alternate mark-up for elements without an HTML equivalent
          o create any additional navigation to be included in the work (for example, attributes or
      Convert the source content
          o to HTML format
          o and create any visual presentation additions for the work (for example, a CSS
      Test and review the HTML conversion
          o verify content, structure, styling, navigation, accessibility, etc.
          o check to be sure that the conversion can be used with the applications, tools, etc., for
              which it was made
          o validate HTML code (see below for validation resources)

In the sections that follow, each of these steps is discussed in more detail and information included
regarding specific issues and strategies for resolving them. It’s important to note that not all of the
issues identified in these sections may be present or need to be addressed in any one particular
HTML conversion, nor is coverage exhaustive. However, using these guidelines will help produce an
HTML output that is readable, navigable, and useful.

Review Source Content
Prior to conversion, it’s important to review source content to ensure that it can be used to produce
accessible HTML. There are a number of checks and enhancements that can be performed prior to
the actual conversion that will help ensure the resulting HTML conversion is of high quality and is
useful to the widest variety of users.
Complete, Accurate Filesets
Check to be sure the source files are valid XML and are conformant to either the DAISY standard or
to the NIMAS. Many XML editors include a validation feature that will check an XML file for well-
formedness and against a DTD listed in the document declaration. Several XML editors that can
validate a source file are XMLSpy, Dreamweaver, oXygen, and Stylus Studio. Online validators
include the W3C’s Mark-Up Validation Service, Theano GmbH’s XML Validation, and Edinburgh
Language Technology Group (LTG)’s XML well-formedness checker and validator.

Review the source file to identify any structural or content flaws for correction. Make sure that an
actual image file is present for each <img> element in the XML, and that the filename path
corresponds to the correct image. Review the source file for components such as one-time features
or complex pieces. Check for missing or duplicated content.
Consistent, Correct Structure
Check the source file to be sure that structural mark-up is consistent throughout a work and
accurately reflects its source. Examples of structural errors or mismatches to repair include—

  Content components of the same kind are marked up differently
       o Example: A Q&A sidebar that appears at the end of each chapter is marked up one way in
         one chapter, and another way in other chapters
       o Example: Recurring margin notes marked up differently each time they appear
  Content components are not marked up according to the source work
       o Example: Page breaks in incorrect locations
       o Example: Sidebar content marked up as a paragraph
  Excessive use of <sidebar>, <p>, or other elements that do not correspond to content type

It would not be possible to list all potential inconsistencies or errors that might be found in a source
fileset, but these items concern the entire work as a whole and should go a long way to ensuring that
a file is of high quality and accurately reflects its print source.
Edit Source Content
Editing of the source files can be done with any one of a number of XML editors or even in a text
editor such as Notepad. Using XML editing software has several advantages over hand-coding; most
programs correct errors as they are typed or on demand; save time by auto-completing tags and
attributes; and often include rather sophisticated search, find, and replace features.


XML in Notepad

The vast majority of producers will prefer to edit DAISY or NIMAS source files in a free or
commercially available XML editor. However, there is one rare instance where it may be pragmatic to
convert the source file to RTF as part of the conversion process. Specifically, Braille transcribers
often find that the use of RTF to help create Braille products is a significant aid because the use of
this format means a very large file can be divided into smaller, more manageable pieces. Further, the
longstanding practice of using RTF for Braille conversion means that experienced Braille transcribers
work efficiently and comfortably in this format. Converting to RTF may also be useful to those who,
without an understanding of the DAISY or NIMAS specifications nor of XML, still must produce an
HTML output. RTF can be used beneficially in these select cases.

A useful and high quality tool that can convert DAISY or NIMAS filesets to RTF is the online
conversion tool at the TechAdapt Accessible Media Center, or TAMC. The TAMC converts XML into
RTF in just a few seconds and offers a range of choices for conversion and output details. The DAISY
Pipeline converts files via its “DTBook to RTF” conversion script. The DAISY Pipeline can also
convert an edited RTF file back again to a DAISY DTB. Note that the Pipeline requires Java Runtime
Environment version 5 or later, while the TAMC requires version 1.6 or later.

                                            TAMC screenshot

                                        DAISY Pipeline screenshot

Elements in DAISY and NIMAS without HTML Equivalents
Many DAISY and NIMAS elements have equivalents in HTML that make conversion very
straightforward. Others have near matches or are well known and can be handled during a
conversion process without difficulty. However, there are elements in the NIMAS and in the DAISY
standard that are not present in or recognized by HTML. It is necessary to address these in any
conversion from a NIMAS fileset to an HTML output, whether by editing the XML file or by script.
Many of the more important examples are listed in the comparison table below.

              Comparison of Elements in DAISY and NIMAS without HTML Equivalents
                DAISY 3 & NIMAS 1.1 Baseline Element Set                         HTML 4.01 & XHTML 1.1

                specification                        element                 example possible replacement
                DAISY 3                    <annoref>, <annotation>                      <a>, <div>
           DAISY 3 & NIMAS 1.1                    <author>                                <span>
           DAISY 3 & NIMAS 1.1                 <bodymatter>                     <div> (no explicit parallel)
                                                                                <body>, <div> (no explicit
                   DAISY 3                           <book>
                DAISY 3                          <bridgehead>                       <h2>, <h3>, <h4>
                DAISY 3                              <byline>                             <span>
                DAISY 3                            <covertitle>                           <span>
                DAISY 3                             <dateline>                            <span>
           DAISY 3 & NIMAS 1.1                    <docauthor>                             <span>
           DAISY 3 & NIMAS 1.1                      <doctitle>                         <p>, <span>
           DAISY 3 & NIMAS 1.1                      <dtbook>                              <body>
                DAISY 3                            <epigraph>                       <p>, <span> <div>
           DAISY 3 & NIMAS 1.1                   <frontmatter>                <div>, <p> (no explicit parallel)
           DAISY 3 & NIMAS 1.1                         <hd>                      <h1>, <h2>, <h3>, <h4>
           DAISY 3 & NIMAS 1.1                     <imggroup>                              <div>
           DAISY 3 & NIMAS 1.1             <level>/<level1>–<level6>              <div>, <p> (or new file)
           DAISY 3 & NIMAS 1.1                         <lic>                           <li>, <span>
           DAISY 3 & NIMAS 1.1           <line>/<linegroup>/<linenum>             <span>; <p> and <br/>
           DAISY 3 & NIMAS 1.1                         <list>                            <ol>; <ul>
           DAISY 3 & NIMAS 1.1                 <note>/<noteref>                       <span> <div>
           DAISY 3 & NIMAS 1.1                     <pagenum>                              <span>
                DAISY 3                              <poem>                        <p>, <span>, <div>
           DAISY 3 & NIMAS 1.1                     <prodnote>                      *See explanation below
           DAISY 3 & NIMAS 1.1                    <rearmatter>                <div>, <p> (no explicit parallel)
                DAISY 3                               <sent>                             <span>
           DAISY 3 & NIMAS 1.1                      <sidebar>                           <p>, <div>
                DAISY 3                                <w>                               <span>

*The <prodnote> element is used for a producer’s note regarding content. Currently, it is most often used to describe
complex images for which an alt tag is inadequate. The render attribute is required and must be set to either “required” or
“optional.” <prodnote>s should be associated with the content to which they refer. See the Alternative Text section below
for more information about how to convert the <prodnote> element to HTML.

Examples of alternate mark-up for HTML

<span class=”annoref”>
<a class=“annoref”>
<div class=”annotation”>

<span class=”author”>
<span class=”author-name”>

<div class=”bodymatter”>
<div class=”main-content”>

<div class=”imggroup”>
<img id="p019-001" src="./images/U01C01/p019-001.jpg" alt="Map of southwest Asia with Israel highlighted and an inset
photo of Israel"/>
<p class=“caption” id=”c019-001>Place: Israel has a dry climate in the south, and a wetter climate in the north, with
prosperous farms and thriving cities.</p></caption>

Alternate mark-up for the <lic> element is conceptually divided into two categories due to the nature of
content that goes to make up lists. The first can be thought of as regular list content, such as a top
ten list, that generally addresses one straightforward idea. The second is more complex, and can be
thought of either as referring content or as content with multiple parts (two or more pieces),
depending on its nature. The use of <lic> is inappropriate for straightforward, one-concept lists, so
alternate mark-up for ths kind (when encountered) is comparatively simple—just drop the <lic>. Mark
up for the second, more complex sort of list requires more thought and editing.

Examples of lists with more complex content that may be coded using <lic> in XML and would
need alternate mark-up in HTML:

Alternate mark-up for more complex lists, such as tables of contents, indices, catalogues, etc.,
sometimes pose a bit of a challenge. Descriptive examples for the most commonly found list of this
type, tables of contents (TOCs) is outlined below.

Example of a basic TOC entry in XML:

<lic class=”entry-title”>Chapter 1</lic>
<lic class=”pagenum”>5</lic>

Sample HTML:

<li class=”entry-title”>Chapter 1</li>
<li class=”pagenum”>5</li>
<li class=”entry-title”>Chapter 1</li>
<li class=”pagenum”><a href=”/unit2/chapter1/U2CH1page005”>5</a></li>

Note that it is crucially important never to combine the two halves of an entry of a table of contents,
index, or other list that contains location information. Text and page numbers must not be coded as if
they were one piece of content or one piece of information.

In an index or TOC, for example, an entry is made up of two parts. One half of the entry contains text,
and the other half contains a page number (or, potentially, other location information). Aside from
centuries of definition and usage corresponding to entries, and an expectation for same by users,
coding this kind of list’s entries correctly enables searching and navigating content by text (such as a
search by chapter title) and by page/location.

CSS can be used to create the traditional look of a table of contents. For example:

li.part, li.hdr1, li.chap, li.sect, li.hdr2, li.hdr3
float: left;
width: 80%;
/*padding-right: 400px;
padding-left: 50px;*/
float: left;
width: 20%;

Would be rendered in a browser as—

                                              Enhancing Navigation

One of the major strengths of the DAISY file format is its excellent support for navigation. Using a
combination of hyperlinks, skippable/escapable structures, and a Navigation Control File (NCX), a
well-marked DAISY fileset can enhance an end user’s reading experience by making it easy to find
and navigate to different content sections.

However, many DAISY and NIMAS filesets produced by third-party conversion services often do not
leverage DAISY/NIMAS or HTML navigation support features. Often the strengths of DAISY options
are lost on conversion to HTML. To retain options and features, HTML code must be added to
provide navigational links, menus, TOCs, etc.

Value-added features such as linking content components can enhance navigation of an HTML
conversion and greatly improve the product overall. One of the most useful and important examples
of enhanced navigation is the addition of links to a TOC and, if only one additional feature is to be
added to a conversion for accessibility, the creation of a linked TOC is the best option.

Tables of contents

correct mark-up
Tables of contents (TOCs)—as well as other lists of a similar nature such as indices—can be
formatted in many different ways. The most important requirement to follow in marking up TOCs,
indices, and the like correctly is to maintain their entries. Never combine the text half of an entry with
the location information half of an entry.


<list type="ol" enum="1" depth="2">
   <lic class="entry”>Evidence of Evolution</lic>
   <lic class="pagenum”>182</lic>
   <lic class="entry”>Earth Science: The Fossil Record</lic>
   <lic class="pagenum">189</lic>

Because both <lic> elements make up one complete entry inside a list item (<li> element) in the
example above, they should never be made into a single list item. Some TOC entries, however, are
not coded correctly to reflect the type of content that is contained in such a list. If appropriate
elements and attributes are not used, the result is incorrect. One possible error in mark-up is shown

Example of an error in HTML mark-up:
 <li class= “entry”>Evidence of Evolution </li>
 <li class="pagenum”>182</li>
 <li class="entry”>Earth Science: The Fossil Record</li>
 <li class="pagenum">189</li>

The resulting HTML display would show the following errors:

One correct way to mark up TOC entries, while replacing the <lic> element unrecognized by HTML, is
as follows:

<ul class=”toc”>
<li class="entry”>Evidence of Evolution</li>
<li class="pagenum”>182</li>
<li class="entry”>Earth Science: The Fossil Record</li>
<li class="pagenum">189</li>

Note that the use of an unordered list, <ul>, as well as the application of styles via CSS, will enable
this table of contents entry to be displayed correctly on screen. Use of attributes to distinguish these
types of lists and their entry halves from other kinds of lists that may be found in a work is key to
being able to render and use them correctly upon conversion to HTML. It is important to mark up the
content accurately to maintain the purpose of it as well as to permit features of HTML to be used to
advantage. Never add content such as invented text or images to serve as a substitute for mark up.
(For an example of a complex list that is not a TOC or an index, see the following art exhibition list.)

adding links
By adding links to a TOC, whether by use of hyperlinks added directly to the source file or by use of
code that will then be employed by script in a transform, navigation and ease of use of the HTML
conversion can be greatly enhanced. DAISY players allow for direct navigation to page numbers, so
the information in a table of contents is useful without explicit links. In HTML, however, the user will
not be able to find referenced pages unless the conversion process builds in appropriate links and
navigation features. When properly done, such navigational enhancement creates an HTML output
that is as accessible and navigable as possible. Examples of mark-up for navigation links are shown

<li class=“tocline”><a href= “#id12345”>Evidence of Evolution</a></li>
<li class=“tocpage”>182</li>
 <h3 id=“id12345”>Evidence of Evolution</h3>

When the fileset is converted to HTML, the hyperlink from the TOC entry allows the reader to easily
go to the “Evidence of Evolution” heading in the book.


<li class=”tocline”>Evidence of Evolution</li>
<li class=”tocpage”><a href=”#p0182”>182</a></li>
<span id=“p0182" class="normal">182</pagenum>

When the fileset is converted to HTML, the hyperlink from the TOC entry allows the reader to easily
go to page 182.

For a series of progressively more complex examples of how to generate TOCs via XSLT, see Jesper
Tverskov’s TOC for XHTML with XSLT. See also the W3C’s XSL Transformations (XSLT)
specification 1.0 for more information.

According to the DAISY Consortium, when <pagenum> is not used, <span> is recommended:

<span class="page-normal">7</span>

Visible indications of page breaks are useful (for example, for a student who needs to cite print edition
page numbers in a writing assignment, even though an alternate format of the book is being used).
However, make sure to use appropriate styling so that page numbers are set off from main text and
do not confuse the reader.

<poem> serves as a good representative sample of an element that may require or benefit from more
thoughtful and careful mark-up:

                    <line>If I cannot see her, at least I can think of her, and so be
DAISY source        happy;</line>
file                <line>To light the beggar's hut no candle is better than
                    <div class="poetry-box">
                    <p class="poetry">If I cannot see her, at least I can think of her, and
HTML file           so be happy;<br/>
                     To light the beggar's hut no candle is better than moonlight.</p>
                    width: 400px;
CSS document        border-style: solid;
for use with        border-width: thin;
HTML output         border-color: blue;
                    padding-left: 5px;
                    padding-right: 5px;

                        font-size: 12pt;
CSS document            margin-top: 5px;
for use with
HTML output             p.poetry + p.poetry
(continued)             {
                        margin-top: 20px;

The attributes (example: class="poetry-box") allow for the poetry content to be marked up more in
keeping with the intent and purpose of the <poem> element. The result—

DAISY provides sample mark-up for this element as follows:

<div class=“sidebar”>

This modification is simple and straightforward and preserves information about the component’s
type. Use of <p> is not recommended in this case, since errors in coding sidebar content as inline
paragraphs could be prolonged in this way.
Providing Accessibility Support for Images
For many books—particularly textbooks—images form an integral part of instructional materials and
the reading experience. In order to provide content accessibility, alternative text and long descriptions
(LDs) should be included. The alt attribute is required for images in both DAISY and NIMAS filesets;
however, it is often used as a placeholder in the source file and alternative text must be added in the
production process.
Alternative Text
The following excerpt explains alternative text for images and outlines how to provide it:

       Writing for Accessibility: Alt Tags and Long Descriptions

       An alt tag is a brief description of an image. The "alt" in alt tag stands for "alternative" and an alt
       tag is alternative text—another option to the image. Alt tags should state the type of image and a
       brief summary of the image. They should not have any unnecessary text. Alt tag text should be
       approximately four to ten words long. Alt tag text is designed to be brief. The point is to capture
       the function of the graphic and to express it in terms that make sense.

        Every image has an alt tag associated with it. An alt tag must appear for every purposeful image.
        The alt tags appear on screen with mouse-over, or when the mouse is moved over the image.
        Assistive technology such as a voice-output screen-reader will not "read" an image but will read
        the alt tag instead. Text-only browsers display alt tags over the image placeholder.

        A long description is a detailed description of an image that supports or adds meaning to the text.
        Long descriptions are context-specific. The details given depend on how the image supports or
        supplements the text. Their purpose is to provide content information conveyed by the image so
        that students who are unable to "read" the image, for whatever reason, still have access to the
        information relevant to instruction that is conveyed by the image.

        Long descriptions are provided whenever an alt tag is not sufficient to convey the content of an
        image. Long descriptions should be written for each image (map, timeline, picture, chart, graph,
        photo, etc.) that supports the text or gives additional or new information needed to understand
        content or topic. A long description should be included whenever an alt tag cannot provide
        sufficient information about the object and its purpose for inclusion. Remember that long
        descriptions vary according to learning goals. Try to create a balance between brevity and
        sufficient information so that every learner can access key content.

                      Editorial Process Guidelines for Creating Accessible Digital Textbooks. CAST, Inc. 2004


<img src=”./images/boy-with-ball.jpg” alt=”Small boy with a red ball”/>

<img src=”./images/ch02/p056-002.jpg” alt=”Photo of a lion”/>

These are examples of images that may not need more than an alt tag.

This more complex image of a map—
<img src=”./images/U04/C01/04-01-006.jpg” alt=” Map of Shang, China” longdesc="./LDs/U04C02/longdesc04-01-

—will need a long description to convey its information:
<p>This map is entitled "Shang China" and shows the area occupied by the Shang Civilization in circa 1100 B.C. The
map shows Asia in the west, China in the south, and the Yellow Sea and the Pacific Ocean in the east. Rivers are
indicated by black lines.</p>

<p>The area labeled "Shang Civilization, c. 1100 B.C." is shown in orange and extends in length from the Huang Ho river
in the west to the Yellow Sea in the east. It extends in width from about 100 miles north of the 40°N latitude to about 200
miles north of the Chang Jiang River. The final portion of the Chang Jiang River is the southern border of the far
southeast corner of the orange area. The city of An-yang is marked with a black dot about 200 miles in from the coast of
the Yellow Sea along the Huang Ho River.</p>

Long descriptions are usually added to images using a hyperlink that refers to another HTML file.
Here is an example:

<img id="p011-001" src="./images/U01C02/p011-001.jpg" alt="Illustration of peaks and troughs in a line of waves"
<a href="./LDs/U01C02/longdescp011-001.htm" onclick="return popupLD('./LDs/U01C02/longdescp011-001.htm');"
title="Image description link" class="ldlink" target="_blank">Image description</a>
<title>NIMAS Exemplar 6: Science LDp011-001</title>
<link rel="stylesheet" href="Ex6Science.css" type="text/css"/>
<p>This illustration shows the peaks and troughs in a line of waves. A dotted horizontal line shows the middle or neutral
level and a wavy line goes up and down across the dotted line. A line above with arrows points out the highest peak of
each wave along the line. A line below with arrows points out the lowest trough of each wave along the line.</p>
<p>Click the Close button or use CTRL+W to close this window.</p>

See the AIM Center’s Exemplars page for more examples of images, alt tags, and LDs.
Text in Images
Some images are not in fact graphical or illustrative in nature but can be all, mostly, or heavily made
up of text. How to code such images poses a bit of a challenge. The following excerpt from the Text in
Images section of NIMAS Files Best Practices explains key differences between images which
contain text and how to provide accessibility for them.

        “The key to choosing whether or not to mark up text in images as alt and LD text or as body
        text is to determine whether or not the text is an integral part of the image (chart; map =
        alt and LD), whether the content is presented visually for variety or other non-instructional
        reason (some icons; repeated reminders = alt), and whether or not the text would stand alone
        if its image were not present (menu; scorecard = body text).

        “This last point of whether or not embedded image text could stand alone without its visual is
        an important one since image-dependent embedded text must remain part of its image (as
        a long description) in order for the text to retain its meaning and stand-alone embedded
        text is often far more accessible if converted into body text because image-
        independent embedded text is, effectively, body text presented visually.”

See also page 8 of NIMAS Files Best Practices for detailed examples.
Image Formats
The NIMAS specifies three image formats for use in filesets: SVG, PNG, and JPG. Each format has
its own strengths and weaknesses and each is more suitable for certain types of images than others.
JPG is currently the most widely used.

Scalable Vector Graphics (SVG) are actually XML and this format is used to create images that can
be manipulated in a variety of ways without loss of quality, and provides search features that greatly
enhance the accessibility of the image.

SVG is a preferred format for images in the NIMAS technical specification. With the release of
Internet Explorer 9 in early 2011, it is supported by the latest versions of all major web browsers
(though they may not support all SVG features, such as animation). If you are targeting older
browsers, or limited versions of browsers such as those on some mobile devices, then you may need
to convert SVG graphics to another image format.

If a fileset contains SVGs, and it is not known prior to conversion that an output is intended for use
with tools and resources that support SVG, it is wise to provide images converted to either PNG or
JPG format. (PNG and JPG are the other two of the three specific image formats of the NIMAS
specification. While it may be necessary to modify images for size, resolution, or other purposes, it is
not recommended to convert images to another format.)

Image conversion tools include the following:

      Adobe Illustrator:
      Gimpshop:
      ImageMagick:
      Inkscape vector drawing tool:
      Photoshop:

Edit the XML source file as well as the OPF manifest to reflect the new image filenames.
Image Resolution and File Size
Images for print works are almost always prepared to a high resolution for optimal printed results, and
images in NIMAS filesets are required to be between 300 and 600 dpi. Images intended for use
online are usually produced at a much lower resolution, typically 96 dpi or 72 dpi, the approximate
resolution of a computer screen. These resolutions work well on-screen, and reduce the total file size
per image. If a fileset contains high-resolution, print-ready images, it may be an improvement in an
HTML final output product to reduce the images’ resolution and re-size them. It may also make the
conversion nimbler and easier to use.

Many software programs offer image editing features, a few of which are listed below:

      Gimpshop
      Image Converter Plus
      Photoshop
      Pixillion

Other tools are available that include an image reduction feature that can be used to convert images
to 72 dpi:

      NIMAS Conversion Tool (uses ImageMagick internally)
      TechAdapt Accessible Media Center (TAMC)

Images should be reviewed after conversion to ensure that they are still sharp and clear enough to be

Presentation: Formatting, Navigation, Rendering
HTML without an accompanying CSS will not have styling and other presentation information and
visually it will appear to be very plain. It’s important to decide what presentation styles are needed for
an HTML conversion in relation to both fidelity to the source work (as needed) and accessibility and
readability features. See the DAISY Consortium’s CSS Syntax Overview page for more information.

Content may also greatly benefit from the value of encoding additional links for enhanced navigation,
especially TOCs, tables, figures, references, and glossaries. Another value-add option that may be
needed or beneficial for intended users is to provide additional links cross-referencing directive or
descriptive text with its referents.

For example, the following text—

<p>Refer to Table 1 for a list of common metric length measurement units.</p>

—could contain a text link at “Table 1” to provide navigation to Table 1’s location:

<p>Refer to <a href=”#table01”>Table 1</a> for a list of common metric length measurement units.</p>
<table id=“table01”>

For a glossary word appearing within a work’s text, such as—
<p>A scientist who studies living things is called a <dfn>biologist</dfn>.</p>

(with <dfn> tags used to identify a defining instance) a link could be provided for navigation directly to
the word’s entry in the backmatter’s glossary section:

<p>A scientist who studies living things is called a <dfn><a href=”#glossary-biologist”>biologist</a></dfn>.</p>
<p> A scientist who studies living things is called a <dfn class=”glossary”>biologist</dfn>.</p>

could be used with Javascript added to create links.

These are just a few examples of the many opportunities for and many ways of adding links to
appropriate content to make that content and the work as a whole much more accessible and useful.
Converting DAISY/NIMAS to HTML via Conversion Tools
There are a number of free tools that can perform a conversion directly from DAISY/NIMAS to HTML.
These include—

        The NIMAS Conversion Tool
        The TechAdapt Accessible Media Center (TAMC)
        The DAISY Pipeline

These tools provide varying capabilities for dividing a DAISY or NIMAS fileset into smaller sections;
changing image size, resolution, and format; including or excluding certain types of content; and so
on. All provide fairly straightforward user interfaces that do not require in-depth knowledge of DAISY
or NIMAS to use. Of course, it is also possible to use XSLT (eXtensible Stylesheet Language for
Transformations) for conversions.
Additional Considerations
Character Encoding
DAISY, NIMAS, and HTML are all intended to support substantially all of the world’s written
languages. In order to do so, designation of character encoding is required as part of the source file.
DAISY and NIMAS encourage use of the Unicode UTF-8 character set (but UTF-16 can also be
used). UTF-8 is widely used in HTML as well. There is no need to change the designated character
set of a DAISY or NIMAS source file.
Skippability and Escapability
Skippability is a term used to describe the skippable structure feature of the DAISY standard that
allows a DTB to be more readable and to make its navigation easier and more efficient by encoding
skippable structures into the digital book source fileset. According to the DAISY Consortium, which
describes best practices for preparing digital content, skippable structures are—

  optional
  allow users to skip content components (especially useful for those components that are repetitive
    or seldom-used)
  navigation points (allowing users to locate these specific points at another time)
  coded in such a way that they will be compatible with future revisions to DAISY

Escapability is a similar function that allows a portion of content to be navigated away from during
playback. For more information, see DAISY’s Specifications for the Digital Talking Book.

Since class attributes and text elements added to a source file for skippability and escapability are
ignored in almost all cases of HTML rendering and additional code is found in the SMIL file which
would not be used at all for an HTML output, it is considered safe to disregard the skippability and
escapability notations in the fileset. Detailed information about skippable and escapable structures
within a DTB fileset is available online at the URL cited above.
MathML in HTML
MathML is an XML-based mark-up language for mathematics and science content works with
equations and symbolic content. At this writing, many NIMAS filesets and DAISY books display
mathematical content by the use of images with alternative text. The primary reason for this is due to
lack of market saturation of players that support MathML and the relatively small number of tools and
software able to support MathML. However, this is rapidly changing. MathML was officially accepted
as a modular extension of the DAISY standard in October 2010 and, therefore, is an optional part of
the NIMAS. Math processing software programs are available that support MathML. The NIMAS
Center strongly recommends that producers move to incorporate MathML into their workflow.

Filesets which contain MathML may be converted into HTML and used with any of the currently
available methods for rendering and processing MathML content. As of this date, browsers that
support MathML and render it onscreen include—

       Amaya
       Dadzilla
       Firefox
       Opera
       Safari

The MathJax project is an open-source Javascript library that can enable MathML in HTML pages to
permit display by virtually all web browsers.

Testing and Review of HTML Conversion
Testing and review of HTML output conversions is critical, especially if an automated conversion
process was used. Generated outputs may not convert content components as expected. Hands-on
conversions of XML source files also need to be checked for accuracy and desired results.

Consider the following when checking an HTML conversion:

      Content (all content is present and structured correctly)
      Visual rendering (on-screen appearance)
      Layout and formatting choices for HTML display
      Necessary and/or desired components present and accurate (example: page numbers
       correspond to correct content and display consistently at the correct spot on all pages)
      Navigation by levels (both needed and/or desired)
      Navigation accuracy (links go to correct sections, pages, etc.)
      Links (any outside, non-navigation, or other links work and go to correct location)
      Images (render correctly and have alternative text, as well as long descriptions if necessary)
      Usability (conversion works with tools/software intended for use)
      Accessibility (there are many automated tools available for testing for common HTML
       accessibility issues)

Always validate both HTML and CSS. Valid documents are more likely to work with a variety of
browsers and assistive technology.

For additional resources, visit the AIM Center web site:

HTML links found throughout this document are listed below in order of appearance.


DAISY Standard





basic, default stylesheet

Creating Accessible HTML

HTML Techniques for Web Content Accessibility

Web Standards Project Accessible HTML/XHTML Forms

Accessible Media: Text

Accessible Textbooks in the K–12 Classroom II

All About AIM




Stylus Studio

Mark-Up Validation Service

XML Validation

XML well-formedness checker and validator


DAISY Pipeline



art exhibition list

Jesper Tverskov’s TOC for XHTML with XSLT

W3C’s XSL Transformations (XSLT) specification 1.0


NIMAS Files Best Practices

Adobe Illustrator

Gimpshop (x2)


Inkscape vector drawing tool

Photoshop (x2)

Image Converter Plus


NIMAS Conversion Tool (x2)

TechAdapt Accessible Media Center (TAMC) (x2)

CSS Syntax Overview

DAISY Pipeline



DAISY’s Specifications for the Digital Talking Book

[citation in inline text]

Math & Images







AIM Center web site

DAISY Consortium


Content Conversion Services

DAISY Pipeline (FAQ)

NIMAS Conversion Tool

TAMC Online Conversion Tool

CSS Tutorial


Zen Garden
HTML Code Tutorial

HTML Goodies

W3C HTML Tutorial

Web Accessibility
WebAIM’s Web Accessibility Principles

WebAIM’s WCAG 2.0 Checklist

Web Accessibility for Section 508

oXygen XML editor

W3C XML Tutorial

XML Basics


The NIMAS Center acknowledges the contribution of Chris Von See of TechAdapt and thanks the
U.S. Department of Education, Office of Special Education Programs (OSEP) for their support.


Shared By: