PDF Techniques for Web Content Accessibility Guidelines 2.0

March 4, 2016

AH Formatter implements a combination of FOs and properties from XSL 1.1 plus Antenna House extensions that allow you to use the accessibility features of PDF in the output from AH Formatter.

"PDF Techniques for WCAG 2.0" is a W3C Working Group Note (http://www.w3.org/WAI/GL/WCAG20-TECHS/pdf) that describes PDF-specific techniques for meeting "Web Content Accessibility Guidelines 2.0" (http://www.w3.org/TR/WCAG20/) success criteria. The text of the Note is excerpted from "Techniques for WCAG 2.0" (http://www.w3.org/WAI/GL/WCAG20-TECHS/).

"PDF Techniques for WCAG 2.0" shows you how to use applications such as Microsoft Word and Adobe Acrobat to produce accessible PDFs. This tutorial uses the same headings (in the same order) to show you how to use AH Formatter to produce the same effects. The quoted text after each heading is an excerpt from the corresponding "PDF Techniques for WCAG 2.0" section. The quotes are included to help explain the rationale for each technique.

The AH Formatter techniques shown here almost all require that you are producing Tagged PDF output (http://www.antennahouse.com/product/ahf60/docs/ahf-pdf.html#taggedpdf), either by checking Tagged PDF in the PDF Option Setting Dialog (http://www.antennahouse.com/product/ahf60/docs/ahf-gui.html#pdf-option-settings-dialog) or by specifying -tpdf (http://www.antennahouse.com/product/ahf60/docs/ahf-xslcmd.html#taggedpdf) with the Command-line interface. Tagged PDF output is not available with AH Formatter Lite.

Applying text alternatives to images with the Alt entry in PDF documents

PDF documents may be enhanced by providing alternative descriptions for images, formulas, and other items that do not translate naturally into text. In fact, such text alternatives are required for accessibility: alternate descriptions are human-readable text that can be vocalized by text-to-speech technology for the benefit of users with vision disabilities.

Alternate text can be specified with the axf:alttext extension property (http://www.antennahouse.com/product/ahf60/docs/ahf-ext.html#axf.alttext). For example:

<code class="xml syntaxhl"><span class="CodeRay"><span class="tag">&lt;fo:external-graphic</span> <span class="attribute-name">src</span>=<span class="string"><span class="delimiter">&quot;</span><span class="content">...</span><span class="delimiter">&quot;</span></span> <span class="attribute-name">axf:alttext</span>=<span class="string"><span class="delimiter">&quot;</span><span class="content">AltText</span><span class="delimiter">&quot;</span></span><span class="tag">/&gt;</span></span></code>

When formatting CSS, you can use the -ah-alttext extension property or the HTML alt attribute.

When no alternate text is provided, AH Formatter uses the value of the role property, if present, or a single space character (U+0020) as the alternate text. This prevents an accessibility check from signaling an error from alternate text not being provided, even if an alternate text is unnecessary. It is your responsibility to provide the alternative text where it is needed.

Creating bookmarks in PDF documents

A person with cognitive disabilities may prefer a hierarchical outline that provides an overview of the document rather than reading and traversing through many pages. This is also a conventional means of navigating a document that benefits all users.

AH Formatter implements the multiple XSL 1.1 (http://www.w3.org/TR/xsl/) FOs for representing bookmarks (http://www.w3.org/TR/xsl/#d0e14206): fo:bookmark-tree, fo:bookmark, and fo:bookmark-title.

However, if you want to both generate multiple PDFs from one FO (http://www.antennahouse.com/product/ahf60/docs/ahf-ext.html#IDADYBF) and include either identical or separate sets of bookmarks in each volume, you'll need to use axf:outline-level and other, related extension properties (http://www.antennahouse.com/product/ahf60/docs/ahf-ext.html#pdf-bookmark) that AH Formatter had already implemented before XSL 1.1 was available. This is partly because output to multiple volumes is not defined by XSL 1.1, but mostly because axf:outline-level and other properties are used by annotating existing FOs, which makes it possible to produce bookmarks based on the structure of each individual volume.

Ensuring correct tab and reading order in PDF documents

For sighted users, the logical order of PDF content is also the visual order on the screen. For keyboard and assistive technology users, the tab order through content, including interactive elements (form fields and links), determines the order in which these users can navigate the content. The tab order must reflect the logical order of the document.

Logical structure is created when a document is saved as tagged PDF. The reading order of a PDF document is the tag order of document elements, including interactive elements.

When you produce Tagged PDF with AH Formatter, the reading order follows from the document order of the FOs in the FO document.

Hiding decorative images with the Artifact tag in PDF documents

In PDF, artifacts are generally graphics objects or other markings that are not part of the authored content. Examples of artifacts include page header or footer information, lines or other graphics separating sections of the page, or decorative images.

fo:static-content, marks at line breaks, fo:table-header (except the one at the beginning of the table), and fo:table-footer (except the one at the end of the table) are all automatically tagged as Artifact in the PDF so they do not form part of the reading order. You can also annotate any FO with axf:pdftag="Artifact" (http://www.antennahouse.com/product/ahf60/docs/ahf-ext.html#axf.pdftag) to similarly remove it from the reading order.

Indicating required form controls in PDF forms

The objective of this technique is to notify the user when a field that must be completed has not been completed in a PDF form. Required fields are implemented using the /Ff entry in the form field's dictionary (see Table 220 in Section 12.7 (Interactive Forms) of PDF 1.7 (ISO 32000-1). This is normally accomplished using a tool for authoring PDF.

You can include forms in PDF generated by AH Formatter by using the axf:form extension (http://www.antennahouse.com/product/ahf60/docs/ahf-ext.html#acroform). You can mark individual form fields as required by using the axf:field-required extension property (http://www.antennahouse.com/product/ahf60/docs/ahf-ext.html#axf.field-required).

Using table elements for table markup in PDF Documents

Tabular information must be presented in a way that preserves relationships within the information even when users cannot see the table or the presentation format is changed. Information is considered tabular when logical relationships among text, numbers, images, or other data exist in two dimensions (vertical and horizontal). These relationships are represented in columns and rows, and the columns and rows must be recognizable in order for the logical relationships to be perceived.

The Tagged PDF that AH Formatter generates includes the correct PDF tags for the table-related FOs, and for table-related elements when formatting CSS. See http://www.antennahouse.com/product/ahf60/docs/ahf-pdf.html#taggedpdf

FO

FO PDF Element
fo:table Table
fo:table-caption Caption
fo:table-header THead
fo:table-footer TFoot
fo:table-body TBody
fo:table-row TR
fo:table-cell TD

You can tag fo:table-cell in the table header as TH using axf:pdftag.

CSS

HTML PDF Element
caption Caption
table Table
tr TR
td TD
th TH
thead THead
tfoot TFoot
tbody TBody

Performing OCR on a scanned PDF document to provide actual text

A document that consists of scanned images of text is inherently inaccessible because the content of the document is images, not searchable text. Assistive technologies cannot read or extract the words; users cannot select, edit, resize, or reflow text nor can they change text and background colors; and authors cannot manipulate the PDF for accessibility.

This does not directly apply to the PDF that AH Formatter generates from an FO document. However, if the FO incorporates scanned images of text, then it would apply. One way to include the text of a scanned raster image is to convert the raster image to Tagged PDF, apply OCR on the image, and embed the Tagged PDF in the PDF generated from the FO document. See http://www.antennahouse.com/product/ahf60/docs/ahf-pdf.html#pdf-embedding

Providing definitions for abbreviations via an E entry for a structure element

PDF documents may be enhanced by providing expansions for abbreviations. In fact, such expansions are required for accessibility to ensure understanding by people who have difficulty decoding words; rely on screen magnification (which may obscure context); have limited memory; or who have difficulty using context to aid understanding.

Not available at this time.

Providing headings by marking content with heading tags in PDF documents

Because headings indicate the start of important sections of content, it is possible for assistive technology users to access the list of headings and to jump directly to the appropriate heading and begin reading the content. This ability to "skim" the content through the headings and go directly to content of interest significantly speeds interaction for users who would otherwise access the content slowly.

AH Formatter does not automatically generate heading elements in the Tagged PDF for any FOs in the FO document. However, you can tag any FO as the appropriate heading level using axf:pdftag.

When formatting CSS, AH Formatter automatically tags HTML heading elements as their corresponding heading level in the Tagged PDF:

HTML PDF Element
h1 H1
h2 H2
h3 H3
h4 H4
h5 H5
h6 H6

Providing labels for interactive form controls in PDF documents

Form controls allow users to interact with a PDF document by filling in information or indicating choices which can then be submitted for processing. Assistive technology users must be able to recognize and understand the form fields, make selections, and provide input to complete the forms, and submit the form, just as sighted users can. Understandable labels that convey the purpose of each form control are essential to form accessibility.

As noted above, you can include forms in PDF generated by AH Formatter by using the axf:form extension (http://www.antennahouse.com/product/ahf60/docs/ahf-ext.html#acroform). You can add descriptions to individual form fields by using the axf:field-description extension property (http://www.antennahouse.com/product/ahf60/docs/ahf-ext.html#axf.field-description).

Providing links and link text using the Link annotation and the /Link structure element in PDF documents

Links in PDF documents are represented by a Link tag and objects in its sub-tree, consisting of a link object reference (or Link annotation) and one or more text objects. The text object or objects inside the Link tag are used by assistive technologies to provide a name for the link.

AH Formatter automatically tags fo:basic-link or, when formatting CSS, tags HTML a elements that have a href attribute as @Link in the Tagged PDF.

Providing name, role, value information for form fields in PDF documents

Providing name, role, state, and value information for all form components enables compatibility with assistive technology, such as screen readers, screen magnifiers, and speech recognition software used by people with disabilities.

These can be generated using extension properties on axf:form-field. See http://www.antennahouse.com/product/ahf60/docs/ahf-ext.html#form-field

Providing replacement text using the /Alt entry for links in PDF documents

Links in PDF documents are represented by a Link tag and objects in its sub-tree, consisting of a link object reference (or Link annotation) and one or more text objects. The text object or objects inside the Link tag are used by assistive technologies to provide a name for the link.

Authors can replace the default link text by providing an /Alt entry for the Link tag. When the Link tag has an /Alt entry, screen readers ignore the value of any visible text objects in the Link tag and use the /Alt entry value for the link text.

Replacement text can be provided using axf:alttext. See https://www.antennahouse.com/product/ahf64/ahf-ext.html#axf.alttext

Providing running headers and footers in PDF documents

Running headers and footers help make content easier to use and understandable by providing repeated information in a consistent and predictable way. The content of headers and footers will vary widely depending on the document scope and content, the audience, and design decisions.

Running headers and footers are implemented using fo:static-content.

Providing submit buttons with the submit-form action in PDF forms

The objective of this technique is to provide a mechanism that allows users to explicitly request a change of context using the submit-form action in a PDF form.

Submit buttons can be provided using the extensions for PDF form actions. See http://www.antennahouse.com/product/ahf60/docs/ahf-ext.html#form-action

Setting the default language using the /Lang entry in the document catalog of a PDF document

Both assistive technologies and conventional user agents can render text more accurately when the language of the document is identified. Screen readers can load the correct pronunciation rules. Visual browsers can display characters and scripts correctly. Media players can show captions correctly. As a result, users with disabilities are better able to understand the content.

You can specify the default language by setting the xml:lang property on the fo:root.

Specifying consistent page numbering for PDF documents

Many documents use specific page number formats within a document. Commonly, front matter is numbered with lowercase Roman numerals. The main content, starting on the page numbered 1, may actually be the fifth or sixth page in the document. Similarly, appendices may begin with page number 1 plus a prefix of the appendix letter (e.g., "A-1").

Authors should make sure that the page numbering of their converted documents is reflected in any page number displays in their user agent. Consistency in presenting the document's page numbers will help make navigating the document more predictable and understandable.

Using fo:page-number (http://www.w3.org/TR/xsl/#fo_page-number) and its related properties can ensure that page numbering is consistent.

Specifying the document title using the Title entry in the document information dictionary of a PDF document

Document titles identify the current location without requiring users to read or interpret page content. User agents make the title of the page easily available to the user for identifying the page. For instance, a user agent may display the page title in the window title bar or as the name of the tab containing the page.

Using fo:title (http://www.w3.org/TR/xsl/#fo_title) will produce a title in the generated PDF.

Specifying the language for a passage or phrase with the Lang entry in PDF documents

Both assistive technologies and conventional user agents can render text more accurately when the language is identified. Screen readers can load the correct pronunciation rules. As a result, users with disabilities are better able to understand the content.

The xml:lang property can be used on any FO.

Using Adobe Acrobat Pro's Table Editor to repair mistagged tables

Not applicable.

Using List tags for lists in PDF documents

When markup is used that visually formats items as a list but does not indicate the list relationship, users may have difficulty navigating the information. An example of such visual formatting is simply using line-breaks to separate list items.

The Tagged PDF that AH Formatter generates includes the correct PDF tags for the link-related FOs, and for link-related elements when formatting CSS. See http://www.antennahouse.com/product/ahf60/docs/ahf-pdf.html#taggedpdf

FO

FO PDF Element
fo:list-block L
fo:list-item LI
fo:list-item-label Lbl
fo:list-item-body Lbody

CSS

HTML PDF Element Comment
ul L
ol L
li LI
li::marker LBL list label
dl L
dt LBL
dd LBody

Indicating when user input falls outside the required format or values in PDF forms

Ensuring that users are aware an error has occurred, can determine what is wrong, and can correct it are key to software usability and accessibility. Meeting this objective helps ensure that all users can complete for-based transactions with ease and confidence.

You can specify the field format category and field format for individual form fields by using the axf:field-format-category extension property (http://www.antennahouse.com/product/ahf60/docs/ahf-ext.html#axf.field-format-category) and axf:field-format extension properties (http://www.antennahouse.com/product/ahf60/docs/ahf-ext.html#axf.field-format), respectively.

Providing interactive form controls in PDF documents

Form controls allow users to interact with a PDF document by filling in information or indicating choices, which can then be submitted for processing. Users who rely on keyboard access must be able to recognize and understand the form fields, make selections, and provide input to complete the forms, and submit the form, just as sighted users can.

You specify the type of individual form fields by using the axf:field-type extension property (http://www.antennahouse.com/product/ahf60/docs/ahf-ext.html#axf.field-type).