PDF Output

PDF output is a standard function of AH Formatter V6.5. The PDF versions that can be output are as follows:

AH Formatter V6.5 outputs PDFs with the following features:

See also PDF Output Settings for more details.

PDF/X no-LT

PDF/X is defined in ISO 15930 and is a subset of PDF that aims to exchange the data for printing. Basically all the information for printing is included in the PDF file. AH Formatter V6.5 is available for outputting the following version of PDF/X. It's impossible to output PDF/X with AH Formatter V6.5 Lite.

The following shows the main features for PDF/X.

PDF/X-1aPDF/X-2PDF/X-3PDF/X-4
All fonts must be embedded yes yes yes yes
The output intent must be specified yes yes yes yes
ICC profiles specified as the output intent must be embedded no no no yes
Supports CMYK, Spot color yes yes yes yes
Supports Grayscale yes yes yes yes
Supports RGB no yes yes yes
Supports transparency no no no yes
Supports PDFs with encryption; setting of password, the restriction for printing and the restriction for changing must not be done no no no no
PDF can contain links or annotations, etc. in the print area no no no no

In PDF/X, all fonts must be embedded. When the font that is not allowed to embed is used, PDF/X cannot be generated.

PDF/X adopts the appropriate information by ignoring the most of the information set by users, such as font embedding, etc. For example the version of the output PDF is specified to PDF/X, Font Embedding in the PDF Option Setting Dialog will be displayed in gray color.

When an image incompatible as PDF/X is specified, the processing changes depending on error-on-pdfx-fault specified in PDF Output Settings. See also error-on-pdfx-fault.

In order to specify the output intent by ICC profile, use the URL of the ICC profile to the src property for fo:color-profile. In this case please omit the color-profile-name property or specify "#CMYK", "#GrayScale" or "#RGB". If this property is omitted, it is considered as "#CMYK". For example:

<fo:declarations> <fo:color-profile src="url('file:///C:/WINDOWS/system32/spool/drivers/color/JapanColor2001Coated.icc')" color-profile-name="#CMYK" /> </fo:declarations>

With PDF/X, the output intent which can be specified is only one. When multiple output intents are specified, it is unsure which one is adopted.

With PDF/X, the only ICC profile of the output intent which can be embedded is the one whose device class is "prtr". Any ICC profiles other than "prtr" cannot be used.

With PDF/X-1 to PDF/X-3, the output condition identifier can also be specified with the src property of fo:color-profile. However, the output condition identifier cannot be specified with PDF/X-4. The setting is done in the form of the fragmentation of URI. For example:

<fo:declarations> <fo:color-profile src="#OutputConditionIdentifier=CGATS TR 001&amp;RegistryName=http://www.color.org" /> </fo:declarations>

The first character must be #. After that, the parameters are lined and separated by &amp;. Each parameter is in the form of name=value. The name of the parameters are as follows (by mapping to the items of the Outputlntent dictionary for PDF/X).

It's also possible to give the information of the profile by describing the format of the fragment parameter following to the ICC profile. The base URI specified by axf:base-uri can also be specified.

<fo:declarations axf:base-uri="url('file:///C:/WINDOWS/system32/spool/drivers/color/')">
 <fo:color-profile src="url('Photoshop5DefaultCMYK.icc#Info=Photoshop5')"/>
</fo:declarations>

When the output intent is specified by the output condition identifier, the output intent is not embedded in PDF/X. When it is specified by the ICC profile, the ICC profile is embedded in PDF/X.

When the standard output intent is specified with a URL, it's possible to select whether the ICC profile is embedded or not by embed-std-output-intent of the PDF Output Settings, because the embedding of the ICC profile is not required. However, with PDF/X-4, this setting is disregarded and the ICC profile is always embedded. When specified by the output condition identifier, it's not embedded regardless of the setting of embed-std-output-intent. See also the following to learn more about the standard output intent.

CMYK Characterization Data

When the output intent is not specified in FO, default-output-intent in the PDF Output Settings will be adopted.

The standard ICC profile can be downloaded from Adobe.

Adobe ICC profiles

The profile that can be downloaded may be already bundled with Adobe Acrobat and has already been installed on your system. If your OS is Windows, please search the %windir%\system32\spool\drivers\color directory.

PDF/A no-LT

PDF/A is defined by ISO 19005 and it is the specification intended to be suitable for long-term preservation of electronic documents. AH Formatter V6.5 is available for outputting the following version of PDF/A. It's impossible to output PDF/A with AH Formatter V6.5 Lite.

The following shows the main features for PDF/A:

PDF/A-1a PDF/A-1b PDF/A-2a PDF/A-2b PDF/A-2u PDF/A-3a PDF/A-3b PDF/A-3u
All fonts must be embedded yes yes yes yes yes yes yes yes
ICC profiles must be embedded yes yes yes yes yes yes yes yes
Files must be tagged yes no yes no no yes no no
Files must include XMP metadata yes yes yes yes yes yes yes yes
Files may include encryption no no no no no no no no
Files may include LZW Compression no no no no no no no no
Files may include Transparent images no no yes yes yes yes yes yes
Files may refer to the external content no no no no no no no no
Files may include JavaScript no no no no no no no no
Unicode must be used for text no no yes no yes yes no yes
Any files other than PDF/A can be embedded no no no no no yes yes yes

PDF/A as well as PDF/X require that all fonts are embedded; if a font cannot be embedded due to security restrictions or other issues, a PDF/A or PDF/X will not be generated. Also, the embedding of the ICC profile is required with PDF/A, so when specifying the output intent, only the URL specification of the ICC profile is effective.

Most information (including embedding of fonts etc.) is adopted precisely and user settings are ignored. In PDF/A-1a or PDF/A-2a or PDF/A-3a, the tagging is done compulsorily. See also Tagged PDF.

XMP metadata is automatically generated from the document information of PDF. The original XMP can also be specified by <axf:document-info name="xmp">.

CAUTION: In AH Formatter V6.5, you cannot specify PDF/A and PDF/X simultaneously.

Tagged PDF no-LT

Usual PDF does not have document structure in the contents. For example, the sentences are cut off per each line. In the column, the first line of the right column follows the first line of the left column. Therefore, even if a person with sight problems, etc. is going to read PDF using some kind of reader, it is very difficult to read a text in the right order. The same thing is applied to the text extraction from PDF.

Tagged PDF adds structure to PDF documents by embedding tags into the PDF. The structure added to the document makes the information in the PDF more reusable. For this reason, Tagged PDF is required to create accessible PDF documents. See also the following to know more about Tagged PDF specifications.

ISO 32000-1:2008 14.8 Tagged PDF

Tagged PDF is an important element of PDF accessibility. For accessible PDF, there are some requirements in addition to tagged PDF, as a whole, it is defined in PDF/UA.

AH Formatter V6.5 embeds the following tags (StructElem) for each FO element.

FO element PDF element Comment
fo:root Document
fo:page-sequence Part
fo:flow Sect
fo:static-content Sect
fo:block P or Div P when it has the content of inline-level, otherwise Div
fo:block-container Div or Sect Sect when absolute-position="fixed" or "absolute", otherwise Div
fo:inline Span or Reference Reference when the child of fo:footnote, otherwise Span
fo:inline-container Span
fo:leader Span
fo:page-number Span
fo:page-number-citation Span
fo:page-number-citation-last Span
fo:scaling-value-citation Span
fo:index-page-citation-list Span
fo:bidi-override Span
fo:footnote The footnote-reference-area embeds a Sect that contains all the footnotes on the page
fo:footnote-body Note
fo:float Sect
fo:external-graphic Figure or FormulaFormula in case of MathML, otherwise Figure
fo:instream-foreign-object Figure or FormulaFormula in case of MathML, otherwise Figure
fo:basic-link Link
fo:list-block L
fo:list-item LI
fo:list-item-label Lbl
fo:list-item-body LBody
fo:table Table
fo:table-caption Caption
fo:table-header THead
fo:table-footer TFoot
fo:table-body TBody
fo:table-row TR
fo:table-cell TH or TD TH within fo:table-header, otherwise TD
axf:form-fieldForm
axf:rubyRuby
axf:ruby-baseRB
axf:ruby-textRT

AH Formatter V6.5 embeds the following tags (StructElem) for each HTML element.

HTML element PDF element Comment
htmlDocument
divDiv
h1H1
h2H2
h3H3
h4H4
h5H5
h6H6
pP
ulL
olL
liLI
li::markerLbllist label
dlL
dtLbl
ddLBody
blockquoteBlockQuote
captionCaption
tableTable
trTR
tdTD
thTH
theadTHead
tfootTFoot
tbodyTBody
rubyRuby
rbRB
rtRT
spanSpan
imgFigure
a[href]Link
other block elementsDiv
other inline elementsSpan

There are some tags that are not structural elements.

PDF element Comment
ArtifactIt is mapped to the contents distinguished from the text of a page. static-content that is repeatedly outputted at each line break, and table-header (except the one at the beginning of the table) and table-footer (except the one at the end of the table) serve as Artifact. The same applies when axf:pdftag="Artifact" is specified.
ReversedCharsIt is mapped to the text which runs from right to left, such as Arabic, etc.
SpanIt is used for setting up the character string before being processed for the display as ActualText. It is different from Span in the structural element. The character strings before being processed mean the character string before the hyphenation is processed, or the character string before complicated glyph substitution is done in Thai etc., for example.

In order to create Tagged PDF, you can check Tagged PDF in the PDF Option Setting Dialog, or you can specify -tpdf with the Command-line interface.

Using the axf:pdftag extension property, you can specify an arbitrary tag name to PDF.

AH Formatter V6.5 processes as follows for each check item (in the case of Acrobat 7.0) of the Adobe Acrobat's Accessibility Full Check

See also PDF Embedding to know more about embedding tagged PDF in tagged PDF or the restrictions by the PDF versions.

It's impossible to output Tagged PDF with AH Formatter V6.5 Lite.

PDF/UA no-LT

PDF/UA is defined by ISO 14289-1 and is the specification intended for improving the accessibility of PDF based on the ISO 32000-1 (PDF 1.7) specification. AH Formatter V6.5 is available for outputting the following version of PDF/UA. It's impossible to output PDF/UA with AH Formatter V6.5 Lite.

The following shows the main features of PDF/UA.

PDF Embedding

Possible to embed PDF document in the other PDF. Fillable PDF forms can also be embedded.

It is performed by using <fo:external-graphic> like handling an image. Please refer to Graphics.

<fo:external-graphic src="embedded.pdf#page=3"/>

As described above, specify the page number which you want to embed to the URI. Specify the page in the form of #page=<FirstPage>-<LastPage>. When there is no page number specified, the first page will be embedded.

embedded.pdf#page=3-5
from 3rd page to 5th page
embedded.pdf#page=3-
from 3rd page to the last page
embedded.pdf#page=3
3rd page only
embedded.pdf
first page only

When there is no size specified, it's embedded in the page size of the PDF where it's embedded. However if you want to scale the size, content-width or content-height can be specified as follows:

<fo:external-graphic src="embedded.pdf#page=3" content-width="50%"/>

Possible to embed multiple pages continuously.

<fo:external-graphic src="embedded.pdf#page=3-5"/>

As described above, embed from the 3rd page to 5th page continuously. If you want to embed all pages, specify as follows.

<fo:external-graphic src="embedded.pdf#page=1-"/>

When specifying PDF by utilizing the data scheme (RFC2397), the page number can be specified as the parameter of the media type as follows.

<fo:external-graphic src="data:application/pdf;page=3;base64,JVBERi0xLjQKJeLjz9M..."/>

Moreover, it's also possible to embed PDF as a background. This can be used when making a list form as a background and formatting only the content data on it. When specifying PDF as a background, please specify axf:background-repeat="no-repeat" to fo:simple-page-master or fo:page-sequence as follows (please specify it to @page in case of CSS). axf:background-repeat="repeat" cannot be specified.

<fo:simple-page-master axf:background-image="background.pdf"
                       axf:background-repeat="no-repeat" ...>

When embedding PDF against the background, it's possible to embed not only one page but also two or more pages continuously. Specify as follows; axf:background-repeat="paginate"

<fo:simple-page-master axf:background-image="background.pdf#page=3-5"
                       axf:background-repeat="paginate" ...>

In this example, pages from the 3rd to the 5th are embedded as the background. When the number of pages generated from the contents of fo:flow is less than the number of embedded PDF pages, pages are added so that all pages of embedded PDF may be outputted. Therefore, if the contents of fo:flow is empty, there would be no problem. When there is more number of pages generated from the contents of fo:flow, the background image of the page beyond the number of embedded PDF pages will drop off. Specify the page in the form of #page=<FirstPage>-<LastPage>. When axf:background-repeat="paginate" is not specified, the -<LastPage> portion is disregarded.

background.pdf#page=3-5
from 3rd page to 5th page
background.pdf#page=3-
from 3rd page to the last page
background.pdf#page=3
3rd page only
background.pdf
all pages

When axf:background-image or axf:background-repeat are specified to both fo:page-sequence and fo:simple-page-master, fo:simple-page-master takes priority. It's possible to embed PDF to fo:region-body/before/after/start/end by specifying to fo:simple-page-master.

axf:background-repeat="paginate" is not available to specify with AH Formatter V6.5 Lite.

The version of embedded PDF must be less than or equal to the version of PDF to output. The following table shows the acceptable combination with PDF/X, etc.

Embedded PDF PDF PDF/X PDF/A PDF/UA​-1
1.3 1.4 1.5 1.6 1.7 1a:​2001 3:​2002 1a:​2003 2:​2003 3:​2003 4 1a 1b 2a 2b 2u 3a 3b 3u 1.5 1.6 1.7
Output PDF PDF1.3 Y Y Y
PDF1.4 Y Y Y Y Y Y Y Y Y
PDF1.5 Y Y Y Y Y Y Y Y Y Y Y
PDF1.6 Y Y Y Y Y Y Y Y Y Y Y Y Y Y
PDF1.7 Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y
PDF/X​-1a:​2001 Y
PDF/X-3:​2002 Y Y
PDF/X-1a:​2003 Y Y
PDF/X-2:​2003 Y Y Y Y Y
PDF/X-3:​2003 Y Y Y Y
PDF/X-4:​2010 Y Y Y Y Y
PDF/A​-1a:​2005
PDF/A​-1b:​2005 P P
PDF/A​-2a:​2011
PDF/A​-2b:​2011 P P P P P
PDF/A​-2u:​2011 P P P
PDF/A​-3a:​2012
PDF/A​-3b:​2012 P P P P P P P P
PDF/A​-3u:​2012 P P P P P
PDF/UA​-1:​2014 (PDF1.5) Y
PDF/UA​-1:​2014 (PDF1.6) Y Y
PDF/UA​-1:​2014 (PDF1.7) Y Y Y

The following shows the restrictions when embedding PDF.

Font Output

Adobe Type1 fonts (including Adobe Standard 14 Fonts), TrueType fonts (including OpenType fonts with TrueType Outlines), OpenType fonts (PostScript Outline), WOFF (Web Open Font Format) and Macintosh TrueType font data fork suitcase are supported for PDF output. Other font formats are not supported. For more details, refer to the "Fonts".

AH Formatter V6.5 requires that the fonts, which are specified in documents, are installed on your system in order to use them correctly. Please refer to Windows help or follow the installation instructions attached to the fonts for the method of installing the font in the Windows version. The font placed aside from the font folder can be outputted to PDF in the Windows version. At this time, you need to specify some setting in the font construction file.

CAUTION: When the PDF includes a transparent image that is displayed with Adobe Acrobat/Reader, the character might appear somewhat bolder. This is a known problem of Adobe Acrobat/Reader.

Character Sets, Encoding

The following character sets are supported:

  • Adobe Standard Latin character set
  • Symbol character set
  • ZapfDingbats character set
  • Japanese character set (Adobe-Japan1-Supplement2)
  • Simplified Chinese character set (Adobe-GB1-Supplement2)
  • Traditional Chinese character set (Adobe-CNS1-Supplement0)
  • Korean character set (Adobe-Korea1-Supplement1)

Encoding of all characters is processed as Unicode within AH Formatter V6.5. In the case of Chinese, Japanese, Korean, (CJK), AH Formatter V6.5 maps the Unicode to glyph in each CJK character sets by using the following CMap.

  • Japanese : UniJIS-UCS2-H(V) UniJIS-UCS2-HW-H(V)
  • Simplified Chinese : UniGB-UCS2-H(V)
  • Traditional Chinese : UniCNS-UCS2-H(V)
  • Korean : UniKS-UCS2-H(V)

The characters that do not belong to the above character sets are embedded in the PDF by getting the glyphs from the font files. This process is done only for TrueType, OpenType fonts.

Font Embedding

Embedding font makes it possible to display PDF files even in the environment where there are no fonts.

In the default setting of TrueType font processing, only the outline of glyphs that are not defined by CMap is embedded. In cases where embedding TrueType fonts are prohibited by a font vendor, error occurs and processing stops. This error can be avoided by replacing it with a white space and output PDF. You can also specify the option that all glyphs of a font are to be embedded whether the character is defined by CMap or not.

In the default setting of Adobe Type1 font processing, only the outline of a font that has font specific encoding is embedded. The option to embed all glyphs of a font can also be specified whether the font has standard or font specific encoding.

You cannot embed the font in PDF if the font is not allowed to embed. See also PDF Output Settings to learn how to specify the font you want to embed.

Regardless of the setting, there may be a case that a font may always be embedded. In the following cases, the font is always embedded. If the font is not allowed to embed, you cannot use the font.

  • Fonts for the following scripts
    • Arab : Arabic
    • Hebr : Hebrew
    • Deva : Devalagari
    • Beng : Bengali no-LT
    • Guru : Gurmukhi no-LT
    • Gujr : Gujarati no-LT
    • Orya : Oriya no-LT
    • Taml : Tamil no-LT
    • Telu : Telugu no-LT
    • Knda : Kannada no-LT
    • Mlym : Malayalam no-LT
    • Sinh : Sinhala no-LT
    • Thai : Thai
    • Khmr : Khmer no-LT
    • Laoo : Lao no-LT
    • Mymr : Myanmar no-LT
  • Ligatured form axf:ligature-mode
  • The glyph of modified by font-variant (except for an emulation of small-caps)
  • Characters that are transformed into Variation Sequence
  • The Unicode characters which cannot be expressed by 16bit.

Image Output

For more information about supported graphic images, refer to the "Graphics".

Vector Images

The following vector images outputted to PDF as vector primitives are replaced with PDF operators:

In Windows version, vector images other than the above are transformed into the raster images and outputted to PDF. The resolution of the raster can be set in the result PDF with the value of dpi. Please refer to rasterize-resolution in PDF Output Settings. In non-Windows versions, the vector image which cannot be outputted to PDF is disregarded.

Please refer to EPS in Graphics for details.

CAUTION: MathML can be used only with "AH Formatter MathML Option" with AH Formatter V6.5 Lite.
CAUTION: If AH Formatter AH Formatter CGM Option is not installed on your Formatter, please refer to Graphics for more detail.

Raster Images

Generally the raster graphic data is compressed using a compression algorithms in the graphic file. If the compression method and the original (uncompressed) image format are both compatible with the PDF file format, the compressed raster graphic data will be directly embedded into the PDF file. If the compression method or original image format are not compatible with the PDF format, the graphic data will be uncompressed and converted to a PDF compatible bitmap format for output. If graphic data cannot be uncompressed, its data cannot be processed. Bitmap graphic will be compressed using JPEG or ZLIB compression and embedded into the PDF file. Refer to color-compression, color-jpeg-quality attributes in PDF Output Settings of the Option Setting File. If raster image data is directly embedded into the PDF file, these attributes will not be applied.

The raster images which can be embedded directly in a PDF are as follows:

These are the following restrictions:

  • Progressive JPEG, Interlaced GIF are transformed into regular JPEG or GIF images.
  • 16-bit color in PNG or TIFF is reduced to 8-bit color.
  • When alpha channel is attached to PNG or TIFF, it is divided.
  • There are some unsupported TIFF formats.
  • JPEG2000 is embedded into PDF only when it is PDF 1.5 or later. For other versions, it is embedded after being converted to JPEG etc.

Down sampling

In AH Formatter V6.5, the down sampling of the raster images embedded in PDF can be down sampled.

How it's down sampled can be specified in Compression in the PDF Option Setting Dialog or in Option Seting File.

The compression has the following restrictions:

JPEG compression is available when the following criteria are met; if not, ZLIB compression is used.

  • BitsPerComponent is 8.
  • The color space is any of CMYK, RGB, grayscale or CIE L*a*b*.

JPEG2000 compression is available when the following criteria are met; if not, JPEG compression is used.

  • PDF version is PDF1.5 or later.
  • BitsPerComponent is 8.
  • The color space is any of RGB, grayscale or CIE L*a*b*.

Multimedia no-LT

The multimedia data, such as video, audio, etc. can be embedded or linked in PDF. The following shows the example using <fo:external-graphic>.

<fo:external-graphic src="video.mpg" content-type="video/mpeg"
                     axf:multimedia-treatment="embed"
                     axf:poster-image="poster.jpg"
                     axf:show-controls="true"
                     width="640pt" height="400pt"/>

In HTML, <object> or <video> can be used.

<object data="video.mpg" type="video/mpeg"
        width="640" height="400"
        style="-ah-poster-image:url('poster.jpg'); -ah-show-controls:true">
</object>

or

<video src="video.mpg" type="video/mpeg"
       width="640" height="400"
       poster="poster.jpg" controls="controls">
</video>

The setting of content-type (or 'type' attribute in HTML) is required. AH Formatter V6.5 assumes that the data specified as src follows the content-type. The contents of data are not checked. A poster image can be specified by axf:poster-image (or 'poster' attribute in HTML <video>). A poster image can be displayed in the GUI. When width and height are not set, the size of the reference area is assumed. Whether to embed or link multimedia can be specified with axf:multimedia-treatment.

CAUTION: In PDF, multimedia is embedded as a type of annotations. Please specify axf:annotation-flags="Print" to print a poster image.

When axf:show-controls (or 'controls' attribute in HTML <video>) is specified, a control bar is shown under the playing multimedia object. To prevent overlapping the control bar with another object, it is necessary to make enough space below the multimedia object.

CAUTION: Whether the control bar is shown or not depends on the multimedia data, the viewer or the player.

The following shows the content types accepted as multimedia by default:

The additional content types other than these can be added by <multimedia> in the Option Setting File. The following example shows a combination of typical extensions and content types.

Extension Content type
*.3g2 video/x-msvideo
*.3gp video/x-msvideo
*.aac audio/basic
*.aiffaudio/x-aiff
*.asf video/x-ms-asf
*.au audio/basic
*.avi video/x-ms-wm
*.dv video/x-dv
*.f4v video/mp4
*.flv video/x-msvideo
*.ivf video/x-ivf
*.m1v video/x-mpeg
*.m2v video/x-mpeg
*.m4a audio/mp4
*.m4b audio/x-m4b
*.m4v video/mp4
*.mid audio/x-midi
*.midi
*.mov video/quicktime
*.mp2 audio/x-mpeg
*.mp3 audio/x-mp3
*.mp4 video/mp4
*.mpe video/mpeg
*.mpeg
*.mpg
*.qt video/quicktime
*.swf application/x-shockwave-flash
*.wav audio/x-wav
*.wma audio/x-ms-wma
*.wmv audio/x-ms-wmv
*.wmx audio/x-ms-wmx

Whether these can be replayed correctly depends on the PDF viewer. When the shortage of players or shortage of codecs error occurs when replaying, it may be available to run again by introducing a corresponding player and codec.

Restrictions

Rich media V6.5 no-LT

The multimedia can be embedded as a rich media annotation (here we call such multimedia simply as the rich media.) It does not require external players etc. for playback. Specify that it is rich media by axf:multimedia-treatment in <fo:external-graphic> as follows: In HTML, specify it by <object> or <video>. Also, there are differences in valid properties. For details, see Images, Multimedia .

<fo:external-graphic src="video.mpg" content-type="video/mpeg"
                     axf:multimedia-treatment="richmedia"
                     axf:poster-image="poster.jpg"
                     width="640pt" height="400pt"/>

The following multimedia can be embedded as the rich media.

Extension Content type
*.3g2 video/x-msvideo
*.3gp video/x-msvideo
*.f4v video/mp4
*.flv video/x-msvideo
*.m4v video/mp4
*.mov video/quicktime
*.mp3 audio/x-mp3
*.mp4 video/mp4
*.swf application/x-shockwave-flash

Even though the container format is supported the video and audio codecs need to be compatible with Adobe Acrobat's flash player in order to play correctly

Restrictions

Miscellaneous