Last updated: April 28, 2008

WordMLToFO Stylesheet V2.1

What is WordMLToFO Stylesheet?

WordMLToFO Stylesheet is the XSLT stylesheet that transforms the new Microsoft Word2003 file format called "WordML" into the FO XML file which is compliant with W3C recommendation "Extensible Stylesheet Language (XSL)". The "Extensible Stylesheet Language (XSL)" is the specification to layout and format the XML documents. Using this stylesheet WordMLToFO, you can generate FO files from Word documents for any purpose.

WordMLToFO Structure

Operation by another XSL-FO formatting application is not guaranteed.

Since the document model is different fundamentally between Microsoft Word and XSL-FO, perfect conversion cannot be performed. Therefore, the formatted result may not be outputted correctlly.
Antenna House now proudly offers Microsoft Office compatible renderer, Server Based Converter (SBC). Since SBC is more compatible and achieves higher performance than WordMLToFO Stylesheet, we strongly recommend you to try SBC. (January 22, 2006)
What is WordML?

WordML is formally called "Wordprocessor Markup Language". Until now Microsoft Word has the main native file format called binary format (.doc extension) or Rich Text format (.rtf extension). WordML has the XML file format and called fully compatible with these native file formats. In addition, WordML has the following features.

You can build following XML applications based on WordML features:

WordMLToFO stylesheet is one of the applications of the latter case. About WordML, Microsoft c has released its specification on Nov, 2003. If you are interested in WordML, you can download specification from the following URL:

Office 2003 XML Reference Schemas includes the document titled "Overview of WordprocessingML" that simply explains WordML structure and examples.

WordMLToFO Function

WordMLToFO style sheet generates the FO file based on the following functions.

Element mapping

WordMLToFO style sheet maps WordML elements to the XSL-FO elements in the following way.

Document element WordML Element XSL-FO Element
Paragraph w:p fo:block
Inline (text-run) w:r fo:inline
Bullet and numbering w:p (paragraph that has w:pPr/w:listPr) fo:list-block, fo:list-item, fo:list-item-label, fo:list-item-body
Table w:tbl, w:tr, w:tc fo:table, fo:table-row, fo:table-cell
Image w:pict fo:external-graphic

Style expansion

Word document contains many styles and the styles are applied paragraph or text-run or table, finally they are formatted according to the applied stylesheet result. The style contains table-style, paragraph-style and character-style. In contrast, XSL-FO does not have style concept. All of the formatting property must be described as the last result in the FO file after applying the styles. As a result WordMLToFO stylesheet must apply following style, and then output the last result to the FO file.

Document Element Condition Applied Stylesheet
Paragraph Paragraph inside the table Table style, Paragraph style
Paragraph outside the table Paragraph style
Inline (text-run) Inline in the paragraph inside the table Table style, Paragraph style, Character style
Inline in the paragraph outside the table Paragraph style, Character style
Row or Cell in the Table - Table style

Running Environment

WordMLToFO is the stylesheet based on XSLT 1.0 W3C recommendation. It uses some extension function about RTF (Result Tree Fragment). At present we have tested under the following XSLT processors.

XSLT Processor Notes
Saxon 6.5.3 Tested using Sun Java SDK, Java 2 Platform, Standard Edition 1.4.1 or higher. Saxon7 is not tested yet. Instant Saxon is not supported.
MSXML3, MSXML4 Line layout calculation is simplified.
.NET To excuse it, The EXSLT.NET Library is necessary. Therefore, you should add the ExsltTransform class to calling program. Please refer to Building Practical Solutions with EXSLT.NET in detail.


Current implementation has the following limitations.


Following is the result using MSXML4 as XSLT processor. Click image to get large one. (The document content is a fiction.)

Word dispaly image

Word view of the original Word document

    XSL Formatter dispaly image

Format the result FO using XSL Formatter

You can download sample data from here.

The WordMLToFO Stylesheet (except for the source code) is built in V4.1. Please download the evaluation version 4.1 and confirm the formatted result of your own document.

Product Information

Name WordMLToFO Stylesheet
Price $200
Contents Stylesheet source file, external Java library (.jar file)
User's manual, Sample Data

Customer may install and use one copy of the product on a single computer.
Customer needs written permit from Antenna House to deliver, distribute the product to third parties.
Customer may alter, modify the stylesheet source code in the product by himself on his responsibility.

If you are interested in WordMLToFO Stylesheet, please feel free to contact us via E-mail.

