Structure Differences in Word vs HTML and Impact on Conversions

In recent years, the roles of web pages have been separated by structurally describing documents in HTML and describing layout and style in CSS.

HTML has tags that are related to structure, and enclosed within them determines what role, or what meaning, the strings within them have. Also, the nesting of tags and strings can determine whether the relationship between sentences is parent-child or parallel.

The combination of these allows HTML to be described in a structured manner.


Structured HTML (HTML5 example)

In contrast, Word documents are originally designed to be printed, so the content of the document is closely tied to its layout and appearance, with text is directly formatted.

This makes it difficult to convert a Word document to HTML while preserving its original structure, except by using some of Word's useful functions such as headings, paragraphs, table of contents, and cross-references, etc.

The "HTML on Word" conversion tool conforms to HTML specifications such as HTML5 and XHTML 1.0, but it relies on a proprietary conversion engine that analyzes the styles specified in Word and converts them into structural HTML. Layout and other styles can be applied to this structure using a separately prepared CSS file.

Note: If there is no CSS, it will be displayed with the standard style that the web browser has.

For more information on the essentials of converting a Word document to HTML, please refer to "Chapter 6 Word Editing Guidelines" in the online manual.

Caution: Layout and style in Word will not be reproduced when converted to HTML.

When converting a Word document to HTML, only the basic layout and some of the styles in the Word document will be preserved in the HTML output. Other layout and style information will not be included in the HTML.

To apply additional styling to the HTML output, you can use style sheets. This product includes sample CSS files that you can specify during the conversion process, and the resulting HTML output will be linked to the specified CSS file.

If you want to use your own layout for the HTML output, you will need to prepare a separate CSS file. Knowledge of CSS and HTML is required to create a custom CSS file that achieves the desired layout.