Our proprietary analyzing program analayzes docx files created in Word and converts them into HTML5 or XHTML 1.0 compliant HTML, which is much simpler and free of extra tags than the standard Word HTML format.
Word has a standard function to convert a document to HTML when it is saved, but in order to ensure that the appearance of layout and style can be reproduced and re-editable in Word, a large number of layout and style specifications are given as "style" directly to tags for text, images, etc. This generally makes them unsuitable as HTML to be published on the Web, or makes it difficult to customize or modify the HTML.
In some cases, the output lacks the right output for the HTML structure, so although the appearance in the Web browser reproduces the layout in Word to some extent, it uses tags that do not match the HTML structure.
"HTML on Word" analyzes the contents of the docx file, minimizes information related to layout and style, and converts the structure of the text added in the Word document so that it is appropriate for the HTML structure. Since there are no extra layout or style specifications, HTML is generated and is simple and easy to customize or modify.
Layout and style can be specified separately using CSS, making it easy to structure a web page separately from HTML structure and design.
Styles, paragraphs, etc. specified in Word are analyzed and converted into equivalent HTML tags for output.
The table below lists some of the tags to be converted. For detailed conversion specifications, please refer to "Conversion Specifications" in the online manual.
Word's style |
html tag |
---|---|
Body text |
<body>-</body> |
Heading 1 to 6 (Outline level 1 to 6) |
<h1>-<h6> Note: For HTML5, output <section> for each heading. |
Heading 7 to 9 (Outline level 7 to 9) |
<p class=”l7”>-<p class=”l9”> |
Paragraph (normal) |
<p>-</p> |
Bullets |
<ul><li>-</li></ul> |
Paragraphs with numbering |
<ol><li>-</li></ol> |
Image |
<img src="Path of output image"> |
Table |
<table><tbody><tr><td>-</td></tr></tbody></table> |
Table style option: Title row |
<thead><tr><td>-</td></tr></thead> |
Table style option: First column |
<tr><th>-</th><td>-</td>-</tr> |
Table cells |
<td>-<td> |
Hyperlink |
<a href="URL">-</a> |
Etc.
The "Table of Contents" that can be automatically created in Word is converted into text links that can be used like a table of contents on a web page.
Text links generated for each heading (outline level) make it easy to navigate to the desired heading.
Number of enhancements have been made to the table of contents to make it easier to layout and more convenient to use.
*1 This value is the default when inserted and unedited by the "Built-In" Word table of contents feature.
*2 Only the inside of the <nav> tag is output as a separate HTML file for loading with JavaScript. Tags such as <html>, <head>, and <body> are not output.
HTML can now be output by splitting a Word document into chapters, sections, and other specified outline level units.
By specifying the "-split" option followed by the desired outline level (1 to 3) when executing from the command line, the document will be split at the heading style and paragraph points of the specified outline level in the Word document and output as an HTML file for each outline level.
By splitting pages, even long documents can be made minimized and easy-to-read Web pages because the amount of scrolling per page can be reduced and the file size to be read at one time can be kept to a minimum.
At this time, if there is a table of contents inserted by the Word table of contents function, the table of contents and its link will be output to all HTML files.
The table of contents can also be output as a separate HTML file by specifying an option. In this case, each HTML file split by outline levels will not output the table of contents. The output HTML file of the table of contents can be loaded into each HTML file using JavaScript, or used to create a page for the table of contents.
Please refer to the sample that reads the HTML file of the table of contents.
→ Samples
When outputting split HTML, the "-pagenavi" option can be used to output "Prev/Next" links that allow the user to move through the split HTML pages in order.
Links are output at the top and bottom of the body text. The output link can be in Japanese or English.
Parameter / Value |
Output |
---|---|
-pagenavi ja |
"前へ" "次へ" |
-pagenavi [Other than ja, or no value] |
"Prev" "Next" |
Note: If there is no corresponding page on the previous or next page, such as the first or last page, the corresponding link is not output.
Awards
Contact
Privacy Policy | © 2025 Antenna House, Inc.