Optimizing Formatter Performance

May 11, 2022

We get a lot of questions about how we measure performance or optimize performance for Formatter while generating PDFs, so here are some tips we'd like to share. Firstly, we can't provide meaningful performance metrics because the formatting throughput depends almost entirely on the configuration of the input document. 

From our internal tests, it appears that the optimal number of simultaneous jobs is almost always between n-1 and n*2 where n is the number of available cores.

To optimize performance, you will want to make sure there is enough physical memory for every job you are running as switching to virtual memory greatly slows down the processing. Forward references, such as tables of contents, can increase memory usage since more of the document needs to be stored until the references can be resolved.

In terms of document structure, you want to minimize use of auto-formatted tables, particularly large ones that you expect to break across pages. Also, if you reference images from http requests to a remote server Formatter has to wait until those requests are resolved before continuing with the document. Resolving FO errors for invalid property values will speed up the processing time, as well as minimizing the size of the input FO by taking advantage of XSL-FO property inheritance.