The focus here was really simple documents, like just one sentence with minimal formatting. The use-case is to have thousands of these simple documents, only a minority containing complex formatting, the rest is just that simple.

Performance work usually focuses on one specific complex feature, e.g. lots of bookmarks, lots of document-level user-defined metadata, and so on — this way there were room for improvements when it comes to trivial documents.

I managed to reduce the cost of the conversion to the fifth of the original cost in both directions — the chart above shows the impact of my work for the ODT → XHTML direction. The steps that helped:

Recognize XHTML as a value for the FilterOptions key in the HTML (StarWriter) export filter, this way avoid the need to go via XSLT, which would be expensive.

Add a new NoFileSync flag to the frame::XStorable::storeToURL() API, so that if you know you’ll read the result after the conversion finished, you can avoid an expensive fsync() call for each and every file, which helps HDDs a lot, while means no overhead for SSDs.

If you know your input format already, then specifying an explicit FilterName key for the frame::XComponentLoader::loadComponentFromURL() API helps not spending time to detect the file format you already know.