ODF Plugfest: Making office tools interoperable

This article brought to you by LWN subscribers Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible.

On the 14th and 15th of October, the federal, regional, and community governments of Belgium organized an ODF Plugfest in Brussels. After plugfests in the Netherlands, Italy, and Spain, this was the fourth vendor-neutral event where all the lead developers and architects of Open Document Format (ODF) implementations — open source and proprietary — can meet to discuss interoperability issues and present what's available on the market.

The first day of the event was reserved for the ODF implementers: Abisource, DiaLOGIKa, Google, IBM, Itaapy, KO GmbH, Microsoft, Novell, Sun/Oracle, and OpenOffice.org. LibreOffice developers weren't able to make it for the first day, but for now the new OpenOffice.org fork doesn't differ very much from its upstream with respect to ODF support. The various vendors did some interoperability testing scenarios focused around specific parts of the ODF standard. The scenarios can be consulted on the wiki of the OpenDoc Society (click on "Scenarios" and then on "20100415") and in the program. For example, the ODF implementers tested interoperability of the YEARFRAC spreadsheet function — an important function that is involved in many financial calculations.

Another testing scenario concerned change tracking with tables. This is a notoriously difficult topic, and the different vendors had a lengthy discussion about a proposal from DeltaXML to improve this in ODF. There were also some scenarios involving digital signatures in ODF 1.2 as well as some smaller tests. Based on the outcome of the tests, the developers of the various office suites surely have some homework to do.

The evolution of a document format

The second day of the event had a more classical, conference-like approach, with presentations by different ODF stakeholders. After the welcome speech by organizer Bart Hanssens, IBM's Rob Weir gave an update about the ODF specification as the chair of the OASIS OpenDocument technical committee. He emphasized that ODF 1.0 is still maintained (errata are published), even if ODF 1.1 is the current version and ODF 1.2 is already in the last phase of development. It's clear that ODF 1.2 is a much bigger development than the two previous versions: while the ODF 1.0 specification has 700 pages and the ODF 1.1 specification has just 30 more, the current Committee Draft 05 of the ODF 1.2 specification has more than 1200 pages.

The biggest differences between ODF 1.0/1.1 and ODF 1.2 lie in the addition of a spreadsheet formula language, OpenFormula (which amounts for 240 pages in the specification), an explicit scheme for digital signatures, and RDF/XML and RDFa capabilities. Moreover, the conformance language has been reworded for easier conversion to an ISO standard, and more than 2000 reported issues have been solved. Rob expects that ODF 1.2, which is "almost done", will be approved by OASIS at the end of January 2011. At the end of his talk, Rob speculated somewhat about what could go into "ODF Next" (version 1.3?): modularization (inspired by the W3C's modularization effort for HTML), which would make possible something like a "web profile" for ODF, enhanced SVG and XForms integration, enhanced signing of documents, and better change tracking. By the way, an interesting side note that Rob made is that OASIS can only approve a specification as a standard if there exist at least three implementations.

Bart Severi of the Flemish government talked about the policy of his government concerning archiving digital documents. Office documents that have to be preserved for a long time, will in practice be converted to PDF/A (if the behavior is not important) or ODF (if the scripts are needed). His colleague Bart Hanssens from the Belgian government department Fedict (and also chair of the ODF Interoperability and Conformance technical committee of OASIS) talked about the possibility of digitally signing ODF documents with the eID, which is an electronic ID card issued to all Belgian people. ODF 1.0/1.1 allow signing a document, but it is not specified how this should happen. In contrast, ODF 1.2 specifies the reuse of the W3C recommendation XML-DSIG and hints that XAdES can be used. XAdES (XML Advanced Electronic Signatures) builds upon XML-DSIG and is compliant with the European Directive 1999/93/EC. Fedict has an eID proof-of-concept applet for web browsers, which is available under the LGPL.

Alex Brown presented his online document validator Office-o-tron, which understands ODF (1.0, 1.1, and draft 1.2) and OOXML. Office-o-tron is open source software licensed under the Mozilla Public License v1.1. Sander Marechal gave an update about the Officeshots web application, which allows users to upload ODF documents and will generate output from various office applications. Officeshots makes it possible to automate the process of investigating ODF interoperability, as we described in November 2009. Since the last plugfest in Spain, development has focused more on the back-end, so there are not many new features, but Sander explained that support for more office applications is on the roadmap, as well as an easier installation routine for volunteers that want to host a rendering server.

Updates from office products

Then there were some short presentations with updates about office products supporting ODF. Casper Boemann from the German company KO GmbH talked about the latest features in KOffice, such as animation support, text line breaking that is compatible with OpenOffice.org (KOffice replicated the hyphenation and justification functionality of OpenOffice.org to get line breaks that behave identically), text wrapping around both sides of multiple shapes, and drop caps which can now be shown in different layouts. He also pointed out that a limited version of KOffice has been ported to Maemo on the N900. Marc Maurer from AbiSource, noted that AbiWord is the default word processor on 1.5 million OLPC XO-1 laptops, and said that the word processor almost supports the current draft ODF 1.2. Google's Nathan Hurst explained that Google Docs supports ODF 1.0 at the moment, but ODF 1.2 support is in the works. Moreover, the Quick View functionality for PDF files in the search engine will also be implemented for the ODF file format.

Microsoft's UK National Standards Officer John Phillips announced that they are researching the differences between ODF 1.1 and ODF 1.2 (ODF 1.1 is already supported in Microsoft Office 2007 by a service pack, and in Microsoft Office 2010 natively), and he re-affirmed his company's commitment to support ODF 1.2 within 9 months of ISO publication of the standard. However, after a question from the public about when the Mac version of Office will support ODF (currently it doesn't), John answered that there is not enough customer demand to implement it.

Oracle was represented by Oliver-Rainer Wittmann, one of the developers of OpenOffice.org. He talked about the work done since the last plugfest and about Oracle Cloud Office, a web and mobile office suite integrated with Oracle Open Office. Rob Weir spoke about IBM Lotus Symphony and said that the 3.0 release should be available at the end of 2010. At the last minute the organizers added a presentation by Cor Nouws about LibreOffice. Cor, a member of the Document Foundation and owner of a Dutch OpenOffice.org consulting company, explained that LibreOffice plans to be a good member of the ODF community and will share new ODF features as soon as possible by communicating in the ODF technical committee.

A healthy ODF ecosystem

It's easy to forget that ODF documents are not only handled by word processors or office suites in general. There's a whole ecosystem of tools that can convert, enrich, or manipulate ODF files, and a couple of them were presented at the ODF plugfest. Karl Morten Ramberg presented the OFS Collaboration Suite — a real time secure collaboration suite which can be used from a web editor or from within OpenOffice.org or Microsoft Office. The client-server based architecture has some interesting functionality. When a user opens a document, the server checks which sections the user has read and/or write access to. The server then removes any edit and copy/paste functionality from the read-only sections. The sections the user has no read access to are removed completely from the document that the user receives from the server. Currently the collaboration suite has plug-ins for texts and spreadsheets, and support for Koffice and IBM Lotus Symphony is planned for a later release.

DIaLOGIKa's Wolfgang Keber (one of the developers of the Microsoft-funded ODF Converter project) made a general remark based on his experience consulting for the European Commission: ODF implementations should not only think about backward compatibility (to be able to read old archived documents), but also about forward compatibility. Opening a new ODF 1.2 document in an older 2.x version of OpenOffice.org should go as smoothly as possible, he said. This is particularly an issue for large institutions, where different application and document versions co-exist during a long migration phase. Giorgio Migliaccio from the Belgian company LetterGen presented its business-focused tools, such as its flagship product LetterGen that generates ODF documents based on an incoming XML message and the definition of templates and business rules, e.g. for legal contracts, manuals, insurance documents, and so on. This can be done interactively or in batch mode. Right now LetterGen only runs on Windows, but the next release 3.0 (expected in mid 2011) will also target other platforms.

The developers of two interesting ODF converters were also present at the ODF plugfest. Werner Donné talked about his proprietary project ODFToEPub, which allows anyone with an ordinary word processor to produce an e-book in the EPub format from a document in the ODT file format. There's a plug-in for OpenOffice.org — running on Linux, Windows and Mac OS X — that adds an export function to convert an ODT-file to EPub, but there's also a standalone interactive Java program and Werner expressed the possibility that a batch version may be coming.

Another special ODF converter is ODT2Braille, an LPGL 3+-licensed Braille extension to OpenOffice.org Writer, enabling authors to export documents as Braille files and even print them directly to a Braille embosser. Christophe Strobbe from the Katholieke Universiteit Leuven has been a researcher in web accessibility for people with disabilities since 2001 and he is the developer of the ODT2Braille project, which is part of the European project Aegis. The latest release is alpha 0.02 from 30 August 2010 and it reuses existing open source tools like liblouisxml, liblouis, pef2text, and odt2daisy. It's currently a Windows-only extension because of some minor incompatibility issues, but Christophe said that there will come versions for Linux and Mac OS X. For future versions, he also wants to support a larger set of embossers and probably also support Braille in the Calc and Impress applications.

There were also some talks about ODF libraries. Luis Belmar-Letelier from Itaapy talked about the lpOD project, which is an open source library with a set of high-level APIs for the Python, Perl, and Ruby languages. According to Luis, developers using lpOD don't have to know the details of the internal XML representation of the documents they manipulate, so they can focus on the high-level structure of the documents. And Oracle's Oliver-Rainer Wittmann talked about ODFDOM, an open source Java-based ODF API that is part of the ODF Toolkit. He said that the conversion from ODF 1.0/1.1 to ODF 1.2 is on the project's agenda. And KO GmbH's Jos van den Oever presented ODFKit, a C++ library for handling documents in ODF, which reuses WebKit functionality such as framework abstractions, code generation, JavaScript bindings, XML parsing, and XSLT processing.

ODF on the web

An especially interesting project that was presented is WebODF, which wants to bring ODF to the web. Jos van den Oever started from the observation that a lot of office suites are moving into the "cloud". Examples are Microsoft Live Office, Google Docs, and Zoho. But where are the free software alternatives for the cloud? For OpenOffice.org, KOffice, AbiWord, and Gnumeric, there are none that have a cloud version with ODF support. That was the motivation for Jos to start a project to fill in this gap and let users view and edit ODF documents on the web without losing control of the document into some company's servers.

The strategy Jos followed was to use just HTML and JavaScript for the web application. The application then loads the XML stream of the ODF document as is into the HTML document and puts it into the DOM tree. Styling is done by applying CSS rules that are directly derived from the <office:styles> and <office:automatic-styles> elements in the ODF document. That is how WebODF was born; it is a project with the initial goal of creating a simple ODF viewer and editor for offline and online use, implemented in HTML5.

The small code base consists of one HTML5 file and eight JavaScript files, each of which is a few hundred lines of code. The most interesting part is that it doesn't need server-side code execution: the JavaScript code is executed in the user's browser and saving the document to the web server is done using WebDAV. It supports both the Gecko and WebKit HTML engines. There is also an implementation on top of QtWebKit, which is for better desktop integration, and an ODFKit implementation. This means that WebODF is an easy way to add ODF support to almost any application, be it in HTML, Gtk, or QML. KO GmbH has received funding from NLnet to improve the current WebODF prototype and see how far the idea goes. Interested readers can try the online demo.

Not only for big companies

The fourth ODF plugfest managed to attract a 50-something attendees: apart from the developers of ODF office suites, there were a few small companies who deliver services, IT people from the Belgian federal and regional governments, and various interested ODF users. The momentum behind ODF 1.2 is one of the things that struck your author during the ODF plugfest: it's a huge change from ODF 1.0/1.1 and many parts from the draft are already supported by a lot of tools.

When talking about a big standard such as ODF, people generally think that it's mostly used by big companies like IBM and Oracle, and huge projects such as OpenOffice.org. However, Michiel Leenaars who presented about the OpenDoc Society made the striking observation that small companies and projects can have a lot of impact in the ODF ecosystem. His case in point was that KO GmbH, a small 6-person support company for KOffice, did three talks at the ODF plugfest. This is promising for small innovative developer teams that want to fully participate in the ODF ecosystem.

The value of the ongoing series of ODF plugfests lies not only in the talks to update attendees about the latest work, but even more in the test scenarios where developers of competing products collaborate to attain the common goal of better interoperability. The presentation of different ODF tools was also inspiring: it shows that there's a healthy ecosystem that is forming around the document format standard.

