

We are always looking for sample files of various file formats. We need them for regression test suites for the existing libraries, but also for introspecting unknown formats (e.g., these on the "suggested" list, but not limited to it). We are not committed to write a filter for any particular format, but having some files (even better, a person who can create more on request) definitely promotes the chance for that happening :-)

In case you are interested and want to help, read further to see documents we are looking for and how to create samples for regression testing and for introspection.

Creating sample documents for regression testing

These documents will be publicly available in a regression test repository, so we need an acknowledgement that they are available under CC-BY-SA license. Files of unknown origin (e.g., found on the Internet) are not acceptable. Please try to follow the following suggestions when creating sample sets:

Create small documents, each covering one isolated feature (or sub-feature) of the format, e.g., paragraph formatting, character formatting, headers/footers, shape transformations, etc.

Create documents for a _single_ version of the application. Always specify the version and operating system (Windows, macOS, Linux, OS/2, DOS etc.) when you are contributing the files to us.

Use at least somewhat meaningful file names, e.g., sample1.xyz is bad, ellipse.xyz or footnote.xyz is good.

Samples do not have to be created from scratch: it is possible to take an existing set (provided there is one :-) and save the files using a newer (or older) version of the file format. Just please tell us that you have done that.

What features should be covered

The following list is a rough guide for features that should be covered by sample documents (provided that the application supports them). Ideally, every bullet point should be covered by a single document or a small number of documents.

all predefined shapes (or a representative sample, if there is too many of them)

shape transformations, e.g., for a simple shape and a complex shape (bezier curves or NURBS)

shape fills

line strokes

various arrows for line endings

use of layers

text in different languages (it does not have to be gramatically or stylistically correct--automatic translation is okay)

text in scripts using different writing directions (Chinese, Arabic, Hebrew, ...)

tables (including use of merged cells and non-default borders)

grouped shapes

more text properties, e.g., a different font, font size, subscript, superscript, color

paragraph properties, e.g., justification, first line indentation, line height

paragraph rules/borders

tabs

styles

bitmap images, ideally using various different input formats (e.g., JPEG, PNG, BMP)

embedded fonts

document metadata, e.g., title, author, description, keywords

use of master pages (or page masters, page styles, page templates, or whatever else it is called)

shape connectors

headers and footers

footnotes

embedded formulas

different types of image anchoring (to page, to paragraph, etc.)

different page sizes in a single document

multi-column text

fields (page count, page number etc.)

Wanted samples

Apple Keynote 5.0-5.2. Just a single file is needed, to ensure that the version string is detected correctly.

CorelDRAW - all versions.

CorelDRAW Exchange (CMX). Both Windows and Mac files.

Microsoft Visio - all versions.

Creating sample documents for introspecting unknown format

These documents would only be used for our internal needs, so any file is fine, regardless of its source. That means that files randomly collected on the Internet are fine--after all, some samples are better than no samples. If you are creating a sample document yourself, please try to follow the general suggestions described above, with the following additions:

Create very minimal documents (e.g., a single text line or a single shape).

Create several variations of the same base document, e.g., a sligtly moved or resized shape, a couple of characters added to a paragraph, etc.

Use at least somewhat meaningful file names, e.g., sample1.xyz is bad, ellipse.xyz or footnote.xyz is good. Variations of a base document can be numbered.

Export each document to PDF or make a screenshot of it opened in the original application, if possible.

Contributing sample documents

There are 2 ways to get your sample documents to us. The first is via e-mail or some internet storage service. The second is via Gerrit; this is only available in certain cases. Both are described in more detail in the following text. Please never attach packs of sample documents to bugzilla: it has a size limit for attachments.

Via e-mail

Pack all your sample files into a single archive (e.g., zip), upload them to some internet storage service and send info about it to info@documentliberation.org. The e-mail should have subject "sample documents for <Your format>" and you should include info about application version, operating system (macOS, Windows, Linux, OS/2, DOS etc.) and origin of the documents (created by you; or from other sources). For documents created by yourself, we assume that you agree with providing them under CC-BY-SA 4.0 license.

Alternatively, you can send your files (preferably in a single archive) attached to an e-mail to dtardon@redhat.com or fridrich.strba@libreoffice.org. The subject and content of the e-mail should be the same as explained in the previous paragraph.

Via Gerrit

Several of our import libraries and their test suites are hosted on http://gerrit.libreoffice.org . That means that the way to contribute to them is the same as for, e.g., libreoffice. The needed info about our gerrit setup is available here. This is not so convenient for one-time contributions, as the initial setup takes some effort, but it makes further contributions much easier.

The test repositories that are available in that way are libabw-test, libcdr-test, libetonyek-test, libfreehand-test, libmspub-test, libpagemaker-test and libvisio-test. You need to clone the repository that should contain test files of the format you prepared (which one it is should hopefully be clear from the repository names. If it is not, this page contains overview of all the libraries and formats they can handle).

For adding your files, you can use the simple method:

add your sample files to a new directory created in the cloned repository;

commit, putting info about the files (the same as described above in the section about e-mail) into the commit message;

send for review.

Or the slightly more complicated method (but only if you know what you are doing):