From MobileRead

There are many file formats used for eBooks. Usually, but not always, the file extension matches the name of the format. In some cases the extension may have two forms, where one form is limited to 3 characters (consistent with early Windows requirements).

edit] eBook Formats

This section attempts to define and identify all (or most) of the eBook formats. With the great proliferation of formats for etext a person want to read an eBook can easily become confused. The most important ones are always the ones that work on the device or devices you own, but if you have a choice the most important ones are the ones that have the most eBook dealers or most eBooks available. Today there are essentially two kinds of ebook format: The various formats provided by Amazon and ePub versions 2 and 3. A third popular format is PDF from Adobe, but it does not tend to work as well on portable mobile devices due to the smaller screen size. PDF is more suited to computers since it often expects full paper size pages. Check Popular Formats Statistics for a list of popular formats as determined by the number of views made in this wiki.

As many reading devices settle on a sub-list of formats, ePUB and PDF have emerged as the leading list, with Adobe DRM support on both. Amazon does not support ePUB and insists on AZW and its variants. AZW format is no longer just one format, but a whole series of different formats (see the next section]. You often also see TXT, with TXT there is only minimal formatting, and HTML with eBook readers often ignoring any complicated formatting. More general purpose portable devices such as tablets will have loadable applications for these and other formats.

edit] Main formats

These are the formats most commonly available commercially.

AZW - An Amazon proprietary format. This is usually the MOBI format with or without DRM. The DRM is unique to the Amazon Kindle. Files with this extension can be any of the Kindle formats.

- An Amazon proprietary format. This is usually the MOBI format with or without DRM. The DRM is unique to the Amazon Kindle. Files with this extension can be any of the Kindle formats. AZW1 - An Amazon proprietary format. It is the TPZ format always with a custom DRM.

- An Amazon proprietary format. It is the TPZ format always with a custom DRM. AZW3 - See KF8.

- See KF8. AZW4 - An Amazon proprietary format. It is the PDF format in a PDB wrapper, and usually (always?) with DRM.

- An Amazon proprietary format. It is the PDF format in a PDB wrapper, and usually (always?) with DRM. EPUB An open format defined by the Open eBook Forum of the International Digital Publishing Forum (<idpf>). It is based on XHTML, XML and CSS2. It is an evolving standard. Current specifications are found at the idpf web site. Adobe, Barnes & Noble and Apple all have their own (incompatible) DRM systems for this format. There is now a new version of this format called ePub 3 but it is not yet in wide use.

An open format defined by the Open eBook Forum of the International Digital Publishing Forum (<idpf>). It is based on XHTML, XML and CSS2. It is an evolving standard. Current specifications are found at the idpf web site. Adobe, Barnes & Noble and Apple all have their own (incompatible) DRM systems for this format. There is now a new version of this format called ePub 3 but it is not yet in wide use. KF8 - (Also called AZW3) It is basically ePub compiled with the PDB wrapper and with Amazon DRM. This format is supported by all Amazon readers from the Kindle Keyboard 3 onwards.

- (Also called AZW3) It is basically ePub compiled with the PDB wrapper and with Amazon DRM. This format is supported by all Amazon readers from the Kindle Keyboard 3 onwards. KFX - A semi-compiled format from Amazon designed to give better typography on Kindle devices, comes with a new DRM system.

- A semi-compiled format from Amazon designed to give better typography on Kindle devices, comes with a new DRM system. MOBI - MobiPocket format, usable with MobiPocket's own reading software on almost any PDA and Smartphones. Mobipocket's Windows PC software can convert .chm, .doc, .html, .ocf, .pdf, .rtf, and .txt files to this format. Kindle uses this format, as well.

- MobiPocket format, usable with MobiPocket's own reading software on almost any PDA and Smartphones. Mobipocket's Windows PC software can convert .chm, .doc, .html, .ocf, .pdf, .rtf, and .txt files to this format. Kindle uses this format, as well. PDB - Palm Database File. Can hold several different e-book formats targeting Palm-enabled devices, commonly used for PalmDOC (AportisDoc) e-books and eReader formats as well and many others.

- Palm Database File. Can hold several different e-book formats targeting Palm-enabled devices, commonly used for PalmDOC (AportisDoc) e-books and eReader formats as well and many others. PDF - Portable Document Format created by Adobe for their Acrobat products. It is the defacto standard for document interchange. Software support exists for almost every computer platform and handheld device. Some devices have problems with PDF since most content available is scaled for either A4 or letter format, both of which are not easily readable when reduced to fit on small screens. Some Readers can reflow some PDF documents, including the Sony PRS505, to accommodate the small screen. Some eBook readers, including the iRex iLiad, have a pan-and-zoom feature that aids readability, but extracts a price in ergonomics.

- Portable Document Format created by Adobe for their Acrobat products. It is the defacto standard for document interchange. Software support exists for almost every computer platform and handheld device. Some devices have problems with PDF since most content available is scaled for either A4 or letter format, both of which are not easily readable when reduced to fit on small screens. Some Readers can reflow some PDF documents, including the Sony PRS505, to accommodate the small screen. Some eBook readers, including the iRex iLiad, have a pan-and-zoom feature that aids readability, but extracts a price in ergonomics. PRC - Palm Resource File. Often holds a Mobipocket eBook but occasionally holds an eReader or AportisDoc eBook.

- Palm Resource File. Often holds a Mobipocket eBook but occasionally holds an eReader or AportisDoc eBook. TPZ - Topaz file extension used on Amazon Kindle. Topaz is a collection of glyphs arrange on pages, along with an unproofed OCR text version. An Amazon proprietary format, used to make older books available quickly, since conversion is essentially automatic from scans of the pages of a book, but it reflows very well.

edit] Other formats

Some of the formats in the list below are only available for a few or even one type of device. Some are more standardized. Be sure your device will read the format you choose.

edit] Other File Types

A number of e-book readers and PDAs play music (usually in MP3) and display graphics (usually in JPG) even outside of eBooks.

LRC - A annotation format originally intended for Lyrics. It can also be used for read along eBooks. It is a text file that can be synced with audio or video files with an appropriate program.

- A annotation format originally intended for Lyrics. It can also be used for read along eBooks. It is a text file that can be synced with audio or video files with an appropriate program. Some eBook readers also provide support for reading typical business office files such as RTF, DOC/DOCX, XLS/XLSX, PPT/PPTX.

edit] Audio Formats

See also: Sound - These are for Music and audio books. Some are specific to speech.

AA - Audible.com Audio proprietary format with four different levels of DRM.

- Audible.com Audio proprietary format with four different levels of DRM. AAC - Advanced Audio Codec is more of a container than a format as within an AAC the music can be encoded in multiple ways from iTunes M4P all the way to a lossless compression.

- Advanced Audio Codec is more of a container than a format as within an AAC the music can be encoded in multiple ways from iTunes M4P all the way to a lossless compression. AAX - Enhanced audio from Audible.com. It is embedded with other features, such as images, graphs, maps, or links.

- Enhanced audio from Audible.com. It is embedded with other features, such as images, graphs, maps, or links. MP3 - the currently most popular music compression format. It is widely used throughout the Internet and plays on almost every portable music player. This format is also used for some audio books.

- the currently most popular music compression format. It is widely used throughout the Internet and plays on almost every portable music player. This format is also used for some audio books. WMA - Windows Media Audio is an audio format developed by Microsoft to compete with MP3.

- Windows Media Audio is an audio format developed by Microsoft to compete with MP3. OGG - Free, open standard container for Vorbis audio compression codec files, as well as for Free Lossless Audio Codec (FLAC), Speex speech compression codec, and Theora lossy video codec.

edit] Graphic Formats

See also: Graphics

BMP - BitMaP image file is an uncompressed graphics format developed by Microsoft.

- BitMaP image file is an uncompressed graphics format developed by Microsoft. GIF - Graphics Interchange Format was developed in 1987 by CompuServe and is a lossless graphics format designed for the reproduction of line drawings rather than photographs. Widely used on the Internet for logotypes and drawings. The coding scheme is patented as it uses the LZW lossless compression scheme however the patents ran out in 2004. PNG was developed to replace GIF and has no patent issues. Main drawbacks are color support (max. 256 colors) and only 1-bit alpha channel (transparency bit).

- Graphics Interchange Format was developed in 1987 by CompuServe and is a lossless graphics format designed for the reproduction of line drawings rather than photographs. Widely used on the Internet for logotypes and drawings. The coding scheme is patented as it uses the LZW lossless compression scheme however the patents ran out in 2004. PNG was developed to replace GIF and has no patent issues. Main drawbacks are color support (max. 256 colors) and only 1-bit alpha channel (transparency bit). JPG - (or JPEG) stands for Joint Photographic Experts Group and a lossy compressed graphics format designed to support photographs rather than line art. Developed in 1992 and issued as the ISO 10918-1 standard in 1994, the quality depends directly on the amount of compression employed. Widely used on the Internet and by most digital camera manufacturers. A newer format is called JPEG 2000.

- (or JPEG) stands for Joint Photographic Experts Group and a lossy compressed graphics format designed to support photographs rather than line art. Developed in 1992 and issued as the ISO 10918-1 standard in 1994, the quality depends directly on the amount of compression employed. Widely used on the Internet and by most digital camera manufacturers. A newer format is called JPEG 2000. PNG - Portable Network Graphics format is a bitmapped graphic format that employs a lossless compression system. Designed to improve upon and replace GIF files, PNG does not require a patent license. Main drawback is the complexity of its color model.

- Portable Network Graphics format is a bitmapped graphic format that employs a lossless compression system. Designed to improve upon and replace GIF files, PNG does not require a patent license. Main drawback is the complexity of its color model. SVG - a vector graphics format that is supported by ePUB.

- a vector graphics format that is supported by ePUB. SWF - Shockwave Flash is currently the dominant format for displaying "animated" vector graphics on the Web.

- Shockwave Flash is currently the dominant format for displaying "animated" vector graphics on the Web. TIF - (or TIFF) Tagged Image File Format is a container that can hold images in a wide variety of bitmapped or even vector formats. They can also be compressed or uncompressed. If compressed they can use RLE, JPG, LZW, Zip or potentially other formats. This standard is owned by Adobe. Main drawback is that it is so versatile that saying that TIF is a supported format may mean nothing since there are really many TIFF formats.

- (or TIFF) Tagged Image File Format is a container that can hold images in a wide variety of bitmapped or even vector formats. They can also be compressed or uncompressed. If compressed they can use RLE, JPG, LZW, Zip or potentially other formats. This standard is owned by Adobe. Main drawback is that it is so versatile that saying that TIF is a supported format may mean nothing since there are really many TIFF formats. IW44 - A subset simplified version of DJVU.

edit] Compression Formats

These are lossless compression formats that reduce the amount of space required to store a document. Text, unlike music, can be compressed a great deal. Sometimes the compression can be as much as 90%. These formats are considered to be containers in that they can hold multiple files. In some case the ability to hold multiple files is more important than the actual compression.

Don't confuse compression formats with eBook formats. Although listed by some eBook readers as a supported format, these readers have only the ability to extract the compressed file and to get to the file or files inside. The reader must still support the actual underlying eBook format. Also, some eBook formats already include compression.

RAR - a file compression system providing one of the most compact resultant files current available in wide distribution. The premier tool for RAR is WinRAR but 7ZIP works as well.

- a file compression system providing one of the most compact resultant files current available in wide distribution. The premier tool for RAR is WinRAR but 7ZIP works as well. ZIP - the most universal of the compression tools. Slightly less efficient than RAR files, ZIP files have been around longer and enjoy more support.

- the most universal of the compression tools. Slightly less efficient than RAR files, ZIP files have been around longer and enjoy more support. LHA - a Japanese developed compressed archive file format. A Microsoft Compressed (LZH) Folder Add-on is included with the Japanese version of Windows to use this format.

- a Japanese developed compressed archive file format. A Microsoft Compressed (LZH) Folder Add-on is included with the Japanese version of Windows to use this format. GZIP - A zip format (.gz) that was developed by the GNU team. It is designed to be zipped or unzipped on the fly and only supports one file. Often the file is a tar (.tar) format which is a container (archive) format. When used together the file extension is usually .tgz or .tar.gz.

- A zip format (.gz) that was developed by the GNU team. It is designed to be zipped or unzipped on the fly and only supports one file. Often the file is a tar (.tar) format which is a container (archive) format. When used together the file extension is usually .tgz or .tar.gz. BZIP2 - compresses files using the Burrows-Wheeler block sorting text compression algorithm, and Huffman coding. Compression is generally considerably better than that achieved by more conventional LZ77 and LZ78-based compressors, and approaches the performance of the PPM (prediction by partial mapping) family of statistical compressors. The file extension is generally .bz2

edit] Supported Format Matrix

Note 1 - Requires a manufacturer supplied conversion program and Word.

Note 2 - Requires a manufacturer supplied conversion program.

Note 3 - Only supported inside of a document.

Note 4 - Only supported on PRS-505, PRS-700, PRS-600, PRS-300

Note 5 - Chinese version of iLiad only

edit] For more information