Lzip

[ English | Español | Français | Italiano ]

Introduction

Lzip is a lossless data compressor with a user interface similar to the one of gzip or bzip2. Lzip can compress about as fast as gzip (lzip -0) or compress most files more than bzip2 (lzip -9). Decompression speed is intermediate between gzip and bzip2. Lzip is better than gzip and bzip2 from a data recovery perspective. Lzip has been designed, written, and tested with great care to replace gzip and bzip2 as the standard general-purpose compressed format for unix-like systems.

The lzip file format is designed for data sharing and long-term archiving, taking into account both data integrity and decoder availability:

The lzip format provides very safe integrity checking and some data recovery means. The lziprecover program can repair bit flip errors (one of the most common forms of data corruption) in lzip files, and provides data recovery capabilities, including error-checked merging of damaged copies of a file.

The lzip format is as simple as possible (but not simpler). The lzip manual provides the source code of a simple decompressor along with a detailed explanation of how it works, so that with the only help of the lzip manual it would be possible for a digital archaeologist to extract the data from a lzip file long after quantum computers eventually render LZMA obsolete.

Additionally the lzip reference implementation is copylefted, which guarantees that it will remain free forever.

A nice feature of the lzip format is that a corrupt byte is easier to repair the nearer it is from the beginning of the file. Therefore, with the help of lziprecover, losing an entire archive just because of a corrupt byte near the beginning is a thing of the past.

Lzip uses the same well-defined exit status values used by bzip2, which makes it safer than compressors returning ambiguous warning values (like gzip) when it is used as a back end for other programs like tar or zutils.

Introductory links

Benchmark - Some tests showing how well lzip can replace gzip and bzip2 as general purpose compressor for unix-like systems from a performance point of view.

Quality assurance - Design, development and testing of lzip.

The lzip format (slides) - Talk given at the GNU Hackers Meeting 2019.

Xz format inadequate for long-term archiving - This article describes the reasons why you should switch to lzip if you are using xz for anything other than compressing short-lived executables.

Other features

Lzip will automatically use for each file the largest dictionary size that does not exceed neither the file size nor the limit given. Keep in mind that the decompression memory requirement is affected at compression time by the choice of dictionary size limit.

When compressing, lzip replaces every file given in the command line with a compressed version of itself, with the name "original_name.lz".

(De)compressing a file is much like copying or moving it; therefore lzip preserves the access and modification dates, permissions, and, when possible, ownership of the file just as "cp -p" does. (If the user ID or the group ID can't be duplicated, the file permission bits S_ISUID and S_ISGID are cleared).

Lzip is able to read from some types of non regular files if the option '--stdout' is specified.

If no file names are specified, lzip compresses (or decompresses) from standard input to standard output. In this case, lzip will decline to write compressed output to a terminal, as this would be entirely incomprehensible and therefore pointless.

Lzip will correctly decompress a file which is the concatenation of two or more compressed files. The result is the concatenation of the corresponding decompressed files. Integrity testing of concatenated compressed files is also supported.

Lzip can produce multimember files, and lziprecover can safely recover the undamaged members in case of file damage. Lzip can also split the compressed output in volumes of a given size, even when reading from standard input. This allows the direct creation of multivolume compressed tar archives.

Lzip is able to compress and decompress streams of unlimited size by automatically creating multimember output. The members so created are large, about 2 PiB each.

In spite of its name (Lempel-Ziv-Markov chain-Algorithm), LZMA is not a concrete algorithm; it is more like "any algorithm using the LZMA coding scheme". For example, the option '-0' of lzip uses the scheme in almost the simplest way possible; issuing the longest match it can find, or a literal byte if it can't find a match. Inversely, a much more elaborated way of finding coding sequences of minimum size than the one currently used by lzip could be developed, and the resulting sequence could also be coded using the LZMA coding scheme.

Lzip currently implements two variants of the LZMA algorithm; fast (used by option '-0') and normal (used by all other compression levels).

The high compression of LZMA comes from combining two basic, well-proven compression ideas: sliding dictionaries (LZ77/78) and markov models (the thing used by every compression algorithm that uses a range encoder or similar order-0 entropy coder as its last stage) with segregation of contexts according to what the bits are used for.

The ideas embodied in lzip are due to (at least) the following people: Abraham Lempel and Jacob Ziv (for the LZ algorithm), Andrey Markov (for the definition of Markov chains), G.N.N. Martin (for the definition of range encoding), Igor Pavlov (for putting all the above together in LZMA), and Julian Seward (for bzip2's CLI).

Plzip - A multi-threaded compressor using the lzip file format.

Lzlib - A compression library for the lzip file format, written in C.

Lunzip - A decompressor for lzip files, written in C.

Clzip - A C implementation of lzip for systems lacking a C++ compiler.

Lziprecover - A data recovery tool and decompressor for the lzip format.

Zutils - Replacement for zcat, zdiff, zgrep, etc, that understands lzip, bzip2 and gzip formats.

Pdlzip - A limited, "public domain" C implementation of the lzip data compressor, intended for those who can't distribute GPL licensed Free Software. Pdlzip is also able to decompress legacy lzma-alone (.lzma) files.

Lzd - An educational decompressor for the lzip format.

Xlunzip - A test tool for the lzip_decompress linux module.

Tarlz - An archiver with multimember lzip compression.

Documentation

The manual is available in the info system of the GNU Operating System. Use info to access the top level info page. Use info lzip to access the lzip section directly.

An online manual for lzip can be found here.

Download

The latest released version of lzip can be found at http://download.savannah.gnu.org/releases/lzip/. You may also subscribe to lzip-bug and receive an email every time a new version is released.

A Windows32 port of lzip can be downloaded from the Savannah download link just above. More ports of lzip for Windows can be found in the Links section below. A Windows port (32 and 64 bits) of plzip can be downloaded from the plzip page above.

Once lzip is installed, the files from archive " foo.tar.lz " can be extracted using the commands " tar -xf foo.tar.lz " or " lzip -cd foo.tar.lz | tar -xf - ".

How to get help

For general discussion of bugs in lzip the mailing list lzip-bug@nongnu.org is the most appropriate forum. Please send messages as plain text. Please do not send messages encoded as HTML nor encoded as base64 MIME nor included as multiple formats. Please include a descriptive subject line. If all of the subject are "bug in lzip" it is impossible to differentiate them.

An archive of the bug report mailing list is available at http://lists.gnu.org/mailman/listinfo/lzip-bug.

How to help

To contact the author, either to report a bug or to contribute fixes or improvements, send mail to lzip-bug@nongnu.org. Please send messages as plain text. If posting patches they should be in unified diff format against the latest version. They should include a text description.

See also the lzip project page at Savannah.

7-Zip ZStandard Edition - A version of 7-Zip with lzip decompression support built in.

Atool, Patool - Command line archive managers that understand lzip files.

GNU Automake - A Makefile generator able to create lzip-compressed tarballs.

Documentation as an indicator of code quality - A different review of lzip.

Dragora GNU/Linux - A GNU/Linux distribution using lzip in its package system.

Easylzma - C library and tools for lzip and lzma-alone file formats.

File Roller - An archive manager for GNOME that understands lzip files.

Lesspipe.sh - View the contents of lzipped files with the pager less.

Libarchive - Multi-format archive and compression library with lzip support.

Littleutils - Convert your files to lzip format.

Man-db - An implementation of the Unix man command able to read lzipped pages.

Midnight Commander - A visual file manager that understands lzip files.

RPM - RPM Package Manager that uses lzip to compress packages.

GNU Tar - Automatically create and extract lzip-compressed tar archives.

GNU Texinfo - The GNU Documentation System understands lzip-compressed manuals.

Z - A simple, safe and convenient front-end for lzip, bzip2 and gzip.

Download lzip for AIX, ALT Linux, Amiga, Android, Arch Linux, DOS, Debian, Exherbo, Fedora, FreeBSD, Gentoo, HP-UX, Mac (fink), NetBSD, NixOS, OS/2, Slackware, Solaris (OpenCSW), Windows (Cygwin), Windows (ezwinports).

Bindings (Interfaces to languages other than C/C++)

Common Lisp, Haskell.

Licensing

Lzip is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version.

Valid HTML 4.01 Strict

Copyright © 2020 Antonio Diaz Diaz

Lzip logo Copyright © 2013 Sonia Diaz Pacheco

You are free to copy, modify and distribute all or part of this article without limitation.

Updated: 2020-03-15