Portable Network Graphics (PNG) Specification (Second Edition) Information technology — Computer graphics and image processing — Portable Network Graphics (PNG): Functional specification. ISO/IEC 15948:2003 (E) W3C Recommendation 10 November 2003 This version: http://www.w3.org/TR/2003/REC-PNG-20031110 Latest version: http://www.w3.org/TR/PNG Previous version: http://www.w3.org/TR/2003/PR-PNG-20030520 Editor: David Duce, Oxford Brookes University (Second Edition) Authors: See author list Please refer to the errata for this document, which may include some normative corrections. See also the translations of this document. Copyright © 2003 W3C ® ( MIT , ERCIM , Keio), All Rights Reserved. W3C liability, trademark, document use and software licensing rules apply.

This document describes PNG (Portable Network Graphics), an extensible file format for the lossless, portable, well-compressed storage of raster images. PNG provides a patent-free replacement for GIF and can also replace many common uses of TIFF. Indexed-color, grayscale, and truecolor images are supported, plus an optional alpha channel. Sample depths range from 1 to 16 bits.

PNG is designed to work well in online viewing applications, such as the World Wide Web, so it is fully streamable with a progressive display option. PNG is robust, providing both full file integrity checking and simple detection of common transmission errors. Also, PNG can store gamma and chromaticity data for improved color matching on heterogeneous platforms.

This specification defines an Internet Media Type image/png.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document is the 14 October 2003 W3C Recommendation of the PNG specification, second edition. It is also International Standard, ISO/IEC 15948:2003. The two documents have exactly identical content except for cover page and boilerplate differences as appropriate to the two organisations.

This International Standard is strongly based on the W3C Recommendation 'PNG Specification Version 1.0' which was reviewed by W3C members, approved as a W3C Recommendation and published in October 1996. This second edition incorporates all known errata and clarifications.

A complete review of the document has been done by ISO/IEC/JTC 1/SC 24 in collaboration with W3C and the PNG development group (the original authors of the PNG 1.0 Recommendation) in order to transform that Recommendation into an ISO/IEC international standard. A major design goal during this review was to avoid changes that will invalidate existing files, editors, or viewers that conform to W3C Recommendation PNG Specification Version 1.0.

The PNG specification enjoys a good level of implementation with good interoperability. At the time of this publication more than 180 image viewers could display PNG images and over 100 image editors could read and write valid PNG files. Full support of PNG is required for conforming SVG viewers; at the time of publication all eighteen SVG viewers had PNG support. HTML has no required image formats, but over 60 HTML browsers had at least basic support of PNG images.

Public comments on this W3C Recommendation are welcome. Please send them to the archived list png-group@w3.org .

The latest information regarding patent disclosures related to this document is available on the Web. As of this publication, the PNG Group are not aware of any royalty-bearing patents they believe to be essential to PNG.

This document has been produced by ISO/IEC JTC1 SC24 and the PNG Group as part of the Graphics Activity within the W3C Interaction Domain.

Note: To provide the highest quality images, this specification uses SVG diagrams with a PNG fallback using the HTML object element. SVG-enabled browsers will see the SVG figures with selectable text, other browsers will display the raster PNG version. W3C is aware that there is a known incompatibility between the unsupported beta of Adobe SVG plugin for Linux and Mozilla versions greater than 0.9.9 due to changes in the plug-in API, causing a browser crash. Therefore, a normative PNG-only alternative version is available that does not use an object element. The two versions are otherwise identical.

Available languages

The English version of this specification is the only normative version. However, for translations in other languages see http://www.w3.org/Consortium/Translation/.

The design goals for this International Standard were:

Portability: encoding, decoding, and transmission should be software and hardware platform independent. Completeness: it should be possible to represent truecolour, indexed-colour, and greyscale images, in each case with the option of transparency, colour space information, and ancillary information such as textual comments. Serial encode and decode: it should be possible for datastreams to be generated serially and read serially, allowing the datastream format to be used for on-the-fly generation and display of images across a serial communication channel. Progressive presentation: it should be possible to transmit datastreams so that an approximation of the whole image can be presented initially, and progressively enhanced as the datastream is received. Robustness to transmission errors: it should be possible to detect datastream transmission errors reliably. Losslessness: filtering and compression should preserve all information. Performance: any filtering, compression, and progressive image presentation should be aimed at efficient decoding and presentation. Fast encoding is a less important goal than fast decoding. Decoding speed may be achieved at the expense of encoding speed. Compression: images should be compressed effectively, consistent with the other design goals. Simplicity: developers should be able to implement the standard easily. Interchangeability: any standard-conforming PNG decoder shall be capable of reading all conforming PNG datastreams. Flexibility: future extensions and private additions should be allowed for without compromising the interchangeability of standard PNG datastreams. Freedom from legal restrictions: no algorithms should be used that are not freely available.

This International Standard specifies a datastream and an associated file format, Portable Network Graphics (PNG, pronounced "ping"), for a lossless, portable, compressed individual computer graphics image transmitted across the Internet. Indexed-colour, greyscale, and truecolour images are supported, with optional transparency. Sample depths range from 1 to 16 bits. PNG is fully streamable with a progressive display option. It is robust, providing both full file integrity checking and simple detection of common transmission errors. PNG can store gamma and chromaticity data as well as a full ICC colour profile for accurate colour matching on heterogenous platforms. This Standard defines the Internet Media type "image/png". The datastream and associated file format have value outside of the main design goal.

The following normative documents contain provisions which, through reference in this text, constitute provisions of this International Standard. For dated references, subsequent amendments to, or revisions of, any of these publications do not apply. However, parties to agreements based on this International Standard are encouraged to investigate the possibility of applying the most recent editions of the normative documents indicated below. For undated references, the latest edition of the normative document referred to applies. Members of ISO and IEC maintain registers of currently valid International Standards.

ISO 639:1988, Code for the representation of names of languages.

ISO/IEC 646:1991, International Organization for Standardization, Information technology — ISO 7-bit coded character set for information interchange.

ISO/IEC 3309:1993, Information Technology — Telecommunications and information exchange between systems — High-level data link control (HDLC) procedures — Frame structure.

ISO/IEC 8859-1:1998, Information technology — 8-bit single-byte coded graphic character sets — Part 1: Latin alphabet No. 1.

For convenience, here is a non-normative sample text file describing the codes and associated character names.

ISO/IEC 9899:1990(R1997), Programming languages — C.

ISO/IEC 10646-1:1993/AMD.2, Information technology — Universal Multiple-Octet Coded Character Sets (UCS) — Part 1: Architecture and Basic Multilingual Plane.

IEC 61966-2-1, Multimedia systems and equipment — Colour measurement and management — Part 2-1: Default RGB colour space — sRGB, available at http://www.iec.ch/ .

CIE-15.2, CIE, "Colorimetry, Second Edition". CIE Publication 15.2-1986. ISBN 3-900-734-00-3.

ICC-1, International Color Consortium, "Specification ICC.1: 1998-09, File Format for Color Profiles", 1998, available at http://www.color.org/

ICC-1A, International Color Consortium, "Specification ICC.1A: 1999-04, Addendum 2 to ICC.1: 1998-09", 1999, available at http://www.color.org/

RFC-1123, Braden, R., Editor, "Requirements for Internet Hosts — Application and Support", STD 3, RFC 1123, USC/Information Sciences Institute, October 1989.

http://www.ietf.org/rfc/rfc1123.txt

RFC-1950, Deutsch, P. and Gailly, J-L., "ZLIB Compressed Data Format Specification version 3.3", RFC 1950, Aladdin Enterprises, May 1996.

http://www.ietf.org/rfc/rfc1950.txt

RFC-1951, Deutsch, P., "DEFLATE Compressed Data Format Specification version 1.3", RFC 1951, Aladdin Enterprises, May 1996.

http://www.ietf.org/rfc/rfc1951.txt

RFC-2045, Freed, N. and Borenstein, N. , "MIME (Multipurpose Internet Mail Extensions) Part One: Format of Internet Message Bodies", RFC 2045, Innosoft, First Virtual, November 1996.

http://www.ietf.org/rfc/rfc2045.txt

RFC-2048, Freed, N., Klensin, J. and Postel, J., "Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures", RFC 2048, Innosoft, MCI, ISI, November 1996.

http://www.ietf.org/rfc/rfc2048.txt

RFC-3066, Alvestrand, H., "Tags for the Identification of Languages", RFC 3066, Cisco Systems, January 2001. (Obsoletes RFC 1766.)

http://www.ietf.org/rfc/rfc3066.txt

For the purposes of this International Standard the following definitions apply.

3.2.1 CRC Cyclic Redundancy Code. A CRC is a type of check value designed to detect most transmission errors. A decoder calculates the CRC for the received data and checks by comparing it to the CRC calculated by the encoder and appended to the data. A mismatch indicates that the data or the CRC were corrupted in transit. 3.2.2 CRT Cathode Ray Tube: a common type of computer display hardware. 3.2.2 LSB Least Significant Byte of a multi- byte value. 3.2.3 LUT Look Up Table. In frame buffer hardware, a LUT can be used to map indexed-colour pixels into a selected set of truecolour values, or to perform gamma correction. In software, a LUT can often be used as a fast way of implementing any mathematical function of a single integer variable. 3.2.4 MSB Most Significant Byte of a multi- byte value.

This International Standard specifies the PNG datastream, and places some requirements on PNG encoders, which generate PNG datastreams, PNG decoders, which interpret PNG datastreams, and PNG editors, which transform one PNG datastream into another. It does not specify the interface between an application and either a PNG encoder, decoder, or editor. The precise form in which an image is presented to an encoder or delivered by a decoder is not specified. Four kinds of image are distinguished.

The source image is the image presented to a PNG encoder. The reference image, which only exists conceptually, is a rectangular array of rectangular pixels, all having the same width and height, and all containing the same number of unsigned integer samples, either three (red, green, blue) or four (red, green, blue, alpha). The array of all samples of a particular kind (red, green, blue, or alpha) is called a channel. Each channel has a sample depth in the range 1 to 16, which is the number of bits used by every sample in the channel. Different channels may have different sample depths. The red, green, and blue samples determine the intensities of the red, green, and blue components of the pixel's colour; if they are all zero, the pixel is black, and if they all have their maximum values (2sampledepth-1), the pixel is white. The alpha sample determines a pixel's degree of opacity, where zero means fully transparent and the maximum value means fully opaque. In a three-channel reference image all pixels are fully opaque. (It is also possible for a four-channel reference image to have all pixels fully opaque; the difference is that the latter has a specific alpha sample depth, whereas the former does not.) Each horizontal row of pixels is called a scanline. Pixels are ordered from left to right within each scanline, and scanlines are ordered from top to bottom. A PNG encoder may transform the source image directly into a PNG image, but conceptually it first transforms the source image into a reference image, then transforms the reference image into a PNG image. Depending on the type of source image, the transformation from the source image to a reference image may require the loss of information. That transformation is beyond the scope of this International Standard. The reference image, however, can always be recovered exactly from a PNG datastream. The PNG image is obtained from the reference image by a series of transformations: alpha separation, indexing, RGB merging, alpha compaction, and sample depth scaling. Five types of PNG image are defined (see 6.1: Colour types and values ). (If the PNG encoder actually transforms the source image directly into the PNG image, and the source image format is already similar to the PNG image format, the encoder may be able to avoid doing some of these transformations.) Although not all sample depths in the range 1 to 16 bits are explicitly supported in the PNG image, the number of significant bits in each channel of the reference image may be recorded. All channels in the PNG image have the same sample depth. A PNG encoder generates a PNG datastream from the PNG image. A PNG decoder takes the PNG datastream and recreates the PNG image. The delivered image is constructed from the PNG image obtained by decoding a PNG datastream. No specific format is specified for the delivered image. A viewer presents an image to the user as close to the appearance of the original source image as it can achieve.

The relationships between the four kinds of image are illustrated in figure 4.1.

Figure 4.1 — Relationships between source, reference, PNG, and display images

The relationships between samples, channels, pixels, and sample depth are illustrated in figure 4.2.

Figure 4.2 — Relationships between sample, sample depth, pixel, and channel

The RGB colour space in which colour samples are situated may be specified in one of three ways:

by an ICC profile; by specifying explicitly that the colour space is sRGB when the samples conform to this colour space; by specifying the value of gamma and the 1931 CIE x,y chromaticities of the red, green, and blue primaries used in the image and the reference white point.

For high-end applications the first method provides the most flexibility and control. The second method enables one particular colour space to be indicated. The third method enables the exact chromaticities of the RGB data to be specified, along with the gamma correction (the power function relating the desired display output with the image samples) to be applied (see Annex C: Gamma and chromaticity). It is recommended that explicit gamma information also be provided when either the first or second method is used, for use by PNG decoders that do not support full ICC profiles or the sRGB colour space. Such PNG decoders can still make sensible use of gamma information. PNG decoders are strongly encouraged to use this information, plus information about the display system, in order to present the image to the viewer in a way that reproduces as closely as possible what the image's original author saw .

Gamma correction is not applied to the alpha channel, if present. Alpha samples always represent a linear fraction of full opacity.

A number of transformations are applied to the reference image to create the PNG image to be encoded (see figure 4.3). The transformations are applied in the following sequence, where square brackets mean the transformation is optional:

[alpha separation] indexing or ( [RGB merging] [alpha compaction] ) sample depth scaling

When every pixel is either fully transparent or fully opaque, the alpha separation, alpha compaction, and indexing transformations can cause the recovered reference image to have an alpha sample depth different from the original reference image, or to have no alpha channel. This has no effect on the degree of opacity of any pixel. The two reference images are considered equivalent, and the transformations are considered lossless. Encoders that nevertheless wish to preserve the alpha sample depth may elect not to perform transformations that would alter the alpha sample depth.

Figure 4.3 — Reference image to PNG image transformation

If all alpha samples in a reference image have the maximum value, then the alpha channel may be omitted, resulting in an equivalent image that can be encoded more compactly.

If the number of distinct pixel values is 256 or less, and the RGB sample depths are not greater than 8, and the alpha channel is absent or exactly 8 bits deep or every pixel is either fully transparent or fully opaque, then an alternative representation called indexed-colour may be more efficient for encoding. Each pixel is replaced by an index into a palette. The palette is a list of entries each containing three 8-bit samples (red, green, blue). If an alpha channel is present, there is also a parallel table of 8-bit alpha samples.

Figure 4.4 — Indexed-colour image

A suggested palette or palettes may be constructed even when the PNG image is not indexed-colour in order to assist viewers that are capable of displaying only a limited number of colours.

For indexed-colour images, encoders can rearrange the palette so that the table entries with the maximum alpha value are grouped at the end. In this case the table can be encoded in a shortened form that does not include these entries.

If the red, green, and blue channels have the same sample depth, and for each pixel the values of the red, green, and blue samples are equal, then these three channels may be merged into a single greyscale channel.

For non-indexed images, if there exists an RGB (or greyscale) value such that all pixels with that value are fully transparent while all other pixels are fully opaque, then the alpha channel can be represented more compactly by merely identifying the RGB (or greyscale) value that is transparent.

In the PNG image, not all sample depths are supported (see 6.1: Colour types and values), and all channels shall have the same sample depth. All channels of the PNG image use the smallest allowable sample depth that is not less than any sample depth in the reference image, and the possible sample values in the reference image are linearly mapped into the next allowable range for the PNG image. Figure 4.5 shows how samples of depth 3 might be mapped into samples of depth 4.

Figure 4.5 — Scaling sample values

Allowing only a few sample depths reduces the number of cases that decoders have to cope with. Sample depth scaling is reversible with no loss of data, because the reference image sample depths can be recorded in the PNG datastream. In the absence of recorded sample depths, the reference image sample depth equals the PNG image sample depth. See 12.5: Sample depth scaling and 13.12: Sample depth rescaling.

Figure 4.6 — Possible PNG image pixel types

The transformation of the reference image results in one of five types of PNG image (see figure 4.6) :

Truecolour with alpha: each pixel consists of four samples: red, green, blue, and alpha. Greyscale with alpha: each pixel consists of two samples: grey and alpha. Truecolour: each pixel consists of three samples: red, green, and blue. The alpha channel may be represented by a single pixel value. Matching pixels are fully transparent, and all others are fully opaque. If the alpha channel is not represented in this way, all pixels are fully opaque. Greyscale: each pixel consists of a single sample: grey. The alpha channel may be represented by a single pixel value as in the previous case. If the alpha channel is not represented in this way, all pixels are fully opaque. Indexed-colour: each pixel consists of an index into a palette (and into an associated table of alpha values, if present).

The format of each pixel depends on the PNG image type and the bit depth. For PNG image types other than indexed-colour, the bit depth specifies the number of bits per sample, not the total number of bits per pixel. For indexed-colour images, the bit depth specifies the number of bits in each palette index, not the sample depth of the colours in the palette or alpha table. Within the pixel the samples appear in the following order, depending on the PNG image type.

Truecolour with alpha: red, green, blue, alpha. Greyscale with alpha: grey, alpha. Truecolour: red, green, blue. Greyscale: grey. Indexed-colour: palette index.

A conceptual model of the process of encoding a PNG image is given in figure 4.7. The steps refer to the operations on the array of pixels or indices in the PNG image. The palette and alpha table are not encoded in this way.

Pass extraction: to allow for progressive display, the PNG image pixels can be rearranged to form several smaller images called reduced images or passes. Scanline serialization: the image is serialized a scanline at a time. Pixels are ordered left to right in a scanline and scanlines are ordered top to bottom. Filtering: each scanline is transformed into a filtered scanline using one of the defined filter types to prepare the scanline for image compression. Compression: occurs on all the filtered scanlines in the image. Chunking: the compressed image is divided into conveniently sized chunks. An error detection code is added to each chunk. Datastream construction: the chunks are inserted into the datastream.

Pass extraction (see figure 4.8) splits a PNG image into a sequence of reduced images where the first image defines a coarse view and subsequent images enhance this coarse view until the last image completes the PNG image. The set of reduced images is also called an interlaced PNG image. Two interlace methods are defined in this International Standard. The first method is a null method; pixels are stored sequentially from left to right and scanlines from top to bottom. The second method makes multiple scans over the image to produce a sequence of seven reduced images. The seven passes for a sample image are illustrated in figure 4.8. See clause 8: Interlacing and pass extraction.

Figure 4.7 — Encoding the PNG image

Figure 4.8 — Pass extraction

Each row of pixels, called a scanline, is represented as a sequence of bytes.

PNG standardizes one filter method and several filter types that may be used to prepare image data for compression. It transforms the byte sequence in a scanline to an equal length sequence of bytes preceded by a filter type byte (see figure 4.9 for an example). The filter type byte defines the specific filtering to be applied to a specific scanline. The encoder shall use only a single filter method for an interlaced PNG image, but may use different filter types for each scanline in a reduced image. See clause 9: Filtering.

Figure 4.9 — Serializing and filtering a scanline

The sequence of filtered scanlines in the pass or passes of the PNG image is compressed (see figure 4.10) by one of the defined compression methods. The concatenated filtered scanlines form the input to the compression stage. The output from the compression stage is a single compressed datastream. See clause 10: Compression.

Chunking provides a convenient breakdown of the compressed datastream into manageable chunks (see figure 4.10). Each chunk has its own redundancy check. See clause 11: Chunk specifications.

Figure 4.10 — Compression

Ancillary information may be associated with an image. Decoders may ignore all or some of the ancillary information. The types of ancillary information provided are described in Table 4.1.

Table 4.1 — Types of ancillary information Type of information Description Background colour Solid background colour to be used when presenting the image if no better option is available. Gamma and chromaticity Gamma characteristic of the image with respect to the desired output intensity, and chromaticity characteristics of the RGB values used in the image. ICC profile Description of the colour space (in the form of an International Color Consortium (ICC) profile) to which the samples in the image conform. Image histogram Estimates of how frequently the image uses each palette entry. Physical pixel dimensions Intended pixel size and aspect ratio to be used in presenting the PNG image. Significant bits The number of bits that are significant in the samples. sRGB colour space A rendering intent (as defined by the International Color Consortium) and an indication that the image samples conform to this colour space. Suggested palette A reduced palette that may be used when the display device is not capable of displaying the full range of colours in the image. Textual data Textual information (which may be compressed) associated with the image. Time The time when the PNG image was last modified. Transparency Alpha information that allows the reference image to be reconstructed when the alpha channel is not retained in the PNG image.

The PNG datastream consists of a PNG signature (see 5.2: PNG signature) followed by a sequence of chunks (see clause 11: Chunk specifications). Each chunk has a chunk type which specifies its function.

There are 18 chunk types defined in this International Standard. Chunk types are four-byte sequences chosen so that they correspond to readable labels when interpreted in the ISO 646.IRV:1991 character set. The first four are termed critical chunks, which shall be understood and correctly interpreted according to the provisions of this International Standard. These are:

IHDR : image header, which is the first chunk in a PNG datastream. PLTE : palette table associated with indexed PNG images. IDAT : image data chunks. IEND : image trailer, which is the last chunk in a PNG datastream.

The remaining 14 chunk types are termed ancillary chunk types, which encoders may generate and decoders may interpret.

Errors in a PNG datastream fall into two general classes:

transmission errors or damage to a computer file system, which tend to corrupt much or all of the datastream; syntax errors, which appear as invalid values in chunks, or as missing or misplaced chunks. Syntax errors can be caused not only by encoding mistakes, but also by the use of registered or private values, if those values are unknown to the decoder.

PNG decoders should detect errors as early as possible, recover from errors whenever possible, and fail gracefully otherwise. The error handling philosophy is described in detail in 13.2: Error handling.

For some facilities in PNG, there are a number of alternatives defined, and this International Standard allows other alternatives to be defined by registration. According to the rules for the designation and operation of registration authorities in the ISO/IEC Directives, the ISO and IEC Councils have designated the following as the registration authority:

The World-Wide Web Consortium Host at ERCIM

The Registration Authority for PNG

INRIA- Sophia Antipolis

BP 93

06902 Sophia Antipolis Cedex

FRANCE

Email:png-group@w3.org

To ensure timely processing the Registration Authority should be contacted by email.

The following entities may be registered:

chunk type; text keyword.

The following entities are reserved for future standardization:

undefined field values less than 128; filter method; filter type; interlace method; compression method.

This clause defines the PNG signature and the basic properties of chunks. Individual chunk types are discussed in clause 11: Chunk specifications.

The first eight bytes of a PNG datastream always contain the following (decimal) values:

137 80 78 71 13 10 26 10

This signature indicates that the remainder of the datastream contains a single PNG image, consisting of a series of chunks beginning with an IHDR chunk and ending with an IEND chunk.

Each chunk consists of three or four fields (see figure 5.1). The meaning of the fields is described in Table 5.1. The chunk data field may be empty.

Figure 5.1 — Chunk parts

Table 5.1 — Chunk fields Length A four-byte unsigned integer giving the number of bytes in the chunk's data field. The length counts only the data field, not itself, the chunk type, or the CRC. Zero is a valid length. Although encoders and decoders should treat the length as unsigned, its value shall not exceed 231-1 bytes. Chunk Type A sequence of four bytes defining the chunk type. Each byte of a chunk type is restricted to the decimal values 65 to 90 and 97 to 122. These correspond to the uppercase and lowercase ISO 646 letters ( A - Z and a - z ) respectively for convenience in description and examination of PNG datastreams. Encoders and decoders shall treat the chunk types as fixed binary values, not character strings. For example, it would not be correct to represent the chunk type IDAT by the equivalents of those letters in the UCS 2 character set. Additional naming conventions for chunk types are discussed in 5.4: Chunk naming conventions . Chunk Data The data bytes appropriate to the chunk type, if any. This field can be of zero length. CRC A four-byte CRC (Cyclic Redundancy Code) calculated on the preceding bytes in the chunk, including the chunk type field and chunk data fields, but not including the length field. The CRC can be used to check for corruption of the data. The CRC is always present, even for chunks containing no data. See 5.5: Cyclic Redundancy Code algorithm .

The chunk data length may be any number of bytes up to the maximum; therefore, implementors cannot assume that chunks are aligned on any boundaries larger than bytes.

Chunk types are chosen to be meaningful names when the bytes of the chunk type are interpreted as ISO 646 letters. Chunk types are assigned so that a decoder can determine some properties of a chunk even when the type is not recognized. These rules allow safe, flexible extension of the PNG format, by allowing a PNG decoder to decide what to do when it encounters an unknown chunk. (The chunk types standardized in this International Standard are defined in clause 11: Chunk specifications, and the way to add non-standard chunks is defined in clause 14: Editors and extensions.) The naming rules are normally of interest only when the decoder does not recognize the chunk's type.

Four bits of the chunk type, the property bits, namely bit 5 (value 32) of each byte, are used to convey chunk properties. This choice means that a human can read off the assigned properties according to whether the letter corresponding to each byte of the chunk type is uppercase (bit 5 is 0) or lowercase (bit 5 is 1). However, decoders should test the properties of an unknown chunk type by numerically testing the specified bits; testing whether a character is uppercase or lowercase is inefficient, and even incorrect if a locale-specific case definition is used.

The property bits are an inherent part of the chunk type, and hence are fixed for any chunk type. Thus, CHNK and cHNk would be unrelated chunk types, not the same chunk with different properties.

The semantics of the property bits are defined in Table 5.2.

Table 5.2 — Semantics of property bits Ancillary bit: first byte 0 (uppercase) = critical,

1 (lowercase) = ancillary. Critical chunks are necessary for successful display of the contents of the datastream, for example the image header chunk ( IHDR ). A decoder trying to extract the image, upon encountering an unknown chunk type in which the ancillary bit is 0, shall indicate to the user that the image contains information it cannot safely interpret.

Ancillary chunks are not strictly necessary in order to meaningfully display the contents of the datastream, for example the time chunk ( tIME ). A decoder encountering an unknown chunk type in which the ancillary bit is 1 can safely ignore the chunk and proceed to display the image. Private bit: second byte 0 (uppercase) = public,

1 (lowercase) = private. A public chunk is one that is defined in this International Standard or is registered in the list of PNG special-purpose public chunk types maintained by the Registration Authority (see 4.9 Extension and registration ). Applications can also define private (unregistered) chunk types for their own purposes. The names of private chunks have a lowercase second letter, while public chunks will always be assigned names with uppercase second letters. Decoders do not need to test the private-chunk property bit, since it has no functional significance; it is simply an administrative convenience to ensure that public and private chunk names will not conflict. See clause 14: Editors and extensions and 12.10.2: Use of private chunks . Reserved bit: third byte 0 (uppercase) in this version of PNG.

If the reserved bit is 1, the datastream does not conform to this version of PNG. The significance of the case of the third letter of the chunk name is reserved for possible future extension. In this International Standard, all chunk names shall have uppercase third letters. Safe-to-copy bit: fourth byte 0 (uppercase) = unsafe to copy,

1 (lowercase) = safe to copy. This property bit is not of interest to pure decoders, but it is needed by PNG editors. This bit defines the proper handling of unrecognized chunks in a datastream that is being modified. Rules for PNG editors are discussed further in 14.2: Behaviour of PNG editors .

EXAMPLE The hypothetical chunk type "cHNk" has the property bits:

cHNk <-- 32 bit chunk type represented in text form |||| |||+- Safe-to-copy bit is 1 (lower case letter; bit 5 is 1) ||+-- Reserved bit is 0 (upper case letter; bit 5 is 0) |+--- Private bit is 0 (upper case letter; bit 5 is 0) +---- Ancillary bit is 1 (lower case letter; bit 5 is 1)

Therefore, this name represents an ancillary, public, safe-to-copy chunk.

CRC fields are calculated using standardized CRC methods with pre and post conditioning, as defined by ISO 3309 [ISO-3309] and ITU-T V.42 [ITU-T-V42]. The CRC polynomial employed is

x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1

In PNG, the 32-bit CRC is initialized to all 1's, and then the data from each byte is processed from the least significant bit (1) to the most significant bit (128). After all the data bytes are processed, the CRC is inverted (its ones complement is taken). This value is transmitted (stored in the datastream) MSB first. For the purpose of separating into bytes and ordering, the least significant bit of the 32-bit CRC is defined to be the coefficient of the x31 term.

Practical calculation of the CRC often employs a precalculated table to accelerate the computation. See Annex D: Sample Cyclic Redundancy Code implementation.

The constraints on the positioning of the individual chunks are listed in Table 5.3 and illustrated diagrammatically in figure 5.2 and figure 5.3. These lattice diagrams represent the constraints on positioning imposed by this International Standard. The lines in the diagrams define partial ordering relationships. Chunks higher up shall appear before chunks lower down. Chunks which are horizontally aligned and appear between two other chunk types (higher and lower than the horizontally aligned chunks) may appear in any order between the two higher and lower chunk types to which they are connected. The superscript associated with the chunk type is defined in Table 5.4. It indicates whether the chunk is mandatory, optional, or may appear more than once. A vertical bar between two chunk types indicates alternatives.

Table 5.4 — Meaning of symbols used in lattice diagrams Symbol Meaning + One or more 1 Only one ? Zero or one * Zero or more | Alternative

Figure 5.2 — Lattice diagram: PNG images with PLTE in datastream

Figure 5.3 — Lattice diagram: PNG images without PLTE in datastream

As explained in 4.4: PNG image there are five types of PNG image. Corresponding to each type is a colour type, which is the sum of the following values: 1 (palette used), 2 (truecolour used) and 4 (alpha used). Greyscale and truecolour images may have an explicit alpha channel. The PNG image types and corresponding colour types are listed in Table 6.1.

Table 6.1 — PNG image types and colour types PNG image type Colour type Greyscale 0 Truecolour 2 Indexed-colour 3 Greyscale with alpha 4 Truecolour with alpha 6

The allowed bit depths and sample depths for each PNG image type are listed in 11.2.2: IHDR Image header.

Greyscale samples represent luminance if the transfer curve is indicated (by gAMA, sRGB, or iCCP) or device-dependent greyscale if not. RGB samples represent calibrated colour information if the colour space is indicated (by gAMA and cHRM, or sRGB, or iCCP) or uncalibrated device-dependent colour if not.

Sample values are not necessarily proportional to light intensity; the gAMA chunk specifies the relationship between sample values and display output intensity. Viewers are strongly encouraged to compensate properly. See 4.2: Colour spaces, 13.13: Decoder gamma handling and Annex C: Gamma and chromaticity.

In a PNG datastream transparency may be represented in one of four ways, depending on the PNG image type (see 4.3.2: Alpha separation and 4.3.5: Alpha compaction).

Truecolour with alpha, greyscale with alpha: an alpha channel is part of the image array. Truecolour, greyscale: A tRNS chunk contains a single pixel value distinguishing the fully transparent pixels from the fully opaque pixels. Indexed-colour: A tRNS chunk contains the alpha table that associates an alpha sample with each palette entry. Truecolour, greyscale, indexed-colour: there is no tRNS chunk present and all pixels are fully opaque.

An alpha channel included in the image array has 8-bit or 16-bit samples, the same size as the other samples. The alpha sample for each pixel is stored immediately following the greyscale or RGB samples of the pixel. An alpha value of zero represents full transparency, and a value of 2sampledepth - 1 represents full opacity. Intermediate values indicate partially transparent pixels that can be composited against a background image to yield the delivered image.

The colour values in a pixel are not premultiplied by the alpha value assigned to the pixel. This rule is sometimes called "unassociated" or "non-premultiplied" alpha. (Another common technique is to store sample values premultiplied by the alpha value; in effect, such an image is already composited against a black background. PNG does not use premultiplied alpha. In consequence an image editor can take a PNG image and easily change its transparency.) See 12.4: Alpha channel creation and 13.16: Alpha channel processing.

All integers that require more than one byte shall be in network byte order (as illustrated in figure 7.1): the most significant byte comes first, then the less significant bytes in descending order of significance (MSB LSB for two-byte integers, MSB B2 B1 LSB for four-byte integers). The highest bit (value 128) of a byte is numbered bit 7; the lowest bit (value 1) is numbered bit 0. Values are unsigned unless otherwise noted. Values explicitly noted as signed are represented in two's complement notation.

PNG four-byte unsigned integers are limited to the range 0 to 231-1 to accommodate languages that have difficulty with unsigned four-byte values. Similarly PNG four-byte signed integers are limited to the range -(231-1) to 231-1 to accommodate languages that have difficulty with the value -231.

Figure 7.1 — Integer representation in PNG

A PNG image (or pass, see clause 8: Interlacing and pass extraction) is a rectangular pixel array, with pixels appearing left-to-right within each scanline, and scanlines appearing top-to-bottom. The size of each pixel is determined by the number of bits per pixel.

Pixels within a scanline are always packed into a sequence of bytes with no wasted bits between pixels. Scanlines always begin on byte boundaries. Permitted bit depths and colour types are restricted so that in all cases the packing is simple and efficient.

In PNG images of colour type 0 (greyscale) each pixel is a single sample, which may have precision less than a byte (1, 2, or 4 bits). These samples are packed into bytes with the leftmost sample in the high-order bits of a byte followed by the other samples for the scanline.

In PNG images of colour type 3 (indexed-colour) each pixel is a single palette index. These indices are packed into bytes in the same way as the samples for colour type 0.

When there are multiple pixels per byte, some low-order bits of the last byte of a scanline may go unused. The contents of these unused bits are not specified.

PNG images that are not indexed-colour images may have sample values with a bit depth of 16. Such sample values are in network byte order (MSB first, LSB second). PNG permits multi-sample pixels only with 8 and 16-bit samples, so multiple samples of a single pixel are never packed into one byte.

PNG allows the scanline data to be filtered before it is compressed. Filtering can improve the compressibility of the data. The filter step itself results in a sequence of bytes of the same size as the incoming sequence, but in a different representation, preceded by a filter type byte. Filtering does not reduce the size of the actual scanline data. All PNG filters are strictly lossless.

Different filter types can be used for different scanlines, and the filter algorithm is specified for each scanline by a filter type byte. The filter type byte is not considered part of the image data, but it is included in the datastream sent to the compression step. An intelligent encoder can switch filters from one scanline to the next. The method for choosing which filter to employ is left to the encoder.

See clause 9: Filtering.

Pass extraction (see figure 4.8) splits a PNG image into a sequence of reduced images (the interlaced PNG image) where the first image defines a coarse view and subsequent images enhance this coarse view until the last image completes the PNG image. This allows progressive display of the interlaced PNG image by the decoder and allows images to "fade in" when they are being displayed on-the-fly. On average, interlacing slightly expands the datastream size, but it can give the user a meaningful display much more rapidly.

Two interlace methods are defined in this International Standard, methods 0 and 1. Other values of interlace method are reserved for future standardization (see 4.9: Extension and registration).

With interlace method 0, the null method, pixels are extracted sequentially from left to right, and scanlines sequentially from top to bottom. The interlaced PNG image is a single reduced image.

Interlace method 1, known as Adam7, defines seven distinct passes over the image. Each pass transmits a subset of the pixels in the reference image. The pass in which each pixel is transmitted (numbered from 1 to 7) is defined by replicating the following 8-by-8 pattern over the entire image, starting at the upper left corner:

1 6 4 6 2 6 4 6 7 7 7 7 7 7 7 7 5 6 5 6 5 6 5 6 7 7 7 7 7 7 7 7 3 6 4 6 3 6 4 6 7 7 7 7 7 7 7 7 5 6 5 6 5 6 5 6 7 7 7 7 7 7 7 7

Figure 4.8 shows the seven passes of interlace method 1. Within each pass, the selected pixels are transmitted left to right within a scanline, and selected scanlines sequentially from top to bottom. For example, pass 2 contains pixels 4, 12, 20, etc. of scanlines 0, 8, 16, etc. (where scanline 0, pixel 0 is the upper left corner). The last pass contains all of scanlines 1, 3, 5, etc. The transmission order is defined so that all the scanlines transmitted in a pass will have the same number of pixels; this is necessary for proper application of some of the filters. The interlaced PNG image consists of a sequence of seven reduced images. For example, if the PNG image is 16 by 16 pixels, then the third pass will be a reduced image of two scanlines, each containing four pixels (see figure 4.8).

Scanlines that do not completely fill an integral number of bytes are padded as defined in 7.2: Scanlines.

NOTE If the reference image contains fewer than five columns or fewer than five rows, some passes will be empty.

Filtering transforms the PNG image with the goal of improving compression. PNG allows for a number of filter methods. All the reduced images in an interlaced image shall use a single filter method. Only filter method 0 is defined by this International Standard. Other filter methods are reserved for future standardization (see 4.9 Extension and registration). Filter method 0 provides a set of five filter types, and individual scanlines in each reduced image may use different filter types.

PNG imposes no additional restriction on which filter types can be applied to an interlaced PNG image. However, the filter types are not equally effective on all types of data. See 12.8: Filter selection.

Filtering transforms the byte sequence in a scanline to an equal length sequence of bytes preceded by the filter type. Filter type bytes are associated only with non-empty scanlines. No filter type bytes are present in an empty pass. See 13.8: Interlacing and progressive display.

Filters are applied to bytes, not to pixels, regardless of the bit depth or colour type of the image. The filters operate on the byte sequence formed by a scanline that has been represented as described in 7.2: Scanlines. If the image includes an alpha channel, the alpha data is filtered in the same way as the image data.

Filters may use the original values of the following bytes to generate the new byte value:

x the byte being filtered; a the byte corresponding to x in the pixel immediately before the pixel containing x (or the byte immediately before x, when the bit depth is less than 8); b the byte corresponding to x in the previous scanline; c the byte corresponding to b in the pixel immediately before the pixel containing b (or the byte immediately before b, when the bit depth is less than 8).

Figure 9.1 shows the relative positions of the bytes x , a , b , and c .

PNG filter method 0 defines five basic filter types as listed in Table 9.1. Orig(y) denotes the orginal (unfiltered) value of byte y . Filt(y) denotes the value after a filter has been applied. Recon(y) denotes the value after the corresponding reconstruction function has been applied. The filter function for the Paeth type PaethPredictor is defined below.

Filter method 0 specifies exactly this set of five filter types and this shall not be extended. This ensures that decoders need not decompress the data to determine whether it contains unsupported filter types: it is sufficient to check the filter method in IHDR.

Table 9.1 — Filter types Type Name Filter Function Reconstruction Function 0 None Filt(x) = Orig(x) Recon(x) = Filt(x) 1 Sub Filt(x) = Orig(x) - Orig(a) Recon(x) = Filt(x) + Recon(a) 2 Up Filt(x) = Orig(x) - Orig(b) Recon(x) = Filt(x) + Recon(b) 3 Average Filt(x) = Orig(x) - floor((Orig(a) + Orig(b)) / 2) Recon(x) = Filt(x) + floor((Recon(a) + Recon(b)) / 2) 4 Paeth Filt(x) = Orig(x) - PaethPredictor(Orig(a), Orig(b), Orig(c)) Recon(x) = Filt(x) + PaethPredictor(Recon(a), Recon(b), Recon(c))

For all filters, the bytes "to the left of" the first pixel in a scanline shall be treated as being zero. For filters that refer to the prior scanline, the entire prior scanline and bytes "to the left of" the first pixel in the prior scanline shall be treated as being zeroes for the first scanline of a reduced image.

To reverse the effect of a filter requires the decoded values of the prior pixel on the same scanline, the pixel immediately above the current pixel on the prior scanline, and the pixel just to the left of the pixel above.

Unsigned arithmetic modulo 256 is used, so that both the inputs and outputs fit into bytes. Filters are applied to each byte regardless of bit depth. The sequence of Filt values is transmitted as the filtered scanline.

The sum Orig(a) + Orig(b) shall be performed without overflow (using at least nine-bit arithmetic). floor() indicates that the result of the division is rounded to the next lower integer if fractional; in other words, it is an integer division or right shift operation.

The Paeth filter function computes a simple linear function of the three neighbouring pixels (left, above, upper left), then chooses as predictor the neighbouring pixel closest to the computed value. The algorithm used in this International Standard is an adaptation of the technique due to Alan W. Paeth [PAETH].

The PaethPredictor function is defined in the code below. The logic of the function and the locations of the bytes a , b , c , and x are shown in figure 9.1. Pr is the predictor for byte x .

p = a + b - c pa = abs(p - a) pb = abs(p - b) pc = abs(p - c) if pa <= pb and pa <= pc then Pr = a else if pb <= pc then Pr = b else Pr = c return Pr

Figure 9.1: The PaethPredictor function

The calculations within the PaethPredictor function shall be performed exactly, without overflow.

The order in which the comparisons are performed is critical and shall not be altered. The function tries to establish in which of the three directions (vertical, horizontal, or diagonal) the gradient of the image is smallest.

Exactly the same PaethPredictor function is used by both encoder and decoder.

Only PNG compression method 0 is defined by this International Standard. Other values of compression method are reserved for future standardization (see 4.9: Extension and registration). PNG compression method 0 is deflate/inflate compression with a sliding window (which is an upper bound on the distances appearing in the deflate stream) of at most 32768 bytes. Deflate compression is an LZ77 derivative [ZL].

Deflate-compressed datastreams within PNG are stored in the "zlib" format, which has the structure:

zlib compression method/flags code 1 byte Additional flags/check bits 1 byte Compressed data blocks n bytes Check value 4 bytes

Further details on this format are given in the zlib specification [RFC-1950].

For PNG compression method 0, the zlib compression method/flags code shall specify method code 8 (deflate compression) and an LZ77 window size of not more than 32768 bytes. The zlib compression method number is not the same as the PNG compression method number in the IHDR chunk (see 11.2.2 IHDR Image header). The additional flags shall not specify a preset dictionary.

If the data to be compressed contain 16384 bytes or fewer, the PNG encoder may set the window size by rounding up to a power of 2 (256 minimum). This decreases the memory required for both encoding and decoding, without adversely affecting the compression ratio.

The compressed data within the zlib datastream are stored as a series of blocks, each of which can represent raw (uncompressed) data, LZ77-compressed data encoded with fixed Huffman codes, or LZ77-compressed data encoded with custom Huffman codes. A marker bit in the final block identifies it as the last block, allowing the decoder to recognize the end of the compressed datastream. Further details on the compression algorithm and the encoding are given in the deflate specification [RFC-1951].

The check value stored at the end of the zlib datastream is calculated on the uncompressed data represented by the datastream. The algorithm used to calculate this is not the same as the CRC calculation used for PNG chunk CRC field values. The zlib check value is useful mainly as a cross-check that the deflate and inflate algorithms are implemented correctly. Verifying the individual PNG chunk CRCs provides confidence that the PNG datastream has been transmitted undamaged.

The sequence of filtered scanlines is compressed and the resulting data stream is split into IDAT chunks. The concatenation of the contents of all the IDAT chunks makes up a zlib datastream. This datastream decompresses to filtered image data.

It is important to emphasize that the boundaries between IDAT chunks are arbitrary and can fall anywhere in the zlib datastream. There is not necessarily any correlation between IDAT chunk boundaries and deflate block boundaries or any other feature of the zlib data. For example, it is entirely possible for the terminating zlib check value to be split across IDAT chunks.

Similarly, there is no required correlation between the structure of the image data (i.e., scanline boundaries) and deflate block boundaries or IDAT chunk boundaries. The complete filtered PNG image is represented by a single zlib datastream that is stored in a number of IDAT chunks.

PNG also uses compression method 0 in iTXt, iCCP, and zTXt chunks. Unlike the image data, such datastreams are not split across chunks; each such chunk contains an independent zlib datastream (see 10.1: Compression method 0).

The PNG datastream consists of a PNG signature (see 5.2: PNG signature) followed by a sequence of chunks. Each chunk has a chunk type which specifies its function. This clause defines the PNG chunk types standardized in this International Standard. The PNG datastream structure is defined in clause 5: Datastream structure. This also defines the order in which chunks may appear. For details specific to encoders see 12.11: Chunking. For details specific to decoders see 13.5: Chunking.

Critical chunks are those chunks that are absolutely required in order to successfully decode a PNG image from a PNG datastream. Extension chunks may be defined as critical chunks (see clause 14: Editors and extensions), though this practice is strongly discouraged.

A valid PNG datastream shall begin with a PNG signature, immediately followed by an IHDR chunk, then one or more IDAT chunks, and shall end with an IEND chunk. Only one IHDR chunk and one IEND chunk are allowed in a PNG datastream.

The four-byte chunk type field contains the decimal values

73 72 68 82

The IHDR chunk shall be the first chunk in the PNG datastream. It contains:

Width 4 bytes Height 4 bytes Bit depth 1 byte Colour type 1 byte Compression method 1 byte Filter method 1 byte Interlace method 1 byte

Width and height give the image dimensions in pixels. They are PNG four-byte unsigned integers. Zero is an invalid value.

Bit depth is a single-byte integer giving the number of bits per sample or per palette index (not per pixel). Valid values are 1, 2, 4, 8, and 16, although not all values are allowed for all colour types. See 6.1: Colour types and values.

Colour type is a single-byte integer that defines the PNG image type. Valid values are 0, 2, 3, 4, and 6.

Bit depth restrictions for each colour type are imposed to simplify implementations and to prohibit combinations that do not compress well. The allowed combinations are defined in Table 11.1.

Table 11.1 — Allowed combinations of colour type and bit depth PNG image type Colour type Allowed bit depths Interpretation Greyscale 0 1, 2, 4, 8, 16 Each pixel is a greyscale sample Truecolour 2 8, 16 Each pixel is an R,G,B triple Indexed-colour 3 1, 2, 4, 8 Each pixel is a palette index; a PLTE chunk shall appear. Greyscale with alpha 4 8, 16 Each pixel is a greyscale sample followed by an alpha sample. Truecolour with alpha 6 8, 16 Each pixel is an R,G,B triple followed by an alpha sample.

The sample depth is the same as the bit depth except in the case of indexed-colour PNG images (colour type 3), in which the sample depth is always 8 bits (see 4.4: PNG image).

Compression method is a single-byte integer that indicates the method used to compress the image data. Only compression method 0 (deflate/inflate compression with a sliding window of at most 32768 bytes) is defined in this International Standard. All conforming PNG images shall be compressed with this scheme.

Filter method is a single-byte integer that indicates the preprocessing method applied to the image data before compression. Only filter method 0 (adaptive filtering with five basic filter types) is defined in this International Standard. See clause 9: Filtering for details.

Interlace method is a single-byte integer that indicates the transmission order of the image data. Two values are defined in this International Standard: 0 (no interlace) or 1 (Adam7 interlace). See clause 8: Interlacing and pass extraction for details.

The four-byte chunk type field contains the decimal values

80 76 84 69

The PLTE chunk contains from 1 to 256 palette entries, each a three-byte series of the form:

Red 1 byte Green 1 byte Blue 1 byte

The number of entries is determined from the chunk length. A chunk length not divisible by 3 is an error.

This chunk shall appear for colour type 3, and may appear for colour types 2 and 6; it shall not appear for colour types 0 and 4. There shall not be more than one PLTE chunk.

For colour type 3 (indexed-colour), the PLTE chunk is required. The first entry in PLTE is referenced by pixel value 0, the second by pixel value 1, etc. The number of palette entries shall not exceed the range that can be represented in the image bit depth (for example, 24 = 16 for a bit depth of 4). It is permissible to have fewer entries than the bit depth would allow. In that case, any out-of-range pixel value found in the image data is an error.

For colour types 2 and 6 (truecolour and truecolour with alpha), the PLTE chunk is optional. If present, it provides a suggested set of colours (from 1 to 256) to which the truecolour image can be quantized if it cannot be displayed directly. It is, however, recommended that the sPLT chunk be used for this purpose, rather than the PLTE chunk. If neither PLTE nor sPLT chunks are present and the image cannot be displayed directly, quantization has to be done by the viewing system. However, it is often preferable for the selection of colours to be done once by the PNG encoder. (See 12.6: Suggested palettes.)

Note that the palette uses 8 bits (1 byte) per sample regardless of the image bit depth. In particular, the palette is 8 bits deep even when it is a suggested quantization of a 16-bit truecolour image.

There is no requirement that the palette entries all be used by the image, nor that they all be different.

The four-byte chunk type field contains the decimal values

73 68 65 84

The IDAT chunk contains the actual image data which is the output stream of the compression algorithm. See clause 9: Filtering and clause 10: Compression for details.

There may be multiple IDAT chunks; if so, they shall appear consecutively with no other intervening chunks. The compressed datastream is then the concatenation of the contents of the data fields of all the IDAT chunks.

The four-byte chunk type field contains the decimal values

73 69 78 68

The IEND chunk marks the end of the PNG datastream. The chunk's data field is empty.

The ancillary chunks defined in this International Standard are listed in the order in 4.7.2: Chunk types. This is not the order in which they appear in a PNG datastream. Ancillary chunks may be ignored by a decoder. For each ancillary chunk, the actions described are under the assumption that the decoder is not ignoring the chunk.

The four-byte chunk type field contains the decimal values

116 82 78 83

The tRNS chunk specifies either alpha values that are associated with palette entries (for indexed-colour images) or a single transparent colour (for greyscale and truecolour images). The tRNS chunk contains:

Colour type 0 Grey sample value 2 bytes Colour type 2 Red sample value 2 bytes Blue sample value 2 bytes Green sample value 2 bytes Colour type 3 Alpha for palette index 0 1 byte Alpha for palette index 1 1 byte ...etc... 1 byte

For colour type 3 (indexed-colour), the tRNS chunk contains a series of one-byte alpha values, corresponding to entries in the PLTE chunk. Each entry indicates that pixels of the corresponding palette index shall be treated as having the specified alpha value. Alpha values have the same interpretation as in an 8-bit full alpha channel: 0 is fully transparent, 255 is fully opaque, regardless of image bit depth. The tRNS chunk shall not contain more alpha values than there are palette entries, but a tRNS chunk may contain fewer values than there are palette entries. In this case, the alpha value for all remaining palette entries is assumed to be 255. In the common case in which only palette index 0 need be made transparent, only a one-byte tRNS chunk is needed, and when all palette indices are opaque, the tRNS chunk may be omitted.

For colour types 0 or 2, two bytes per sample are used regardless of the image bit depth (see 7.1: Integers and byte order). Pixels of the specified grey sample value or RGB sample values are treated as transparent (equivalent to alpha value 0); all other pixels are to be treated as fully opaque (alpha value 2bitdepth-1). If the image bit depth is less than 16, the least significant bits are used and the others are 0.

A tRNS chunk shall not appear for colour types 4 and 6, since a full alpha channel is already present in those cases.

NOTE For 16-bit greyscale or truecolour data, only pixels matching the entire 16-bit values in tRNS chunks are transparent. Decoders have to postpone any sample depth rescaling until after the pixels have been tested for transparency.

The four-byte chunk type field contains the decimal values

99 72 82 77

The cHRM chunk may be used to specify the 1931 CIE x,y chromaticities of the red, green, and blue display primaries used in the image, and the referenced white point. See Annex C: Gamma and chromaticity for more information. The iCCP and sRGB chunks provide more sophisticated support for colour management and control.

The cHRM chunk contains:

White point x 4 bytes White point y 4 bytes Red x 4 bytes Red y 4 bytes Green x 4 bytes Green y 4 bytes Blue x 4 bytes Blue y 4 bytes

Each value is encoded as a four-byte PNG unsigned integer, representing the x or y value times 100000.

EXAMPLE A value of 0.3127 would be stored as the integer 31270.

The cHRM chunk is allowed in all PNG datastreams, although it is of little value for greyscale images.

An sRGB chunk or iCCP chunk, when present and recognized, overrides the cHRM chunk.

The four-byte chunk type field contains the decimal values

103 65 77 65

The gAMA chunk specifies the relationship between the image samples and the desired display output intensity. Gamma is defined in 3.1.20: gamma.

In fact specifying the desired display output intensity is insufficient. It is also necessary to specify the viewing conditions under which the output is desired. For gAMA these are the reference viewing conditions of the sRGB specification [IEC 61966-2-1], which are based on ISO 3664 [ISO-3664]. Adjustment for different viewing conditions is normally handled by a Colour Management System. If the adjustment is not performed, the error is usually small. Applications desiring high colour fidelity may wish to use an sRGB chunk or iCCP chunk.

The gAMA chunk contains:

Image gamma 4 bytes

The value is encoded as a four-byte PNG unsigned integer, representing gamma times 100000.

EXAMPLE A gamma of 1/2.2 would be stored as the integer 45455.

See 12.2: Encoder gamma handling and 13.13: Decoder gamma handling for more information.

An sRGB chunk or iCCP chunk, when present and recognized, overrides the gAMA chunk.

The four-byte chunk type field contains the decimal values

105 67 67 80

The iCCP chunk contains:

Profile name 1-79 bytes (character string) Null separator 1 byte (null character) Compression method 1 byte Compressed profile n bytes

The profile name may be any convenient name for referring to the profile. It is case-sensitive. Profile names shall contain only printable Latin-1 characters and spaces (only character codes 32-126 and 161-255 decimal are allowed). Leading, trailing, and consecutive spaces are not permitted. The only compression method defined in this International Standard is method 0 (zlib datastream with deflate compression, see 10.3: Other uses of compression). The compression method entry is followed by a compressed profile that makes up the remainder of the chunk. Decompression of this datastream yields the embedded ICC profile.

If the iCCP chunk is present, the image samples conform to the colour space represented by the embedded ICC profile as defined by the International Color Consortium [ICC]. The colour space of the ICC profile shall be an RGB colour space for colour images (PNG colour types 2, 3, and 6), or a greyscale colour space for greyscale images (PNG colour types 0 and 4). A PNG encoder that writes the iCCP chunk is encouraged to also write gAMA and cHRM chunks that approximate the ICC profile, to provide compatibility with applications that do not use the iCCP chunk. When the iCCP chunk is present, PNG decoders that recognize it and are capable of colour management [ICC] shall ignore the gAMA and cHRM chunks and use the iCCP chunk instead and interpret it according to [ICC-1] and [ICC-1A]. PNG decoders that are used in an environment that is incapable of full-fledged colour management should use the gAMA and cHRM chunks if present.

A PNG datastream should contain at most one embedded profile, whether specified explicitly with an iCCP chunk or implicitly with an sRGB chunk.

The four-byte chunk type field contains the decimal values

115 66 73 84

To simplify decoders, PNG specifies that only certain sample depths may be used, and further specifies that sample values should be scaled to the full range of possible values at the sample depth. The sBIT chunk defines the original number of significant bits (which can be less than or equal to the sample depth). This allows PNG decoders to recover the original data losslessly even if the data had a sample depth not directly supported by PNG.

The sBIT chunk contains:

Colour type 0 significant greyscale bits 1 byte Colour types 2 and 3 significant red bits 1 byte significant green bits 1 byte significant blue bits 1 byte Colour type 4 significant greyscale bits 1 byte significant alpha bits 1 byte Colour type 6 significant red bits 1 byte significant green bits 1 byte significant blue bits 1 byte significant alpha bits 1 byte

Each depth specified in sBIT shall be greater than zero and less than or equal to the sample depth (which is 8 for indexed-colour images, and the bit depth given in IHDR for other colour types). Note that sBIT does not provide a sample depth for the alpha channel that is implied by a tRNS chunk; in that case, all of the sample bits of the alpha channel are to be treated as significant. If the sBIT chunk is not present, then all of the sample bits of all channels are to be treated as significant.

The four-byte chunk type field contains the decimal values

115 82 71 66

If the sRGB chunk is present, the image samples conform to the sRGB colour space [IEC 61966-2-1] and should be displayed using the specified rendering intent defined by the International Color Consortium [ICC-1] and [ICC-1A].

The sRGB chunk contains:

Rendering intent 1 byte

The following values are defined for rendering intent:

0 Perceptual for images preferring good adaptation to the output device gamut at the expense of colorimetric accuracy, such as photographs. 1 Relative colorimetric for images requiring colour appearance matching (relative to the output device white point), such as logos. 2 Saturation for images preferring preservation of saturation at the expense of hue and lightness, such as charts and graphs. 3 Absolute colorimetric for images requiring preservation of absolute colorimetry, such as previews of images destined for a different output device (proofs).

It is recommended that a PNG encoder that writes the sRGB chunk also write a gAMA chunk (and optionally a cHRM chunk) for compatibility with decoders that do not use the sRGB chunk. Only the following values shall be used.

gAMA Gamma 45455 cHRM White point x 31270 White point y 32900 Red x 64000 Red y 33000 Green x 30000 Green y 60000 Blue x 15000 Blue y 6000

When the sRGB chunk is present, it is recommended that decoders that recognize it and are capable of colour management [ICC] ignore the gAMA and cHRM chunks and use the sRGB chunk instead. Decoders that recognize the sRGB chunk but are not capable of colour management [ICC] are recommended to ignore the gAMA and cHRM chunks, and use the values given above as if they had appeared in gAMA and cHRM chunks.

It is recommended that the sRGB and iCCP chunks do not both appear in a PNG datastream.

PNG provides the tEXt, iTXt, and zTXt chunks for storing text strings associated with the image, such as an image description or copyright notice. Keywords are used to indicate what each text string represents. Any number of such text chunks may appear, and more than one with the same keyword is permitted.

The following keywords are predefined and should be used where appropriate.

Title Short (one line) title or caption for image Author Name of image's creator Description Description of image (possibly long) Copyright Copyright notice Creation Time Time of original image creation Software Software used to create the image Disclaimer Legal disclaimer Warning Warning of nature of content Source Device used to create the image Comment Miscellaneous comment

Other keywords may be defined for other purposes. Keywords of general interest can be registered with the PNG Registration Authority (see 4.9 Extension and registration). It is also permitted to use private unregistered keywords. (Private keywords should be reasonably self-explanatory, in order to minimize the chance that the same keyword is used for incompatible purposes by different people.)

Keywords shall contain only printable Latin-1 [ISO-8859-1] characters and spaces; that is, only character codes 32-126 and 161-255 decimal are allowed. To reduce the chances for human misreading of a keyword, leading spaces, trailing spaces, and consecutive spaces are not permitted in keywords, nor is the non-breaking space (code 160) since it is visually indistinguishable from an ordinary space.

Keywords shall be spelled exactly as registered, so that decoders can use simple literal comparisons when looking for particular keywords. In particular, keywords are considered case-sensitive. Keywords are restricted to 1 to 79 bytes in length.

For the Creation Time keyword, the date format defined in section 5.2.14 of RFC 1123 is suggested, but not required [RFC-1123].

In the tEXt and zTXt chunks, the text string associated with a keyword is restricted to the Latin-1 character set plus the linefeed character. Text strings in zTXt are compressed into zlib datastreams using deflate compression (see 10.3: Other uses of compression). The iTXt chunk can be used to convey characters outside the Latin-1 set. It uses the UTF-8 encoding of UCS [ISO/IEC 10646-1] . There is an option to compress text strings in the iTXt chunk.

The four-byte chunk type field contains the decimal values

116 69 88 116

Each tEXt chunk contains a keyword and a text string, in the format:

Keyword 1-79 bytes (character string) Null separator 1 byte (null character) Text string 0 or more bytes (character string)

The keyword and text string are separated by a zero byte (null character). Neither the keyword nor the text string may contain a null character. The text string is not null-terminated (the length of the chunk defines the ending). The text string may be of any length from zero bytes up to the maximum permissible chunk size less the length of the keyword and null character separator.

The keyword indicates the type of information represented by the text string as described in 11.3.4.2: Keywords and text strings.

Text is interpreted according to the Latin-1 character set [ISO-8859-1]. The text string may contain any Latin-1 character. Newlines in the text string should be represented by a single linefeed character (decimal 10). Characters other than those defined in Latin-1 plus the linefeed character have no defined meaning in tEXt chunks. Text containing characters outside the repertoire of ISO/IEC 8859-1 should be encoded using the iTXt chunk.

The four-byte chunk type field contains the decimal values

122 84 88 116

The zTXt and tEXt chunks are semantically equivalent, but the zTXt chunk is recommended for storing large blocks of text.

A zTXt chunk contains:

Keyword 1-79 bytes (character string) Null separator 1 byte (null character) Compression method 1 byte Compressed text datastream n bytes

The keyword and null character are the same as in the tEXt chunk (see 11.3.4.3: tEXt Textual data). The keyword is not compressed. The compression method entry defines the compression method used. The only value defined in this International Standard is 0 (deflate/inflate compression). Other values are reserved for future standardization (see 4.9 Extension and registration). The compression method entry is followed by the compressed text datastream that makes up the remainder of the chunk. For compression method 0, this datastream is a zlib datastream with deflate compression (see 10.3: Other uses of compression). Decompression of this datastream yields Latin-1 text that is identical to the text that would be stored in an equivalent tEXt chunk.

The four-byte chunk type field contains the decimal values

105 84 88 116

An iTXt chunk contains:

Keyword 1-79 bytes (character string) Null separator 1 byte (null character) Compression flag 1 byte Compression method 1 byte Language tag 0 or more bytes (character string) Null separator 1 byte (null character) Translated keyword 0 or more bytes Null separator 1 byte (null character) Text 0 or more bytes

The keyword is described in 11.3.4.2: Keywords and text strings.

The compression flag is 0 for uncompressed text, 1 for compressed text. Only the text field may be compressed. The compression method entry defines the compression method used. The only compression method defined in this International Standard is 0 (zlib datastream with deflate compression, see 10.3: Other uses of compression). For uncompressed text, encoders shall set the compression method to 0, and decoders shall ignore it.

The language tag defined in [RFC-3066] indicates the human language used by the translated keyword and the text. Unlike the keyword, the language tag is case-insensitive. It is an ISO 646.IRV:1991 [ISO 646] string consisting of hyphen-separated words of 1-8 alphanumeric characters each (for example cn, en-uk, no-bok, x-klingon, x-KlInGoN). If the first word is two or three letters long, it is an ISO language code [ISO-639]. If the language tag is empty, the language is unspecified.

The translated keyword and text both use the UTF-8 encoding of UCS [ISO/IEC 10646-1], and neither shall contain a zero byte (null character). The text, unlike other textual data in this chunk, is not null-terminated; its length is derived from the chunk length.

Line breaks should not appear in the translated keyword. In the text, a newline should be represented by a single linefeed character (decimal 10). The remaining control characters (1-9, 11-31, 127-159) are discouraged in both the translated keyword and text. In UTF-8 there is a difference between the characters 128-159 (which are discouraged) and the bytes 128-159 (which are often necessary).

The translated keyword, if not empty, should contain a translation of the keyword into the language indicated by the language tag, and applications displaying the keyword should display the translated keyword in addition.

The four-byte chunk type field contains the decimal values

98 75 71 68

The bKGD chunk specifies a default background colour to present the image against. If there is any other preferred background, either user-specified or part of a larger page (as in a browser), the bKGD chunk should be ignored. The bKGD chunk contains:

Colour types 0 and 4 Greyscale 2 bytes Colour types 2 and 6 Red 2 bytes Green 2 bytes Blue 2 bytes Colour type 3 Palette index 1 byte

For colour type 3 (indexed-colour), the value is the palette index of the colour to be used as background.

For colour types 0 and 4 (greyscale, greyscale with alpha), the value is the grey level to be used as background in the range 0 to (2bitdepth)-1. For colour types 2 and 6 (truecolour, truecolour with alpha), the values are the colour to be used as background, given as RGB samples in the range 0 to (2bitdepth)-1. In each case, for consistency, two bytes per sample are used regardless of the image bit depth. If the image bit depth is less than 16, the least significant bits are used and the others are 0.

The four-byte chunk type field contains the decimal values

104 73 83 84

The hIST chunk contains a series of two-byte (16-bit) unsigned integers:

Frequency 2 bytes (unsigned integer) ...etc...

The hIST chunk gives the approximate usage frequency of each colour in the palette. A histogram chunk can appear only when a PLTE chunk appears. If a viewer is unable to provide all the colours listed in the palette, the histogram may help it decide how to choose a subset of the colours for display.

There shall be exactly one entry for each entry in the PLTE chunk. Each entry is proportional to the fraction of pixels in the image that have that palette index; the exact scale factor is chosen by the encoder.

Histogram entries are approximate, with the exception that a zero entry specifies that the corresponding palette entry is not used at all in the image. A histogram entry shall be nonzero if there are any pixels of that colour.

NOTE When the palette is a suggested quantization of a truecolour image, the histogram is necessarily approximate, since a decoder may map pixels to palette entries differently than the encoder did. In this situation, zero entries should not normally appear, because any entry might be used.

The four-byte chunk type field contains the decimal values

112 72 89 115

The pHYs chunk specifies the intended pixel size or aspect ratio for display of the image. It contains:

Pixels per unit, X axis 4 bytes (PNG unsigned integer) Pixels per unit, Y axis 4 bytes (PNG unsigned integer) Unit specifier 1 byte

The following values are defined for the unit specifier:

0 unit is unknown 1 unit is the metre

When the unit specifier is 0, the pHYs chunk defines pixel aspect ratio only; the actual size of the pixels remains unspecified.

If the pHYs chunk is not present, pixels are assumed to be square, and the physical size of each pixel is unspecified.

The four-byte chunk type field contains the decimal values

115 80 76 84

The sPLT chunk contains:

Palette name 1-79 bytes (character string) Null separator 1 byte (null character) Sample depth 1 byte Red 1 or 2 bytes Green 1 or 2 bytes Blue 1 or 2 bytes Alpha 1 or 2 bytes Frequency 2 bytes ...etc...

Each palette entry is six bytes or ten bytes containing five unsigned integers (red, blue, green, alpha, and frequency).

There may be any number of entries. A PNG decoder determines the number of entries from the length of the chunk remaining after the sample depth byte. This shall be divisible by 6 if the sPLT sample depth is 8, or by 10 if the sPLT sample depth is 16. Entries shall appear in decreasing order of frequency. There is no requirement that the entries all be used by the image, nor that they all be different.

The palette name can be any convenient name for referring to the palette (for example "256 colour including Macintosh default", "256 colour including Windows-3.1 default", "Optimal 512"). The palette name may aid the choice of the appropriate suggested palette when more than one appears in a PNG datastream.

The palette name is case-sensitive, and subject to the same restrictions as the keyword parameter for the tEXt chunk. Palette names shall contain only printable Latin-1 characters and spaces (only character codes 32-126 and 161-255 decimal are allowed). Leading, trailing, and consecutive spaces are not permitted.

The sPLT sample depth shall be 8 or 16.

The red, green, blue, and alpha samples are either one or two bytes each, depending on the sPLT sample depth, regardless of the image bit depth. The colour samples are not premultiplied by alpha, nor are they precomposited against any background. An alpha value of 0 means fully transparent. An alpha value of 255 (when the sPLT sample depth is 8) or 65535 (when the sPLT sample depth is 16) means fully opaque. The sPLT chunk may appear for any PNG colour type. Entries in sPLT use the same gamma and chromaticity values as the PNG image, but may fall outside the range of values used in the colour space of the PNG image; for example, in a greyscale PNG image, each sPLT entry would typically have equal red, green, and blue values, but this is not required. Similarly, sPLT entries can have non-opaque alpha values even when the PNG image does not use transparency.

Each frequency value is proportional to the fraction of the pixels in the image for which that palette entry is the closest match in RGBA space, before the image has been composited against any background. The exact scale factor is chosen by the PNG encoder; it is recommended that the resulting range of individual values reasonably fills the range 0 to 65535. A PNG encoder may artificially inflate the frequencies for colours considered to be "important", for example the colours used in a logo or the facial features of a portrait. Zero is a valid frequency meaning that the colour is "least important" or that it is rarely, if ever, used. When all the frequencies are zero, they are meaningless, that is to say, nothing may be inferred about the actual frequencies with which the colours appear in the PNG image.

Multiple sPLT chunks are permitted, but each shall have a different palette name.

The four-byte chunk type field contains the decimal values

116 73 77 69

The tIME chunk gives the time of the last image modification (not the time of initial image creation). It contains:

Year 2 bytes (complete; for example, 1995, not 95) Month 1 byte (1-12) Day 1 byte (1-31) Hour 1 byte (0-23) Minute 1 byte (0-59) Second 1 byte (0-60) (to allow for leap seconds)

Universal Time (UTC) should be specified rather than local time.

The tIME chunk is intended for use as an automatically-applied time stamp that is updated whenever the image data are changed.

This clause gives requirements and recommendations for encoder behaviour. A PNG encoder shall produce a PNG datastream from a PNG image that conforms to the format specified in the preceding clauses. Best results will usually be achieved by following the additional recommendations given here.

See Annex C: Gamma and chromaticity for a brief introduction to gamma issues.

PNG encoders capable of full colour management [ICC] will perform more sophisticated calculations than those described here and may choose to use the iCCP chunk. If it is known that the image samples conform to the sRGB specification [IEC 61966-2-1], encoders are strongly encouraged to write the sRGB chunk without performing additional gamma handling. In both cases it is recommended that an appropriate gAMA chunk be generated for use by PNG decoders that do not recognize the iCCP chunk or sRGB chunk.

A PNG encoder has to determine:

what value to write in the gAMA chunk; how to transform the provided image samples into the values to be written in the PNG datastream.

The value to write in the gAMA chunk is that value which causes a PNG decoder to behave in the desired way. See 13.13: Decoder gamma handling.

The transform to be applied depends on the nature of the image samples and their precision. If the samples represent light intensity in floating-point or high precision integer form (perhaps from a computer graphics renderer), the encoder may perform "gamma encoding" (applying a power function with exponent less than 1) before quantizing the data to integer values for inclusion in the PNG datastream. This results in fewer banding artifacts at a given sample depth, or allows smaller samples while retaining the same visual quality. An intensity level expressed as a floating-point value in the range 0 to 1 can be converted to a datastream image sample by:

integer_sample = floor((2sampledepth-1) * intensityencoding_exponent + 0.5)

If the intensity in the equation is the desired output intensity, the encoding exponent is the gamma value to be used in the gAMA chunk.

If the intensity available to the PNG encoder is the original scene intensity, another transformation may be needed. There is sometimes a requirement for the displayed image to have higher contrast than the original source image. This corresponds to an end-to-end transfer function from original scene to display output with an exponent greater than 1. In this case:

gamma = encoding_exponent/end_to_end_exponent

If it is not known whether the conditions under which the original image was captured or calculated warrant such a contrast change, it may be assumed that the display intensities are proportional to original scene intensities, i.e. the end-to-end exponent is 1 and hence:

gamma = encoding_exponent

If the image is being written to a datastream only, the encoder is free to choose the encoding exponent. Choosing a value that causes the gamma value in the gAMA chunk to be 1/2.2 is often a reasonable choice because it minimizes the work for a PNG decoder displaying on a typical video monitor.

Some image renderers may simultaneously write the image to a PNG datastream and display it on-screen. The displayed pixels should be gamma corrected for the display system and viewing conditions in use, so that the user sees a proper representation of the intended scene.

If the renderer wants to write the displayed sample values to the PNG datastream, avoiding a separate gamma encoding step for the datastream, the renderer should approximate the transfer function of the display system by a power function, and write the reciprocal of the exponent into the gAMA chunk. This will allow a PNG decoder to reproduce what was displayed on screen for the originator during rendering.

However, it is equally reasonable for a renderer to compute displayed pixels appropriate for the display device, and to perform separate gamma encoding for data storage and transmission, arranging to have a value in the gAMA chunk more appropriate to the future use of the image.

Computer graphics renderers often do not perform gamma encoding, instead making sample values directly proportional to scene light intensity. If the PNG encoder receives sample values that have already been quantized into integer values, there is no point in doing gamma encoding on them; that would just result in further loss of information. The encoder should just write the sample values to the PNG datastream. This does not imply that the gAMA chunk should contain a gamma value of 1.0 because the desired end-to-end transfer function from scene intensity to display output intensity is not necessarily linear. However, the desired gamma value is probably not far from 1.0. It may depend on whether the scene being rendered is a daylight scene or an indoor scene, etc.

When the sample values come directly from a piece of hardware, the correct gAMA value can, in principle, be inferred from the transfer function of the hardware and lighting conditions of the scene. In the case of video digitizers ("frame grabbers"), the samples are probably in the sRGB colour space, because the sRGB specification was designed to be compatible with modern video standards. Image scanners are less predictable. Their output samples may be proportional to the input light intensity since CCD sensors themselves are linear, or the scanner hardware may have already applied a power function designed to compensate for dot gain in subsequent printing (an exponent of about 0.57), or the scanner may have corrected the samples for display on a monitor. It may be necessary to refer to the scanner's manual or to scan a calibrated target in order to determine the characteristics of a particular scanner. It should be remembered that gamma relates samples to desired display output, not to scanner input.

Datastream format converters generally should not attempt to convert supplied images to a different gamma. The data should be stored in the PNG datastream without conversion, and the gamma value should be deduced from information in the source datastream if possible. Gamma alteration at datastream conversion time causes re-quantization of the set of intensity levels that are represented, introducing further roundoff error with little benefit. It is almost always better to just copy the sample values intact from the input to the output file.

If the source datastream describes the gamma characteristics of the image, a datastream converter is strongly encouraged to write a gAMA chunk. Some datastream formats specify the display exponent (the exponent of the function which maps image samples to display output rather than the other direction). If the source file's gamma value is greater than 1.0, it is probably a display exponent, and the reciprocal of this value should be used for the PNG gamma value. If the source file format records the relationship between image samples and a quantity other than display output, it will be more complex than this to deduce the PNG gamma value.

If a PNG encoder or datastream converter knows that the image has been displayed satisfactorily using a display system whose transfer function can be approximated by a power function with exponent display_exponent , the image can be marked as having the gamma value:

gamma = 1/display_exponent

It is better to write a gAMA chunk with a value that is approximately correct than to omit the chunk and force PNG decoders to guess an approximate gamma. If a PNG encoder is unable to infer the gamma value, it is preferable to omit the gAMA chunk. If a guess has to be made this should be left to the PNG decoder.

Gamma does not apply to alpha samples; alpha is always represented linearly.

See also 13.13: Decoder gamma handling.

See Annex C: Gamma and chromaticity for references to colour issues.

PNG encoders capable of full colour management [ICC] will perform more sophisticated calculations than those described here and may choose to use the iCCP chunk. If it is known that the image samples conform to the sRGB specification [IEC 61966-2-1], PNG encoders are strongly encouraged to use the sRGB chunk.

If it is possible for the encoder to determine the chromaticities of the source display primaries, or to make a strong guess based on the origin of the image, or the hardware running it, the encoder is strongly encouraged to output the cHRM chunk. If this is done, the gAMA chunk should also be written; decoders can do little with a cHRM chunk if the gAMA chunk is missing.

There are a number of recommendations and standards for primaries and white points, some of which are linked to particular technologies, for example the CCIR 709 standard [ITU-R-BT709] and the SMPTE-C standard [SMPTE-170M].

There are three cases that need to be considered:

the encoder is part of the generation system; the source image is captured by a camera or scanner; the PNG datastream was generated by translation from some other format.

In the case of hand-drawn or digitally edited images, it is necessary to determine what monitor they were viewed on when being produced. Many image editing programs allow the type of monitor being used to be specified. This is often because they are working in some device-independent space internally. Such programs have enough information to write valid cHRM and gAMA chunks, and are strongly encouraged to do so automatically.

If the encoder is compiled as a portion of a computer image renderer that performs full-spectral rendering, the monitor values that were used to convert from the internal device-independent colour space to RGB should be written into the cHRM chunk. Any colours that are outside the gamut of the chosen RGB device should be mapped to be within the gamut; PNG does not store out-of-gamut colours.

If the computer image renderer performs calculations directly in device-dependent RGB space, a cHRM chunk should not be written unless the scene description and rendering parameters have been adjusted for a particular monitor. In that case, the data for that monitor should be used to construct a cHRM chunk.

A few image formats store calibration information, which can be used to fill in the cHRM chunk. For example, TIFF 6.0 files [TIFF-6.0] can optionally store calibration information, which if present should be used to construct the cHRM chunk.

Video created with recent video equipment probably uses the CCIR 709 primaries and D65 white point [ITU-R-BT709], which are given in Table 12.1.

Table 12.1 — CCIR 709 primaries and D65 whitepoint R G B White x 0.640 0.300 0.150 0.3127 y 0.330 0.600 0.060 0.3290

An older but still very popular video standard is SMPTE-C [SMPTE-170M] given in Table 12.2.

Table 12.2 — SMPTE-C video standard R G B White x 0.630 0.310 0.155 0.3127 y 0.340 0.595 0.070 0.3290

It is not recommended that datastream format converters attempt to convert supplied images to a different RGB colour space. The data should be stored in the PNG datastream without conversion, and the source primary chromaticities should be recorded if they are known. Colour space transformation at datastream conversion time is a bad idea because of gamut mismatches and rounding errors. As with gamma conversions, it is better to store the data losslessly and incur at most one conversion when the image is finally displayed.

See also 13.14: Decoder colour handling.

The alpha channel can be regarded either as a mask that temporarily hides transparent parts of the image, or as a means for constructing a non-rectangular image. In the first case, the colour values of fully transparent pixels should be preserved for future use. In the second case, the transparent pixels carry no useful data and are simply there to fill out the rectangular image area required by PNG. In this case, fully transparent pixels should all be assigned the same colour value for best compression.

Image authors should keep in mind the possibility that a decoder will not support transparency control in full (see 13.16: Alpha channel processing). Hence, the colours assigned to transparent pixels should be reasonable background colours whenever feasible.

For applications that do not require a full alpha channel, or cannot afford the price in compression efficiency, the tRNS transparency chunk is also available.

If the image has a known background colour, this colour should be written in the bKGD chunk. Even decoders that ignore transparency may use the bKGD colour to fill unused screen area.

If the original image has premultiplied (also called "associated") alpha data, it can be converted to PNG's non-premultiplied format by dividing each sample value by the corresponding alpha value, then multiplying by the maximum value for the image bit depth, and rounding to the nearest integer. In valid premultiplied data, the sample values never exceed their corresponding alpha values, so the result of the division should always be in the range 0 to 1. If the alpha value is zero, output black (zeroes).

When encoding input samples that have a sample depth that cannot be directly represented in PNG, the encoder shall scale the samples up to a sample depth that is allowed by PNG. The most accurate scaling method is the linear equation:

output = floor((input * MAXOUTSAMPLE / MAXINSAMPLE) + 0.5)

where the input samples range from 0 to MAXINSAMPLE and the outputs range from 0 to MAXOUTSAMPLE (which is 2sampledepth-1).

A close approximation to the linear scaling method is achieved by "left bit replication", which is shifting the valid bits to begin in the most significant bit and repeating the most significant bits into the open bits. This method is often faster to compute than linear scaling.

EXAMPLE Assume that 5-bit samples are being scaled up to 8 bits. If the source sample value is 27 (in the range from 0-31), then the original bits are:

4 3 2 1 0 --------- 1 1 0 1 1

Left bit replication gives a value of 222:

7 6 5 4 3 2 1 0 ---------------- 1 1 0 1 1 1 1 0 |=======| |===| | Leftmost Bits Repeated to Fill Open Bits | Original Bits

which matches the value computed by the linear equation. Left bit replication usually gives the same value as linear scaling, and is never off by more than one.

A distinctly less accurate approximation is obtained by simply left-shifting the input value and filling the low order bits with zeroes. This scheme cannot reproduce white exactly, since it does not generate an all-ones maximum value; the net effect is to darken the image slightly. This method is not recommended in general, but it does have the effect of improving compression, particularly when dealing with greater-than-8-bit sample depths. Since the relative error introduced by zero-fill scaling is small at high sample depths, some encoders may choose to use it. Zero-fill shall not be used for alpha channel data, however, since many decoders will treat alpha values of all zeroes and all ones as special cases. It is important to represent both those values exactly in the scaled data.

When the encoder writes an sBIT chunk, it is required to do the scaling in such a way that the high-order bits of the stored samples match the original data. That is, if the sBIT chunk specifies a sample depth of S, the high-order S bits of the stored data shall agree with the original S-bit data values. This allows decoders to recover the original data by shifting right. The added low-order bits are not constrained. All the above scaling methods meet this restriction.

When scaling up source image data, it is recommended that the low-order bits be filled consistently for all samples; that is, the same source value should generate the same sample value at any pixel position. This improves compression by reducing the number of distinct sample values. This is not a mandatory requirement, and some encoders may choose not to follow it. For example, an encoder might instead dither the low-order bits, improving displayed image quality at the price of increasing file size.

In some applications the original source data may have a range that is not a power of 2. The linear scaling equation still works for this case, although the shifting methods do not. It is recommended that an sBIT chunk not be written for such images, since sBIT suggests that the original data range was exactly 0..2S-1.

Suggested palettes may appear as sPLT chunks in any PNG datastream, or as a PLTE chunk in truecolour PNG datastreams. In either case, the suggested palette is not an essential part of the image data, but it may be used to present the image on indexed-colou