This appendix gives the reasoning behind some of the design decisionsin PNG. Many of these decisions were the subject of considerabledebate. The authors freely admit that another group might have madedifferent decisions; however, we believe that our choices aredefensible and consistent.
We have also addressed some of the widely known shortcomings of GIF.In particular, PNG supports truecolor images. We know of no widelyused image format that losslessly compresses truecolor images aseffectively as PNG does. We hope that PNG will make use oftruecolor images more practical and widespread.
Some form of transparency control is desirable for applicationsin which images are displayed against a background or together withother images. GIF provided a simple transparent-color specificationfor this purpose. PNG supports a full alpha channel as well astransparent-color specifications. This allows both highly flexibletransparency and compression efficiency.
Robustness against transmission errors has been an importantconsideration. For example, images transferred across Internet areoften mistakenly processed as text, leading to file corruption. PNGis designed so that such errors can be detected quickly and reliably.
PNG has been expressly designed not to be completelydependent on a single compression technique. Althoughdeflate/inflate compression is mentioned in thisdocument, PNG would still exist without it.
PNG also does not support multiple images in one file.This restriction is a reflection of the reality that many applicationsdo not need and will not support multiple images per file.In any case, single images are afundamentally different sort of object from sequences of images.Rather than make false promises ofinterchangeability, we have drawn a clear distinction betweensingle-image and multi-image formats. PNG is a single-image format.(But seeMultiple-image extension.)
GIF is no longer suitable as a universal standard because of legalentanglements. Although just replacing GIF's compression method wouldavoid that problem, GIF does not support truecolor images, alphachannels, or gamma correction. The spec has more subtle problems too.Only a small subset of the GIF89 spec is actually portable across avariety of implementations, but there is no codification of the mostportable part of the spec.
TIFF is far too complex to meet our goals of simplicity andinterchangeability. Defining a TIFF subset would meet that objection,but would frustrate users making the reasonable assumption that a filesaved as TIFF from their existing software would load into a programsupporting our flavor of TIFF. Furthermore, TIFF is not designed for streamprocessing, has no provision for progressive display, and does notcurrently provide any good, legally unencumbered, lossless compressionmethod.
IFF has also been suggested, but is not suitable in detail: availableimage representations are too machine-specific or not adequatelycompressed. The overall chunk structure of IFF is a useful conceptthat PNG has liberally borrowed from, but we did not attempt to bebit-for-bit compatible with IFF chunk structure. Again this is due todetailed issues, notably the fact that IFF FORMs are not designed tobe serially writable.
Lossless JPEG is not suitable because it does not provide for thestorage of indexed-color images. Furthermore, its lossless truecolorcompression is often inferior to that of PNG.
In practice, image gamma values around 1.0 and around 0.5 are bothwidely found. Older image standards such as GIF often do not accountfor this fact. The JFIF standard specifies that images in that formatshould use linear samples, but many JFIF images found on the Internetactually have a gamma somewhere near 0.4 or 0.5. Thevariety of images found and the variety of systems that people displaythem on have led to widespread problems with images appearing"too dark" or "too light".
PNG expects viewers to compensate for image gamma at the time thatthe image is displayed. Another possible approach is to expect encodersto convert all images to a uniform gamma at encoding time. While thatmethod would speed viewers slightly, it has fundamental flaws:
SeeGamma Tutorial for more information.
Some image rendering techniques generate images with premultipliedalpha (the alpha value actually represents how much of the pixel iscovered by the image). This representation can be converted to PNG bydividing the sample values by alpha, except where alpha is zero. Theresult will look good if displayed by a viewer that handles alphaproperly, but will not look very good if the viewer ignores the alphachannel.
Although each form of alpha storage has its advantages, we did notwant to require all PNG viewers to handle both forms. We standardizedon non-premultiplied alpha as being the lossless and more general case.
The filter algorithms are defined to operate on bytes, rather thanpixels; this gains simplicity and speed with very little cost incompression performance. Tests have shown that filtering isusually ineffective for images with fewer than 8 bits per sample, soproviding pixelwise filtering for such images would be pointless.For 16 bit/sample data, bytewise filtering is nearly as effective aspixelwise filtering, because MSBs are predicted from adjacent MSBs,and LSBs are predicted from adjacent LSBs.
The encoder is allowed to change filters for each new scanline.This creates no additional complexity for decoders, since a decoder isrequired to contain defiltering logic for every filter type anyway.The only cost is an extra byte per scanline in the pre-compressiondatastream. Our tests showed that when the same filter is selectedfor all scanlines, this extra byte compresses away to almost nothing,so there is little storage cost compared to a fixed filter specifiedfor the whole image. And the potential benefits of adaptive filteringare too great to ignore. Even with the simplistic filter-choiceheuristics so far discovered, adaptive filtering usually outperformsfixed filters. In particular, an adaptive filter can change behaviorfor successive passes of an interlaced image; a fixed filter cannot.
The ISO 8859-1 (Latin-1) character set was chosen as a compromisebetween functionality and portability. Some platforms cannotdisplay anything more than 7-bit ASCII characters, while otherscan handle characters beyond the Latin-1 set. We felt that Latin-1represents a widely useful and reasonably portable character set.Latin-1 is a direct subset of character sets commonly used onpopular platforms such as Microsoft Windows and X Windows. It canalso be handled on Macintosh systems with a simple remapping ofcharacters.
There is presently no provision for text employing charactersets other than Latin-1. Werecognize that the need for other character sets will increase.However, PNG already requires that programmers implement anumber of new and unfamiliar features, and text representationis not PNG's primary purpose. Since PNG provides for the creationand public registration of new ancillary chunks of general interest,we expect that text chunks for other character sets, suchas Unicode, eventually will be registered and increase gradually inpopularity.
(decimal) 137 80 78 71 13 10 26 10 (hexadecimal) 89 50 4e 47 0d 0a 1a 0a (ASCII C notation) \211 P N G \r \n \032 \n
This signature both identifies the file as a PNG file and provides forimmediate detection of common file-transfer problems.The first two bytes distinguish PNG files on systems that expect thefirst two bytes to identify the file type uniquely. The first byte ischosen as a non-ASCII value to reduce the probability that a text filemay be misrecognized as a PNG file; also, it catches bad filetransfers that clear bit 7. Bytes two through four name the format.The CR-LF sequence catches bad file transfers that alter newlinesequences. The control-Z character stops file display under MS-DOS.The final line feed checks for the inverse of the CR-LF translationproblem.
A decoder may further verify that the next eight bytes contain anIHDR chunk header with the correct chunk length; this willcatch bad transfers that drop or alter null (zero) bytes.
Note that there is no version number in the signature, nor indeedanywhere in the file. This is intentional: the chunk mechanismprovides a better, more flexible way to handle format extensions, asexplained inChunk naming conventions.
Limiting chunk length to (2^31)-1 bytes avoids possible problems forimplementations that cannot conveniently handle 4-byte unsigned values.In practice, chunks will usually be much shorter than that anyway.
A separate CRC is provided for each chunk in order to detectbadly-transferred images as quickly as possible. In particular,critical data such as the image dimensions can be validated beforebeing used.
The chunk length is excluded from the CRC so that the CRC can becalculated as the data is generated; this avoids a second pass overthe data in cases where the chunk length is not known in advance.Excluding the length from the CRC does not createany extra risk of failing to discover file corruption, since if thelength is wrong, the CRC check will fail: the CRC will be computed onthe wrong set of bytes and then be tested against the wrong value fromthe file.
A hypothetical chunk for vector graphics would be a critical chunk,since if ignored, important parts of the intended image would bemissing. A chunk carrying the Mandelbrot set coordinates for afractal image would be ancillary, since other applications coulddisplay the image without understanding what the image represents.In general, a chunk type should be madecritical only if it is impossible to display a reasonablerepresentation of the intended image without interpreting that chunk.
The public/private property bit ensures that any newly defined publicchunk type name cannot conflict with proprietary chunks that could bein use somewhere. However, this does not protect users of privatechunk names from the possibility that someone else may use the samechunk name for a different purpose. It is a good idea to putadditional identifying information at the start of the data for anyprivate chunk type.
When a PNG file is modified, certain ancillary chunks may need to bechanged to reflect changes in other chunks. For example, a histogramchunk needs to be changed if the image data changes. If the file editordoes not recognize histogram chunks, copying them blindly to a newoutput file is incorrect; such chunks should be dropped. Thesafe/unsafe property bit allows ancillary chunks to be markedappropriately.
Not all possible modification scenarios are covered by the safe/unsafesemantics. In particular, chunks that are dependent on the total filecontents are not supported. (An example of such a chunk is an indexofIDAT chunk locations within the file: adding a commentchunk would inadvertently break the index.) Definition of such chunks isdiscouraged. If absolutely necessary for a particular application,such chunks can be made critical chunks, with consequent loss ofportability to other applications. In general, ancillary chunks candepend on critical chunks but not on other ancillary chunks. It isexpected that mutually dependent information should be put into asingle chunk.
In some situations it may be unavoidable to make one ancillary chunkdependent on another. Although the chunk property bits are insufficientto represent this case, a simple solution is available: in thedependent chunk, record the CRC of the chunk depended on. It canthen be determined whether that chunk has been changed by some otherprogram.
The same technique can be useful for other purposes. For example, ifa program relies on the palette being in a particular order, it canstore a private chunk containing the CRC of thePLTE chunk.If this value matches when the file is again read in, then it provideshigh confidence that the palette has not been tampered with. Notethat it is not necessary to mark the private chunk unsafe-to-copywhen this technique is used; thus, such a private chunk can surviveother editing of the file.
Other image formats have usually addressed this problem by specifyingthat the palette entries should appear in order of frequency of use.That is an inferior solution, because it doesn't give the viewernearly as much information: the viewer can't determine how much damagewill be done by dropping the last few colors. Nor does a sortedpalette give enough information to choose a target palette fordithering, in the case that the viewer needs to reduce the number ofcolors substantially. A palette histogram provides the informationneeded to choose such a target palette without making a pass over theimage data.