Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

JBIG2

From Wikipedia, the free encyclopedia
Image file format
JBIG2
Internet media typeimage/x-jbig2
Developed byJoint Bi-level Image Experts Group
Latest release
2
Contained byPortable Document Format,FAX
StandardITU T.88 &ISO/IEC 14492

JBIG2 is animage compression standard forbi-level images, developed by theJoint Bi-level Image Experts Group. It is suitable for bothlossless andlossy compression. According to a press release[1] from the Group, in its lossless mode JBIG2 typically generates files 3–5 times smaller thanFax Group 4 and 2–4 times smaller thanJBIG, the previous bi-level compression standard released by the Group. JBIG2 was published in 2000 as the international standard ITU T.88,[2] and in 2001 asISO/IEC 14492.[3]

Functionality

[edit]

Ideally, a JBIG2 encoder will segment the input page into regions of text, regions ofhalftone images, and regions of other data. Regions that are neither text nor halftones are typically compressed using a context-dependentarithmetic coding algorithm called the MQ coder. Textual regions are compressed as follows: the foreground pixels in the regions are grouped into symbols. A dictionary of symbols is then created and encoded, typically also using context-dependent arithmetic coding, and the regions are encoded by describing which symbols appear where. Typically, a symbol will correspond to a character of text, but this is not required by the compression method. For lossy compression the difference between similar symbols (e.g., slightly different impressions of the same letter) can be neglected; for lossless compression, this difference is taken into account by compressing one similar symbol using another as a template. Halftone images may be compressed by reconstructing thegrayscale image used to generate the halftone and then sending this image together with a dictionary of halftone patterns.[4] Overall, the algorithm used by JBIG2 to compress text is very similar to the JB2 compression scheme used in theDjVu file format for coding binary images.

PDF files versions 1.4 and above may contain JBIG2-compressed data.Open-source decoders for JBIG2 are jbig2dec[5] (AGPL), the java-based jbig2-imageio[6] (Apache-2), the JavaScript-based jbig2.js[7] (Apache-2), and the decoder by Glyph & Cog LLC found inXpdf andPoppler[8] (bothGPL). An open-source encoder is jbig2enc[9] (Apache-2).

Technical details

[edit]

Typically, a bi-level image consists mainly of a large amount of textual and halftone data, in which the same shapes appear repeatedly. The bi-level image is segmented into three regions: text, halftone, and generic regions. Each region is coded differently and the coding methodologies are described in the following passage.

Text image data

[edit]

Text coding is based on the nature of human visual interpretation. A human observer cannot tell the difference between two instances of the same characters in abi-level image even though they may not exactly match pixel by pixel. Therefore, only the bitmap of one representative character instance needs to be coded instead of coding the bitmaps of each occurrence of the same character individually. For each character instance, the coded instance of the character is then stored into a "symbol dictionary".[4] There are two encoding methods for text image data: pattern matching and substitution (PM&S) and soft pattern matching (SPM).[10]

Block diagrams of (left) pattern matching and substitution method and (right) soft pattern matching method

Pattern matching and substitution (PM&S) is the more classic coding method. The encoder performsimage segmentation to isolate character-sized chunks. For each individual chunk, the encoder looks for a match in the bitmap dictionary. If a match exists, we code an index of the corresponding representative bitmap in the dictionary and the position of the character on the page. The position is usually relative to another previously coded character. If a match is not found, the segmented pixel block is coded directly and added into the dictionary. Typical procedures of pattern matching and substitution algorithm are displayed in the left block diagram of the figure above. Although the method of PM&S can achieve outstanding compression, substitution errors could be made during the process if the image resolution is low.[10]

JBIG2 improves on PM&S with optionalsoft pattern matching (SPM). The same segmentation and searching is performed, but for each found match, the encoder saves not only the corresponding dictionary entry, but alsorefinement data describing the difference between the actual chunk and the dictionary chunk. Doing so greatly reduces substitution errors.[4][a] Since the dictionary match requires that the actual character and the dictionary character are highly similar, SPM only adds a tiny amount of data.[10]

Halftones

[edit]

Halftone images can be compressed using two methods. One of the methods is similar to the context-basedarithmetic coding algorithm, which adaptively positions the template pixels in order to obtain correlations between the adjacent pixels. In the second method, descreening is performed on the halftone image so that the image is converted back to grayscale. The converted grayscale values are then used as indexes of fixed-sized tiny bitmap patterns contained in a halftone bitmap dictionary. This allows decoder to successfully render a halftone image by presenting indexed dictionary bitmap patterns neighboring with each other.[4]

Entropy coding

[edit]

All three region types including text,halftone, and generic regions may all use arithmetic coding or huffman coding. JBIG2 specifically uses theMQ coder, the same entropy encoder employed byJPEG 2000.

Patents

[edit]

Patents for JBIG2 are owned by IBM and Mitsubishi. Free licenses should be available after a request. JBIG and JBIG2 patents are not the same.[12][13][14]

Character substitution errors in scanned documents

[edit]

Some implementations of JBIG2 using lossy compression can potentially alter the characters in documents that are scanned to PDF. Unlike some other algorithms wherecompression artifacts are obvious, such as blurring[15] ormosquito noise, JBIG2's "pattern matching" matches up similar-looking symbols. If the matching is implemented poorly, especially in low-resolution scans where characters are less clearly defined, similar characters may get erroneously swapped.

In 2013, various substitutions (e.g., replacing "6" with "8")were reported to happen on manyXerox Workcentrephotocopier and printer machines. Numbers printed on scanned (but notOCR-ed) documents had potentially been altered. This has been demonstrated on constructionblueprints and some tables of numbers; the potential impact of such substitution errors in documents such asmedical prescriptions was briefly mentioned.[16][17][18] German computer scientist David Kriesel and Xerox were investigating this.[19][20]

Xerox subsequently acknowledged that this was a long-standing software defect, and their initial statements in suggesting that only non-factory settings could introduce the substitution were incorrect. No attempt was made to recall or mandate updates to the affected devices – which was acknowledged to affect more than a dozen product families. However, in August 2013 asoftware patch was made available, that when installed, automatically disabled pattern matching.[21] Documents previously scanned continue to potentially contain errors making their veracity difficult to substantiate.

David Kriesel thought that"the error cause is not JBIG2 itself".[16] However authorities in some countries made statements to prevent the use of JBIG2.[22] In Germany theFederal Office for Information Security has issued a technical guideline that says the JBIG2 encoding "MUST NOT be used" for "replacement scanning".[23] In Switzerland the Coordination Office for the Permanent Archiving of Electronic Documents (Koordinationsstelle für die dauerhafte Archivierung elektronischer Unterlagen) has recommended against the use of JBIG2 when creating PDF documents.[24] In its transfer guidance tables, US NARA states it “will not accept digitized records in PDF that have been saved with lossy compression (e.g., JPEG, JBIG2)”, and even lists “JBIG1 or JBIG2” among terms indicating prohibited outputs. Their “acceptable codecs” lists for PDF/A omit JBIG2 entirely (they permit ZIP and lossless JPEG2000).

Exploit

[edit]

A vulnerability in theXpdf implementation of JBIG2, re-used in Apple'siOS phone operating software, was used by thePegasus spyware to implement azero-click attack oniPhones by constructing an emulatedcomputer architecture inside a JBIG2 stream. Apple fixed this "FORCEDENTRY" vulnerability in iOS 14.8 in September 2021.[25]

See also

[edit]

References

[edit]
  1. ^If the refinement data is used without any threshold of difference, encoding would be completelylossless. This could be less efficient than just tagging the whole page as a "generic region" for direct arithmetic coding.[11]
  1. ^Press release from the Joint Bi-level Image experts GroupArchived 2005-05-15 at theWayback Machine.
  2. ^"ITU-T Recommendation T.88 – T.88 : Information technology - Coded representation of picture and audio information - Lossy/lossless coding of bi-level images". Retrieved2011-02-19.
  3. ^"ISO/IEC 14492:2001 – Information technology – Lossy/lossless coding of bi-level images". Retrieved2011-02-19.
  4. ^abcdOno, F.; Rucklidge, W.; Arps, R.; Constantinescu, C. (2000). "JBIG2-the ultimate bi-level image coding standard".Proceedings 2000 International Conference on Image Processing. Vol. 1. IEEE. pp. 140–3.doi:10.1109/ICIP.2000.900914.ISBN 0-7803-6297-7.
  5. ^jbig2dec decoder home page.
  6. ^jbig2-imageio decoder plugin for Java's ImageIO.
  7. ^jbig2.js decoder forPDF.js.
  8. ^JBIG2Stream decoder by Glyph & Cog LLC.
  9. ^jbig2enc encoder project home.
  10. ^abcHoward, P.G.; Kossentini, F.; Martins, B.; Forchhammer, S.; Rucklidge, W.J. (November 1998). "The emerging JBIG2 standard".IEEE Transactions on Circuits and Systems for Video Technology.8 (7):838–848.Bibcode:1998ITCSV...8..838H.doi:10.1109/76.735380.ISSN 1558-2205.
  11. ^Langley, Adam."jbig2enc: Documentation".GitHub.We can choose to do this for each symbol on the page, so we don't have to refine when we are only a couple of pixel off. If we refine whenever we [sic] a wrong pixel, we have lossless encoding using symbols.
  12. ^What is the patent situation with JBIG?, archived fromthe original on 2012-02-23
  13. ^What is JBIG2?, archived fromthe original on 2012-04-14, retrieved2012-04-07
  14. ^JBIG2 patents, archived fromthe original on 2017-09-29, retrieved2012-04-07
  15. ^Zhou Wang, Hamid R. Sheikh and Alan C. Bovik (2002). "No-reference perceptual quality assessment of JPEG compressed images".Proceedings 2002 International Conference on Image Processing(PDF). Archived fromthe original(PDF) on 2013-11-02.
  16. ^ab"Xerox scanners/photocopiers randomly alter numbers in scanned documents". 2013-08-02. Retrieved2013-08-04.
  17. ^"Confused Xerox copiers rewrite documents, expert finds".BBC News. 2013-08-06. Retrieved2013-08-06.
  18. ^"Xerox Scanners / Photocopiers Randomly Alter Numbers".The Font Feed. 5 August 2013. Archived fromthe original on 26 October 2017.
  19. ^"Xerox investigating latest mangling test findings". 2013-08-11. Retrieved2013-08-11.
  20. ^Update on Scanning Issue: Software Patches To Come, Xerox (blog), 2013-08-11, archived fromthe original on 2013-11-04, retrieved2013-08-11
  21. ^"Scanning and Compression White Paper"(PDF).xerox.com. Xerox Corporation. 2013. Archived fromthe original(PDF) on 2022-01-21.
  22. ^Kriesel, David."Video and Slides of my Xerox Talk at 31C3".D. Kriesel Data Science, Machine Learning, BBQ, Photos, and Ants in a Terrarium. Retrieved31 July 2016.
    Note: The video there is dubbed over in English; if you understand German, the original might be easier to follow:David Kriesel:Traue keinem Scan, den du nicht selbst gefälscht hast
  23. ^"BSI Technical Guidelines 03138: Replacement Scanning"(PDF).Federal Office for Information Security. Retrieved28 December 2021.
  24. ^"JBIG2 Compression".CECO. Archived fromthe original on 2025-02-19. Retrieved2021-12-28.
  25. ^Beer, Ian; Groß, Samuel (2021-12-15)."Project Zero: A deep dive into an NSO zero-click iMessage exploit: Remote Code Execution".Google Project Zero. Retrieved2021-12-16.

External links

[edit]
Video
compression
ISO,IEC,
MPEG
ITU-T,VCEG
SMPTE
TrueMotion and AOMedia
Chinese Standard
  • AVS1 P2/AVS+(GB/T 20090.2/16)
  • AVS2 P2(GB/T 33475.2,GY/T 299.1)
    • HDR Vivid(GY/T 358)
  • AVS3 P2(GY/T 368)
Others
Audio
compression
ISO,IEC,
MPEG
ITU-T
IETF
3GPP
ETSI
Bluetooth SIG
Chinese Standard
Others
Image
compression
IEC,ISO,IETF,
W3C,ITU-T,JPEG
Others
Containers
ISO,IEC
ITU-T
IETF
SMPTE
Others
Collaborations
Methods
Lists
SeeCompression methods for techniques andCompression software for codecs
Raster
Raw
Vector
Compound
Metadata
Retrieved from "https://en.wikipedia.org/w/index.php?title=JBIG2&oldid=1327802579"
Categories:
Hidden categories:

[8]ページ先頭

©2009-2026 Movatter.jp