Rename Huffman coding to prefix coding in the bitstream spec

... since no guarantee of Huffman coding can be introduced at decoding
time and encoding didn't actually use Huffman coding in the first place

Bug: webp:551
Change-Id: I400466bb3b4a1d5506353eb3f287d658603164ee
This commit is contained in:
Jyrki Alakuijala 2022-06-16 17:46:18 +00:00 committed by James Zern
parent 8895f8a345
commit 404c1622f8

View File

@ -27,7 +27,7 @@ exactly, including the color values for zero alpha pixels. The
format uses subresolution images, recursively embedded into the format format uses subresolution images, recursively embedded into the format
itself, for storing statistical data about the images, such as the used itself, for storing statistical data about the images, such as the used
entropy codes, spatial predictors, color space conversion, and color entropy codes, spatial predictors, color space conversion, and color
table. LZ77, Huffman coding, and a color cache are used for compression table. LZ77, prefix coding, and a color cache are used for compression
of the bulk data. Decoding speeds faster than PNG have been of the bulk data. Decoding speeds faster than PNG have been
demonstrated, as well as 25% denser compression than can be achieved demonstrated, as well as 25% denser compression than can be achieved
using today's PNG format. using today's PNG format.
@ -65,9 +65,9 @@ distance mapping
entropy image entropy image
: A two-dimensional subresolution image indicating which entropy coding : A two-dimensional subresolution image indicating which entropy coding
should be used in a respective square in the image, i.e., each pixel should be used in a respective square in the image, i.e., each pixel
is a meta Huffman code. is a meta prefix code.
Huffman code prefix code
: A classic way to do entropy coding where a smaller number of bits are : A classic way to do entropy coding where a smaller number of bits are
used for more frequent codes. used for more frequent codes.
@ -75,9 +75,9 @@ LZ77
: Dictionary-based sliding window compression algorithm that either : Dictionary-based sliding window compression algorithm that either
emits symbols or describes them as sequences of past symbols. emits symbols or describes them as sequences of past symbols.
meta Huffman code meta prefix code
: A small integer (up to 16 bits) that indexes an element in the meta : A small integer (up to 16 bits) that indexes an element in the meta
Huffman table. prefix table.
predictor image predictor image
: A two-dimensional subresolution image indicating which spatial : A two-dimensional subresolution image indicating which spatial
@ -617,8 +617,8 @@ We use image data in five different roles:
1. ARGB image: Stores the actual pixels of the image. 1. ARGB image: Stores the actual pixels of the image.
1. Entropy image: Stores the 1. Entropy image: Stores the
[meta Huffman codes](#decoding-of-meta-huffman-codes). The red and green [meta prefix codes](#decoding-of-meta-prefix-codes). The red and green
components of a pixel define the meta Huffman code used in a particular components of a pixel define the meta prefix code used in a particular
block of the ARGB image. block of the ARGB image.
1. Predictor image: Stores the metadata for [Predictor 1. Predictor image: Stores the metadata for [Predictor
Transform](#predictor-transform). The green component of a pixel defines Transform](#predictor-transform). The green component of a pixel defines
@ -651,7 +651,7 @@ the image.
Each pixel is encoded using one of the three possible methods: Each pixel is encoded using one of the three possible methods:
1. Huffman coded literal: each channel (green, red, blue and alpha) is 1. prefix coded literal: each channel (green, red, blue and alpha) is
entropy-coded independently; entropy-coded independently;
2. LZ77 backward reference: a sequence of pixels are copied from elsewhere 2. LZ77 backward reference: a sequence of pixels are copied from elsewhere
in the image; or in the image; or
@ -660,9 +660,9 @@ Each pixel is encoded using one of the three possible methods:
The following sub-sections describe each of these in detail. The following sub-sections describe each of these in detail.
#### 4.2.1 Huffman Coded Literals #### 4.2.1 Prefix Coded Literals
The pixel is stored as Huffman coded values of green, red, blue and alpha (in The pixel is stored as prefix coded values of green, red, blue and alpha (in
that order). See [this section](#decoding-entropy-coded-image-data) for details. that order). See [this section](#decoding-entropy-coded-image-data) for details.
#### 4.2.2 LZ77 Backward Reference #### 4.2.2 LZ77 Backward Reference
@ -788,11 +788,11 @@ Color cache stores a set of colors that have been recently used in the image.
**Rationale:** This way, the recently used colors can sometimes be referred to **Rationale:** This way, the recently used colors can sometimes be referred to
more efficiently than emitting them using the other two methods (described in more efficiently than emitting them using the other two methods (described in
[4.2.1](#huffman-coded-literals) and [4.2.2](#lz77-backward-reference)). [4.2.1](#prefix-coded-literals) and [4.2.2](#lz77-backward-reference)).
Color cache codes are stored as follows. First, there is a 1-bit value that Color cache codes are stored as follows. First, there is a 1-bit value that
indicates if the color cache is used. If this bit is 0, no color cache codes indicates if the color cache is used. If this bit is 0, no color cache codes
exist, and they are not transmitted in the Huffman code that decodes the green exist, and they are not transmitted in the prefix code that decodes the green
symbols and the length prefix codes. However, if this bit is 1, the color cache symbols and the length prefix codes. However, if this bit is 1, the color cache
size is read next: size is read next:
@ -823,11 +823,11 @@ literals, into the cache in the order they appear in the stream.
### 5.1 Overview ### 5.1 Overview
Most of the data is coded using [canonical Huffman code][canonical_huff]. Hence, Most of the data is coded using a [canonical prefix code][canonical_huff].
the codes are transmitted by sending the _Huffman code lengths_, as opposed to Hence, the codes are transmitted by sending the _prefix code lengths_, as
the actual _Huffman codes_. opposed to the actual _prefix codes_.
In particular, the format uses **spatially-variant Huffman coding**. In other In particular, the format uses **spatially-variant prefix coding**. In other
words, different blocks of the image can potentially use different entropy words, different blocks of the image can potentially use different entropy
codes. codes.
@ -840,19 +840,19 @@ potentially better compression.
The encoded image data consists of several parts: The encoded image data consists of several parts:
1. Decoding and building the prefix codes \[AMENDED2\] 1. Decoding and building the prefix codes \[AMENDED2\]
1. Meta Huffman codes 1. Meta prefix codes
1. Entropy-coded image data 1. Entropy-coded image data
#### 5.2.1 Decoding and Building the Prefix Codes #### 5.2.1 Decoding and Building the Prefix Codes
There are several steps in decoding the Huffman codes. There are several steps in decoding the prefix codes.
**Decoding the Code Lengths:** **Decoding the Code Lengths:**
{:#decoding-the-code-lengths} {:#decoding-the-code-lengths}
This section describes how to read the Huffman code lengths from the bitstream. This section describes how to read the prefix code lengths from the bitstream.
The Huffman code lengths can be coded in two ways. The method used is specified The prefix code lengths can be coded in two ways. The method used is specified
by a 1-bit value. by a 1-bit value.
* If this bit is 1, it is a _simple code length code_, and * If this bit is 1, it is a _simple code length code_, and
@ -865,8 +865,8 @@ stream. This may be inefficient, but it is allowed by the format.
\[AMENDED2\] \[AMENDED2\]
This variant is used in the special case when only 1 or 2 Huffman symbols are This variant is used in the special case when only 1 or 2 prefix symbols are
in the range \[0..255\] with code length `1`. All other Huffman code lengths in the range \[0..255\] with code length `1`. All other prefix code lengths
are implicitly zeros. are implicitly zeros.
The first bit indicates the number of symbols: The first bit indicates the number of symbols:
@ -891,17 +891,17 @@ if (num_symbols == 2) {
} }
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
**Note:** Another special case is when _all_ Huffman code lengths are _zeros_ **Note:** Another special case is when _all_ prefix code lengths are _zeros_
(an empty Huffman code). For example, a Huffman code for distance can be empty (an empty prefix code). For example, a prefix code for distance can be empty
if there are no backward references. Similarly, Huffman codes for alpha, red, if there are no backward references. Similarly, prefix codes for alpha, red,
and blue can be empty if all pixels within the same meta Huffman code are and blue can be empty if all pixels within the same meta prefix code are
produced using the color cache. However, this case doesn't need a special produced using the color cache. However, this case doesn't need a special
handling, as empty Huffman codes can be coded as those containing a single handling, as empty prefix codes can be coded as those containing a single
symbol `0`. symbol `0`.
**(ii) Normal Code Length Code:** **(ii) Normal Code Length Code:**
The code lengths of the Huffman code fit in 8 bits and are read as follows. The code lengths of the prefix code fit in 8 bits and are read as follows.
First, `num_code_lengths` specifies the number of code lengths. First, `num_code_lengths` specifies the number of code lengths.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -910,7 +910,7 @@ int num_code_lengths = 4 + ReadBits(4);
If `num_code_lengths` is > 18, the bitstream is invalid. If `num_code_lengths` is > 18, the bitstream is invalid.
The code lengths are themselves encoded using Huffman codes: lower level code The code lengths are themselves encoded using prefix codes: lower level code
lengths `code_length_code_lengths` first have to be read. The rest of those lengths `code_length_code_lengths` first have to be read. The rest of those
`code_length_code_lengths` (according to the order in `kCodeLengthCodeOrder`) `code_length_code_lengths` (according to the order in `kCodeLengthCodeOrder`)
are zeros. are zeros.
@ -934,7 +934,7 @@ int length_nbits = 2 + 2 * ReadBits(3);
int max_symbol = 2 + ReadBits(length_nbits); int max_symbol = 2 + ReadBits(length_nbits);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A Huffman table is then built from `code_length_code_lengths` and used to read A prefix table is then built from `code_length_code_lengths` and used to read
up to `max_symbol` code lengths. up to `max_symbol` code lengths.
* Code \[0..15\] indicates literal code lengths. * Code \[0..15\] indicates literal code lengths.
@ -955,85 +955,85 @@ distance) is formed using their respective alphabet sizes:
* other literals (A,R,B): 256 * other literals (A,R,B): 256
* distance code: 40 * distance code: 40
#### 5.2.2 Decoding of Meta Huffman Codes #### 5.2.2 Decoding of Meta Prefix Codes
As noted earlier, the format allows the use of different Huffman codes for As noted earlier, the format allows the use of different prefix codes for
different blocks of the image. _Meta Huffman codes_ are indexes identifying different blocks of the image. _Meta prefix codes_ are indexes identifying
which Huffman codes to use in different parts of the image. which prefix codes to use in different parts of the image.
Meta Huffman codes may be used _only_ when the image is being used in the Meta prefix codes may be used _only_ when the image is being used in the
[role](#roles-of-image-data) of an _ARGB image_. [role](#roles-of-image-data) of an _ARGB image_.
There are two possibilities for the meta Huffman codes, indicated by a 1-bit There are two possibilities for the meta prefix codes, indicated by a 1-bit
value: value:
* If this bit is zero, there is only one meta Huffman code used everywhere in * If this bit is zero, there is only one meta prefix code used everywhere in
the image. No more data is stored. the image. No more data is stored.
* If this bit is one, the image uses multiple meta Huffman codes. These meta * If this bit is one, the image uses multiple meta prefix codes. These meta
Huffman codes are stored as an _entropy image_ (described below). prefix codes are stored as an _entropy image_ (described below).
**Entropy image:** **Entropy image:**
The entropy image defines which Huffman codes are used in different parts of the The entropy image defines which prefix codes are used in different parts of the
image, as described below. image, as described below.
The first 3-bits contain the `huffman_bits` value. The dimensions of the entropy The first 3-bits contain the `prefix_bits` value. The dimensions of the entropy
image are derived from 'huffman_bits'. image are derived from 'prefix_bits'.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
int huffman_bits = ReadBits(3) + 2; int prefix_bits = ReadBits(3) + 2;
int huffman_xsize = DIV_ROUND_UP(xsize, 1 << huffman_bits); int prefix_xsize = DIV_ROUND_UP(xsize, 1 << prefix_bits);
int huffman_ysize = DIV_ROUND_UP(ysize, 1 << huffman_bits); int prefix_ysize = DIV_ROUND_UP(ysize, 1 << prefix_bits);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
where `DIV_ROUND_UP` is as defined [earlier](#predictor-transform). where `DIV_ROUND_UP` is as defined [earlier](#predictor-transform).
The next bits contain an entropy image of width `huffman_xsize` and height The next bits contain an entropy image of width `prefix_xsize` and height
`huffman_ysize`. `prefix_ysize`.
**Interpretation of Meta Huffman Codes:** **Interpretation of Meta Prefix Codes:**
For any given pixel (x, y), there is a set of five Huffman codes associated with For any given pixel (x, y), there is a set of five prefix codes associated with
it. These codes are (in bitstream order): it. These codes are (in bitstream order):
* **Huffman code #1**: used for green channel, backward-reference length and * **prefix code #1**: used for green channel, backward-reference length and
color cache color cache
* **Huffman code #2, #3 and #4**: used for red, blue and alpha channels * **prefix code #2, #3 and #4**: used for red, blue and alpha channels
respectively. respectively.
* **Huffman code #5**: used for backward-reference distance. * **prefix code #5**: used for backward-reference distance.
From here on, we refer to this set as a **Huffman code group**. From here on, we refer to this set as a **prefix code group**.
The number of Huffman code groups in the ARGB image can be obtained by finding The number of prefix code groups in the ARGB image can be obtained by finding
the _largest meta Huffman code_ from the entropy image: the _largest meta prefix code_ from the entropy image:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
int num_huff_groups = max(entropy image) + 1; int num_prefix_groups = max(entropy image) + 1;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
where `max(entropy image)` indicates the largest Huffman code stored in the where `max(entropy image)` indicates the largest prefix code stored in the
entropy image. entropy image.
As each Huffman code group contains five Huffman codes, the total number of As each prefix code group contains five prefix codes, the total number of
Huffman codes is: prefix codes is:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
int num_huff_codes = 5 * num_huff_groups; int num_prefix_codes = 5 * num_prefix_groups;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Given a pixel (x, y) in the ARGB image, we can obtain the corresponding Huffman Given a pixel (x, y) in the ARGB image, we can obtain the corresponding prefix
codes to be used as follows: codes to be used as follows:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
int position = (y >> huffman_bits) * huffman_xsize + (x >> huffman_bits); int position = (y >> prefix_bits) * prefix_xsize + (x >> prefix_bits);
int meta_huff_code = (entropy_image[pos] >> 8) & 0xffff; int meta_prefix_code = (entropy_image[pos] >> 8) & 0xffff;
HuffmanCodeGroup huff_group = huffman_code_groups[meta_huff_code]; PrefixCodeGroup prefix_group = prefix_code_groups[meta_prefix_code];
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
where, we have assumed the existence of `HuffmanCodeGroup` structure, which where, we have assumed the existence of `PrefixCodeGroup` structure, which
represents a set of five Huffman codes. Also, `huffman_code_groups` is an array represents a set of five prefix codes. Also, `prefix_code_groups` is an array
of `HuffmanCodeGroup` (of size `num_huff_groups`). of `PrefixCodeGroup` (of size `num_prefix_groups`).
The decoder then uses Huffman code group `huff_group` to decode the pixel The decoder then uses prefix code group `prefix_group` to decode the pixel
(x, y) as explained in the [next section](#decoding-entropy-coded-image-data). (x, y) as explained in the [next section](#decoding-entropy-coded-image-data).
#### 5.2.3 Decoding Entropy-coded Image Data #### 5.2.3 Decoding Entropy-coded Image Data
@ -1041,10 +1041,10 @@ The decoder then uses Huffman code group `huff_group` to decode the pixel
\[AMENDED2\] \[AMENDED2\]
For the current position (x, y) in the image, the decoder first identifies the For the current position (x, y) in the image, the decoder first identifies the
corresponding Huffman code group (as explained in the last section). Given the corresponding prefix code group (as explained in the last section). Given the
Huffman code group, the pixel is read and decoded as follows: prefix code group, the pixel is read and decoded as follows:
Read next symbol S from the bitstream using Huffman code #1. Note that S is any Read next symbol S from the bitstream using prefix code #1. Note that S is any
integer in the range `0` to integer in the range `0` to
`(256 + 24 + ` [`color_cache_size`](#color-cache-code)` - 1)`. `(256 + 24 + ` [`color_cache_size`](#color-cache-code)` - 1)`.
@ -1052,15 +1052,15 @@ The interpretation of S depends on its value:
1. if S < 256 1. if S < 256
1. Use S as the green component. 1. Use S as the green component.
1. Read red from the bitstream using Huffman code #2. 1. Read red from the bitstream using prefix code #2.
1. Read blue from the bitstream using Huffman code #3. 1. Read blue from the bitstream using prefix code #3.
1. Read alpha from the bitstream using Huffman code #4. 1. Read alpha from the bitstream using prefix code #4.
1. if S >= 256 && S < 256 + 24 1. if S >= 256 && S < 256 + 24
1. Use S - 256 as a length prefix code. 1. Use S - 256 as a length prefix code.
1. Read extra bits for length from the bitstream. 1. Read extra bits for length from the bitstream.
1. Determine backward-reference length L from length prefix code and the 1. Determine backward-reference length L from length prefix code and the
extra bits read. extra bits read.
1. Read distance prefix code from the bitstream using Huffman code #5. 1. Read distance prefix code from the bitstream using prefix code #5.
1. Read extra bits for distance from the bitstream. 1. Read extra bits for distance from the bitstream.
1. Determine backward-reference distance D from distance prefix code and 1. Determine backward-reference distance D from distance prefix code and
the extra bits read. the extra bits read.
@ -1109,23 +1109,23 @@ of pixels (xsize * ysize).
\[AMENDED2\] \[AMENDED2\]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<spatially-coded image> ::= <color cache info><meta huffman><data> <spatially-coded image> ::= <color cache info><meta prefix><data>
<entropy-coded image> ::= <color cache info><data> <entropy-coded image> ::= <color cache info><data>
<color cache info> ::= 1 bit value 0 | <color cache info> ::= 1 bit value 0 |
(1-bit value 1; 4-bit value for color cache size) (1-bit value 1; 4-bit value for color cache size)
<meta huffman> ::= 1-bit value 0 | <meta prefix> ::= 1-bit value 0 |
(1-bit value 1; <entropy image>) (1-bit value 1; <entropy image>)
<data> ::= <huffman codes><lz77-coded image> <data> ::= <prefix codes><lz77-coded image>
<entropy image> ::= 3-bit subsample value; <entropy-coded image> <entropy image> ::= 3-bit subsample value; <entropy-coded image>
<huffman codes> ::= <huffman code group> | <huffman code group><huffman codes> <prefix codes> ::= <prefix code group> | <prefix code group><prefix codes>
<huffman code group> ::= <huffman code><huffman code><huffman code> <prefix code group> ::= <prefix code><prefix code><prefix code>
<huffman code><huffman code> <prefix code><prefix code>
See "Interpretation of Meta Huffman codes" to See "Interpretation of Meta Prefix Codes" to
understand what each of these five Huffman codes are understand what each of these five prefix codes are
for. for.
<huffman code> ::= <simple huffman code> | <normal huffman code> <prefix code> ::= <simple prefix code> | <normal prefix code>
<simple huffman code> ::= see "Simple code length code" for details <simple prefix code> ::= see "Simple code length code" for details
<normal huffman code> ::= <code length code>; encoded code lengths <normal prefix code> ::= <code length code>; encoded code lengths
<code length code> ::= see section "Normal code length code" <code length code> ::= see section "Normal code length code"
<lz77-coded image> ::= ((<argb-pixel> | <lz77-copy> | <color-cache-code>) <lz77-coded image> ::= ((<argb-pixel> | <lz77-copy> | <color-cache-code>)
<lz77-coded image>) | "" <lz77-coded image>) | ""
@ -1136,7 +1136,7 @@ A possible example sequence:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<RIFF header><image size>1-bit value 1<subtract-green-tx> <RIFF header><image size>1-bit value 1<subtract-green-tx>
1-bit value 1<predictor-tx>1-bit value 0<color cache info>1-bit value 0 1-bit value 1<predictor-tx>1-bit value 0<color cache info>1-bit value 0
<huffman codes><lz77-coded image> <prefix codes><lz77-coded image>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[canonical_huff]: https://en.wikipedia.org/wiki/Canonical_Huffman_code [canonical_huff]: https://en.wikipedia.org/wiki/Canonical_Huffman_code