Rename Huffman coding to prefix coding in the bitstream spec

... since no guarantee of Huffman coding can be introduced at decoding
time and encoding didn't actually use Huffman coding in the first place

Bug: webp:551
Change-Id: I400466bb3b4a1d5506353eb3f287d658603164ee
This commit is contained in:
Jyrki Alakuijala 2022-06-16 17:46:18 +00:00 committed by James Zern
parent 8895f8a345
commit 404c1622f8

View File

@ -27,7 +27,7 @@ exactly, including the color values for zero alpha pixels. The
format uses subresolution images, recursively embedded into the format
itself, for storing statistical data about the images, such as the used
entropy codes, spatial predictors, color space conversion, and color
table. LZ77, Huffman coding, and a color cache are used for compression
table. LZ77, prefix coding, and a color cache are used for compression
of the bulk data. Decoding speeds faster than PNG have been
demonstrated, as well as 25% denser compression than can be achieved
using today's PNG format.
@ -65,9 +65,9 @@ distance mapping
entropy image
: A two-dimensional subresolution image indicating which entropy coding
should be used in a respective square in the image, i.e., each pixel
is a meta Huffman code.
is a meta prefix code.
Huffman code
prefix code
: A classic way to do entropy coding where a smaller number of bits are
used for more frequent codes.
@ -75,9 +75,9 @@ LZ77
: Dictionary-based sliding window compression algorithm that either
emits symbols or describes them as sequences of past symbols.
meta Huffman code
meta prefix code
: A small integer (up to 16 bits) that indexes an element in the meta
Huffman table.
prefix table.
predictor image
: A two-dimensional subresolution image indicating which spatial
@ -617,8 +617,8 @@ We use image data in five different roles:
1. ARGB image: Stores the actual pixels of the image.
1. Entropy image: Stores the
[meta Huffman codes](#decoding-of-meta-huffman-codes). The red and green
components of a pixel define the meta Huffman code used in a particular
[meta prefix codes](#decoding-of-meta-prefix-codes). The red and green
components of a pixel define the meta prefix code used in a particular
block of the ARGB image.
1. Predictor image: Stores the metadata for [Predictor
Transform](#predictor-transform). The green component of a pixel defines
@ -651,7 +651,7 @@ the image.
Each pixel is encoded using one of the three possible methods:
1. Huffman coded literal: each channel (green, red, blue and alpha) is
1. prefix coded literal: each channel (green, red, blue and alpha) is
entropy-coded independently;
2. LZ77 backward reference: a sequence of pixels are copied from elsewhere
in the image; or
@ -660,9 +660,9 @@ Each pixel is encoded using one of the three possible methods:
The following sub-sections describe each of these in detail.
#### 4.2.1 Huffman Coded Literals
#### 4.2.1 Prefix Coded Literals
The pixel is stored as Huffman coded values of green, red, blue and alpha (in
The pixel is stored as prefix coded values of green, red, blue and alpha (in
that order). See [this section](#decoding-entropy-coded-image-data) for details.
#### 4.2.2 LZ77 Backward Reference
@ -788,11 +788,11 @@ Color cache stores a set of colors that have been recently used in the image.
**Rationale:** This way, the recently used colors can sometimes be referred to
more efficiently than emitting them using the other two methods (described in
[4.2.1](#huffman-coded-literals) and [4.2.2](#lz77-backward-reference)).
[4.2.1](#prefix-coded-literals) and [4.2.2](#lz77-backward-reference)).
Color cache codes are stored as follows. First, there is a 1-bit value that
indicates if the color cache is used. If this bit is 0, no color cache codes
exist, and they are not transmitted in the Huffman code that decodes the green
exist, and they are not transmitted in the prefix code that decodes the green
symbols and the length prefix codes. However, if this bit is 1, the color cache
size is read next:
@ -823,11 +823,11 @@ literals, into the cache in the order they appear in the stream.
### 5.1 Overview
Most of the data is coded using [canonical Huffman code][canonical_huff]. Hence,
the codes are transmitted by sending the _Huffman code lengths_, as opposed to
the actual _Huffman codes_.
Most of the data is coded using a [canonical prefix code][canonical_huff].
Hence, the codes are transmitted by sending the _prefix code lengths_, as
opposed to the actual _prefix codes_.
In particular, the format uses **spatially-variant Huffman coding**. In other
In particular, the format uses **spatially-variant prefix coding**. In other
words, different blocks of the image can potentially use different entropy
codes.
@ -840,19 +840,19 @@ potentially better compression.
The encoded image data consists of several parts:
1. Decoding and building the prefix codes \[AMENDED2\]
1. Meta Huffman codes
1. Meta prefix codes
1. Entropy-coded image data
#### 5.2.1 Decoding and Building the Prefix Codes
There are several steps in decoding the Huffman codes.
There are several steps in decoding the prefix codes.
**Decoding the Code Lengths:**
{:#decoding-the-code-lengths}
This section describes how to read the Huffman code lengths from the bitstream.
This section describes how to read the prefix code lengths from the bitstream.
The Huffman code lengths can be coded in two ways. The method used is specified
The prefix code lengths can be coded in two ways. The method used is specified
by a 1-bit value.
* If this bit is 1, it is a _simple code length code_, and
@ -865,8 +865,8 @@ stream. This may be inefficient, but it is allowed by the format.
\[AMENDED2\]
This variant is used in the special case when only 1 or 2 Huffman symbols are
in the range \[0..255\] with code length `1`. All other Huffman code lengths
This variant is used in the special case when only 1 or 2 prefix symbols are
in the range \[0..255\] with code length `1`. All other prefix code lengths
are implicitly zeros.
The first bit indicates the number of symbols:
@ -891,17 +891,17 @@ if (num_symbols == 2) {
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
**Note:** Another special case is when _all_ Huffman code lengths are _zeros_
(an empty Huffman code). For example, a Huffman code for distance can be empty
if there are no backward references. Similarly, Huffman codes for alpha, red,
and blue can be empty if all pixels within the same meta Huffman code are
**Note:** Another special case is when _all_ prefix code lengths are _zeros_
(an empty prefix code). For example, a prefix code for distance can be empty
if there are no backward references. Similarly, prefix codes for alpha, red,
and blue can be empty if all pixels within the same meta prefix code are
produced using the color cache. However, this case doesn't need a special
handling, as empty Huffman codes can be coded as those containing a single
handling, as empty prefix codes can be coded as those containing a single
symbol `0`.
**(ii) Normal Code Length Code:**
The code lengths of the Huffman code fit in 8 bits and are read as follows.
The code lengths of the prefix code fit in 8 bits and are read as follows.
First, `num_code_lengths` specifies the number of code lengths.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@ -910,7 +910,7 @@ int num_code_lengths = 4 + ReadBits(4);
If `num_code_lengths` is > 18, the bitstream is invalid.
The code lengths are themselves encoded using Huffman codes: lower level code
The code lengths are themselves encoded using prefix codes: lower level code
lengths `code_length_code_lengths` first have to be read. The rest of those
`code_length_code_lengths` (according to the order in `kCodeLengthCodeOrder`)
are zeros.
@ -934,7 +934,7 @@ int length_nbits = 2 + 2 * ReadBits(3);
int max_symbol = 2 + ReadBits(length_nbits);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A Huffman table is then built from `code_length_code_lengths` and used to read
A prefix table is then built from `code_length_code_lengths` and used to read
up to `max_symbol` code lengths.
* Code \[0..15\] indicates literal code lengths.
@ -955,85 +955,85 @@ distance) is formed using their respective alphabet sizes:
* other literals (A,R,B): 256
* distance code: 40
#### 5.2.2 Decoding of Meta Huffman Codes
#### 5.2.2 Decoding of Meta Prefix Codes
As noted earlier, the format allows the use of different Huffman codes for
different blocks of the image. _Meta Huffman codes_ are indexes identifying
which Huffman codes to use in different parts of the image.
As noted earlier, the format allows the use of different prefix codes for
different blocks of the image. _Meta prefix codes_ are indexes identifying
which prefix codes to use in different parts of the image.
Meta Huffman codes may be used _only_ when the image is being used in the
Meta prefix codes may be used _only_ when the image is being used in the
[role](#roles-of-image-data) of an _ARGB image_.
There are two possibilities for the meta Huffman codes, indicated by a 1-bit
There are two possibilities for the meta prefix codes, indicated by a 1-bit
value:
* If this bit is zero, there is only one meta Huffman code used everywhere in
* If this bit is zero, there is only one meta prefix code used everywhere in
the image. No more data is stored.
* If this bit is one, the image uses multiple meta Huffman codes. These meta
Huffman codes are stored as an _entropy image_ (described below).
* If this bit is one, the image uses multiple meta prefix codes. These meta
prefix codes are stored as an _entropy image_ (described below).
**Entropy image:**
The entropy image defines which Huffman codes are used in different parts of the
The entropy image defines which prefix codes are used in different parts of the
image, as described below.
The first 3-bits contain the `huffman_bits` value. The dimensions of the entropy
image are derived from 'huffman_bits'.
The first 3-bits contain the `prefix_bits` value. The dimensions of the entropy
image are derived from 'prefix_bits'.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
int huffman_bits = ReadBits(3) + 2;
int huffman_xsize = DIV_ROUND_UP(xsize, 1 << huffman_bits);
int huffman_ysize = DIV_ROUND_UP(ysize, 1 << huffman_bits);
int prefix_bits = ReadBits(3) + 2;
int prefix_xsize = DIV_ROUND_UP(xsize, 1 << prefix_bits);
int prefix_ysize = DIV_ROUND_UP(ysize, 1 << prefix_bits);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
where `DIV_ROUND_UP` is as defined [earlier](#predictor-transform).
The next bits contain an entropy image of width `huffman_xsize` and height
`huffman_ysize`.
The next bits contain an entropy image of width `prefix_xsize` and height
`prefix_ysize`.
**Interpretation of Meta Huffman Codes:**
**Interpretation of Meta Prefix Codes:**
For any given pixel (x, y), there is a set of five Huffman codes associated with
For any given pixel (x, y), there is a set of five prefix codes associated with
it. These codes are (in bitstream order):
* **Huffman code #1**: used for green channel, backward-reference length and
* **prefix code #1**: used for green channel, backward-reference length and
color cache
* **Huffman code #2, #3 and #4**: used for red, blue and alpha channels
* **prefix code #2, #3 and #4**: used for red, blue and alpha channels
respectively.
* **Huffman code #5**: used for backward-reference distance.
* **prefix code #5**: used for backward-reference distance.
From here on, we refer to this set as a **Huffman code group**.
From here on, we refer to this set as a **prefix code group**.
The number of Huffman code groups in the ARGB image can be obtained by finding
the _largest meta Huffman code_ from the entropy image:
The number of prefix code groups in the ARGB image can be obtained by finding
the _largest meta prefix code_ from the entropy image:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
int num_huff_groups = max(entropy image) + 1;
int num_prefix_groups = max(entropy image) + 1;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
where `max(entropy image)` indicates the largest Huffman code stored in the
where `max(entropy image)` indicates the largest prefix code stored in the
entropy image.
As each Huffman code group contains five Huffman codes, the total number of
Huffman codes is:
As each prefix code group contains five prefix codes, the total number of
prefix codes is:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
int num_huff_codes = 5 * num_huff_groups;
int num_prefix_codes = 5 * num_prefix_groups;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Given a pixel (x, y) in the ARGB image, we can obtain the corresponding Huffman
Given a pixel (x, y) in the ARGB image, we can obtain the corresponding prefix
codes to be used as follows:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
int position = (y >> huffman_bits) * huffman_xsize + (x >> huffman_bits);
int meta_huff_code = (entropy_image[pos] >> 8) & 0xffff;
HuffmanCodeGroup huff_group = huffman_code_groups[meta_huff_code];
int position = (y >> prefix_bits) * prefix_xsize + (x >> prefix_bits);
int meta_prefix_code = (entropy_image[pos] >> 8) & 0xffff;
PrefixCodeGroup prefix_group = prefix_code_groups[meta_prefix_code];
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
where, we have assumed the existence of `HuffmanCodeGroup` structure, which
represents a set of five Huffman codes. Also, `huffman_code_groups` is an array
of `HuffmanCodeGroup` (of size `num_huff_groups`).
where, we have assumed the existence of `PrefixCodeGroup` structure, which
represents a set of five prefix codes. Also, `prefix_code_groups` is an array
of `PrefixCodeGroup` (of size `num_prefix_groups`).
The decoder then uses Huffman code group `huff_group` to decode the pixel
The decoder then uses prefix code group `prefix_group` to decode the pixel
(x, y) as explained in the [next section](#decoding-entropy-coded-image-data).
#### 5.2.3 Decoding Entropy-coded Image Data
@ -1041,10 +1041,10 @@ The decoder then uses Huffman code group `huff_group` to decode the pixel
\[AMENDED2\]
For the current position (x, y) in the image, the decoder first identifies the
corresponding Huffman code group (as explained in the last section). Given the
Huffman code group, the pixel is read and decoded as follows:
corresponding prefix code group (as explained in the last section). Given the
prefix code group, the pixel is read and decoded as follows:
Read next symbol S from the bitstream using Huffman code #1. Note that S is any
Read next symbol S from the bitstream using prefix code #1. Note that S is any
integer in the range `0` to
`(256 + 24 + ` [`color_cache_size`](#color-cache-code)` - 1)`.
@ -1052,15 +1052,15 @@ The interpretation of S depends on its value:
1. if S < 256
1. Use S as the green component.
1. Read red from the bitstream using Huffman code #2.
1. Read blue from the bitstream using Huffman code #3.
1. Read alpha from the bitstream using Huffman code #4.
1. Read red from the bitstream using prefix code #2.
1. Read blue from the bitstream using prefix code #3.
1. Read alpha from the bitstream using prefix code #4.
1. if S >= 256 && S < 256 + 24
1. Use S - 256 as a length prefix code.
1. Read extra bits for length from the bitstream.
1. Determine backward-reference length L from length prefix code and the
extra bits read.
1. Read distance prefix code from the bitstream using Huffman code #5.
1. Read distance prefix code from the bitstream using prefix code #5.
1. Read extra bits for distance from the bitstream.
1. Determine backward-reference distance D from distance prefix code and
the extra bits read.
@ -1109,23 +1109,23 @@ of pixels (xsize * ysize).
\[AMENDED2\]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<spatially-coded image> ::= <color cache info><meta huffman><data>
<spatially-coded image> ::= <color cache info><meta prefix><data>
<entropy-coded image> ::= <color cache info><data>
<color cache info> ::= 1 bit value 0 |
(1-bit value 1; 4-bit value for color cache size)
<meta huffman> ::= 1-bit value 0 |
(1-bit value 1; <entropy image>)
<data> ::= <huffman codes><lz77-coded image>
<meta prefix> ::= 1-bit value 0 |
(1-bit value 1; <entropy image>)
<data> ::= <prefix codes><lz77-coded image>
<entropy image> ::= 3-bit subsample value; <entropy-coded image>
<huffman codes> ::= <huffman code group> | <huffman code group><huffman codes>
<huffman code group> ::= <huffman code><huffman code><huffman code>
<huffman code><huffman code>
See "Interpretation of Meta Huffman codes" to
understand what each of these five Huffman codes are
for.
<huffman code> ::= <simple huffman code> | <normal huffman code>
<simple huffman code> ::= see "Simple code length code" for details
<normal huffman code> ::= <code length code>; encoded code lengths
<prefix codes> ::= <prefix code group> | <prefix code group><prefix codes>
<prefix code group> ::= <prefix code><prefix code><prefix code>
<prefix code><prefix code>
See "Interpretation of Meta Prefix Codes" to
understand what each of these five prefix codes are
for.
<prefix code> ::= <simple prefix code> | <normal prefix code>
<simple prefix code> ::= see "Simple code length code" for details
<normal prefix code> ::= <code length code>; encoded code lengths
<code length code> ::= see section "Normal code length code"
<lz77-coded image> ::= ((<argb-pixel> | <lz77-copy> | <color-cache-code>)
<lz77-coded image>) | ""
@ -1136,7 +1136,7 @@ A possible example sequence:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
<RIFF header><image size>1-bit value 1<subtract-green-tx>
1-bit value 1<predictor-tx>1-bit value 0<color cache info>1-bit value 0
<huffman codes><lz77-coded image>
<prefix codes><lz77-coded image>
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[canonical_huff]: https://en.wikipedia.org/wiki/Canonical_Huffman_code