webp-lossless-bitstream-spec: mv Nomenclature after Intro

Change-Id: I3337513e48a8e604b154d91993bd7ff84d6c55ad
This commit is contained in:
James Zern 2022-08-02 18:50:59 -07:00
parent 79be856e6e
commit 337cf69f58

View File

@ -37,8 +37,52 @@ using today's PNG format.
{:toc}
Nomenclature
------------
1 Introduction
--------------
This document describes the compressed data representation of a WebP
lossless image. It is intended as a detailed reference for WebP lossless
encoder and decoder implementation.
In this document, we extensively use C programming language syntax to
describe the bitstream, and assume the existence of a function for
reading bits, `ReadBits(n)`. The bytes are read in the natural order of
the stream containing them, and bits of each byte are read in
least-significant-bit-first order. When multiple bits are read at the
same time, the integer is constructed from the original data in the
original order. The most significant bits of the returned integer are
also the most significant bits of the original data. Thus the statement
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
b = ReadBits(2);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
is equivalent with the two statements below:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
b = ReadBits(1);
b |= ReadBits(1) << 1;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We assume that each color component (e.g. alpha, red, blue and green) is
represented using an 8-bit byte. We define the corresponding type as
uint8. A whole ARGB pixel is represented by a type called uint32, an
unsigned integer consisting of 32 bits. In the code showing the behavior
of the transformations, alpha value is codified in bits 31..24, red in
bits 23..16, green in bits 15..8 and blue in bits 7..0, but
implementations of the format are free to use another representation
internally.
Broadly, a WebP lossless image contains header data, transform
information and actual image data. Headers contain width and height of
the image. A WebP lossless image can go through four different types of
transformation before being entropy encoded. The transform information
in the bitstream contains the data required to apply the respective
inverse transforms.
2 Nomenclature
--------------
ARGB
: A pixel value consisting of alpha, red, green, and blue values.
@ -95,51 +139,7 @@ scan-line order
completed, continue from the left-hand column of the next row.
1 Introduction
--------------
This document describes the compressed data representation of a WebP
lossless image. It is intended as a detailed reference for WebP lossless
encoder and decoder implementation.
In this document, we extensively use C programming language syntax to
describe the bitstream, and assume the existence of a function for
reading bits, `ReadBits(n)`. The bytes are read in the natural order of
the stream containing them, and bits of each byte are read in
least-significant-bit-first order. When multiple bits are read at the
same time, the integer is constructed from the original data in the
original order. The most significant bits of the returned integer are
also the most significant bits of the original data. Thus the statement
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
b = ReadBits(2);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
is equivalent with the two statements below:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
b = ReadBits(1);
b |= ReadBits(1) << 1;
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We assume that each color component (e.g. alpha, red, blue and green) is
represented using an 8-bit byte. We define the corresponding type as
uint8. A whole ARGB pixel is represented by a type called uint32, an
unsigned integer consisting of 32 bits. In the code showing the behavior
of the transformations, alpha value is codified in bits 31..24, red in
bits 23..16, green in bits 15..8 and blue in bits 7..0, but
implementations of the format are free to use another representation
internally.
Broadly, a WebP lossless image contains header data, transform
information and actual image data. Headers contain width and height of
the image. A WebP lossless image can go through four different types of
transformation before being entropy encoded. The transform information
in the bitstream contains the data required to apply the respective
inverse transforms.
2 RIFF Header
3 RIFF Header
-------------
The beginning of the header has the RIFF container. This consists of the
@ -183,7 +183,7 @@ int version_number = ReadBits(3);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3 Transformations
4 Transformations
-----------------
Transformations are reversible manipulations of the image data that can
@ -255,7 +255,7 @@ The transform data contains the prediction mode for each block of the
image. All the `block_width * block_height` pixels of a block use same
prediction mode. The prediction modes are treated as pixels of an image
and encoded using the same techniques described in
[Chapter 4](#image-data).
[Chapter 5](#image-data).
For a pixel _x, y_, one can compute the respective filter block address
by:
@ -460,7 +460,7 @@ The remaining part of the color transform data contains
`ColorTransformElement` instances corresponding to each block of the
image. `ColorTransformElement` instances are treated as pixels of an
image and encoded using the methods described in
[Chapter 4](#image-data).
[Chapter 5](#image-data).
During decoding, `ColorTransformElement` instances of the blocks are
decoded and the inverse color transform is applied on the ARGB values of
@ -606,12 +606,12 @@ The values are packed into the green component as follows:
the more significant bits of the green value at x / 8.
4 Image Data
5 Image Data
------------
Image data is an array of pixel values in scan-line order.
### 4.1 Roles of Image Data
### 5.1 Roles of Image Data
We use image data in five different roles:
@ -634,7 +634,7 @@ We use image data in five different roles:
[Color Indexing Transform](#color-indexing-transform). This is stored as an
image of width `color_table_size` and height `1`.
### 4.2 Encoding of Image Data
### 5.2 Encoding of Image Data
The encoding of image data is independent of its role.
@ -660,12 +660,12 @@ Each pixel is encoded using one of the three possible methods:
The following sub-sections describe each of these in detail.
#### 4.2.1 Prefix Coded Literals
#### 5.2.1 Prefix Coded Literals
The pixel is stored as prefix coded values of green, red, blue and alpha (in
that order). See [this section](#decoding-entropy-coded-image-data) for details.
#### 4.2.2 LZ77 Backward Reference
#### 5.2.2 LZ77 Backward Reference
Backward references are tuples of _length_ and _distance code_:
@ -781,14 +781,14 @@ where `distance_map` is the mapping noted above and `xsize` is the width of the
image in pixels.
#### 4.2.3 Color Cache Coding
#### 5.2.3 Color Cache Coding
{:#color-cache-code}
Color cache stores a set of colors that have been recently used in the image.
**Rationale:** This way, the recently used colors can sometimes be referred to
more efficiently than emitting them using the other two methods (described in
[4.2.1](#prefix-coded-literals) and [4.2.2](#lz77-backward-reference)).
[5.2.1](#prefix-coded-literals) and [5.2.2](#lz77-backward-reference)).
Color cache codes are stored as follows. First, there is a 1-bit value that
indicates if the color cache is used. If this bit is 0, no color cache codes
@ -818,10 +818,10 @@ by inserting every pixel, be it produced by backward referencing or as
literals, into the cache in the order they appear in the stream.
5 Entropy Code
6 Entropy Code
--------------
### 5.1 Overview
### 6.1 Overview
Most of the data is coded using a [canonical prefix code][canonical_huff].
Hence, the codes are transmitted by sending the _prefix code lengths_, as
@ -835,7 +835,7 @@ codes.
So, allowing them to use different entropy codes provides more flexibility and
potentially better compression.
### 5.2 Details
### 6.2 Details
The encoded image data consists of several parts:
@ -843,7 +843,7 @@ The encoded image data consists of several parts:
1. Meta prefix codes
1. Entropy-coded image data
#### 5.2.1 Decoding and Building the Prefix Codes
#### 6.2.1 Decoding and Building the Prefix Codes
There are several steps in decoding the prefix codes.
@ -955,7 +955,7 @@ distance) is formed using their respective alphabet sizes:
* other literals (A,R,B): 256
* distance code: 40
#### 5.2.2 Decoding of Meta Prefix Codes
#### 6.2.2 Decoding of Meta Prefix Codes
As noted earlier, the format allows the use of different prefix codes for
different blocks of the image. _Meta prefix codes_ are indexes identifying
@ -1036,7 +1036,7 @@ of `PrefixCodeGroup` (of size `num_prefix_groups`).
The decoder then uses prefix code group `prefix_group` to decode the pixel
(x, y) as explained in the [next section](#decoding-entropy-coded-image-data).
#### 5.2.3 Decoding Entropy-coded Image Data
#### 6.2.3 Decoding Entropy-coded Image Data
\[AMENDED2\]
@ -1071,7 +1071,7 @@ The interpretation of S depends on its value:
1. Get ARGB color from the color cache at that index.
6 Overall Structure of the Format
7 Overall Structure of the Format
---------------------------------
Below is a view into the format in Backus-Naur form. It does not cover