Merge changes I96bc063c,I45880467,If9e18e5a,I6ee938e4,I0a410b28, ... into main

* changes:
  webp-container-spec: add prose for rendering process
  webp-container-spec: note reserved fields MUST be ignored
  webp-lossless-bitstream-spec: improve 'small' color table stmt
  webp-container-spec: remove redundant sentence
  doc/webp-*: fix some punctuation, grammar
  webp-container-spec: clarify background color note
  webp-container-spec: come too late -> out of order
  webp-container-spec: prefer hex literals
  webp-container-spec: change SHOULD to MUST w/ANIM chunk
  webp-container-spec: add unknown fields MUST be ignored
  webp-container-spec: make padding byte=0 a MUST
  webp-container-spec: update note on trailing data
  webp-container-spec: clarify Chunk Size is in bytes
This commit is contained in:
James Zern 2022-10-14 04:05:31 +00:00 committed by Gerrit Code Review
commit dc05b4db2a
2 changed files with 68 additions and 47 deletions

View File

@ -25,7 +25,7 @@ to compress image data in a lossy way, or (ii) the WebP lossless encoding
(and possibly other encodings in the future). These encoding schemes should (and possibly other encodings in the future). These encoding schemes should
make it more efficient than currently used formats. It is optimized for fast make it more efficient than currently used formats. It is optimized for fast
image transfer over the network (e.g., for websites). The WebP format has image transfer over the network (e.g., for websites). The WebP format has
feature parity (color profile, metadata, animation etc) with other formats as feature parity (color profile, metadata, animation, etc.) with other formats as
well. This document describes the structure of a WebP file. well. This document describes the structure of a WebP file.
The WebP container (i.e., RIFF container for WebP) allows feature support over The WebP container (i.e., RIFF container for WebP) allows feature support over
@ -84,8 +84,8 @@ _uint32_
_FourCC_ _FourCC_
: A _FourCC_ (four-character code) is a _uint32_ created by concatenating four : A _FourCC_ (four-character code) is a _uint32_ created by concatenating four
ASCII characters in little-endian order. This means 'aaaa' (%x61.61.61.61) and ASCII characters in little-endian order. This means 'aaaa' (0x61616161) and
'AAAA' (%x41.41.41.41) are treated as different _FourCCs_. 'AAAA' (0x41414141) are treated as different _FourCCs_.
_1-based_ _1-based_
@ -123,14 +123,13 @@ Chunk FourCC: 32 bits
Chunk Size: 32 bits (_uint32_) Chunk Size: 32 bits (_uint32_)
: The size of the chunk not including this field, the chunk identifier or : The size of the chunk in bytes, not including this field, the chunk
padding. identifier or padding.
Chunk Payload: _Chunk Size_ bytes Chunk Payload: _Chunk Size_ bytes
: The data payload. If _Chunk Size_ is odd, a single padding byte -- that : The data payload. If _Chunk Size_ is odd, a single padding byte -- that MUST
SHOULD be `0` to conform with RIFF -- is added. Applications MAY use be `0` to conform with RIFF -- is added.
another value, but readers may fail to parse the file.
**Note:** RIFF has a convention that all-uppercase chunk FourCCs are standard **Note:** RIFF has a convention that all-uppercase chunk FourCCs are standard
chunks that apply to any RIFF file format, while FourCCs specific to a file chunks that apply to any RIFF file format, while FourCCs specific to a file
@ -166,10 +165,11 @@ File Size: 32 bits (_uint32_)
A WebP file MUST begin with a RIFF header with the FourCC 'WEBP'. The file size A WebP file MUST begin with a RIFF header with the FourCC 'WEBP'. The file size
in the header is the total size of the chunks that follow plus `4` bytes for in the header is the total size of the chunks that follow plus `4` bytes for
the 'WEBP' FourCC. The file SHOULD NOT contain anything after it. Readers MAY the 'WEBP' FourCC. The file SHOULD NOT contain any data after the data
parse such files, ignoring the trailing data. As the size of any chunk is even, specified by _File Size_. Readers MAY parse such files, ignoring the trailing
the size given by the RIFF header is also even. The contents of individual data. As the size of any chunk is even, the size given by the RIFF header is
chunks will be described in the following sections. also even. The contents of individual chunks will be described in the following
sections.
Simple File Format (Lossy) Simple File Format (Lossy)
@ -206,7 +206,7 @@ VP8 data: _Chunk Size_ bytes
: VP8 bitstream data. : VP8 bitstream data.
Note the fourth character in the 'VP8 ' FourCC is an ASCII space (%x20). Note the fourth character in the 'VP8 ' FourCC is an ASCII space (0x20).
The VP8 bitstream format specification can be found at [VP8 Data Format and The VP8 bitstream format specification can be found at [VP8 Data Format and
Decoding Guide][vp8spec]. Note that the VP8 frame header contains the VP8 frame Decoding Guide][vp8spec]. Note that the VP8 frame header contains the VP8 frame
@ -292,7 +292,7 @@ details about frames can be found in the [Animation](#animation) section.
All chunks SHOULD be placed in the same order as listed above. If a chunk All chunks SHOULD be placed in the same order as listed above. If a chunk
appears in the wrong place, the file is invalid, but readers MAY parse the appears in the wrong place, the file is invalid, but readers MAY parse the
file, ignoring the chunks that come too late. file, ignoring the chunks that are out of order.
**Rationale:** Setting the order of chunks should allow quicker file **Rationale:** Setting the order of chunks should allow quicker file
parsing. For example, if an 'ALPH' chunk does not appear in its required parsing. For example, if an 'ALPH' chunk does not appear in its required
@ -322,7 +322,7 @@ Extended WebP file header:
Reserved (Rsv): 2 bits Reserved (Rsv): 2 bits
: MUST be `0`. : MUST be `0`. Readers MUST ignore this field.
ICC profile (I): 1 bit ICC profile (I): 1 bit
@ -348,11 +348,11 @@ Animation (A): 1 bit
Reserved (R): 1 bit Reserved (R): 1 bit
: MUST be `0`. : MUST be `0`. Readers MUST ignore this field.
Reserved: 24 bits Reserved: 24 bits
: MUST be `0`. : MUST be `0`. Readers MUST ignore this field.
Canvas Width Minus One: 24 bits Canvas Width Minus One: 24 bits
@ -366,7 +366,7 @@ Canvas Height Minus One: 24 bits
The product of _Canvas Width_ and _Canvas Height_ MUST be at most `2^32 - 1`. The product of _Canvas Width_ and _Canvas Height_ MUST be at most `2^32 - 1`.
Future specifications MAY add more fields. Future specifications may add more fields. Unknown fields MUST be ignored.
### Chunks ### Chunks
@ -400,8 +400,8 @@ Background Color: 32 bits (_uint32_)
**Note**: **Note**:
* Background color MAY contain a transparency value (alpha), even if the * Background color MAY contain a non-opaque alpha value, even if the _Alpha_
_Alpha_ flag in [VP8X chunk](#extended_header) is unset. flag in [VP8X chunk](#extended_header) is unset.
* Viewer applications SHOULD treat the background color value as a hint, and * Viewer applications SHOULD treat the background color value as a hint, and
are not required to use it. are not required to use it.
@ -414,8 +414,8 @@ Loop Count: 16 bits (_uint16_)
: The number of times to loop the animation. `0` means infinitely. : The number of times to loop the animation. `0` means infinitely.
This chunk MUST appear if the _Animation_ flag in the VP8X chunk is set. This chunk MUST appear if the _Animation_ flag in the VP8X chunk is set.
If the _Animation_ flag is not set and this chunk is present, it If the _Animation_ flag is not set and this chunk is present, it MUST be
SHOULD be ignored. ignored.
ANMF chunk: ANMF chunk:
@ -466,7 +466,7 @@ Frame Duration: 24 bits (_uint24_)
Reserved: 6 bits Reserved: 6 bits
: MUST be 0. : MUST be `0`. Readers MUST ignore this field.
Blending method (B): 1 bit Blending method (B): 1 bit
@ -547,7 +547,7 @@ _padded_ chunks as described by the [RIFF file format](#riff-file-format).
Reserved (Rsv): 2 bits Reserved (Rsv): 2 bits
: MUST be `0`. : MUST be `0`. Readers MUST ignore this field.
Pre-processing (P): 2 bits Pre-processing (P): 2 bits
@ -686,8 +686,7 @@ If this chunk is not present, sRGB SHOULD be assumed.
Metadata can be stored in 'EXIF' or 'XMP ' chunks. Metadata can be stored in 'EXIF' or 'XMP ' chunks.
There SHOULD be at most one chunk of each type ('EXIF' and 'XMP '). If there There SHOULD be at most one chunk of each type ('EXIF' and 'XMP '). If there
are more such chunks, readers MAY ignore all except the first one. Also, a file are more such chunks, readers MAY ignore all except the first one.
may possibly contain both 'EXIF' and 'XMP ' chunks.
The chunks are defined as follows: The chunks are defined as follows:
@ -721,7 +720,7 @@ XMP Metadata: _Chunk Size_ bytes
: image metadata in XMP format. : image metadata in XMP format.
Note the fourth character in the 'XMP ' FourCC is an ASCII space (%x20). Note the fourth character in the 'XMP ' FourCC is an ASCII space (0x20).
Additional guidance about handling metadata can be found in the Additional guidance about handling metadata can be found in the
Metadata Working Group's [Guidelines for Handling Metadata][metadata]. Metadata Working Group's [Guidelines for Handling Metadata][metadata].
@ -747,19 +746,40 @@ original order (unless they specifically intend to modify these chunks).
### Assembling the Canvas from frames ### Assembling the Canvas from frames
Here we provide an overview of how a reader should assemble a canvas in the Here we provide an overview of how a reader MUST assemble a canvas in the case
case of an animated image. The notation _VP8X.field_ means the field in the of an animated image.
'VP8X' chunk with the same description.
Displaying an _animated image_ canvas MUST be equivalent to the following The process begins with creating a canvas using the dimensions given in the
pseudocode: 'VP8X' chunk, `Canvas Width Minus One + 1` pixels wide by `Canvas Height Minus
One + 1` pixels high. The `Loop Count` field from the 'ANIM' chunk controls how
many times the animation process is repeated. This is `Loop Count - 1` for
non-zero `Loop Count` values or infinitely if `Loop Count` is zero.
At the beginning of each loop iteration the canvas is filled using the
background color from the 'ANIM' chunk or an application defined color.
'ANMF' chunks contain individual frames given in display order. Before rendering
each frame, the previous frame's `Disposal method` is applied.
The rendering of the decoded frame begins at the Cartesian coordinates (`2 *
Frame X`, `2 * Frame Y`) using the top-left corner of the canvas as the origin.
`Frame Width Minus One + 1` pixels wide by `Frame Height Minus One + 1` pixels
high are rendered onto the canvas using the `Blending method`.
The canvas is displayed for `Frame Duration` milliseconds. This continues until
all frames given by 'ANMF' chunks have been displayed. A new loop iteration is
then begun or the canvas is left in its final state if all iterations have been
completed.
The following pseudocode illustrates the rendering process. The notation
_VP8X.field_ means the field in the 'VP8X' chunk with the same description.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
assert VP8X.flags.hasAnimation assert VP8X.flags.hasAnimation
canvas ← new image of size VP8X.canvasWidth x VP8X.canvasHeight with canvas ← new image of size VP8X.canvasWidth x VP8X.canvasHeight with
background color ANIM.background_color. background color ANIM.background_color.
loop_count ← ANIM.loopCount loop_count ← ANIM.loopCount
dispose_method ← ANIM.disposeMethod dispose_method ← Dispose to background color
if loop_count == 0: if loop_count == 0:
loop_count = ∞ loop_count = ∞
frame_params ← nil frame_params ← nil
@ -785,10 +805,12 @@ for loop = 0..loop_count - 1
frame_params.bitstream = bitstream_data frame_params.bitstream = bitstream_data
render frame with frame_params.alpha and frame_params.bitstream render frame with frame_params.alpha and frame_params.bitstream
on canvas with top-left corner at (frame_params.frameX, on canvas with top-left corner at (frame_params.frameX,
frame_params.frameY), using dispose method dispose_method. frame_params.frameY), using blending method
frame_params.blendingMethod.
canvas contains the decoded image. canvas contains the decoded image.
Show the contents of the canvas for Show the contents of the canvas for
frame_params.frameDuration * 1ms. frame_params.frameDuration * 1ms.
dispose_method = frame_params.disposeMethod
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

View File

@ -51,7 +51,7 @@ the stream containing them, and bits of each byte are read in
least-significant-bit-first order. When multiple bits are read at the least-significant-bit-first order. When multiple bits are read at the
same time, the integer is constructed from the original data in the same time, the integer is constructed from the original data in the
original order. The most significant bits of the returned integer are original order. The most significant bits of the returned integer are
also the most significant bits of the original data. Thus the statement also the most significant bits of the original data. Thus, the statement
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
b = ReadBits(2); b = ReadBits(2);
@ -196,7 +196,7 @@ once. The transformations are used only for the main level ARGB image:
the subresolution images have no transforms, not even the 0 bit the subresolution images have no transforms, not even the 0 bit
indicating the end-of-transforms. indicating the end-of-transforms.
Typically an encoder would use these transforms to reduce the Shannon Typically, an encoder would use these transforms to reduce the Shannon
entropy in the residual image. Also, the transform data can be decided entropy in the residual image. Also, the transform data can be decided
based on entropy minimization. based on entropy minimization.
@ -566,9 +566,9 @@ pixels are bundled into a single pixel. The pixel bundling packs several
respectively. Pixel bundling allows for a more efficient joint respectively. Pixel bundling allows for a more efficient joint
distribution entropy coding of neighboring pixels, and gives some distribution entropy coding of neighboring pixels, and gives some
arithmetic coding-like benefits to the entropy code, but it can only be arithmetic coding-like benefits to the entropy code, but it can only be
used when there are a small number of unique values. used when there are 16 or fewer unique values.
`color_table_size` specifies how many pixels are combined together: `color_table_size` specifies how many pixels are combined:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
int width_bits; int width_bits;
@ -583,13 +583,12 @@ if (color_table_size <= 2) {
} }
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
`width_bits` has a value of 0, 1, 2 or 3. A value of 0 indicates no `width_bits` has a value of 0, 1, 2 or 3. A value of 0 indicates no pixel
pixel bundling to be done for the image. A value of 1 indicates that two bundling to be done for the image. A value of 1 indicates that two pixels are
pixels are combined together, and each pixel has a range of \[0..15\]. A combined, and each pixel has a range of \[0..15\]. A value of 2 indicates that
value of 2 indicates that four pixels are combined together, and each four pixels are combined, and each pixel has a range of \[0..3\]. A value of 3
pixel has a range of \[0..3\]. A value of 3 indicates that eight pixels indicates that eight pixels are combined and each pixel has a range of \[0..1\],
are combined together and each pixel has a range of \[0..1\], i.e., a i.e., a binary value.
binary value.
The values are packed into the green component as follows: The values are packed into the green component as follows:
@ -659,7 +658,7 @@ Each pixel is encoded using one of the three possible methods:
3. Color cache code: using a short multiplicative hash code (color cache 3. Color cache code: using a short multiplicative hash code (color cache
index) of a recently seen color. index) of a recently seen color.
The following sub-sections describe each of these in detail. The following subsections describe each of these in detail.
#### 5.2.1 Prefix Coded Literals #### 5.2.1 Prefix Coded Literals
@ -725,7 +724,7 @@ return offset + ReadBits(extra_bits) + 1;
{:#distance-mapping} {:#distance-mapping}
As noted previously, distance code is a number indicating the position of a As noted previously, distance code is a number indicating the position of a
previously seen pixel, from which the pixels are to be copied. This sub-section previously seen pixel, from which the pixels are to be copied. This subsection
defines the mapping between a distance code and the position of a previous defines the mapping between a distance code and the position of a previous
pixel. pixel.