Merge changes I96bc063c,I45880467,If9e18e5a,I6ee938e4,I0a410b28, ... into main

* changes:
  webp-container-spec: add prose for rendering process
  webp-container-spec: note reserved fields MUST be ignored
  webp-lossless-bitstream-spec: improve 'small' color table stmt
  webp-container-spec: remove redundant sentence
  doc/webp-*: fix some punctuation, grammar
  webp-container-spec: clarify background color note
  webp-container-spec: come too late -> out of order
  webp-container-spec: prefer hex literals
  webp-container-spec: change SHOULD to MUST w/ANIM chunk
  webp-container-spec: add unknown fields MUST be ignored
  webp-container-spec: make padding byte=0 a MUST
  webp-container-spec: update note on trailing data
  webp-container-spec: clarify Chunk Size is in bytes
This commit is contained in:
James Zern 2022-10-14 04:05:31 +00:00 committed by Gerrit Code Review
commit dc05b4db2a
2 changed files with 68 additions and 47 deletions

View File

@ -25,7 +25,7 @@ to compress image data in a lossy way, or (ii) the WebP lossless encoding
(and possibly other encodings in the future). These encoding schemes should
make it more efficient than currently used formats. It is optimized for fast
image transfer over the network (e.g., for websites). The WebP format has
feature parity (color profile, metadata, animation etc) with other formats as
feature parity (color profile, metadata, animation, etc.) with other formats as
well. This document describes the structure of a WebP file.
The WebP container (i.e., RIFF container for WebP) allows feature support over
@ -84,8 +84,8 @@ _uint32_
_FourCC_
: A _FourCC_ (four-character code) is a _uint32_ created by concatenating four
ASCII characters in little-endian order. This means 'aaaa' (%x61.61.61.61) and
'AAAA' (%x41.41.41.41) are treated as different _FourCCs_.
ASCII characters in little-endian order. This means 'aaaa' (0x61616161) and
'AAAA' (0x41414141) are treated as different _FourCCs_.
_1-based_
@ -123,14 +123,13 @@ Chunk FourCC: 32 bits
Chunk Size: 32 bits (_uint32_)
: The size of the chunk not including this field, the chunk identifier or
padding.
: The size of the chunk in bytes, not including this field, the chunk
identifier or padding.
Chunk Payload: _Chunk Size_ bytes
: The data payload. If _Chunk Size_ is odd, a single padding byte -- that
SHOULD be `0` to conform with RIFF -- is added. Applications MAY use
another value, but readers may fail to parse the file.
: The data payload. If _Chunk Size_ is odd, a single padding byte -- that MUST
be `0` to conform with RIFF -- is added.
**Note:** RIFF has a convention that all-uppercase chunk FourCCs are standard
chunks that apply to any RIFF file format, while FourCCs specific to a file
@ -166,10 +165,11 @@ File Size: 32 bits (_uint32_)
A WebP file MUST begin with a RIFF header with the FourCC 'WEBP'. The file size
in the header is the total size of the chunks that follow plus `4` bytes for
the 'WEBP' FourCC. The file SHOULD NOT contain anything after it. Readers MAY
parse such files, ignoring the trailing data. As the size of any chunk is even,
the size given by the RIFF header is also even. The contents of individual
chunks will be described in the following sections.
the 'WEBP' FourCC. The file SHOULD NOT contain any data after the data
specified by _File Size_. Readers MAY parse such files, ignoring the trailing
data. As the size of any chunk is even, the size given by the RIFF header is
also even. The contents of individual chunks will be described in the following
sections.
Simple File Format (Lossy)
@ -206,7 +206,7 @@ VP8 data: _Chunk Size_ bytes
: VP8 bitstream data.
Note the fourth character in the 'VP8 ' FourCC is an ASCII space (%x20).
Note the fourth character in the 'VP8 ' FourCC is an ASCII space (0x20).
The VP8 bitstream format specification can be found at [VP8 Data Format and
Decoding Guide][vp8spec]. Note that the VP8 frame header contains the VP8 frame
@ -292,7 +292,7 @@ details about frames can be found in the [Animation](#animation) section.
All chunks SHOULD be placed in the same order as listed above. If a chunk
appears in the wrong place, the file is invalid, but readers MAY parse the
file, ignoring the chunks that come too late.
file, ignoring the chunks that are out of order.
**Rationale:** Setting the order of chunks should allow quicker file
parsing. For example, if an 'ALPH' chunk does not appear in its required
@ -322,7 +322,7 @@ Extended WebP file header:
Reserved (Rsv): 2 bits
: MUST be `0`.
: MUST be `0`. Readers MUST ignore this field.
ICC profile (I): 1 bit
@ -348,11 +348,11 @@ Animation (A): 1 bit
Reserved (R): 1 bit
: MUST be `0`.
: MUST be `0`. Readers MUST ignore this field.
Reserved: 24 bits
: MUST be `0`.
: MUST be `0`. Readers MUST ignore this field.
Canvas Width Minus One: 24 bits
@ -366,7 +366,7 @@ Canvas Height Minus One: 24 bits
The product of _Canvas Width_ and _Canvas Height_ MUST be at most `2^32 - 1`.
Future specifications MAY add more fields.
Future specifications may add more fields. Unknown fields MUST be ignored.
### Chunks
@ -400,8 +400,8 @@ Background Color: 32 bits (_uint32_)
**Note**:
* Background color MAY contain a transparency value (alpha), even if the
_Alpha_ flag in [VP8X chunk](#extended_header) is unset.
* Background color MAY contain a non-opaque alpha value, even if the _Alpha_
flag in [VP8X chunk](#extended_header) is unset.
* Viewer applications SHOULD treat the background color value as a hint, and
are not required to use it.
@ -414,8 +414,8 @@ Loop Count: 16 bits (_uint16_)
: The number of times to loop the animation. `0` means infinitely.
This chunk MUST appear if the _Animation_ flag in the VP8X chunk is set.
If the _Animation_ flag is not set and this chunk is present, it
SHOULD be ignored.
If the _Animation_ flag is not set and this chunk is present, it MUST be
ignored.
ANMF chunk:
@ -466,7 +466,7 @@ Frame Duration: 24 bits (_uint24_)
Reserved: 6 bits
: MUST be 0.
: MUST be `0`. Readers MUST ignore this field.
Blending method (B): 1 bit
@ -547,7 +547,7 @@ _padded_ chunks as described by the [RIFF file format](#riff-file-format).
Reserved (Rsv): 2 bits
: MUST be `0`.
: MUST be `0`. Readers MUST ignore this field.
Pre-processing (P): 2 bits
@ -686,8 +686,7 @@ If this chunk is not present, sRGB SHOULD be assumed.
Metadata can be stored in 'EXIF' or 'XMP ' chunks.
There SHOULD be at most one chunk of each type ('EXIF' and 'XMP '). If there
are more such chunks, readers MAY ignore all except the first one. Also, a file
may possibly contain both 'EXIF' and 'XMP ' chunks.
are more such chunks, readers MAY ignore all except the first one.
The chunks are defined as follows:
@ -721,7 +720,7 @@ XMP Metadata: _Chunk Size_ bytes
: image metadata in XMP format.
Note the fourth character in the 'XMP ' FourCC is an ASCII space (%x20).
Note the fourth character in the 'XMP ' FourCC is an ASCII space (0x20).
Additional guidance about handling metadata can be found in the
Metadata Working Group's [Guidelines for Handling Metadata][metadata].
@ -747,19 +746,40 @@ original order (unless they specifically intend to modify these chunks).
### Assembling the Canvas from frames
Here we provide an overview of how a reader should assemble a canvas in the
case of an animated image. The notation _VP8X.field_ means the field in the
'VP8X' chunk with the same description.
Here we provide an overview of how a reader MUST assemble a canvas in the case
of an animated image.
Displaying an _animated image_ canvas MUST be equivalent to the following
pseudocode:
The process begins with creating a canvas using the dimensions given in the
'VP8X' chunk, `Canvas Width Minus One + 1` pixels wide by `Canvas Height Minus
One + 1` pixels high. The `Loop Count` field from the 'ANIM' chunk controls how
many times the animation process is repeated. This is `Loop Count - 1` for
non-zero `Loop Count` values or infinitely if `Loop Count` is zero.
At the beginning of each loop iteration the canvas is filled using the
background color from the 'ANIM' chunk or an application defined color.
'ANMF' chunks contain individual frames given in display order. Before rendering
each frame, the previous frame's `Disposal method` is applied.
The rendering of the decoded frame begins at the Cartesian coordinates (`2 *
Frame X`, `2 * Frame Y`) using the top-left corner of the canvas as the origin.
`Frame Width Minus One + 1` pixels wide by `Frame Height Minus One + 1` pixels
high are rendered onto the canvas using the `Blending method`.
The canvas is displayed for `Frame Duration` milliseconds. This continues until
all frames given by 'ANMF' chunks have been displayed. A new loop iteration is
then begun or the canvas is left in its final state if all iterations have been
completed.
The following pseudocode illustrates the rendering process. The notation
_VP8X.field_ means the field in the 'VP8X' chunk with the same description.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
assert VP8X.flags.hasAnimation
canvas ← new image of size VP8X.canvasWidth x VP8X.canvasHeight with
background color ANIM.background_color.
loop_count ← ANIM.loopCount
dispose_method ← ANIM.disposeMethod
dispose_method ← Dispose to background color
if loop_count == 0:
loop_count = ∞
frame_params ← nil
@ -785,10 +805,12 @@ for loop = 0..loop_count - 1
frame_params.bitstream = bitstream_data
render frame with frame_params.alpha and frame_params.bitstream
on canvas with top-left corner at (frame_params.frameX,
frame_params.frameY), using dispose method dispose_method.
frame_params.frameY), using blending method
frame_params.blendingMethod.
canvas contains the decoded image.
Show the contents of the canvas for
frame_params.frameDuration * 1ms.
dispose_method = frame_params.disposeMethod
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

View File

@ -51,7 +51,7 @@ the stream containing them, and bits of each byte are read in
least-significant-bit-first order. When multiple bits are read at the
same time, the integer is constructed from the original data in the
original order. The most significant bits of the returned integer are
also the most significant bits of the original data. Thus the statement
also the most significant bits of the original data. Thus, the statement
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
b = ReadBits(2);
@ -196,7 +196,7 @@ once. The transformations are used only for the main level ARGB image:
the subresolution images have no transforms, not even the 0 bit
indicating the end-of-transforms.
Typically an encoder would use these transforms to reduce the Shannon
Typically, an encoder would use these transforms to reduce the Shannon
entropy in the residual image. Also, the transform data can be decided
based on entropy minimization.
@ -566,9 +566,9 @@ pixels are bundled into a single pixel. The pixel bundling packs several
respectively. Pixel bundling allows for a more efficient joint
distribution entropy coding of neighboring pixels, and gives some
arithmetic coding-like benefits to the entropy code, but it can only be
used when there are a small number of unique values.
used when there are 16 or fewer unique values.
`color_table_size` specifies how many pixels are combined together:
`color_table_size` specifies how many pixels are combined:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
int width_bits;
@ -583,13 +583,12 @@ if (color_table_size <= 2) {
}
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
`width_bits` has a value of 0, 1, 2 or 3. A value of 0 indicates no
pixel bundling to be done for the image. A value of 1 indicates that two
pixels are combined together, and each pixel has a range of \[0..15\]. A
value of 2 indicates that four pixels are combined together, and each
pixel has a range of \[0..3\]. A value of 3 indicates that eight pixels
are combined together and each pixel has a range of \[0..1\], i.e., a
binary value.
`width_bits` has a value of 0, 1, 2 or 3. A value of 0 indicates no pixel
bundling to be done for the image. A value of 1 indicates that two pixels are
combined, and each pixel has a range of \[0..15\]. A value of 2 indicates that
four pixels are combined, and each pixel has a range of \[0..3\]. A value of 3
indicates that eight pixels are combined and each pixel has a range of \[0..1\],
i.e., a binary value.
The values are packed into the green component as follows:
@ -659,7 +658,7 @@ Each pixel is encoded using one of the three possible methods:
3. Color cache code: using a short multiplicative hash code (color cache
index) of a recently seen color.
The following sub-sections describe each of these in detail.
The following subsections describe each of these in detail.
#### 5.2.1 Prefix Coded Literals
@ -725,7 +724,7 @@ return offset + ReadBits(extra_bits) + 1;
{:#distance-mapping}
As noted previously, distance code is a number indicating the position of a
previously seen pixel, from which the pixels are to be copied. This sub-section
previously seen pixel, from which the pixels are to be copied. This subsection
defines the mapping between a distance code and the position of a previous
pixel.