diff --git a/doc/webp-container-spec.txt b/doc/webp-container-spec.txt index 63ec01d7..38a3e2ce 100644 --- a/doc/webp-container-spec.txt +++ b/doc/webp-container-spec.txt @@ -25,7 +25,7 @@ to compress image data in a lossy way, or (ii) the WebP lossless encoding (and possibly other encodings in the future). These encoding schemes should make it more efficient than currently used formats. It is optimized for fast image transfer over the network (e.g., for websites). The WebP format has -feature parity (color profile, metadata, animation etc) with other formats as +feature parity (color profile, metadata, animation, etc.) with other formats as well. This document describes the structure of a WebP file. The WebP container (i.e., RIFF container for WebP) allows feature support over @@ -84,8 +84,8 @@ _uint32_ _FourCC_ : A _FourCC_ (four-character code) is a _uint32_ created by concatenating four - ASCII characters in little-endian order. This means 'aaaa' (%x61.61.61.61) and - 'AAAA' (%x41.41.41.41) are treated as different _FourCCs_. + ASCII characters in little-endian order. This means 'aaaa' (0x61616161) and + 'AAAA' (0x41414141) are treated as different _FourCCs_. _1-based_ @@ -123,14 +123,13 @@ Chunk FourCC: 32 bits Chunk Size: 32 bits (_uint32_) -: The size of the chunk not including this field, the chunk identifier or - padding. +: The size of the chunk in bytes, not including this field, the chunk + identifier or padding. Chunk Payload: _Chunk Size_ bytes -: The data payload. If _Chunk Size_ is odd, a single padding byte -- that - SHOULD be `0` to conform with RIFF -- is added. Applications MAY use - another value, but readers may fail to parse the file. +: The data payload. If _Chunk Size_ is odd, a single padding byte -- that MUST + be `0` to conform with RIFF -- is added. **Note:** RIFF has a convention that all-uppercase chunk FourCCs are standard chunks that apply to any RIFF file format, while FourCCs specific to a file @@ -166,10 +165,11 @@ File Size: 32 bits (_uint32_) A WebP file MUST begin with a RIFF header with the FourCC 'WEBP'. The file size in the header is the total size of the chunks that follow plus `4` bytes for -the 'WEBP' FourCC. The file SHOULD NOT contain anything after it. Readers MAY -parse such files, ignoring the trailing data. As the size of any chunk is even, -the size given by the RIFF header is also even. The contents of individual -chunks will be described in the following sections. +the 'WEBP' FourCC. The file SHOULD NOT contain any data after the data +specified by _File Size_. Readers MAY parse such files, ignoring the trailing +data. As the size of any chunk is even, the size given by the RIFF header is +also even. The contents of individual chunks will be described in the following +sections. Simple File Format (Lossy) @@ -206,7 +206,7 @@ VP8 data: _Chunk Size_ bytes : VP8 bitstream data. -Note the fourth character in the 'VP8 ' FourCC is an ASCII space (%x20). +Note the fourth character in the 'VP8 ' FourCC is an ASCII space (0x20). The VP8 bitstream format specification can be found at [VP8 Data Format and Decoding Guide][vp8spec]. Note that the VP8 frame header contains the VP8 frame @@ -292,7 +292,7 @@ details about frames can be found in the [Animation](#animation) section. All chunks SHOULD be placed in the same order as listed above. If a chunk appears in the wrong place, the file is invalid, but readers MAY parse the -file, ignoring the chunks that come too late. +file, ignoring the chunks that are out of order. **Rationale:** Setting the order of chunks should allow quicker file parsing. For example, if an 'ALPH' chunk does not appear in its required @@ -322,7 +322,7 @@ Extended WebP file header: Reserved (Rsv): 2 bits -: MUST be `0`. +: MUST be `0`. Readers MUST ignore this field. ICC profile (I): 1 bit @@ -348,11 +348,11 @@ Animation (A): 1 bit Reserved (R): 1 bit -: MUST be `0`. +: MUST be `0`. Readers MUST ignore this field. Reserved: 24 bits -: MUST be `0`. +: MUST be `0`. Readers MUST ignore this field. Canvas Width Minus One: 24 bits @@ -366,7 +366,7 @@ Canvas Height Minus One: 24 bits The product of _Canvas Width_ and _Canvas Height_ MUST be at most `2^32 - 1`. -Future specifications MAY add more fields. +Future specifications may add more fields. Unknown fields MUST be ignored. ### Chunks @@ -400,8 +400,8 @@ Background Color: 32 bits (_uint32_) **Note**: - * Background color MAY contain a transparency value (alpha), even if the - _Alpha_ flag in [VP8X chunk](#extended_header) is unset. + * Background color MAY contain a non-opaque alpha value, even if the _Alpha_ + flag in [VP8X chunk](#extended_header) is unset. * Viewer applications SHOULD treat the background color value as a hint, and are not required to use it. @@ -414,8 +414,8 @@ Loop Count: 16 bits (_uint16_) : The number of times to loop the animation. `0` means infinitely. This chunk MUST appear if the _Animation_ flag in the VP8X chunk is set. -If the _Animation_ flag is not set and this chunk is present, it -SHOULD be ignored. +If the _Animation_ flag is not set and this chunk is present, it MUST be +ignored. ANMF chunk: @@ -466,7 +466,7 @@ Frame Duration: 24 bits (_uint24_) Reserved: 6 bits -: MUST be 0. +: MUST be `0`. Readers MUST ignore this field. Blending method (B): 1 bit @@ -547,7 +547,7 @@ _padded_ chunks as described by the [RIFF file format](#riff-file-format). Reserved (Rsv): 2 bits -: MUST be `0`. +: MUST be `0`. Readers MUST ignore this field. Pre-processing (P): 2 bits @@ -686,8 +686,7 @@ If this chunk is not present, sRGB SHOULD be assumed. Metadata can be stored in 'EXIF' or 'XMP ' chunks. There SHOULD be at most one chunk of each type ('EXIF' and 'XMP '). If there -are more such chunks, readers MAY ignore all except the first one. Also, a file -may possibly contain both 'EXIF' and 'XMP ' chunks. +are more such chunks, readers MAY ignore all except the first one. The chunks are defined as follows: @@ -721,7 +720,7 @@ XMP Metadata: _Chunk Size_ bytes : image metadata in XMP format. -Note the fourth character in the 'XMP ' FourCC is an ASCII space (%x20). +Note the fourth character in the 'XMP ' FourCC is an ASCII space (0x20). Additional guidance about handling metadata can be found in the Metadata Working Group's [Guidelines for Handling Metadata][metadata]. @@ -747,19 +746,40 @@ original order (unless they specifically intend to modify these chunks). ### Assembling the Canvas from frames -Here we provide an overview of how a reader should assemble a canvas in the -case of an animated image. The notation _VP8X.field_ means the field in the -'VP8X' chunk with the same description. +Here we provide an overview of how a reader MUST assemble a canvas in the case +of an animated image. -Displaying an _animated image_ canvas MUST be equivalent to the following -pseudocode: +The process begins with creating a canvas using the dimensions given in the +'VP8X' chunk, `Canvas Width Minus One + 1` pixels wide by `Canvas Height Minus +One + 1` pixels high. The `Loop Count` field from the 'ANIM' chunk controls how +many times the animation process is repeated. This is `Loop Count - 1` for +non-zero `Loop Count` values or infinitely if `Loop Count` is zero. + +At the beginning of each loop iteration the canvas is filled using the +background color from the 'ANIM' chunk or an application defined color. + +'ANMF' chunks contain individual frames given in display order. Before rendering +each frame, the previous frame's `Disposal method` is applied. + +The rendering of the decoded frame begins at the Cartesian coordinates (`2 * +Frame X`, `2 * Frame Y`) using the top-left corner of the canvas as the origin. +`Frame Width Minus One + 1` pixels wide by `Frame Height Minus One + 1` pixels +high are rendered onto the canvas using the `Blending method`. + +The canvas is displayed for `Frame Duration` milliseconds. This continues until +all frames given by 'ANMF' chunks have been displayed. A new loop iteration is +then begun or the canvas is left in its final state if all iterations have been +completed. + +The following pseudocode illustrates the rendering process. The notation +_VP8X.field_ means the field in the 'VP8X' chunk with the same description. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ assert VP8X.flags.hasAnimation canvas ← new image of size VP8X.canvasWidth x VP8X.canvasHeight with background color ANIM.background_color. loop_count ← ANIM.loopCount -dispose_method ← ANIM.disposeMethod +dispose_method ← Dispose to background color if loop_count == 0: loop_count = ∞ frame_params ← nil @@ -785,10 +805,12 @@ for loop = 0..loop_count - 1 frame_params.bitstream = bitstream_data render frame with frame_params.alpha and frame_params.bitstream on canvas with top-left corner at (frame_params.frameX, - frame_params.frameY), using dispose method dispose_method. + frame_params.frameY), using blending method + frame_params.blendingMethod. canvas contains the decoded image. Show the contents of the canvas for frame_params.frameDuration * 1ms. + dispose_method = frame_params.disposeMethod ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/doc/webp-lossless-bitstream-spec.txt b/doc/webp-lossless-bitstream-spec.txt index c1ff493e..bb5a9435 100644 --- a/doc/webp-lossless-bitstream-spec.txt +++ b/doc/webp-lossless-bitstream-spec.txt @@ -51,7 +51,7 @@ the stream containing them, and bits of each byte are read in least-significant-bit-first order. When multiple bits are read at the same time, the integer is constructed from the original data in the original order. The most significant bits of the returned integer are -also the most significant bits of the original data. Thus the statement +also the most significant bits of the original data. Thus, the statement ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ b = ReadBits(2); @@ -196,7 +196,7 @@ once. The transformations are used only for the main level ARGB image: the subresolution images have no transforms, not even the 0 bit indicating the end-of-transforms. -Typically an encoder would use these transforms to reduce the Shannon +Typically, an encoder would use these transforms to reduce the Shannon entropy in the residual image. Also, the transform data can be decided based on entropy minimization. @@ -566,9 +566,9 @@ pixels are bundled into a single pixel. The pixel bundling packs several respectively. Pixel bundling allows for a more efficient joint distribution entropy coding of neighboring pixels, and gives some arithmetic coding-like benefits to the entropy code, but it can only be -used when there are a small number of unique values. +used when there are 16 or fewer unique values. -`color_table_size` specifies how many pixels are combined together: +`color_table_size` specifies how many pixels are combined: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ int width_bits; @@ -583,13 +583,12 @@ if (color_table_size <= 2) { } ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -`width_bits` has a value of 0, 1, 2 or 3. A value of 0 indicates no -pixel bundling to be done for the image. A value of 1 indicates that two -pixels are combined together, and each pixel has a range of \[0..15\]. A -value of 2 indicates that four pixels are combined together, and each -pixel has a range of \[0..3\]. A value of 3 indicates that eight pixels -are combined together and each pixel has a range of \[0..1\], i.e., a -binary value. +`width_bits` has a value of 0, 1, 2 or 3. A value of 0 indicates no pixel +bundling to be done for the image. A value of 1 indicates that two pixels are +combined, and each pixel has a range of \[0..15\]. A value of 2 indicates that +four pixels are combined, and each pixel has a range of \[0..3\]. A value of 3 +indicates that eight pixels are combined and each pixel has a range of \[0..1\], +i.e., a binary value. The values are packed into the green component as follows: @@ -659,7 +658,7 @@ Each pixel is encoded using one of the three possible methods: 3. Color cache code: using a short multiplicative hash code (color cache index) of a recently seen color. -The following sub-sections describe each of these in detail. +The following subsections describe each of these in detail. #### 5.2.1 Prefix Coded Literals @@ -725,7 +724,7 @@ return offset + ReadBits(extra_bits) + 1; {:#distance-mapping} As noted previously, distance code is a number indicating the position of a -previously seen pixel, from which the pixels are to be copied. This sub-section +previously seen pixel, from which the pixels are to be copied. This subsection defines the mapping between a distance code and the position of a previous pixel.