mirror of
https://github.com/webmproject/libwebp.git
synced 2025-01-27 15:12:54 +01:00
4e2e0a8cac
Change-Id: I8e29f1c814b0c8f6aa268562d8653989caaf281d modified: doc/webp-container-spec.txt
468 lines
16 KiB
Plaintext
468 lines
16 KiB
Plaintext
<!--
|
|
|
|
Although you may be viewing an alternate representation, this document
|
|
is sourced in Markdown, a light-duty markup scheme, and is optimized for
|
|
the [kramdown](http://kramdown.rubyforge.org/) transformer.
|
|
|
|
See the accompanying README. External link targets are referenced at the
|
|
end of this file.
|
|
|
|
-->
|
|
|
|
|
|
WebP Container Specification
|
|
============================
|
|
|
|
_Working Draft, v0.5, 20120713_
|
|
|
|
|
|
* TOC placeholder
|
|
{:toc}
|
|
|
|
|
|
Introduction
|
|
------------
|
|
|
|
WebP is an image format that uses either (i) the VP8 key frame encoding
|
|
to compress image data in a lossy way, or (ii) the WebP lossless encoding
|
|
(and possibly other encodings in the future). These encoding schemes should
|
|
make it more efficient than currently used formats. It is optimized for fast
|
|
image transfer over the network (e.g., for websites). This document describes
|
|
the structure of a WebP file.
|
|
|
|
The WebP container (i.e., RIFF container for WebP) allows feature support over
|
|
and above the basic use case of WebP (i.e., a file containing a single image
|
|
encoded as a VP8 key frame). The WebP container provides additional support
|
|
for:
|
|
|
|
* **Lossless compression.** An image can be losslessly compressed, using the
|
|
WebP Lossless Format.
|
|
|
|
* **Transparency.** An image may have transparency, i.e., an alpha channel.
|
|
|
|
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
|
|
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
|
|
document are to be interpreted as described in [RFC 2119][].
|
|
|
|
|
|
Terminology & Basics
|
|
------------------------
|
|
|
|
A WebP file contains a still image (i.e., an encoded matrix of pixels) and,
|
|
optionally, transparency information. In case we need to refer only to the
|
|
matrix of pixels, we will call it the _canvas_ of the image.
|
|
|
|
Below are additional terms used throughout this document:
|
|
|
|
Code that reads WebP files is referred to as a _reader_, while
|
|
code that writes them is referred to as a _writer_.
|
|
|
|
_uint16_
|
|
|
|
: A 16-bit, little-endian, unsigned integer.
|
|
|
|
_uint24_
|
|
|
|
: A 24-bit, little-endian, unsigned integer.
|
|
|
|
_uint32_
|
|
|
|
: A 32-bit, little-endian, unsigned integer.
|
|
|
|
_1-based_
|
|
: An unsigned integer field storing values offset by `-1`. e.g., Such a field
|
|
would store value _25_ as _24_.
|
|
|
|
The basic element of a RIFF file is a _chunk_. It consists of:
|
|
|
|
0 1 2 3
|
|
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| Chunk FourCC |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| Chunk Size |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| Chunk Payload |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
|
|
Chunk FourCC: 32 bits
|
|
|
|
: ASCII four character code or _chunk tag_ used for chunk identification.
|
|
|
|
Chunk Size: 32 bits (_uint32_)
|
|
|
|
: The size of the chunk (_ckSize_) not including this field, the chunk
|
|
identifier and padding.
|
|
|
|
Chunk Payload: _Chunk Size_ bytes
|
|
|
|
: The data payload. If _Chunk Size_ is odd a single padding byte that
|
|
SHOULD be `0` is added.
|
|
|
|
_ChunkHeader('ABCD')_
|
|
|
|
: This is used to describe the fourcc and size header of individual
|
|
chunks, where 'ABCD' is the fourcc for the chunk. This element's
|
|
size is 8 bytes.
|
|
|
|
: Note that, in this specification, all chunk tag characters are in
|
|
file order, not in byte order of a uint32 of any particular
|
|
architecture.
|
|
|
|
_list of chunks_
|
|
|
|
: A concatenation of multiple chunks.
|
|
|
|
: We will refer to the first chunk as having _position_ 0, the second
|
|
as position 1, etc. By _chunk with index 0 among "ABCD"_ we mean
|
|
the first chunk among the chunks of type "ABCD" in the list, the
|
|
_chunk with index 1 among "ABCD"_ is the second such chunk, etc.
|
|
|
|
A WebP file MUST begin with a single chunk with a tag 'RIFF'. All
|
|
other defined chunks are contained within this chunk. The file SHOULD
|
|
NOT contain anything after it.
|
|
|
|
The maximum size of RIFF's _ckSize_ is 2^32 minus 10 bytes. The size
|
|
of the whole file is at most 4GiB minus 2 bytes.
|
|
|
|
**Note:** some RIFF libraries are said to have bugs when handling files
|
|
larger than 1GiB or 2GiB. If you are using an existing library, check
|
|
that it handles large files correctly.
|
|
|
|
The first four bytes of the RIFF chunk contents (i.e., bytes 8-11 of the file)
|
|
MUST be the ASCII string "WEBP". They are followed by a list of chunks. As the
|
|
size of any chunk is even, the size of the RIFF chunk is also even. The
|
|
contents of the chunks in that list will be described in the following sections.
|
|
|
|
**Note:** RIFF has a convention that all-uppercase chunks are standard
|
|
chunks that apply to any RIFF file format, while chunks specific to a
|
|
file format are all lowercase. WebP does not follow this convention.
|
|
|
|
|
|
WebP file header
|
|
----------------
|
|
|
|
0 1 2 3
|
|
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| 'R' | 'I' | 'F' | 'F' |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| File Size |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| 'W' | 'E' | 'B' | 'P' |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
|
|
'RIFF': 32 bits
|
|
|
|
: The ASCII characters 'R' 'I' 'F' 'F'.
|
|
|
|
File Size: 32 bits (_uint32_)
|
|
|
|
: The size of the file in bytes starting at offset 8.
|
|
|
|
'WEBP': 32 bits
|
|
|
|
: The ASCII characters 'W' 'E' 'B' 'P'.
|
|
|
|
Simple file format (lossy)
|
|
--------------------------
|
|
|
|
This layout SHOULD be used if the image requires _lossy_ encoding and does not
|
|
require transparency or other advanced features provided by the extended format.
|
|
Files with this layout are smaller and supported by older software.
|
|
|
|
Simple WebP (lossy) file format:
|
|
|
|
0 1 2 3
|
|
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| WebP file header (12 bytes) |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| VP8 chunk |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
|
|
VP8 chunk:
|
|
|
|
0 1 2 3
|
|
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| ChunkHeader('VP8 ') |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| VP8 data |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
|
|
VP8 data: _Chunk Size_ bytes
|
|
|
|
: VP8 bitstream data.
|
|
|
|
The VP8 bitstream format specification can be found at [VP8 Data Format and
|
|
Decoding Guide][vp8spec]. Note that the VP8 frame header contains the VP8 frame
|
|
width and height. That is assumed to be the width and height of the canvas.
|
|
|
|
The VP8 specification describes how to decode the image into Y'CbCr
|
|
format. To convert to RGB, Rec. 601 SHOULD be used.
|
|
|
|
Simple file format (lossless)
|
|
-----------------------------
|
|
|
|
**Note:** Older readers may not support files using the lossless format.
|
|
|
|
This layout SHOULD be used if the image requires _lossless_ encoding (with an
|
|
optional transparency channel) and does not require advanced features provided
|
|
by the extended format.
|
|
|
|
Simple WebP (lossless) file format:
|
|
|
|
0 1 2 3
|
|
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| WebP file header (12 bytes) |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| VP8L chunk |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
|
|
VP8L chunk:
|
|
|
|
0 1 2 3
|
|
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| ChunkHeader('VP8L') |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| VP8L data |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
|
|
VP8L data: _Chunk Size_ bytes
|
|
|
|
: VP8L bitstream data.
|
|
|
|
The current specification of the VP8L bitstream can be found at
|
|
[WebP Lossless Bitstream Format][webpllspec]. Note that the VP8L header
|
|
contains the VP8L image width and height. That is assumed to be the width
|
|
and height of the canvas.
|
|
|
|
Extended file format
|
|
--------------------
|
|
|
|
**Note:** Older readers may not support files using the extended format.
|
|
|
|
An extended format file consists of:
|
|
|
|
* A 'VP8X' chunk with information about features used in the file.
|
|
|
|
* An optional 'ALPH' chunk with transparency information.
|
|
|
|
* The image bitstream contained in either a 'VP8 ' or 'VP8L' chunk.
|
|
|
|
All chunks SHOULD be placed in the same order as listed above. If a chunk
|
|
appears in the wrong place, the file is invalid, but readers MAY parse the
|
|
file, ignoring the chunks that come too late.
|
|
|
|
**Rationale:** Setting the order of chunks should allow quicker file
|
|
parsing. For example, if an 'ALPH' chunk does not appear in its required
|
|
position, a decoder can choose to stop searching for it. The rule of
|
|
ignoring late chunks should make programs that need to do a full search
|
|
give the same results as the ones stopping early.
|
|
|
|
Extended WebP file header:
|
|
|
|
0 1 2 3
|
|
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| WebP file header (12 bytes) |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| ChunkHeader('VP8X') |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| Rsv |L| Rsv | Reserved |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| Canvas Width Minus One | ...
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
... Canvas Height Minus One |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
|
|
Reserved (Rsv): 4 bits
|
|
|
|
: SHOULD be `0`.
|
|
|
|
Alpha (L): 1 bit
|
|
|
|
: Set if the file contains some (or all) images with transparency information
|
|
("alpha").
|
|
|
|
Reserved (Rsv): 3 bits
|
|
|
|
: SHOULD be `0`.
|
|
|
|
Reserved: 24 bits
|
|
|
|
: SHOULD be `0`.
|
|
|
|
Canvas Width Minus One: 24 bits
|
|
|
|
: _1-based_ width of the canvas in pixels.
|
|
The actual canvas width is '1 + Canvas Width Minus One'
|
|
|
|
Canvas Height Minus One: 24 bits
|
|
|
|
: _1-based_ height of the canvas in pixels.
|
|
The actual canvas height is '1 + Canvas Height Minus One'
|
|
|
|
The product of _Canvas Width_ and _Canvas Height_ MUST be at most `2^32 - 1`.
|
|
|
|
Future specifications MAY add more fields.
|
|
|
|
### Chunks
|
|
|
|
#### Alpha
|
|
|
|
0 1 2 3
|
|
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
| ChunkHeader('ALPH') |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
|Rsv| P | F | C | Alpha Bitstream... |
|
|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|
|
|
|
Compression method (C): 2 bits
|
|
|
|
: The compression method used:
|
|
|
|
* `0`: No compression.
|
|
* `1`: Compressed using the WebP lossless format.
|
|
|
|
Filtering method (F): 2 bits
|
|
|
|
: The filtering method used:
|
|
|
|
* `0`: None.
|
|
* `1`: Horizontal filter.
|
|
* `2`: Vertical filter.
|
|
* `3`: Gradient filter.
|
|
|
|
For each pixel, filtering is performed using the following calculations.
|
|
Assume the alpha values surrounding the current `X` position are labeled as:
|
|
|
|
C | B |
|
|
---+---+
|
|
A | X |
|
|
|
|
We seek to compute the alpha value at position `X`. First, a prediction is
|
|
made depending on the filtering method:
|
|
|
|
* Method `0`: predictor = 0
|
|
* Method `1`: predictor = A
|
|
* Method `2`: predictor = B
|
|
* Method `3`: predictor = clip(A + B - C)
|
|
|
|
where `clip(v)` is equal to:
|
|
|
|
* 0 if v < 0
|
|
* 255 if v > 255
|
|
* v otherwise
|
|
|
|
The final value is derived by adding the decompressed value `X` to the
|
|
predictor and using modulo-256 arithmetic to wrap the [256-511] range
|
|
into the [0-255] one:
|
|
|
|
`alpha = (predictor + X) % 256`
|
|
|
|
There are special cases for left-most and top-most pixel positions:
|
|
|
|
* Top-left value at location (0,0) uses 0 as predictor value. Otherwise,
|
|
* For horizontal or gradient filtering methods, the left-most pixels at
|
|
location (0, y) are predicted using the location (0, y-1) just above.
|
|
* For vertical or gradient filtering methods, the top-most pixels at
|
|
location (x, 0) are predicted using the location (x-1, 0) on the left.
|
|
|
|
|
|
Pre-processing (P): 2 bits
|
|
|
|
: These INFORMATIVE bits are used to signal the pre-processing that has
|
|
been performed during compression. The decoder can use this information to
|
|
e.g. dither the values or smooth the gradients prior to display.
|
|
|
|
* `0`: no pre-processing
|
|
* `1`: level reduction
|
|
|
|
Decoders are not required to use this information in any specified way.
|
|
|
|
Reserved (Rsv): 2 bits
|
|
|
|
: SHOULD be `0`.
|
|
|
|
Alpha bitstream: _Chunk Size_ - `1` bytes
|
|
|
|
: Encoded alpha bitstream.
|
|
|
|
This optional chunk contains encoded alpha data for the image. An image
|
|
containing a 'VP8L' chunk SHOULD NOT contain this chunk.
|
|
|
|
**Rationale**: The transparency information of the image is already part
|
|
of the 'VP8L' chunk.
|
|
|
|
The alpha channel data is stored as uncompressed raw data (when
|
|
compression method is '0') or compressed using the lossless format
|
|
(when the compression method is '1').
|
|
|
|
* Raw data: consists of a byte sequence of length width * height,
|
|
containing all the 8-bit transparency values in scan order.
|
|
|
|
* Lossless format compression: the byte sequence is a compressed
|
|
image-stream (as described in the [WebP Lossless Bitstream Format]
|
|
[webpllspec]) of implicit dimension width x height. That is, this
|
|
image-stream does NOT contain any headers describing the image dimension.
|
|
|
|
**Rationale**: the dimension is already known from other sources,
|
|
so storing it again would be redundant and error-prone.
|
|
|
|
Once the image-stream is decoded into ARGB color values, following
|
|
the process described in the lossless format specification, the
|
|
transparency information must be extracted from the *green* channel
|
|
of the ARGB quadruplet.
|
|
|
|
**Rationale**: the green channel is allowed extra transformation
|
|
steps in the specification -- unlike the other channels -- that can
|
|
improve compression.
|
|
|
|
#### Bitstream (VP8/VP8L)
|
|
|
|
This chunk contains compressed image data.
|
|
|
|
A bitstream chunk may be either (i) a VP8 chunk, using "VP8 " (note the
|
|
significant fourth-character space) as its tag _or_ (ii) a VP8L chunk, using
|
|
"VP8L" as its tag.
|
|
|
|
The formats of VP8 and VP8L chunks are as described in sections
|
|
[Simple file format (lossy)](#simple-file-format-lossy)
|
|
and [Simple file format (lossless)](#simple-file-format-lossless) respectively.
|
|
|
|
#### Unknown Chunks
|
|
|
|
A file MAY contain other unknown chunks. Readers SHOULD ignore these chunks.
|
|
Writers SHOULD preserve them in their original order.
|
|
|
|
Example file layouts
|
|
--------------------
|
|
|
|
A lossy encoded image with alpha may look as follows:
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
RIFF/WEBP
|
|
+- VP8X (descriptions of features used)
|
|
+- ALPH (alpha bitstream)
|
|
+- VP8 (bitstream)
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
A losslessly encoded image may look as follows:
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
RIFF/WEBP
|
|
+- VP8X (descriptions of features used)
|
|
+- XYZW (unknown chunk)
|
|
+- VP8L (lossless bitstream)
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
[vp8spec]: http://tools.ietf.org/html/rfc6386
|
|
[webpllspec]: https://gerrit.chromium.org/gerrit/gitweb?p=webm/libwebp.git;a=blob;f=doc/webp-lossless-bitstream-spec.txt;hb=master
|
|
[metadata]: http://www.metadataworkinggroup.org/pdf/mwg_guidance.pdf
|
|
[rfc 2119]: http://tools.ietf.org/html/rfc2119
|