This speeds up WebP lossless decoding by 20%. In particular, the
photographic images get 35% speedup.
Change-Id: Idb94750342a140ec05df52c07e12be4bba335adc
speeds up those codes that are not part of the main lookup.
This gives a 10 % speedup for a photographic image.
Change-Id: Ief54b0ad77db790a01314402ad351b40ac9a7be4
Specialize and simplify the alpha-decoding case, which is used when:
- no color-cache is use
- all red/blue/alpha values are the same (and hence their Huffman tree has
only 1 symbol. We don't need to consume any bits for reading these).
+ revamped the loop to use size_t and offsets instead of pointers.
~2-3% faster on Unix (gcc) but up to 25% faster lossy+alpha decoding
on Mac (llvm) and ARM.
Change-Id: I43c9688d1e4811cab0ecf0108a5b8f45781083e6
+ split AllocateInternalBuffers() into two 32b/8b variants instead of
trying to do everything in one function.
Change-Id: I35cac9fcd990a2194c95da4b2a4046ca3a514343
Considering the fact that insert to/lookup from the color cache is always 32
bit, use DecodeImageData() variant in that case.
Change-Id: I6c665a6cfbd9bd10651c1e82fa54e687cbd54a2b
src\dec\vp8l.c(816) : warning C4244: '=' : conversion from '__int64' to
'int', possible loss of data
src\dec\vp8l.c(817) : warning C4244: '=' : conversion from '__int64' to
'int', possible loss of data
Change-Id: I1d376d5dea909395bff8741aba16e8eed83a6e8f
rather than symlink the webm/vpx terms, use the same header as libvpx to
reference in-tree files
based on the discussion in:
https://codereview.chromium.org/12771026/
Change-Id: Ia3067ecddefaa7ee01550136e00f7b3f086d4af4
* "declaration of ‘index’ shadows a global declaration [-Wshadow]"
* "signed and unsigned type in conditional expression [-Wsign-compare]"
Change-Id: I891182d919b18b6c84048486e0385027bd93b57d
Earlier such images were using roughly 9 * width * height bytes for
decoding. Now, they take 6 * width * height memory.
Change-Id: Ie4a681ca5074d96d64f30b2597fafdca648dd8f7
Simplify and re-organize the VP8L bit-reader functions
(e.g.: the 40-bit look-ahead code was helping much)
Speed-up with LBITS=64, on arm7-a:
=> before:
./dwebp_justify_24_neon -v bryce_ll.webp
Time to decode picture: 11.393s
File bryce_ll.webp can be decoded (dimensions: 11158 x 2156).
...
=> after (LBITS=64): Time to decode picture: 9.953s
making the VP8L bit-reader in 32 bit mode is going to be
harder (because we need to be able to read two symbols
at a time, each with max length 15 bits)
Change-Id: I89746fb103b87b5e2fd40a3208a6fbc584b88297
Fix the lossless decoder for the case when it has to apply other
inverse transforms before applying Color indexing inverse transform.
The main idea is to make ColorIndexingInverse virtually in-place: we
use the fact that the argb_cache is allocated to accommodate all
*unpacked* pixels of a macro-row, not just *packed* pixels.
Change-Id: I27f11f3043f863dfd753cc2580bc5b36376800c4
Check for valid bounds on the 'dist' in backward reference case.
Clamp it to 1 in case of zero and negative values.
Change-Id: I78e956d4595955efa02b1f9628b475093f6ee001
moves the implementation to ParseHeadersInternal. this also allows
decoding to start at a VP8X sub-chunk, e.g. 'ALPH'.
Change-Id: I06791f87d90f888de32746ecb02705e4b0ff227a
Ensure that the lossless bit-stream doesn't allow for such cases and
safe-gaurd decoder against indefinite recursion.
Change-Id: Ia6d7f519291de8739f79a977a5800982872aae71
Change the lossless signature to 0x2f
Add 1 bit indicator for 'droppable (or trivial) alpha)'.
Add 3 bit lossless version (for future extension like yuv support).
Change the sub-resolution information to 3 bits implying range [2 .. 9]
Change-Id: Ic7b8c069240bbcd326cf5d5d4cd2dde8667851e2
Previously, it used to assume any raw bitstream is a VP8 one.
Also,
- Factor out VP8CheckSignature() & VP8LCheckSignature().
- Use a local var for *data_ptr in ParseVP8Header() for
readability.
Change-Id: I0fa8aa177dad7865e00c8898f7e7ce76a9db19d5
may be useful later for instance to bypass some code
if we know we don't use the Bundle+ColorMap transform.
Change-Id: I9dc70d18165b2363ad9ede763684ef3d8eba5903
This saves ~26 bytes of headers.
* introduce new VP8LDecodeAlphaImageStream() for decoding
* use VP8LEncodeStream() for encoding
* refactor code a bit
still TODO: make the alpha-quality/enc-method user-configurable
Change-Id: I23e599bebe335cfb5868e746e076c3358ef12e71
Limit the overall number of transformations to 4 and disallow any
duplicate transform for decoding an image.
Change-Id: Ic4b0ecd553db96702e117fd073617237d95e45c0
this allows later customization of data output method.
No perf diff observed, even if ProcessRows is no longer inlined.
Change-Id: I6933a3612a9cf6c108cf2776dfde0ae80c6c07c0
+ small opportunistic fixes:
* allow NULL decoded_data to be passed to DecodeStream
and clarity (with assert()) when to do so
* AllocateAndInitRescaler() was already setting error status,
as it should. No need to do it at caller's site
Change-Id: I30867e596564a7f459a0d1ddbf6f5d312414b7fd
When we are at end-of-stream, but haven't decoded all pixels, we should
return an error.
Also remove an obsolete TODO.
Change-Id: I3fb1646136e706da536d537a54d1fa487a890630
- Symbols added to the tree are valid inside HuffmanTreeBuildExplicit().
- In HuffmanTreeBuildImplicit(), make sure 'root_symbol' is
valid in case of a single symbol tree.
Change-Id: I7de5de71ff28f41e2d6228b29ed8dd4a20813e99
[Basically, the condition "src - dist < data" can be wrongly evaluated
to be false if "src < dist" due to underflow. Instead, "src - data <
dist" is the correct condition, as "src > data" is always true and so
there would never be an underflow].
Change-Id: Ic9f64bfe76a9acae97abc1fb7c1f4868e81f1eb8
No empty trees are codified with the simple Huffman code. The simple Huffman
code is simplified to be either a 1-bit code or 8-bit code for symbols.
Change-Id: I3e2813027b5a643862729339303d80197c497aff