Commit Graph

105 Commits

Author SHA1 Message Date
skal
1667bded67 Remove ReadOneBit() and ReadSymbolUnsafe()
Simplify and re-organize the VP8L bit-reader functions
(e.g.: the 40-bit look-ahead code was helping much)

Speed-up with LBITS=64, on arm7-a:

=> before:
./dwebp_justify_24_neon -v bryce_ll.webp
Time to decode picture: 11.393s
File bryce_ll.webp can be decoded (dimensions: 11158 x 2156).
...

=> after (LBITS=64):	Time to decode picture: 9.953s

making the VP8L bit-reader in 32 bit mode is going to be
harder (because we need to be able to read two symbols
at a time, each with max length 15 bits)

Change-Id: I89746fb103b87b5e2fd40a3208a6fbc584b88297
2013-02-20 00:13:23 +01:00
skal
b7490f8553 introduce WEBP_REFERENCE_IMPLEMENTATION compile option
This flag will make the code use no uint64, no asm, and no fancy
trick, but instead aim at being as simple and straightforward as
possible.
Main use is to help emscripten generate proper JS code.
More code needs to be simplified later.

Also: tune the BITS values to be 24 and make use of WEBP_RIGHT_JUSTIFY
Here are the typical timing for decoding a large image:

        ARM7-a:
        dwebp_justify_32_neon Time to decode picture: 3.280s
        dwebp_justify_24_neon Time to decode picture: 2.640s
        dwebp_justify_16_neon Time to decode picture: 2.723s
        dwebp_justify_8_neon Time to decode picture: 2.802s
        dwebp_justify_32 Time to decode picture: 4.264s
        dwebp_justify_24 Time to decode picture: 3.696s
        dwebp_justify_16 Time to decode picture: 3.779s
        dwebp_justify_8 Time to decode picture: 3.834s
        dwebp_32_neon Time to decode picture: 4.010s
        dwebp_24_neon Time to decode picture: 2.725s
        dwebp_16_neon Time to decode picture: 2.852s
        dwebp_8_neon Time to decode picture: 2.778s
        dwebp_32 Time to decode picture: 4.587s
        dwebp_24 Time to decode picture: 3.800s
        dwebp_16 Time to decode picture: 3.902s
        dwebp_8 Time to decode picture: 3.815s
        REFERENCE (HEAD) Time to decode picture: 3.818s

        x86_64:
        dwebp_justify_32 Time to decode picture: 0.473s
        dwebp_justify_24 Time to decode picture: 0.434s
        dwebp_justify_16 Time to decode picture: 0.450s
        dwebp_justify_8 Time to decode picture: 0.467s
        dwebp_32 Time to decode picture: 0.474s
        dwebp_24 Time to decode picture: 0.468s
        dwebp_16 Time to decode picture: 0.468s
        dwebp_8 Time to decode picture: 0.481s
        REFERENCE (HEAD) Time to decode picture: 0.436s

        i386:
        dwebp_justify_32 Time to decode picture: 0.723s
        dwebp_justify_24 Time to decode picture: 0.618s
        dwebp_justify_16 Time to decode picture: 0.626s
        dwebp_justify_8 Time to decode picture: 0.651s
        dwebp_32 Time to decode picture: 0.744s
        dwebp_24 Time to decode picture: 0.627s
        dwebp_16 Time to decode picture: 0.642s
        dwebp_8 Time to decode picture: 0.642s

Change-Id: Ie56c7235733a24f94fbfc2e4351aae36ec39c225
2013-02-14 15:46:12 +01:00
skal
3383885799 faster decoding (3%-6%)
. revamped the boolean decoder to use less shifts
. added some description and ASCII art as explanations too.
. clarified the types further (bit_t, lbit_t, range_t, etc.)
. changed the negative field 'missing_' into positive 'bits_'

Some stats, decoding some randomly encoded WebP files:

with USE_RIGHT_JUSTIFY:
  BITS=32 => 133 files, 50 loops => 7.3s (1.097 ms/file/iterations)
  BITS=24 => 133 files, 50 loops => 7.3s (1.097 ms/file/iterations)
  BITS=16 => 133 files, 50 loops => 7.4s (1.120 ms/file/iterations)
  BITS=8 => 133 files, 50 loops => 7.5s (1.128 ms/file/iterations)

without USE_RIGHT_JUSTIFY:
  BITS=32 => 133 files, 50 loops => 7.5s (1.131 ms/file/iterations)
  BITS=24 => 133 files, 50 loops => 7.6s (1.142 ms/file/iterations)
  BITS=16 => 133 files, 50 loops => 7.6s (1.143 ms/file/iterations)
  BITS=8 => 133 files, 50 loops => 7.6s (1.149 ms/file/iterations)

Change-Id: I9277fb051676c05582e9c7ea3cb5a4b2a3ffb12e
2013-02-14 15:42:58 +01:00
pascal massimino
08e7c58ee1 Merge "Provide an option to build decoder library." 2013-01-23 11:54:24 -08:00
Vikas Arora
0aeba52852 Provide an option to build decoder library.
When the config option '--enable-libwebpdecoder' is specified, the
lean decoder library 'libwebpdecoder' will be created in addition to
libwebp. Also dwebp binary will be linked to libwebpdecoder, if this
config option is specified.

Change-Id: I9de3e149b59c9a8390fae2ba660941749640e54a
2013-01-23 11:43:36 -08:00
skal
757ebcb1c1 catch malloc(0)/calloc(0) with an assert
Actually, it turns out we now should never call these functions
with a zero size, otherwise something is wrong in the logic.

Change-Id: Ie414fcbec95486c169190470a71f2cff0843782a
2013-01-23 20:09:28 +01:00
skal
42f8f9346c handle malloc(0) and calloc(0) uniformly on all platforms
also change lossless encoder logic, which was relying on explicit
NULL return from WebPSafeMalloc(0)

renamed function to CheckSizeArgumentsOverflow() explicitly

addresses issue #138

Change-Id: Ibbd51cc0281e60e86dfd4c5496274399e4c0f7f3
2013-01-22 23:40:16 +01:00
skal
ba2aa0fdda Add support for BITS=24 case
The main advantage is that you can avoid the use of uint64_t
some times, sticking to 32bit only.
Default still is BITS=32, this is mainly "in case".

Change-Id: Id694028793117ba822c37d46ef6c52fa0afed4ac
2012-11-26 23:47:08 +01:00
Urvang Joshi
23782f95b4 Separate out mux and demux code and libraries:
- Separate out mux.h and demux.h
- muxtypes.h: new header for data types common to mux/demux
- Move some misc read/write utilities to utils/utils.h
- Remove some duplicate methods.
- Separate out mux/demux libraries

Change-Id: If9b9569b10d55d922ad9317ef51710544315d6de
2012-11-19 11:40:18 -08:00
Urvang Joshi
7caab1d8f6 Some cosmetic/comment fixes.
Change-Id: Id0613f84cc53fcbeceb913c835a262451687e27b
2012-11-09 10:46:38 -08:00
skal
8f216f7e60 remove cases of equal comparison for qsort()
Returning 0 (equal) can lead to undefined behaviour.
And, in our cases we'll never have equal keys (added asserts for that)

Change-Id: Ifaf202df321d3f877ad2a03de42e0d6cdd1b2388
2012-09-25 19:03:41 +02:00
skal
2afee60a7c speed up for ARM using 8bit for boolean decoder
SBITS=8 is reported 20-30% faster on ARM (where 64bit ops
are expensive).

Also use 32bits for i32.

Change-Id: Id6a7197d805061aeb8832f20432512d0d930ebfa
2012-09-10 23:27:58 +02:00
Pascal Massimino
2cf1f81590 Merge "fix the BITS=8 case" 2012-09-03 02:36:03 -07:00
Pascal Massimino
12f78aec48 fix the BITS=8 case
spotted by Måns Rullgård (mans at mansr dot com)
Change-Id: I4720dc2eeb645af894e396739be6fa11b5fe2739
2012-09-03 02:29:14 -07:00
Pascal Massimino
6920c71f0a fix MSVC warnings regarding implicit uint64 to uint32 conversions
Change-Id: I284dae9222a3817bba3c5ba6be271b31b5bf660d
2012-09-01 07:21:45 -07:00
Pascal Massimino
bff34ac1ca harness some malloc/calloc to use WebPSafeMalloc and WebPSafeCalloc
quite a large security sweep.

Change-Id: If150dfbb46e6e9b56210473a109c8ad6ccd0cea4
2012-08-01 12:06:04 -07:00
Pascal Massimino
a3c063c714 Merge "extra size check for security" into 0.2.0 2012-08-01 12:00:30 -07:00
Pascal Massimino
f1edf62fae Merge "rationalize use of color-cache" into 0.2.0 2012-08-01 11:55:36 -07:00
Pascal Massimino
c19333173a extra size check for security
no speed diff observed by removing the test before calling BitWriterResize().

+ remove some unnecessary memset() in VP8LBitWriter
+ fix mixed code/variable-decl in BIG_ENDIAN mode

Change-Id: I36be61f83d10a43e4682b680c2dae0e494da4218
2012-08-01 00:37:24 -07:00
Pascal Massimino
906be65744 rationalize use of color-cache
* ~1-4% faster
* if it's not used, don't use it
* remove the special handling of cache_bits = 0
* remove some tests in the loops

Change-Id: I19d87c3ca731052ff532ea8b2d8e89816507b75f
2012-08-01 00:32:12 -07:00
Pascal Massimino
80cc7303ab WebPCheckMalloc() and WebPCheckCalloc():
safe size-checking versions of malloc() and calloc()

Change-Id: Iffa3138c48b9b254b3d7eaad913e1f852d9dafba
2012-07-31 16:56:39 -07:00
Pascal Massimino
7d732f905b make QuantizeLevels() store the sum of squared error
(instead of MSE).
Useful for directly storing the alpha-PSNR (in another patch)

Change-Id: I4072864f9c53eb4f38366e8025a2816eb14f504e
2012-07-23 14:26:56 -07:00
Pascal Massimino
2fc1301577 harmonize authors as "Name (mail@address)"
Change-Id: I85bfae61a37de75a5ed945a906002de2ef75149f
2012-07-19 16:09:47 -07:00
James Zern
02201c35a0 Merge "remove one malloc() by making color_cache non dynamic" into 0.2.0 2012-07-16 19:21:05 -07:00
Pascal Massimino
7f0c178e46 remove one malloc() by making color_cache non dynamic
Change-Id: I7c71a056f79a79bfacfe64a263f1eb8476c05456
2012-07-16 19:13:58 -07:00
James Zern
596dff784d VP8LFillBitWindow: use 64-bit path for msvc x64 builds
Change-Id: I14a3865f4091dd048e02abc99aa4e7f1e325e12a
2012-07-14 13:20:26 -07:00
James Zern
1d40a8bce8 configure: add pthread detection
adds a --disable-threading option to prevent the check.
uses AX_PTHREAD from:
git://git.savannah.gnu.org/autoconf-archive.git
100644 blob d90de34d14    ax_pthread.m4

Change-Id: Icf3ad1ebcf052748bc341706effcc9ba02a259e4
2012-07-12 16:27:29 -07:00
Pascal Massimino
48b39eb17a fix underflow for very short bitstreams
+ hardened the asserts

Change-Id: Ie798ef2f9d848c131f6f84a35ea28ef254822d1e
2012-07-09 16:47:34 -07:00
James Zern
f56a369ab0 fix big-endian VP8LWriteBits
bits would be lost if n_bits was > 17

Change-Id: Id315d075075338a08e10c04faa91cab347a53591
2012-07-02 14:30:41 -07:00
James Zern
18cae37b91 msvc: silence some build warnings
these are related to the removal of USE_LOSSLESS_ENCODER

Change-Id: Ib0ca5bc2766c351130507bc2e657a034c8455b34
2012-06-13 13:51:50 -07:00
Pascal Massimino
78f3e34504 Enable lossless encoder code
Remove USE_LOSSLESS_ENCODER compile flag
Update Makefile.am and makefile.unix

Change-Id: If7080c4d8f37994c7c784730c5e547bb0a851455
2012-06-13 00:26:58 -07:00
James Zern
8fbb91884e prefer webp/types.h over stdint.h
stdint.h is part of C99 and is notably lacking under MSVC

Change-Id: Iff60dcb8bdcc7f948dc35fb0b5d47478520b570f
2012-06-04 18:34:24 -07:00
Urvang Joshi
e75dc80516 Move some more defines to format_constants.h
Also remove some duplicate const/defines.

Change-Id: I0ec48866b874f546022d72e938fb65669b0b3211
2012-05-24 17:46:01 +05:30
Pascal Massimino
27f417ab66 fix orthographic typo
Change-Id: I4f62e9c125ef3bc1ab75871e6725551f1c89f5e8
2012-05-22 13:20:57 -07:00
Pascal Massimino
d73e63a726 add DequantizeLevels() placeholder
will be called by alpha post-processing, although doing nothing for now.
Gradient smoothing would be nice-to-have here. Patch welcome!

Change-Id: I534cde866bdc75da22d0f0a6d1373c90e21366f3
2012-05-22 02:28:19 -07:00
Pascal Massimino
3e863dda61 remove tcoder, switch alpha-plane compression to lossless
* Method #1 is now calling the lossless encoder on the alpha plane.
Format is not final, it's just a first draft. We need ad-hoc functions.
* removed now useless utils/alpha.*
* added utils/quant_levels.h instead
* removed the TCoder code altogether

Change-Id: I636840b6129a43171b74860e0a0fc5bb1bcffc6a
2012-05-21 06:24:48 -07:00
James Zern
778c52284b Merge "remove some variable shadowing" 2012-05-17 14:03:10 -07:00
Vikas Arora
817c9dce61 Few more HuffmanTreeToken conversions.
Change-Id: I932b5368d279f83c462c7d916978dab3e81d7709
2012-05-16 12:15:52 +05:30
James Zern
37a77a6bf4 remove some variable shadowing
Change-Id: I4348253ec6b50639095b22c4745dc26da0904466
2012-05-15 14:04:24 -07:00
Pascal Massimino
c6882c49e3 merge all tree processing into a single VP8LProcessTree()
-> 0.1% size improvement because we're calling OptimizeForRLE()
systematically now.

Change-Id: I03bd712175728e0d46323f375134cae5a241db4b
2012-05-14 05:49:02 -07:00
Vikas Arora
8b85d01c45 Added HuffmanTreeCode Struct for tree codes.
To represent tree codes (depth and bits array).

Change-Id: I87650886384dd10d95b16ab808dfd3bb573172bc
2012-05-14 13:50:35 +05:30
Pascal Massimino
432399472f add assert(tokens)
Change-Id: I952a5cd15ff0a80cff349293e6403357cbc7bd8d
2012-05-11 01:45:06 -07:00
Pascal Massimino
ac8e5e42d1 minor typo and style fix
Change-Id: If4927beb7a8f3c96379eee1fedc687a5046a6951
2012-05-11 01:17:31 -07:00
Pascal Massimino
9f566d1d36 clean-up around Huffman-encode
* refine doc
* use LUT for bit-reversal
* added assert
* Introduce HuffmanTreeToken instead of separate code/extra_bits arrays

Change-Id: I0fe0b50b55eb43a4be9f730b1abe40632a6fa7f0
2012-05-10 09:11:47 -07:00
Urvang Joshi
14757f8ae2 Make sure huffman trees always have valid symbols
- Symbols added to the tree are valid inside HuffmanTreeBuildExplicit().
- In HuffmanTreeBuildImplicit(), make sure 'root_symbol' is
valid in case of a single symbol tree.

Change-Id: I7de5de71ff28f41e2d6228b29ed8dd4a20813e99
2012-05-10 11:40:18 +05:30
James Zern
27541fbdc0 cosmetics: VP8LCreateHuffmanTree: fix indent
Change-Id: I025fada400366f8937208e1f9b8df483cd53a942
2012-05-09 12:11:22 -07:00
Urvang Joshi
2c140e113c Some cleanup in VP8LCreateHuffmanTree() (and related functions
CompareHuffmanTrees() and SetBitDepths()):
- Move 'tree_size' initialization and malloc for 'tree + tree_pool'
  outside the loop.
- Some renames/tweaks for readability.

Change-Id: I5cb3cc942afac6e9f51a0b97c57ee897677a48a2
2012-05-09 14:26:01 +05:30
James Zern
e38602d2ad Merge branch 'lossless_encoder'
* lossless_encoder: (46 commits)
  split StoreHuffmanCode() into smaller functions
  more consolidation: introduce VP8LHistogramSet
  big code clean-up and refactoring and optimization
  Some cosmetics in histogram.c
  Approximate FastLog between value range [256, 8192]
  Forgot to update out_bit_costs to symbol_bit_costs at one instance.
  Evaluate output cluster's bit_costs once in HistogramRefine.
  Simple Huffman code changes.
  Lossless decoder: remove an unneeded param in ReadHuffmanCodeLengths().
  Reducing emerging palette size from 11 to 9 bits.
  Move GetHistImageSymbols to histogram.c
  Improve predict vs no-predict heuristic.
  code-moving and clean-up
  reduce memory usage by allocating only one histo
  Restrict histo_bits to ensure histo_image size is under 32MB
  further simplification for the meta-Huffman coding
  A quick pass of cleanup in backward reference code
  Make transform bits a function of encode method (-m).
  introduce -lossless option, protected by USE_LOSSLESS_ENCODER
  Run TraceBackwards for higher qualities.
  ...

Conflicts:
	src/enc/webpenc.c

Change-Id: I9a5d98cba0889ea91d10699466939cc283da345a
2012-05-07 14:27:17 -07:00
Urvang Joshi
352a4f49ab Get rid of PackLiteralBitLengths()
[and in turn a malloc]. Also, a few related const fixes.

Change-Id: I229519b1c34d41c78d9ad2403f1e25feab3c9d93
2012-05-07 14:24:31 -07:00
Vikas Arora
84547f540c Add EncodeImageInternal() method.
Most of changes in enc/vp8l.c is cherry-picked from src/lossless/encode.c

Change-Id: I27938cb2590eccbfe1db0a454343e856bd483e75
2012-05-07 14:24:25 -07:00