Commit Graph

1164 Commits

Author SHA1 Message Date
James Zern
6e37cb942f Merge "cosmetics: backward_references.c: reindent after a7d2ee3" 2014-02-11 16:28:55 -08:00
James Zern
a42ea9742a cosmetics: backward_references.c: reindent after a7d2ee3
a7d2ee3 Optimize cache estimate logic.

Change-Id: I81dd1eea49f603465dc5f3afae8a101e5205e963
2014-02-11 15:52:22 -08:00
skal
6c32744214 Merge "fix missing __BIG_ENDIAN__ definition on some platform" 2014-02-11 14:51:11 -08:00
skal
a8b6aad155 fix missing __BIG_ENDIAN__ definition on some platform
e.g: mips-gcc doesn't define __BIG_ENDIAN__

Change-Id: Ic06bf453164ddddc69a523e7845a4993e14a1af2
2014-02-11 14:43:44 -08:00
Vikas Arora
fde2904b8a Increase initial buffer size for VP8L Bit Writer.
Increase the initial buffer size for VP8L Bit Writer from 4bpp to 8bpp.
The resize buffer is expensive (requires realloc and copy) and this additional
memory (0.5 * W * H) doesn't add much overhead on the lossless encoder.

Change-Id: Ic1fe55cd7bc3d1afadc799e4c2c8786ec848ee66
2014-02-11 11:13:21 -08:00
Vikas Arora
a7d2ee39be Optimize cache estimate logic.
Optimize 'VP8LCalculateEstimateForCacheSize' for lower quality ranges (Q < 50).
The entropy is generally lower for higher cache_bits, so start searching from
higher cache_bits and settle for a local minima, instead of evaluating all
values.

This speeds up the lossless encoding at lower qualities by 10-15%.

Change-Id: I33c1e958515a2549f2e6f64b1aab3f128660dcec
2014-02-11 10:59:01 -08:00
pascal massimino
7fb6095b03 Merge "dec_neon.c: add TransformAC3" 2014-02-11 09:17:34 -08:00
skal
bf182e837e VP8LBitWriter: use a bit-accumulator
* simplify the endian logic
* remove the need for memset()
* write 16 or 32 at a time (likely aligned)

Makes the code a bit faster on ARM (~1%)

Change-Id: I650bc5654e8d0b0454318b7a78206b301c5f6c2c
2014-02-11 09:12:45 -08:00
Djordje Pesut
3f40b4a581 Merge "MIPS: MIPS32r1: clang macro warning resolved" 2014-02-11 00:06:36 -08:00
Urvang Joshi
1684f4ee37 WebP Decoder: Mark some truncated bitstreams as invalid
Specifically, check for truncated RIFF and/or VP8(L) chunks.

For more context, see:
https://code.google.com/p/webp/issues/detail?id=185

Change-Id: I91ca2dbf05080660fbc513244fc53adc57fc04b5
2014-02-10 16:35:27 -08:00
Jovan Zelincevic
acbedac475 MIPS: MIPS32r1: clang macro warning resolved
.set macro - Enables the expansion of macro instructions.

Change-Id: I1e44fe056798aeff803cc97171724d21da1fc2bf
2014-02-10 06:50:11 -08:00
James Zern
228e4877ab dec_neon.c: add TransformAC3
based on SSE2 version

Change-Id: Icc6782955253c98e83d5984153b596ef5f1c0d34
2014-02-08 12:47:54 -08:00
skal
32aeaf115a revamp VP8LColorSpaceTransform() a bit
-> remove the 'color_transform' multiplier, use more constants, etc.

This function is particularly critical, mostly because of
GetBestColorTransformForTile().
Loop is a bit faster (maybe ~1%)

Change-Id: I90c96a3437cafb184773acef55c77e40c224388f
2014-02-05 10:37:06 +01:00
James Zern
0c7cc4ca20 Merge "Don't dereference NULL, ensure HashChain fully initialized" 2014-02-03 22:58:37 -08:00
Scott Talbot
391316fee2 Don't dereference NULL, ensure HashChain fully initialized
Found by clang's static analyzer, they look validly uninitialized
to me.

Change-Id: I650250f516cdf6081b35cdfe92288c20a3036ac8
2014-02-03 21:16:59 -08:00
skal
926ff40229 WEBP_SWAP_16BIT_CSP: remove code dup
and prepare for potentially supporting both RGBA4444 and BARG4444

Change-Id: If5200289bc6338757a2ceb2df1a19de732595052
2014-02-03 13:24:33 -08:00
Vikas Arora
1d1cd3bbd6 Fix decode bug for rgbA_4444/RGBA_4444 color-modes.
The WEBP_SWAP_16BIT_CSP flag needs to be honored while filling the Alpha (4 bits)
data in the destination buffer and while pre-multiplying the alpha to RGB colors.

Change-Id: I3b07307d60963db8d09c3b078888a839cefb35ba
2014-02-03 09:20:54 -08:00
James Zern
8934a622ac cosmetics: *_mips32.c
indent, comments, unused includes

Change-Id: Id0aabc52d05bb633f62aec022155ec27699cf5a0
2014-01-30 18:03:48 -08:00
Djordje Pesut
dd438c9a7d MIPS: MIPS32r1: Optimization of some simple point-sampling functions. PATCH [6/6]
Change-Id: I2020e71e9be5d17d4bf67cabf6c470ca43d5d838
2014-01-29 15:37:31 +01:00
Djordje Pesut
53520911c3 Added support for calling sampling functions via pointers.
Change-Id: Ic4d72e6b175a6b27bcdcc8cd97828e44ea93e743
2014-01-29 15:32:35 +01:00
Jovan Zelincevic
d16c69749b MIPS: MIPS32r1: Optimization of filter functions. PATCH [5/6]
Change-Id: Ifbd305e0514f09a587db02c3970f22190808503a
2014-01-29 15:03:45 +01:00
Djordje Pesut
04336fc7f8 MIPS: MIPS32r1: Optimization of function TransformOne. PATCH [4/6]
Change-Id: I5b98e2de940977538cf91bfa2128f4d1daa5c170
2014-01-28 20:10:43 -08:00
Djordje Pesut
92d8fc7dd4 MIPS: MIPS32r1: Optimization of function WebPRescalerImportRow. PATCH [3/6]
Change-Id: I32339a8d2d03f1a8d8638563d2b2c9e3a13a4909
2014-01-29 00:12:41 +01:00
skal
bbc23ff34c parse one row of intra modes altogether
(instead of per-macroblock)

speed unchanged.
simplified the context-saving for incremental decoding

Change-Id: I301be581bab581ff68de14c4ffe5bc0ec63f34be
2014-01-28 21:40:40 +01:00
James Zern
a2f608f9e0 Merge "MIPS: MIPS32r1: Optimization of function WebPRescalerExportRow. [2/6]" 2014-01-28 11:19:25 -08:00
Djordje Pesut
88230854e3 MIPS: MIPS32r1: Optimization of function WebPRescalerExportRow. [2/6]
Change-Id: I420e646deefac5df9a8dcec3b02da8dff02b1167
2014-01-28 11:02:01 -08:00
James Zern
c5a5b0286f decode mt+incremental: fix segfault in debug builds
VP8GetThreadMethod() may be called with a NULL headers param; correct an
assert.

broken since:
8a2fa09 Add a second multi-thread method

Change-Id: If7b6d1b8f4ec874d343a806cee5f5e6bb6438620
2014-01-27 20:33:05 -08:00
skal
9882b2f953 always use fast-analysis for all methods.
This makes the segmentation overall less prone to
local-optimum  or boundary effect.

(and overall, encoding is a little faster)

Change-Id: I35688098b0f43c28b5cb81c4a92e1575bb0eddb9
2014-01-24 18:17:04 +01:00
James Zern
2d2fc37d98 update .gitignore
- generalize *.pc for libwebp(mux|demux)
- add auto-generated mkinstalldirs

Change-Id: I7543209722d6b64830fa61f88eccf54165920cc4
2014-01-23 19:35:28 -08:00
Pascal Massimino
c1cb1933d5 disable NEON for arm64 platform
The registers and instructions are quite different to 32bit
and the assembly code needs a rewrite.

more info: http://people.linaro.org/~rikuvoipio/aarch64-talk/

Change-Id: Id75dbc1b7bf47f43a426ba2831f25bb8fa252c4f
2014-01-23 12:35:01 -08:00
Jovan Zelincevic
4d493f8db2 MIPS: MIPS32r1: Decoder bit reader function optimized. PATCH [1/6]
Solved unaligned load and added MIPS32r1 support for VP8LoadNewBytes.

Change-Id: I7782e668a8f2b9dace0ab6afeaff3449bbc7d90b
2014-01-17 22:58:40 +01:00
skal
c741183c10 make WebPCleanupTransparentArea work with argb picture
the -alpha_cleanup flag was ineffective since we switched cwebp
to using ARGB input always.

Original idea by David Eckel (dvdckl at gmail dot com)

Change-Id: I0917a8b91ce15a43199728ff4ee2a163be443bab
2014-01-17 20:19:16 +01:00
skal
5da185522b add a decoding option to flip image vertically
New API options: WebPDecoderOptions.flip and 'dwebp -flip ...'

it uses negative stride trick.

Also changed the decoder code to support user-supplied
buffers with negative stride, independently of the
WebPDecoderOptions.flip value.

Change-Id: I4dc0d06f0c87e51a3f3428be4fee2d6b5ad76053
2014-01-16 15:48:43 +01:00
James Zern
ea59a8e980 Merge "Merge tag 'v0.4.0'" 2014-01-10 17:09:19 -08:00
skal
7574bed43d fix comments related to array sizes
thanks to foivos_g4 at hotmail dot com for spotting this.

Change-Id: I8cd301bae58a6edbc3ef47607654bc21321721ca
2014-01-09 17:30:25 +01:00
James Zern
8c524db84c bump version to 0.4.0
libwebp{,decoder} - 0.4.0
libwebp libtool - 5.0.0
libwebpdecoder libtool - 1.0.0

mux/demux - 0.2.0
libtool - 1.0.0

Change-Id: Idbd067f95a6af2f0057d6a63ab43176fcdbb767d
2013-12-18 19:20:00 -08:00
James Zern
c72e08119a Merge "dec/webp.c: don't wait for data before reporting w/h" 2013-12-18 18:47:43 -08:00
James Zern
5ad653145a dec/frame.c: fix formatting
since 26d842e NEON speed up

+ drop a duplicate and from a comment

Change-Id: I710f46f83b80161064910c7efc16788b88c089fe
2013-12-18 17:16:31 -08:00
James Zern
f7fc4bc89b dec/webp.c: don't wait for data before reporting w/h
this partially reverts
f626fe2 Detect canvas and image size mismatch in decoder.

the original change would cause calls to e.g., WebPGetInfo to fail until
a portion of the image chunk was available. With lossy+alpha this meant
waiting for the entire ALPH chunk to be received.
this change restores the original behavior -- reporting the values from
VP8X if available -- while retaining some of the added canvas/image size
checks if the image data is available

Change-Id: I6295b00a2e2d0d4d8847371756af347e4a80bc0e
2013-12-18 17:09:04 -08:00
skal
66a32af5e1 Merge "NEON speed up" 2013-12-18 14:17:19 -08:00
skal
26d842eb8f NEON speed up
add TransformDC special case, and make the switch function inlined.
Recovers a few of the CPU lost during the addition of TransformAC3
(only on ARM)

Change-Id: I21c1f0c6a9cb9d1dfc1e307b4f473a2791273bd6
2013-12-18 22:32:58 +01:00
Pascal Massimino
495bef413d fix bug in TrellisQuantize
the *quantized* level should be clipped to 2047, not the
original coeff.
(similar problem was fixed in the regular quantize function
quite some time ago)

Change-Id: I2fd2f8d94561ff0204e60535321ab41a565e8f85
2013-12-17 11:08:01 -08:00
James Zern
605a712701 simplify __cplusplus ifdef
drop c_plusplus which is from a quite ancient pre-standard compiler

Change-Id: I9e357b3292a6b52b14c2641ba11f4f872c04b7fb
2013-12-16 20:16:02 -08:00
James Zern
5227d99146 drop: ifdef __cplusplus checks from C files
the prototypes are already marked in the headers

Change-Id: I172fe742200c939ca32a70a2299809b8baf9b094
2013-12-13 11:42:13 -08:00
skal
73b731fb42 introduce a special quantization function for WHT
WHT is somewhat a special case: no sharpen[] bias, etc.
Will be useful in a later CL when precision of input is changed.

Change-Id: I851b06deb94abdfc1ef00acafb8aa731801b4299
2013-12-10 14:21:47 +01:00
skal
41c0cc4b9a Make Forward WHT transform use 32bit fixed-point calculation
This is in preparation for a future change where input will
be 16bit instead of 12bit

No speed diff observed.

Note that the NEON implementation was using 32bit calc already.

Change-Id: If06935db5c56a77fc9cefcb2dec617483f5f62b4
2013-12-10 06:10:52 +01:00
skal
a3359f5d2c Only compute quantization params once
(all quantization params #1..#15 are the same)

Change-Id: If04058bd89fe2677b5b118ee4e1bcce88f0e4bf5
2013-12-10 05:36:23 +01:00
skal
d513bb62bc * fix off-by-one zthresh calculation
* remove the sharpening for non luma-AC coeffs
* adjust the bias a little bit to compensate for this

Using the multiply-by-reciprocal doesn't always give the same result
as the exact divide, given the QFIX fixed-point precision we use.
-> removed few now-unneeded SSE2 instructions (and checked for
bit-exactness using -noasm)

Change-Id: Ib68057cbdd69c4e589af56a01a8e7085db762c24
2013-12-09 13:56:04 +01:00
skal
cb63785545 Merge "fix bug due to overzealous check in WebPPictureYUVAToARGB()" 2013-12-04 00:54:35 -08:00
pascal massimino
8189885b6d Merge "EstimateBestFilter: use an int to iterate WEBP_FILTER_TYPE" 2013-12-03 23:32:41 -08:00
James Zern
c12e2369d8 cosmetics: fix a few typos
Change-Id: I73b1900b2d960c4c57ef7078df137c776b321a1b
2013-12-03 22:36:29 -08:00
skal
6f104034a1 fix bug due to overzealous check in WebPPictureYUVAToARGB()
This tests prevented views to be converted to ARGB

https://code.google.com/p/webp/issues/detail?id=178

Change-Id: I5ba66da2791e6f1d2bfd8c55b5fffe6955263374
2013-12-02 15:23:12 +01:00
James Zern
3f6c35c6f3 EstimateBestFilter: use an int to iterate WEBP_FILTER_TYPE
this change allows the code to be built with a C++ compiler without the
addition of an operator++

Change-Id: I2f2fae720b9772abfc3c540bb2e3bf9107d96cc9
2013-11-27 20:10:45 -08:00
James Zern
cc55790e37 Merge changes I8bb7a4dc,I2c180051,I021a014f,I8a224a62
* changes:
  mux: add some missing casts
  enc/vp8l: add a missing cast
  idec: add some missing casts
  ErrorStatusLossless: correct return type
2013-11-27 17:06:35 -08:00
James Zern
c536afb57b Merge "cosmetics: fix some typos" 2013-11-27 17:04:00 -08:00
skal
cbdd3e6e53 add a -dither dithering option to the decoder
Even at high quality setting, the U/V quantizer step is limited
to 4 which can lead to banding on gradient.
This option allows to selectively apply some randomness to
potentially flattened-out U/V blocks and attenuate the banding.

This option is off by default in 'dwebp', but set to -dither 50
by default in 'vwebp'.

Note: depending on the number of blocks selectively dithered,
we can have up to a 10% slow-down in decoding speed it seems.

Change-Id: Icc2446007f33ddacb60b3a80a9e63f2d5ad162de
2013-11-27 00:57:51 -08:00
James Zern
4931c3294b cosmetics: fix some typos
Change-Id: I0d6efebd817815139db5ae87236fd8911df4d53c
2013-11-26 19:21:14 -08:00
James Zern
05aacf77c2 mux: add some missing casts
+ fix a return value
+ fix a param type

Change-Id: I8bb7a4dc4c76b11140f8693c909aeb10f660e6e5
2013-11-25 20:53:15 -08:00
James Zern
617d93480e enc/vp8l: add a missing cast
Change-Id: I2c1800516eb4573ae2599866ace10017b865f23a
2013-11-25 20:48:13 -08:00
James Zern
46db286572 idec: add some missing casts
Change-Id: I021a014f1f3a597f37ba8d9f4006c8cb4100723c
2013-11-25 20:47:35 -08:00
James Zern
b524e3369c ErrorStatusLossless: correct return type
int -> VP8StatusCode

Change-Id: I8a224a622e98d401a6e401047780fa6b25cb0ae4
2013-11-25 20:45:53 -08:00
skal
cb261f790f fix a descaling bug for vertical/horizontal U/V interpolation
RGBToU/V calls expects two extra precision bits, they were only
given one by SUM2H and SUM2H macros.

For rounding coherency, also changed SUM1 macro.

Change-Id: I05f96a46f5d4f17b830d0420eaf79b066cdf78d4
2013-11-19 11:24:09 +01:00
James Zern
6198715e89 demux: split chunk parsing from ParseVP8X
makes the header and chunk parsing portions slightly more digestible

Change-Id: I48968468b2529506b830e96a2f13e8d29976179b
2013-11-16 11:06:43 -08:00
James Zern
d2e3f4e6b5 demux: add a tail pointer for chunks
a large number of non-frame chunks could slow the parse

Change-Id: I181bc73626e92263c3c5b2a6dc2bf6e6a0481d52
2013-11-16 10:57:09 -08:00
James Zern
87cffcc3c9 demux: cosmetics: s/has_frames/is_animation/
animation matches the flag name
s/has_fragments/is_fragmented/ to match.

Change-Id: I86c834e19bab6db6610ea2ccf900ab3ebffafa8a
2013-11-16 10:32:56 -08:00
James Zern
e18e66779b demux: strictly enforce the animation flag
if the flag is incorrectly set in a VP8L or VP8/ALPH single image file
the demux will now fail.

Change-Id: Id4d5f2d3f6922a29b442c5d3d0acbe3d679e468a
2013-11-14 16:01:36 -08:00
James Zern
c4f39f4a31 demux: cosmetics: remove a useless break
+ reword a comment

Change-Id: I52b64abcde2c8322b59c594f2f75b509ed367bff
2013-11-13 21:08:07 -08:00
James Zern
61cb884d79 demux: (non-exp) fail if the fragmented flag is set
otherwise make sure that all frames are marked as a fragment. there's
still some work to do with validation if fragments are expected to cover
the entire canvas.

Change-Id: Id59e95ac01b9340ba8c6039b0c3b65484b91c42f
2013-11-12 21:50:27 -08:00
skal
ff379db317 few % speedup of lossless encoding
mostly visible for method 4 and up

Change-Id: I1561d871bc055ec5f7998eb193d927927d3f2add
2013-11-12 00:09:45 +01:00
skal
df3649a287 remove all disabled code related to P-frames
it's drifting out of sync, and won't be used anyway...

Change-Id: I931b288e1c8480bf3bccd685b3a356bb6bd8e10b
2013-11-04 11:52:05 +01:00
James Zern
7708e60908 Merge "detect flatness in blocks and favor DC prediction" 2013-11-01 11:24:09 -07:00
James Zern
06b1503eff Merge "add comment about the kLevelsFromDelta[][] LUT generation" 2013-10-31 21:37:55 -07:00
skal
5935259cd0 add comment about the kLevelsFromDelta[][] LUT generation
Change-Id: Id1a91932bf05aeee421761667af351ef2200c334
2013-10-31 21:35:55 -07:00
skal
e3312ea681 detect flatness in blocks and favor DC prediction
this avoids local-minima that look bad, even if the distortion
looks low (e.g. gradients, sky,...). Mostly visible in the q=50-80 range.
Output size is mostly unchanged.

Change-Id: I425b600ec45420db409911367cda375870bc2c63
2013-11-01 00:47:04 +01:00
James Zern
ebc9b1eedf Merge "VPLBitReader bugfix: Catch error if bit_pos > LBITS too." 2013-10-31 16:38:15 -07:00
Urvang Joshi
96ad0e0aef VPLBitReader bugfix: Catch error if bit_pos > LBITS too.
Earlier we were only testing for bit_pos == LBITS. But this is not
sufficient,
as bit_pos can jump from < LBITS to > LBITS.

This was resulting in some bit-stream truncation errors not being
caught.

Note: Not a security bug though, as br->pos wasn't incremented in such
cases
and so we weren't reading beyond the buffer.

Change-Id: Idadcdcbc6a5713f8fac3470f907fa37a63074836
2013-10-30 16:33:36 -07:00
skal
a014e9c9cd tune quantization biases toward higher precision
* raise U/V quantization bias to more neutral values
* also raise the non-zero AC bias for Y1/Y2 matrices

(we need all the precision we can for U/V leves, which are often empty)
This will increase quality in the higher range (q >= 90) mostly.
Files size is exacted to raise a little (5-7%). and SSIM accordingly of course.

Change-Id: I8a9ffdb6d8fb6dadb959e3fd392e66dc5aaed64e
2013-10-30 23:57:23 +01:00
skal
1e898619cb add helpful PrintBlockInfo() function
(protected under a DEBUG_BLOCK compile flag)

Change-Id: Icb8da02dbd00e5cf856c314943c212f1c9578d9b
2013-10-30 19:25:27 +01:00
Pascal Massimino
596a6d73ce make use of 'extern' consistent in function declarations
Change-Id: I18e050db3111e52acfe97da09cdf1860f3e15936
2013-10-30 03:23:21 -07:00
Pascal Massimino
98aa33cf1e extract random utils to their own file util/random.[ch]
they'll be used for decoding too, probably.

Change-Id: Id9cbb250c74fc0e876d4ea46b1b3dbf8356d6725
2013-10-30 02:00:33 -07:00
James Zern
d3408720d8 Merge "fast auto-determined filtering strength" 2013-10-29 12:47:58 -07:00
skal
f8bfd5cd1e fast auto-determined filtering strength
kLevelsFromDelta[sharpness][delta] is an inverse look-up table
that tells the minimum filtering strength needed to trigger the
filtering of a step with amplitude 'delta'. We use this table
in various situations:

a) when computing the initial (/global) filtering
strength for each segment. We look at the quantization
step and deduce the proper filtering strength needed
to result this quantization noise (talking the -f option
into account).

b) during intra16 calculation, when a block ends up
very empty (only DC coeffs are non-zero, all ACs have
vanished). We'll rely on the in-loop filtering to
restore the smoothness (if the source was gradient-like
smooth. That's why we look at the distortion too before
triggering the filtering).

Step b) goes _in addition_ to a), potentially raising
the filtering strength if blockiness is likely.

Change-Id: Icaeca93ef21da195b079e6587a44d9edfc8e9efa
2013-10-29 20:13:29 +01:00
Pascal Massimino
ac0bf951ca small clean-up in ExpandMatrix()
Change-Id: Ib06cb1658a6548f06bb7320310b3864881b606a7
2013-10-29 19:58:57 +01:00
skal
43148b6cd2 filtering: precompute ilimit and hev_threshold
no speed change, just simplifying the logic

Change-Id: I518800494428596733d4fbae69072049828aec3c
2013-10-28 13:37:33 +01:00
Pascal Massimino
18f992ec0f simplify f_inner calculation a little
by incorporating the is_4x4 flag at init

Change-Id: I042e04aacb15181db0bf86f3212c880087519189
2013-10-28 01:49:09 -07:00
Pascal Massimino
241d11f141 add missing const
Change-Id: Id1c767d21d52197ed2e4497005eb9c4795c602f0
2013-10-25 20:34:14 +02:00
Pascal Massimino
86c0031eb2 add a 'format' field to WebPBitstreamFeatures
Change-Id: I79a688e4c34fb77527127bbdf4bc844efa6aa9a4
2013-10-25 20:34:06 +02:00
Urvang Joshi
dde91fde96 Demux: Correct the extended format validation
Earlier "f = f->next_" was executing for both inner and outer loop, thus
skipping validation of some frames.

Change-Id: Ice5cdb4ff5da78384aa0573addd3a5e5efa0b10c
2013-10-23 17:25:23 -07:00
skal
7c098d1814 Use some gamma-curve range compression when computing U/V average
This helps for discolorated chroma-subsampled edges.

Change-Id: I1d8ce87b66cb7e8b3572e6722905beabf0f50554
2013-10-18 21:28:05 +02:00
skal
0b2b05049f Use deterministic random-dithering during RGB->YUV conversion
-> helps debanding (sky, gradients, etc.)

This dithering can only be triggered when using -preset photo
or -pre 2 (as a preprocessing). Everything is unchanged otherwise.

Note that this change is likely to make the perceived PSNR/SSIM drop
since we're altering the input internally.

Change-Id: Id8d4326245d9b828141de162c94ba381b1fa5813
2013-10-17 22:36:49 +02:00
skal
8a2fa099cc Add a second multi-thread method
method 1 grouping: [parse + reconstruction] // [filtering + output]
method 2 grouping: [parse] // [reconstruction+filtering + output]

Depending on some heuristics (see VP8ThreadMethod()), we
can pick one of the other when -mt flag (or option.use_threads)
is selected.

Conservatively, we always use method #2 for now until the heuristic
is refined (so, timing should be the same the before this patch)

+ replace 'use_threads' by 'mt_method'
+ define MIN_WIDTH_FOR_THREADS constant
+ fix comment alignment

Change-Id: I11a756dea9070d6e21b1a9481d357a1e8aa0663e
2013-10-15 23:58:31 +02:00
skal
0532149c8a up to 20% faster multi-threaded decoding
Mostly visible for large images.
Reconstruction+filtering is now done in parallel to bitstream-parsing.

Change-Id: I4cc4483d803b255f4d97a2fcd9158b1c291dd900
2013-10-15 00:25:21 +02:00
skal
cb22155201 Decode a full row of bitstream before reconstructing
Needs more memory but allows for future parallelization.
Noticeably faster on ARM, slightly faster on x86

also: remove dec->filter_row_ unnecessary field

Change-Id: I044a808839b4e000c838a477e3e8688820436d9a
2013-10-10 21:29:58 +02:00
James Zern
dca8a4d315 Merge "NEON/simple loopfilter: avoid q4-q7 registers" 2013-10-10 01:58:41 -07:00
pascal massimino
9e84d901d2 Merge "NEON/TransformWHT: avoid q4-q7 registers" 2013-10-09 09:32:59 -07:00
James Zern
fc10249b36 NEON/simple loopfilter: avoid q4-q7 registers
very tiny speed improvement

Change-Id: I3024f120feb7275ce20bfff21af31ea8650a5a03
2013-10-09 18:17:31 +02:00
James Zern
2f09d63e30 NEON/TransformWHT: avoid q4-q7 registers
very tiny speed improvement

Change-Id: Iace78b9038af412d0a794845ff19f54afa88ccdc
2013-10-09 18:17:23 +02:00
skal
0b28d7ab08 use a macrofunc for setting NzCoeffs bits
(avoids code dup)

Change-Id: I776f065538e562673ca08f3bc43c7167d13254d9
2013-10-09 11:46:32 +02:00
skal
f9bbc2a034 Special-case sparse transform
If the number of non-zero coeffs is <= 3, use a
simplified transform for luma.

Change-Id: I78a1252704228d21720d4bc1221252c84338d9c8
2013-10-08 22:05:38 +02:00
Pascal Massimino
6a8c0eb718 Merge "small optimization in segment-smoothing loop" 2013-10-08 04:13:37 -07:00