Average3 created a slowdown of 1-2% in lossless decoding.
Average4 created a slowdown of 2-3% in lossless decoding.
Change-Id: Ic2e62cdd83fc897887ec2bf41ea7cadbada84fe5
...instead of the pointers stored in the array.
Should be faster (inlined) and safer.
Also: suffix explicitly the functions with _SSE2
Change-Id: Ie7de4b8876caea15067fdbe44abfedd72b299a90
Before, a first thread could enter VP8LDspInitSSE2, set
VP8LPredictorsAdd to an SSE2 version BEFORE another thread
would do the memcpy from VP8LPredictorsAdd to VP8LPredictorsAdd_C
thus leading to a C version actually being the SSE2 one (which
would then create an infinite recursion in the SSE2 predictors
at execution).
Change-Id: I224f4ceab31d38f77a1375a7e2636a6014080e3a
it actually disables the disposal / blending method
and just displays the raw delta values.
Useful for debugging.
TODO: Outline the refreshed area with a drawn rectangle?
Change-Id: I6f8cddd0aad8b953cff78a693ae7e8c31def010c
Benchmarks from vrabaud@:
8BIT/GRAY corpus speed: faster: -4.3 % , corpus size: unchanged
skal/sources_png_skal corpus speed: faster: -5.2 % , corpus size: unchanged
images/png_rgb corpus speed: faster: -5.1 % , corpus size: unchanged
images/lpcb corpus speed: unchanged, corpus size: unchanged
images/png_big corpus speed: faster: -1.7 % , corpus size: unchanged
images/png_doc corpus speed: unchanged, corpus size: unchanged
images/png_1bit corpus speed: faster: -1.2 % , corpus size: unchanged
images/jpeg_small corpus speed: unchanged, corpus size: unchanged
images/icip_core1 corpus speed: unchanged, corpus size: unchanged
images/png_gray corpus speed: faster: -2.5 % , corpus size: unchanged
images/jpeg_high_quality corpus speed: faster: -4.0 % , corpus size: unchanged
images/jpeg corpus speed: faster: -2.3 % , corpus size: unchanged
images/png_translucent corpus speed: faster: -2.8 % , corpus size: unchanged
images/gif corpus speed: faster: -1.4 % , corpus size: unchanged
images/png_opaque corpus speed: faster: -2.8 % , corpus size: unchanged
images/png_rgb_opaque corpus speed: unchanged, corpus size: unchanged
images/png_indexed corpus speed: faster: -2.0 % , corpus size: unchanged
images/all corpus speed: faster: -1.5 % , corpus size: unchanged
images/png_small corpus speed: unchanged, corpus size: unchanged
images/png corpus speed: unchanged, corpus size: unchanged
images/gif_still corpus speed: faster: -1.6 % , corpus size: unchanged
Change-Id: I69fe11baa188c5d32cbc77a84b8c0deae13d792b
avoiding triplets of data should make it easier to write SSE2 versions.
FilterRow() can now filter all input in one single pass
-> conversion is 15-20% faster (but still overall slow compared to -pre 0)
Change-Id: I14c3215e672fdecde7ec80394e814bdc7445019f
When try_both_modes=0 (that is: -m 0 or -m 1), and the mode is i4,
we were still sometimes falling back to (unexplored, uninitialized) i16 mode,
which resulted in a enc/dec mismatch.
This was mainly occurring for large images (when bit_limit is low enough)
We disable the fall-back by disabling bit_limit using a large MAX_COST threshold.
Change-Id: I0c60257595812bd813b239ff4c86703ddf63cbf8
The options are now:
-duration d -> set the whole animation to duration 'd'
-duration d,s -> set only frame 's' to duration 'd'
-duration d,s,e -> set only interval [s,d] to duration 'd'
+ style fix
Change-Id: I72e95282d520146f76696666f44280ad9506affa
this will force a constant duration for an interval of frames
in an animation.
Notes:
a) '-duration [...]' can be repeated as many times as needed.
b) intervals are taken into account in option order. If they overlap, values will be overwritten.
c) 'start' and 'end' can be omitted, but not the duration value.
d) 'end' can be equal to '0', in which case it means 'last frame'
e) single-image files are untouched (ie. not turned into an animation file).
Some example usage:
webpmux -duration 150 in.webp -o out.webp
webpmux -duration 33,10,0 in.webp -o out.webp
webpmux -duration 200,2 -duration 150,0,50 in.webp -o out.webp
Change-Id: I9b595dafa77f9221bacd080be7858b1457f54636
the min-distortion was quite too low. And we were also
considering the fully skipped macroblocks (nz=0) in the stats.
We need to have at least *some* non-zero dc coeffs (nz=0x100XXXX).
Fix also two typos in StoreMaxDelta: the v0/v1 comparison was wrong,
and the DCs[] coeffs are actually already in ZigZag order.
Change-Id: I602aaa74b36f7ce80017e506212c7d6fd9deba1f
make this function return success/failure.
an empty map or out of bounds read is treated as an error.
BUG=webp:316
Change-Id: Ic8651836915ea4dd8e0dc81ca8d5d3f247be1ff8
This reverts commit f048d38d38.
the 'len' in Remap refers to the src[] not the colormap; this change
breaks valid files
BUG=webp:316
Change-Id: I1ed40075c2194df91d345cb6f29619b1f5cc96fc
Roughly, if both the source and the reference areas are
darker too dark (R/G/B <= ~6), they are ignored.
One caveat: SSIM calculation won't work for U/V planes,
which are 128-centered and not related to luminance.
But WebPPlaneDistortion() enforces the conversion to RGB,
if needed.
Change-Id: I586c2579c475583b8c90c5baefd766b1d5aea591
Make WebPPictureDistortion() only compute distortion on A/R/G/B planes, not Y/U/V(A).
(not just for SSIM, but PSNR too).
This is to avoid problems with using SSIM on U/V channels.
If Y/U/V distortion is needed, one can always use WebPPlaneDistortion() individually.
Change-Id: If8bc9c3ac12a8d2220f03224694fc389b16b7da9
use a compile check on a separate file to avoid assuming using
arm_neon.h is safe to use without flags when just the file itself is
self-contained with GCC target pragmas.
BUG=webp:313
Change-Id: I48f92ae3e6e4a9468ea5b937c80a89ee40b2dcfd