This will well isolate contributions for original code,
generated code and SIMD (especially for Android).
Change-Id: Ie47664decc7f43c2f57260a72cab951c347281a7
(cherry picked from commit 6c62841076)
brings the final libwebp.so size down 16/20K with arm64/armv7 builds
using ndk-r13
Change-Id: I20d8aba61d6b692b0fc32f4b271e2f9872f03c28
(cherry picked from commit de568abfdb)
quiets a -Wimplicit-function-declaration with some configurations of gcc
(-std=c99).
_POSIX_C_SOURCE is preferred over _BSD_SOURCE with newer versions of
glibc
Change-Id: I378bffb13ba52ff5c4bad1433090dcc387e5d507
(cherry picked from commit bfab894739)
The image is scaled to fit the whole viewport.
Avoid some oddities with offsets, etc.
removes some TODO.
Change-Id: I52fae9ca80a2feed234f32261c7f6358d7594e21
(cherry picked from commit 9310d19258)
max_i4_header_bits_ could drop to zero for difficult image and trigger
a loop. Surprisingly, StatLoop() didn't have this bug.
Change-Id: Idc0f9eadef30a2b2f02041b994f25def30901e36
(cherry picked from commit 21e7537abe)
Pick the mode with the smallest alpha.
It only affects m0, in which case the mode decision is not re-examined
later in VP8Decimate(). Tests on some natural content png images show
PSNR increase as well as visual quality improvement.
Change-Id: Iea997e718cd7477160fa05eb7cfb35f4cec2fa9a
(cherry picked from commit 1377ac2ec1)
use a compile check on a separate file to avoid assuming using
arm_neon.h is safe to use without flags when just the file itself is
self-contained with GCC target pragmas.
BUG=webp:313
Change-Id: I48f92ae3e6e4a9468ea5b937c80a89ee40b2dcfd
(cherry picked from commit 0104d730bf)
Average3 created a slowdown of 1-2% in lossless decoding.
Average4 created a slowdown of 2-3% in lossless decoding.
Change-Id: Ic2e62cdd83fc897887ec2bf41ea7cadbada84fe5
...instead of the pointers stored in the array.
Should be faster (inlined) and safer.
Also: suffix explicitly the functions with _SSE2
Change-Id: Ie7de4b8876caea15067fdbe44abfedd72b299a90
Before, a first thread could enter VP8LDspInitSSE2, set
VP8LPredictorsAdd to an SSE2 version BEFORE another thread
would do the memcpy from VP8LPredictorsAdd to VP8LPredictorsAdd_C
thus leading to a C version actually being the SSE2 one (which
would then create an infinite recursion in the SSE2 predictors
at execution).
Change-Id: I224f4ceab31d38f77a1375a7e2636a6014080e3a
it actually disables the disposal / blending method
and just displays the raw delta values.
Useful for debugging.
TODO: Outline the refreshed area with a drawn rectangle?
Change-Id: I6f8cddd0aad8b953cff78a693ae7e8c31def010c
Benchmarks from vrabaud@:
8BIT/GRAY corpus speed: faster: -4.3 % , corpus size: unchanged
skal/sources_png_skal corpus speed: faster: -5.2 % , corpus size: unchanged
images/png_rgb corpus speed: faster: -5.1 % , corpus size: unchanged
images/lpcb corpus speed: unchanged, corpus size: unchanged
images/png_big corpus speed: faster: -1.7 % , corpus size: unchanged
images/png_doc corpus speed: unchanged, corpus size: unchanged
images/png_1bit corpus speed: faster: -1.2 % , corpus size: unchanged
images/jpeg_small corpus speed: unchanged, corpus size: unchanged
images/icip_core1 corpus speed: unchanged, corpus size: unchanged
images/png_gray corpus speed: faster: -2.5 % , corpus size: unchanged
images/jpeg_high_quality corpus speed: faster: -4.0 % , corpus size: unchanged
images/jpeg corpus speed: faster: -2.3 % , corpus size: unchanged
images/png_translucent corpus speed: faster: -2.8 % , corpus size: unchanged
images/gif corpus speed: faster: -1.4 % , corpus size: unchanged
images/png_opaque corpus speed: faster: -2.8 % , corpus size: unchanged
images/png_rgb_opaque corpus speed: unchanged, corpus size: unchanged
images/png_indexed corpus speed: faster: -2.0 % , corpus size: unchanged
images/all corpus speed: faster: -1.5 % , corpus size: unchanged
images/png_small corpus speed: unchanged, corpus size: unchanged
images/png corpus speed: unchanged, corpus size: unchanged
images/gif_still corpus speed: faster: -1.6 % , corpus size: unchanged
Change-Id: I69fe11baa188c5d32cbc77a84b8c0deae13d792b
avoiding triplets of data should make it easier to write SSE2 versions.
FilterRow() can now filter all input in one single pass
-> conversion is 15-20% faster (but still overall slow compared to -pre 0)
Change-Id: I14c3215e672fdecde7ec80394e814bdc7445019f
When try_both_modes=0 (that is: -m 0 or -m 1), and the mode is i4,
we were still sometimes falling back to (unexplored, uninitialized) i16 mode,
which resulted in a enc/dec mismatch.
This was mainly occurring for large images (when bit_limit is low enough)
We disable the fall-back by disabling bit_limit using a large MAX_COST threshold.
Change-Id: I0c60257595812bd813b239ff4c86703ddf63cbf8
The options are now:
-duration d -> set the whole animation to duration 'd'
-duration d,s -> set only frame 's' to duration 'd'
-duration d,s,e -> set only interval [s,d] to duration 'd'
+ style fix
Change-Id: I72e95282d520146f76696666f44280ad9506affa