Commit Graph

4552 Commits

Author SHA1 Message Date
31fe11a57a fix infinite loop in case of PARTITION0 overflow
max_i4_header_bits_ could drop to zero for difficult image and trigger
a loop. Surprisingly, StatLoop() didn't have this bug.

Change-Id: Idc0f9eadef30a2b2f02041b994f25def30901e36
(cherry picked from commit 21e7537abe)
2016-12-07 18:30:39 -08:00
532215dd29 Change the rule of picking UV mode in MBAnalyzeBestUVMode()
Pick the mode with the smallest alpha.
It only affects m0, in which case the mode decision is not re-examined
later in VP8Decimate(). Tests on some natural content png images show
PSNR increase as well as visual quality improvement.

Change-Id: Iea997e718cd7477160fa05eb7cfb35f4cec2fa9a
(cherry picked from commit 1377ac2ec1)
2016-12-07 18:30:33 -08:00
9c75dbd39c cwebp.1: improve some grammar
Change-Id: Id849d7e0d7573f5b8d3b2e807d95e9c628f03b1e
(cherry picked from commit 5b46f7fc80)
2016-12-07 18:30:28 -08:00
af2e05cbdf vwebp: Clear previous frame when a key triggers a redraw
otherwise, transparent areas were accumulating.

Change-Id: I066a96a2bcf0cac750b3df0c02229542b1ed3473
(cherry picked from commit c0a27fd2af)
2016-12-07 18:30:23 -08:00
26ffa2962b Add descriptions of default configuration in help info.
Change-Id: I43188fab5f57bd45aa3e564df52e36cc37b1bb2f
(cherry picked from commit 74f6f9e793)
2016-12-07 18:30:17 -08:00
7416280d75 Fix an unsigned integer overflow error in enc/cost.h
Change-Id: I9774b59c417c185f09a61a115364b9642976a100
(cherry picked from commit 0b2c58a91c)
2016-12-07 18:29:51 -08:00
13cf1d2e41 Do token recording and counting in a single loop
Change-Id: I8afd3c486b210bd67888de03e91dde7f78276f89
(cherry picked from commit 0c0fb83211)
2016-12-07 18:29:44 -08:00
eb9a4b97c5 Reset segment id if we decide not to update segment map
This avoids potential encoder and decoder mismatch.

Change-Id: I5282d3e168afc6193033ad3fce8fbc35618ab2f5
(cherry picked from commit 386e4ba2f0)
2016-12-07 18:25:06 -08:00
42ebe3b783 configure: fix NEON flag detection under gcc 6
use a compile check on a separate file to avoid assuming using
arm_neon.h is safe to use without flags when just the file itself is
self-contained with GCC target pragmas.

BUG=webp:313

Change-Id: I48f92ae3e6e4a9468ea5b937c80a89ee40b2dcfd
(cherry picked from commit 0104d730bf)
2016-12-07 18:23:55 -08:00
76ebbfff28 NEON: implement predictor #13
~5-7% faster

Change-Id: I3361b0bbc978f3721168db15778a67337309c18a
2016-12-07 14:58:49 -08:00
95b12a08ae Merge "Revert Average3 and Average4" 2016-12-07 15:38:56 +00:00
54ab2e758f Revert Average3 and Average4
Average3 created a slowdown of 1-2% in lossless decoding.
Average4 created a slowdown of 2-3% in lossless decoding.

Change-Id: Ic2e62cdd83fc897887ec2bf41ea7cadbada84fe5
2016-12-07 15:32:33 +01:00
fe12330c81 3-5% faster Predictor #5, #6, #7 and #10 for NEON
Change-Id: Ica48c7088d4384f0888dd171a47e68ebd25729b2
2016-12-07 15:25:33 +01:00
fbfb3bef7b ~2% faster predictor #10 for NEON
Change-Id: Icd9cff90c227d702c3ba319131996c5475094520
2016-12-06 13:47:35 +00:00
d4b7d801db lossless_sse2: use the local functions
...instead of the pointers stored in the array.
Should be faster (inlined) and safer.

Also: suffix explicitly the functions with _SSE2

Change-Id: Ie7de4b8876caea15067fdbe44abfedd72b299a90
2016-12-06 14:20:41 +01:00
a5e3b22574 Lossless decoder SSE2 improvements.
Change-Id: Ia901014ac63156a2e278b81e035256c30bdf8706
2016-12-06 13:45:09 +01:00
58a1f124c2 ~2% faster predictor #12 in NEON.
Change-Id: I6772bb865d0f72720a65561eb55028e538df236d
2016-12-06 10:24:27 +01:00
906c3b6392 Merge "Implement lossless transforms in NEON." 2016-12-03 16:55:14 +00:00
d23abe4e9f Implement lossless transforms in NEON.
Change-Id: I2172b1a763eb9dfe25d2b9bf1fb6501d7e192e55
2016-12-03 11:20:22 +00:00
2e6cb6f34e Give more flexibility to the predictor generating macro.
Change-Id: Ia651afa8322cb5c5ae87128340d05245c0f6a900
2016-12-02 12:33:12 -08:00
28e0bb7088 Merge "Fix race condition in multi-threading initialization." 2016-12-02 17:45:10 +00:00
647045305a Fix race condition in multi-threading initialization.
Before, a first thread could enter VP8LDspInitSSE2, set
VP8LPredictorsAdd to an SSE2 version BEFORE another thread
would do the memcpy from VP8LPredictorsAdd to VP8LPredictorsAdd_C
thus leading to a C version actually being the SSE2 one (which
would then create an infinite recursion in the SSE2 predictors
at execution).

Change-Id: I224f4ceab31d38f77a1375a7e2636a6014080e3a
2016-12-02 18:28:57 +01:00
bded7848ea img2webp: fix default -lossless value and use pic.argb=1
Change-Id: I0e5350928c1e58e0901303ee979fb4587f25d6bc
2016-12-02 13:19:02 +01:00
0e61a5134a Merge "img2webp: convert a sequence of images to an animated webp" 2016-12-02 11:59:26 +00:00
1cc79e92ac AnimEncoder: Correctly skip a frame when sub-rectangle is empty.
Change-Id: I0d288bd9561b48cf5a1eae92a1b7106ba44c664e
2016-12-02 11:50:13 +01:00
03f40955a3 img2webp: convert a sequence of images to an animated webp
Usage:

  img2webp [file-level options] [image files...] [per-frame options...]

File-level options (only used at the start of compression):
 -min_size ............ minimize size
 -loop <int> .......... loop count (default: 0, = infinite loop)
 -kmax <int> .......... maximum number of frame between key-frames
                        (0=only keyframes)
 -kmin <int> .......... minimum number of frame between key-frames
                        (0=disable key-frames altogether)
 -mixed ............... use mixed lossy/lossless automatic mode
 -v ................... verbose mode
 -h ................... this help

Per-frame options (only used for subsequent images input):
 -d <int> ............. frame duration in ms (default: 100)
 -lossless  ........... use lossless mode (default)
 -lossy ... ........... use lossy mode
 -q <float> ........... quality
 -m <int> ............. method to use

example: img2webp -loop 2 in0.png -lossy in1.jpg
                  -d 80 in2.tiff -o out.webp

Change-Id: I23771b90eaf0660f420d7ffd304e704155386286
2016-12-02 11:44:17 +01:00
ea72cd60cb add missing 'extern' keyword for predictor dcl
Change-Id: Ibf3db9b6dae91e53524c31cdfccf4678b3fa1135
2016-12-01 08:15:14 +01:00
67879e6d48 SSE implementation of decoding predictors.
Change-Id: I5c9ae63afc98013cb45ce8a91f051203ac68402c
2016-11-30 12:00:07 +01:00
34aee99026 Merge "vwebp: make 'd' key toggle the debugging of fragments" 2016-11-29 12:36:34 +00:00
a41296aef5 Fix potentially uninitialized value.
Change-Id: I721695e22474992db3094942b1ad4754ae7c0a02
2016-11-29 13:19:32 +01:00
c85adb33d2 vwebp: make 'd' key toggle the debugging of fragments
it actually disables the disposal / blending method
and just displays the raw delta values.
Useful for debugging.
TODO: Outline the refreshed area with a drawn rectangle?

Change-Id: I6f8cddd0aad8b953cff78a693ae7e8c31def010c
2016-11-28 19:47:23 +00:00
4239a1489c Make the lossless predictors work on a batch of pixels.
Change-Id: Ieaee34f1f97c375b9e97ef7e9df60aed353dffa1
2016-11-28 17:12:10 +01:00
bc18ebad2e fix extra 'const's in signatures
Change-Id: Ie433d0defbc0c6feae2eb2f11e70082f1affada8
2016-11-25 09:45:52 +01:00
71e2f5cadf Remove memcpy in lossless decoding.
Change-Id: Iba694b306486d67764e2fc5576c98a974c9b886c
2016-11-24 17:45:24 +01:00
7474d46e45 Do not use a register array in SSE.
Change-Id: I79cf95bdac1164fc4de899828e9380c23df8d141
2016-11-24 13:06:44 +01:00
67748b41db Improve latency of FTransform2.
Benchmarks from vrabaud@:
8BIT/GRAY                corpus speed: faster: -4.3 % , corpus size: unchanged
skal/sources_png_skal    corpus speed: faster: -5.2 % , corpus size: unchanged
images/png_rgb           corpus speed: faster: -5.1 % , corpus size: unchanged
images/lpcb              corpus speed: unchanged, corpus size: unchanged
images/png_big           corpus speed: faster: -1.7 % , corpus size: unchanged
images/png_doc           corpus speed: unchanged, corpus size: unchanged
images/png_1bit          corpus speed: faster: -1.2 % , corpus size: unchanged
images/jpeg_small        corpus speed: unchanged, corpus size: unchanged
images/icip_core1        corpus speed: unchanged, corpus size: unchanged
images/png_gray          corpus speed: faster: -2.5 % , corpus size: unchanged
images/jpeg_high_quality corpus speed: faster: -4.0 % , corpus size: unchanged
images/jpeg              corpus speed: faster: -2.3 % , corpus size: unchanged
images/png_translucent   corpus speed: faster: -2.8 % , corpus size: unchanged
images/gif               corpus speed: faster: -1.4 % , corpus size: unchanged
images/png_opaque        corpus speed: faster: -2.8 % , corpus size: unchanged
images/png_rgb_opaque    corpus speed: unchanged, corpus size: unchanged
images/png_indexed       corpus speed: faster: -2.0 % , corpus size: unchanged
images/all               corpus speed: faster: -1.5 % , corpus size: unchanged
images/png_small         corpus speed: unchanged, corpus size: unchanged
images/png               corpus speed: unchanged, corpus size: unchanged
images/gif_still         corpus speed: faster: -1.6 % , corpus size: unchanged

Change-Id: I69fe11baa188c5d32cbc77a84b8c0deae13d792b
2016-11-24 07:09:50 +00:00
16951b1905 Merge "Provide an SSE implementation of ConvertBGRAToRGB" 2016-11-23 16:37:35 +00:00
6540cd0eeb Provide an SSE implementation of ConvertBGRAToRGB
Change-Id: Ida11b079077a47fe3b92754f08aa30d81c301fcf
2016-11-23 16:25:51 +01:00
de568abfdb Android.mk: use -fvisibility=hidden
brings the final libwebp.so size down 16/20K with arm64/armv7 builds
using ndk-r13

Change-Id: I20d8aba61d6b692b0fc32f4b271e2f9872f03c28
2016-11-18 19:24:09 -08:00
3c2a61b099 remove some unneeded casts
Change-Id: Ie68788c77f016ed11446a55142b1bd8d96261452
2016-11-16 22:54:40 -08:00
9ac063c37f add dsp functions for SmartYUV
+ SSE2 implementation

Change-Id: I5cfdb62d68b5a95899241a097d3a2f697fbc590e
2016-11-16 14:23:06 +00:00
22efabddb4 Merge "smart_yuv: switch to planar instead of packed r/g/b processing" 2016-11-15 14:55:17 +00:00
1d6e7bf39f smart_yuv: switch to planar instead of packed r/g/b processing
avoiding triplets of data should make it easier to write SSE2 versions.

FilterRow() can now filter all input in one single pass
-> conversion is 15-20% faster (but still overall slow compared to -pre 0)

Change-Id: I14c3215e672fdecde7ec80394e814bdc7445019f
2016-11-15 14:51:34 +01:00
0a3838ca77 fix bug in RefineUsingDistortion()
When try_both_modes=0 (that is: -m 0 or -m 1), and the mode is i4,
we were still sometimes falling back to (unexplored, uninitialized) i16 mode,
which resulted in a enc/dec mismatch.
This was mainly occurring for large images (when bit_limit is low enough)

We disable the fall-back by disabling bit_limit using a large MAX_COST threshold.

Change-Id: I0c60257595812bd813b239ff4c86703ddf63cbf8
2016-11-12 02:15:28 -08:00
c0699515af webpmux -duration: set default 'end' value equal to 'start'
The options are now:
  -duration d     -> set the whole animation to duration 'd'
  -duration d,s   -> set only frame 's' to duration 'd'
  -duration d,s,e -> set only interval [s,d] to duration 'd'

+ style fix

Change-Id: I72e95282d520146f76696666f44280ad9506affa
2016-11-11 17:57:56 +00:00
83cbfa09a1 Import: use relative pointer offsets
avoids int rollover when working with large input

BUG=webp:312

Change-Id: I6ad9f93b6c4b665c559bff87716a7b847f66a20d
(cherry picked from commit 342e15f0ce)
2016-11-09 15:50:57 -08:00
a1ade40ed8 PreprocessARGB: use relative pointer offsets
avoids int rollover when working with large input

BUG=webp:312

Change-Id: I2881bec2884b550c966108beeff1bf0d8ef9f76b
(cherry picked from commit 1147ab4ee7)
2016-11-09 15:24:16 -08:00
fd4d090fd1 ConvertWRGBToYUV: use relative pointer offsets
avoids int rollover when working with large input

BUG=webp:312

Change-Id: I693cbb295df9cf94aa89294b19c0496bdbe84d18
(cherry picked from commit de9fa5074e)
2016-11-09 12:57:03 -08:00
9daad4598b ImportYUVAFromRGBA: use relative pointer offsets
avoids int rollover when working with large input

BUG=webp:312

Change-Id: I3d7b689be8d5751248a82d1021243d80d3f67203
(cherry picked from commit deb1b83199)
2016-11-09 12:56:49 -08:00
f90c60d129 Merge "add a "-duration duration,start,end" option to webpmux" 2016-11-09 19:05:12 +00:00