Commit Graph

3333 Commits

Author SHA1 Message Date
Vincent Rabaud
95b12a08ae Merge "Revert Average3 and Average4" 2016-12-07 15:38:56 +00:00
Vincent Rabaud
54ab2e758f Revert Average3 and Average4
Average3 created a slowdown of 1-2% in lossless decoding.
Average4 created a slowdown of 2-3% in lossless decoding.

Change-Id: Ic2e62cdd83fc897887ec2bf41ea7cadbada84fe5
2016-12-07 15:32:33 +01:00
Pascal Massimino
fe12330c81 3-5% faster Predictor #5, #6, #7 and #10 for NEON
Change-Id: Ica48c7088d4384f0888dd171a47e68ebd25729b2
2016-12-07 15:25:33 +01:00
Pascal Massimino
fbfb3bef7b ~2% faster predictor #10 for NEON
Change-Id: Icd9cff90c227d702c3ba319131996c5475094520
2016-12-06 13:47:35 +00:00
Pascal Massimino
d4b7d801db lossless_sse2: use the local functions
...instead of the pointers stored in the array.
Should be faster (inlined) and safer.

Also: suffix explicitly the functions with _SSE2

Change-Id: Ie7de4b8876caea15067fdbe44abfedd72b299a90
2016-12-06 14:20:41 +01:00
Vincent Rabaud
a5e3b22574 Lossless decoder SSE2 improvements.
Change-Id: Ia901014ac63156a2e278b81e035256c30bdf8706
2016-12-06 13:45:09 +01:00
Pascal Massimino
58a1f124c2 ~2% faster predictor #12 in NEON.
Change-Id: I6772bb865d0f72720a65561eb55028e538df236d
2016-12-06 10:24:27 +01:00
Pascal Massimino
906c3b6392 Merge "Implement lossless transforms in NEON." 2016-12-03 16:55:14 +00:00
Vincent Rabaud
d23abe4e9f Implement lossless transforms in NEON.
Change-Id: I2172b1a763eb9dfe25d2b9bf1fb6501d7e192e55
2016-12-03 11:20:22 +00:00
Vincent Rabaud
2e6cb6f34e Give more flexibility to the predictor generating macro.
Change-Id: Ia651afa8322cb5c5ae87128340d05245c0f6a900
2016-12-02 12:33:12 -08:00
Vincent Rabaud
28e0bb7088 Merge "Fix race condition in multi-threading initialization." 2016-12-02 17:45:10 +00:00
Vincent Rabaud
647045305a Fix race condition in multi-threading initialization.
Before, a first thread could enter VP8LDspInitSSE2, set
VP8LPredictorsAdd to an SSE2 version BEFORE another thread
would do the memcpy from VP8LPredictorsAdd to VP8LPredictorsAdd_C
thus leading to a C version actually being the SSE2 one (which
would then create an infinite recursion in the SSE2 predictors
at execution).

Change-Id: I224f4ceab31d38f77a1375a7e2636a6014080e3a
2016-12-02 18:28:57 +01:00
Pascal Massimino
bded7848ea img2webp: fix default -lossless value and use pic.argb=1
Change-Id: I0e5350928c1e58e0901303ee979fb4587f25d6bc
2016-12-02 13:19:02 +01:00
Pascal Massimino
0e61a5134a Merge "img2webp: convert a sequence of images to an animated webp" 2016-12-02 11:59:26 +00:00
Hui Su
1cc79e92ac AnimEncoder: Correctly skip a frame when sub-rectangle is empty.
Change-Id: I0d288bd9561b48cf5a1eae92a1b7106ba44c664e
2016-12-02 11:50:13 +01:00
Pascal Massimino
03f40955a3 img2webp: convert a sequence of images to an animated webp
Usage:

  img2webp [file-level options] [image files...] [per-frame options...]

File-level options (only used at the start of compression):
 -min_size ............ minimize size
 -loop <int> .......... loop count (default: 0, = infinite loop)
 -kmax <int> .......... maximum number of frame between key-frames
                        (0=only keyframes)
 -kmin <int> .......... minimum number of frame between key-frames
                        (0=disable key-frames altogether)
 -mixed ............... use mixed lossy/lossless automatic mode
 -v ................... verbose mode
 -h ................... this help

Per-frame options (only used for subsequent images input):
 -d <int> ............. frame duration in ms (default: 100)
 -lossless  ........... use lossless mode (default)
 -lossy ... ........... use lossy mode
 -q <float> ........... quality
 -m <int> ............. method to use

example: img2webp -loop 2 in0.png -lossy in1.jpg
                  -d 80 in2.tiff -o out.webp

Change-Id: I23771b90eaf0660f420d7ffd304e704155386286
2016-12-02 11:44:17 +01:00
Pascal Massimino
ea72cd60cb add missing 'extern' keyword for predictor dcl
Change-Id: Ibf3db9b6dae91e53524c31cdfccf4678b3fa1135
2016-12-01 08:15:14 +01:00
Vincent Rabaud
67879e6d48 SSE implementation of decoding predictors.
Change-Id: I5c9ae63afc98013cb45ce8a91f051203ac68402c
2016-11-30 12:00:07 +01:00
Pascal Massimino
34aee99026 Merge "vwebp: make 'd' key toggle the debugging of fragments" 2016-11-29 12:36:34 +00:00
Vincent Rabaud
a41296aef5 Fix potentially uninitialized value.
Change-Id: I721695e22474992db3094942b1ad4754ae7c0a02
2016-11-29 13:19:32 +01:00
Pascal Massimino
c85adb33d2 vwebp: make 'd' key toggle the debugging of fragments
it actually disables the disposal / blending method
and just displays the raw delta values.
Useful for debugging.
TODO: Outline the refreshed area with a drawn rectangle?

Change-Id: I6f8cddd0aad8b953cff78a693ae7e8c31def010c
2016-11-28 19:47:23 +00:00
Vincent Rabaud
4239a1489c Make the lossless predictors work on a batch of pixels.
Change-Id: Ieaee34f1f97c375b9e97ef7e9df60aed353dffa1
2016-11-28 17:12:10 +01:00
Pascal Massimino
bc18ebad2e fix extra 'const's in signatures
Change-Id: Ie433d0defbc0c6feae2eb2f11e70082f1affada8
2016-11-25 09:45:52 +01:00
Vincent Rabaud
71e2f5cadf Remove memcpy in lossless decoding.
Change-Id: Iba694b306486d67764e2fc5576c98a974c9b886c
2016-11-24 17:45:24 +01:00
Vincent Rabaud
7474d46e45 Do not use a register array in SSE.
Change-Id: I79cf95bdac1164fc4de899828e9380c23df8d141
2016-11-24 13:06:44 +01:00
Owen Rodley
67748b41db Improve latency of FTransform2.
Benchmarks from vrabaud@:
8BIT/GRAY                corpus speed: faster: -4.3 % , corpus size: unchanged
skal/sources_png_skal    corpus speed: faster: -5.2 % , corpus size: unchanged
images/png_rgb           corpus speed: faster: -5.1 % , corpus size: unchanged
images/lpcb              corpus speed: unchanged, corpus size: unchanged
images/png_big           corpus speed: faster: -1.7 % , corpus size: unchanged
images/png_doc           corpus speed: unchanged, corpus size: unchanged
images/png_1bit          corpus speed: faster: -1.2 % , corpus size: unchanged
images/jpeg_small        corpus speed: unchanged, corpus size: unchanged
images/icip_core1        corpus speed: unchanged, corpus size: unchanged
images/png_gray          corpus speed: faster: -2.5 % , corpus size: unchanged
images/jpeg_high_quality corpus speed: faster: -4.0 % , corpus size: unchanged
images/jpeg              corpus speed: faster: -2.3 % , corpus size: unchanged
images/png_translucent   corpus speed: faster: -2.8 % , corpus size: unchanged
images/gif               corpus speed: faster: -1.4 % , corpus size: unchanged
images/png_opaque        corpus speed: faster: -2.8 % , corpus size: unchanged
images/png_rgb_opaque    corpus speed: unchanged, corpus size: unchanged
images/png_indexed       corpus speed: faster: -2.0 % , corpus size: unchanged
images/all               corpus speed: faster: -1.5 % , corpus size: unchanged
images/png_small         corpus speed: unchanged, corpus size: unchanged
images/png               corpus speed: unchanged, corpus size: unchanged
images/gif_still         corpus speed: faster: -1.6 % , corpus size: unchanged

Change-Id: I69fe11baa188c5d32cbc77a84b8c0deae13d792b
2016-11-24 07:09:50 +00:00
Vincent Rabaud
16951b1905 Merge "Provide an SSE implementation of ConvertBGRAToRGB" 2016-11-23 16:37:35 +00:00
Vincent Rabaud
6540cd0eeb Provide an SSE implementation of ConvertBGRAToRGB
Change-Id: Ida11b079077a47fe3b92754f08aa30d81c301fcf
2016-11-23 16:25:51 +01:00
James Zern
de568abfdb Android.mk: use -fvisibility=hidden
brings the final libwebp.so size down 16/20K with arm64/armv7 builds
using ndk-r13

Change-Id: I20d8aba61d6b692b0fc32f4b271e2f9872f03c28
2016-11-18 19:24:09 -08:00
Pascal Massimino
3c2a61b099 remove some unneeded casts
Change-Id: Ie68788c77f016ed11446a55142b1bd8d96261452
2016-11-16 22:54:40 -08:00
Pascal Massimino
9ac063c37f add dsp functions for SmartYUV
+ SSE2 implementation

Change-Id: I5cfdb62d68b5a95899241a097d3a2f697fbc590e
2016-11-16 14:23:06 +00:00
Pascal Massimino
22efabddb4 Merge "smart_yuv: switch to planar instead of packed r/g/b processing" 2016-11-15 14:55:17 +00:00
Pascal Massimino
1d6e7bf39f smart_yuv: switch to planar instead of packed r/g/b processing
avoiding triplets of data should make it easier to write SSE2 versions.

FilterRow() can now filter all input in one single pass
-> conversion is 15-20% faster (but still overall slow compared to -pre 0)

Change-Id: I14c3215e672fdecde7ec80394e814bdc7445019f
2016-11-15 14:51:34 +01:00
Pascal Massimino
0a3838ca77 fix bug in RefineUsingDistortion()
When try_both_modes=0 (that is: -m 0 or -m 1), and the mode is i4,
we were still sometimes falling back to (unexplored, uninitialized) i16 mode,
which resulted in a enc/dec mismatch.
This was mainly occurring for large images (when bit_limit is low enough)

We disable the fall-back by disabling bit_limit using a large MAX_COST threshold.

Change-Id: I0c60257595812bd813b239ff4c86703ddf63cbf8
2016-11-12 02:15:28 -08:00
Pascal Massimino
c0699515af webpmux -duration: set default 'end' value equal to 'start'
The options are now:
  -duration d     -> set the whole animation to duration 'd'
  -duration d,s   -> set only frame 's' to duration 'd'
  -duration d,s,e -> set only interval [s,d] to duration 'd'

+ style fix

Change-Id: I72e95282d520146f76696666f44280ad9506affa
2016-11-11 17:57:56 +00:00
Urvang Joshi
f90c60d129 Merge "add a "-duration duration,start,end" option to webpmux" 2016-11-09 19:05:12 +00:00
Pascal Massimino
3f182d36f4 add a "-duration duration,start,end" option to webpmux
this will force a constant duration for an interval of frames
in an animation.
Notes:
 a) '-duration [...]' can be repeated as many times as needed.
 b) intervals are taken into account in option order. If they overlap, values will be overwritten.
 c) 'start' and 'end' can be omitted, but not the duration value.
 d) 'end' can be equal to '0', in which case it means 'last frame'
 e) single-image files are untouched (ie. not turned into an animation file).

Some example usage:
    webpmux -duration 150 in.webp -o out.webp
    webpmux -duration 33,10,0 in.webp -o out.webp
    webpmux -duration 200,2 -duration 150,0,50 in.webp -o out.webp

Change-Id: I9b595dafa77f9221bacd080be7858b1457f54636
2016-11-09 15:44:09 +01:00
James Zern
342e15f0ce Import: use relative pointer offsets
avoids int rollover when working with large input

BUG=webp:312

Change-Id: I6ad9f93b6c4b665c559bff87716a7b847f66a20d
2016-11-07 17:08:13 -08:00
James Zern
1147ab4ee7 PreprocessARGB: use relative pointer offsets
avoids int rollover when working with large input

BUG=webp:312

Change-Id: I2881bec2884b550c966108beeff1bf0d8ef9f76b
2016-11-07 17:08:06 -08:00
Pascal Massimino
e4cd4daf74 fix filtering auto-adjustment
the min-distortion was quite too low. And we were also
considering the fully skipped macroblocks (nz=0) in the stats.
We need to have at least *some* non-zero dc coeffs (nz=0x100XXXX).

Fix also two typos in StoreMaxDelta: the v0/v1 comparison was wrong,
and the DCs[] coeffs are actually already in ZigZag order.

Change-Id: I602aaa74b36f7ce80017e506212c7d6fd9deba1f
2016-11-07 06:43:51 -08:00
Pascal Massimino
e715285611 fix doc and code snippet for WebPINewDecoder() doc
Change-Id: I1a75fdf60f0b9f1816be28f22613438bfe21752b
2016-11-04 12:07:54 +01:00
James Zern
de9fa5074e ConvertWRGBToYUV: use relative pointer offsets
avoids int rollover when working with large input

BUG=webp:312

Change-Id: I693cbb295df9cf94aa89294b19c0496bdbe84d18
2016-11-04 00:35:04 -07:00
James Zern
deb1b83199 ImportYUVAFromRGBA: use relative pointer offsets
avoids int rollover when working with large input

BUG=webp:312

Change-Id: I3d7b689be8d5751248a82d1021243d80d3f67203
2016-11-04 00:34:58 -07:00
James Zern
cac9a36a23 gifdec,Remap: avoid out of bounds colormap read
make this function return success/failure.
an empty map or out of bounds read is treated as an error.

BUG=webp:316

Change-Id: Ic8651836915ea4dd8e0dc81ca8d5d3f247be1ff8
2016-11-02 15:01:10 -07:00
James Zern
4595e01fd0 Revert "gifdec,Remap: avoid out of bounds colormap read"
This reverts commit f048d38d38.

the 'len' in Remap refers to the src[] not the colormap; this change
breaks valid files

BUG=webp:316

Change-Id: I1ed40075c2194df91d345cb6f29619b1f5cc96fc
2016-11-01 21:03:56 -07:00
James Zern
fb52d4432a gifdec: make some constants unsigned
fixes implicit type conversion when dealing with color types

Change-Id: Ie4f25e14d8bb2748050db4fca25147164fc6adb4
2016-10-31 17:51:27 -07:00
James Zern
f048d38d38 gifdec,Remap: avoid out of bounds colormap read
sanitize the requested length to be read against the reported size of
the table

BUG=webp:316

Change-Id: I1c471e93ab696a9d21a0142cf1987ffcf8f55dd2
2016-10-31 12:56:04 -07:00
Pascal Massimino
31b1e34342 fix SSIM metric ... by ignoring too-dark area
Roughly, if both the source and the reference areas are
darker too dark (R/G/B <= ~6), they are ignored.

One caveat: SSIM calculation won't work for U/V planes,
which are 128-centered and not related to luminance.
But WebPPlaneDistortion() enforces the conversion to RGB,
if needed.

Change-Id: I586c2579c475583b8c90c5baefd766b1d5aea591
2016-10-20 15:17:55 +02:00
Pascal Massimino
2f51b614b0 introduce WebPPlaneDistortion to compute plane distortion
Make WebPPictureDistortion() only compute distortion on A/R/G/B planes, not Y/U/V(A).
(not just for SSIM, but PSNR too).

This is to avoid problems with using SSIM on U/V channels.
If Y/U/V distortion is needed, one can always use WebPPlaneDistortion() individually.

Change-Id: If8bc9c3ac12a8d2220f03224694fc389b16b7da9
2016-10-19 09:12:13 +02:00
James Zern
0104d730bf configure: fix NEON flag detection under gcc 6
use a compile check on a separate file to avoid assuming using
arm_neon.h is safe to use without flags when just the file itself is
self-contained with GCC target pragmas.

BUG=webp:313

Change-Id: I48f92ae3e6e4a9468ea5b937c80a89ee40b2dcfd
2016-10-14 06:24:08 +00:00