Commit Graph

735 Commits

Author SHA1 Message Date
Vincent Rabaud
4239a1489c Make the lossless predictors work on a batch of pixels.
Change-Id: Ieaee34f1f97c375b9e97ef7e9df60aed353dffa1
2016-11-28 17:12:10 +01:00
Pascal Massimino
9ac063c37f add dsp functions for SmartYUV
+ SSE2 implementation

Change-Id: I5cfdb62d68b5a95899241a097d3a2f697fbc590e
2016-11-16 14:23:06 +00:00
Pascal Massimino
22efabddb4 Merge "smart_yuv: switch to planar instead of packed r/g/b processing" 2016-11-15 14:55:17 +00:00
Pascal Massimino
1d6e7bf39f smart_yuv: switch to planar instead of packed r/g/b processing
avoiding triplets of data should make it easier to write SSE2 versions.

FilterRow() can now filter all input in one single pass
-> conversion is 15-20% faster (but still overall slow compared to -pre 0)

Change-Id: I14c3215e672fdecde7ec80394e814bdc7445019f
2016-11-15 14:51:34 +01:00
Pascal Massimino
0a3838ca77 fix bug in RefineUsingDistortion()
When try_both_modes=0 (that is: -m 0 or -m 1), and the mode is i4,
we were still sometimes falling back to (unexplored, uninitialized) i16 mode,
which resulted in a enc/dec mismatch.
This was mainly occurring for large images (when bit_limit is low enough)

We disable the fall-back by disabling bit_limit using a large MAX_COST threshold.

Change-Id: I0c60257595812bd813b239ff4c86703ddf63cbf8
2016-11-12 02:15:28 -08:00
James Zern
342e15f0ce Import: use relative pointer offsets
avoids int rollover when working with large input

BUG=webp:312

Change-Id: I6ad9f93b6c4b665c559bff87716a7b847f66a20d
2016-11-07 17:08:13 -08:00
James Zern
1147ab4ee7 PreprocessARGB: use relative pointer offsets
avoids int rollover when working with large input

BUG=webp:312

Change-Id: I2881bec2884b550c966108beeff1bf0d8ef9f76b
2016-11-07 17:08:06 -08:00
Pascal Massimino
e4cd4daf74 fix filtering auto-adjustment
the min-distortion was quite too low. And we were also
considering the fully skipped macroblocks (nz=0) in the stats.
We need to have at least *some* non-zero dc coeffs (nz=0x100XXXX).

Fix also two typos in StoreMaxDelta: the v0/v1 comparison was wrong,
and the DCs[] coeffs are actually already in ZigZag order.

Change-Id: I602aaa74b36f7ce80017e506212c7d6fd9deba1f
2016-11-07 06:43:51 -08:00
James Zern
de9fa5074e ConvertWRGBToYUV: use relative pointer offsets
avoids int rollover when working with large input

BUG=webp:312

Change-Id: I693cbb295df9cf94aa89294b19c0496bdbe84d18
2016-11-04 00:35:04 -07:00
James Zern
deb1b83199 ImportYUVAFromRGBA: use relative pointer offsets
avoids int rollover when working with large input

BUG=webp:312

Change-Id: I3d7b689be8d5751248a82d1021243d80d3f67203
2016-11-04 00:34:58 -07:00
Pascal Massimino
2f51b614b0 introduce WebPPlaneDistortion to compute plane distortion
Make WebPPictureDistortion() only compute distortion on A/R/G/B planes, not Y/U/V(A).
(not just for SSIM, but PSNR too).

This is to avoid problems with using SSIM on U/V channels.
If Y/U/V distortion is needed, one can always use WebPPlaneDistortion() individually.

Change-Id: If8bc9c3ac12a8d2220f03224694fc389b16b7da9
2016-10-19 09:12:13 +02:00
Pascal Massimino
4eb5df28d1 remove unused stride fields from VP8Iterator
Change-Id: I242aaa746dc53c456eb8f1a71a5a2378f26fa843
2016-10-10 18:08:47 +02:00
Vincent Rabaud
11bc423ae5 MIN_LENGTH cleanups.
No change in logic so no change in speed or compression.

Change-Id: I744161978c7d058c9b58450f330cba11731530c6
2016-10-10 15:37:45 +02:00
Vincent Rabaud
539f5a688f Fix non-included header in config.c.
When compiling as experimental, WEBP_EXPERIMENTAL_FEATURES
would not be defined because the header defining it would
not be included.
Hence runtime errors in debug mode when running:
./cwebp -lossles whatever
...
Error! Cannot encode picture as WebP
Error code: 4 (INVALID_CONFIGURATION: configuration is invalid)

(detail: WebPConfig would have a random value set for
delta_palettization as config.c does not consider
it to exist.)

Change-Id: I41761cffe81a971130ed514b195a73d1c6dac1b7
2016-10-10 13:39:17 +02:00
Vincent Rabaud
28ce304344 Remove some errors when compiling the code as C++.
This fixes some cases from
https://bugs.chromium.org/p/webp/issues/detail?id=137

Change-Id: I58f3a617bf973dbe4c5794004a01e2aea39ba53a
2016-10-05 09:39:08 +02:00
hui su
b34abcb8b1 Favor keeping the areas locally similar in spatial prediction mode selection
About 0.1% compression improvement.

Change-Id: If106ab209cc2671ef282b726e09ff2971c3e4abf
2016-10-04 16:28:24 -07:00
Vincent Rabaud
f79450ca02 Speedup ApplyMap.
If a small hash map can be used, use it to avoid binary search.
This fist hash function that is tried works with the previous
use case of having indexed data in green.

Change-Id: I2f91cec5f3ca7e9c393fd829e69e09bab74f4e7c
2016-09-28 17:18:08 +02:00
Vincent Rabaud
30d43706d3 Speed-up Combined entropy for palettized histograms.
Change-Id: Ie9bdebb26c726e5b44c2dbcc84d453f85a03f419
2016-09-28 13:22:13 +02:00
Vincent Rabaud
5f1caf2987 Small LZ77 speedups.
The most common conditions are re-ordered and cached.

iter_min was recently introduced to make sure enough iterations
are made in cases where there are many matches (mostly uniform regions).
Now that those are properly analyzed, it becomes useless.

Change-Id: Id3010ee4ec66b84d602fcb926f91eb9155ad27f4
2016-09-22 14:03:25 +02:00
hui su
a2fe9bf404 Speedup TrellisQuantizeBlock().
-Skip examining quantized levels that are too high.
-Calculate last_pos_cost only when needed.

Encoding speed for m6 is increased by about 3%;
Compression performance is neutral.

Change-Id: I8af70b049587cca0375d9b3eb00479ec7c0c842a
2016-09-20 14:54:15 -07:00
Pascal Massimino
573cce270e smartYUV improvements
* switch to Rec709 transfer function in SmartYUV
* use Rec709 for Gray evaluation too.
* stop iterations if error is going up

See paragraph 1.2 and 3.2:
https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.709-6-201506-I!!PDF-E.pdf

(digest: https://en.wikipedia.org/wiki/Rec._709#Transfer_characteristics)

suggested by pdknsk@gmail.com on the mailing list

Change-Id: I12b5f4d3e318dd5134984e1c0a4b244a620a57d7
2016-09-20 06:14:31 +00:00
Pascal Massimino
21e7537abe fix infinite loop in case of PARTITION0 overflow
max_i4_header_bits_ could drop to zero for difficult image and trigger
a loop. Surprisingly, StatLoop() didn't have this bug.

Change-Id: Idc0f9eadef30a2b2f02041b994f25def30901e36
2016-09-15 02:38:28 -07:00
hui su
1377ac2ec1 Change the rule of picking UV mode in MBAnalyzeBestUVMode()
Pick the mode with the smallest alpha.
It only affects m0, in which case the mode decision is not re-examined
later in VP8Decimate(). Tests on some natural content png images show
PSNR increase as well as visual quality improvement.

Change-Id: Iea997e718cd7477160fa05eb7cfb35f4cec2fa9a
2016-09-15 06:01:03 +00:00
Pascal Massimino
bfff0bf329 speed-up SSIM calculation
SSIM results are incompatible with previous version!
We're now averaging the SSIM value for each pixels instead of
printing a frame-level global SSIM value.

* Got rid of some old code
* switched to uint32_t for accumulation
* refactoring

SSIM calculation is ~4x faster now.

Change-Id: I48d838e66aef5199b9b5cd5cddef6a98411f5673
2016-09-14 16:15:43 +02:00
Pascal Massimino
a7be73280b Merge "refactor the PSNR / SSIM calculation code" 2016-09-14 06:37:56 +00:00
Pascal Massimino
50c3d7da9a refactor the PSNR / SSIM calculation code
-print_psnr is now much faster because it doesn't use the SSIM code.
The SSIM speed-up and re-write will come later.

Change-Id: Iabf565e0a8b41651d8164df1266cfeded4ab4823
2016-09-14 06:13:24 +00:00
Pascal Massimino
d6228aed6a indentation fix after I7055d3ee3bd7ed5e78e94ae82cb858fa7db3ddc0
Change-Id: I2145815d778321b9ccc7ac2775aaf64cf2372a42
2016-09-13 22:49:42 -07:00
Vincent Rabaud
6cc48b1728 Move some lossless logic out of dsp.
Change-Id: I4cfd60cd5497666a2e1c188ceada2e71b05f1505
2016-09-13 15:37:32 +02:00
Pascal Massimino
78363e9e51 Merge "Remove a redundant call to InitLeft() in VP8IteratorReset()" 2016-09-13 04:47:51 +00:00
hui su
ffd01929f1 Refactor VP8IteratorNext().
Change-Id: I7055d3ee3bd7ed5e78e94ae82cb858fa7db3ddc0
2016-09-12 15:08:24 -07:00
hui su
c4f6d9c939 Remove a redundant call to InitLeft() in VP8IteratorReset()
VP8IteratorSetRow(it, 0) already did it.

Change-Id: I410e48b5205897a6a23301d8a98aa266787676c6
2016-09-12 14:41:15 -07:00
Pascal Massimino
c27d821096 Merge "smartYUV: simplify main loop" 2016-09-12 17:14:22 +00:00
Pascal Massimino
0779529616 smartYUV: simplify main loop
we don't need to centralize best_uv[] since target_uv[] and best_rgb_uv[]
are already centralized. The diff 'W' was just in the ~[-2,2] range, so
we can ignore the correction.

Overall speed-impact is not large, though. Around ~4% faster conversion.

Output with -pre 4 is expected to be slightly different

Change-Id: Ib59f033955577c49b084d0560108020f42d84102
also: remove the useless clipping in StoreGray()
2016-09-12 18:51:40 +02:00
Vincent Rabaud
c9b45863e2 Split off common lossless dsp inline functions.
Change-Id: I64f96897b11d1c21f033c7e47b21edccb5c68738
2016-09-12 17:35:08 +02:00
Pascal Massimino
490ae5b13d smartYUV: improve initial state for faster convergence
For speed reason, the 'gray' plane was initialized with the same
value for 2x2 block. But in some cases (underlying camera noise, e.g.),
it could lead to instability during iteration, noise amplification,
and visible banding.
Using a precise (but slower) initialization solves the issue, and
since the convergence is faster, we might actually gain some speed.

Change-Id: I81c42101497e7096a8f60289d710f5a3bcb0ddea
2016-09-09 18:29:32 +02:00
Pascal Massimino
894232be56 smartYUV: fix and simplify the over-zealous stop criterion
We usually need at least 2 iterations to converge
(and usually not much more after that). Only 1 was not enough.

Change-Id: Iaf802ea81afa2596f4ba045c92f5eaff61623b7b
2016-09-09 10:10:13 +02:00
Pascal Massimino
32dead4ee3 WebPPictureDistortion(): free() -> WebPSafeFree()
missed one!

Change-Id: I643170451b3ac07c748b70a9abfe8af17a716b24
2016-08-30 15:43:23 +02:00
Vincent Rabaud
85cd5d061c Smarter LZ77 for uniform regions.
No need to find backward references for pixels in uniform regions
by looking at all pixels.
Only pixels at the same distance from the end need to be compared to.

Change-Id: I4f187e965f0667d3a929775726a412f7e69f6473
2016-08-26 09:53:49 +02:00
Vincent Rabaud
7f1b897bee Faster stochastic histogram merging.
Constants are such that brute force is sometimes faster for some
data (mostly big images it seems).

Change-Id: I90aef536408683535e3b09ddfa2e77a9834038f6
2016-08-19 14:52:57 +02:00
skal
6ab496ed22 fix some 'unsigned integer overflow' warnings in ubsan
I couldn't find a safe way of fixing VP8GetSigned() so i just
used the big-hammer.

Change-Id: I1039bc00307d1c90c85909a458a4bc70670e48b7
2016-08-16 23:18:27 -07:00
James Zern
8a4ebc6ab0 Revert "fix 'unsigned integer overflow' warnings in ubsan"
This reverts commit e44f5248ff.

contains unintentional changes in quant.c

Change-Id: I1928f072566788b0c9ea80f6fbc9e571061f9b3e
2016-08-16 16:55:56 -07:00
skal
e44f5248ff fix 'unsigned integer overflow' warnings in ubsan
I couldn't find a safe way of fixing VP8GetSigned() so i just
used the big-hammer.

Change-Id: I1039bc00307d1c90c85909a458a4bc70670e48b7
2016-08-16 15:04:41 -07:00
hui su
1269dc7cfb Refactor VP8LColorCacheContains()
Return key/index if the query is found, and -1 otherwise.
The benefit of this is to save a hashing computation.

Change-Id: Iff056be330f5fb8204011259ac814f7677dd40fe
2016-08-12 15:16:06 -07:00
James Zern
b551e587b3 cosmetics: add {}s on continued control statements
for consistency within the codebase. in some cases simply join the
lines.

Change-Id: I071f061052e274c8a69f651ed4305befb4414a40
2016-08-03 19:08:59 -07:00
hui su
0b2c58a91c Fix an unsigned integer overflow error in enc/cost.h
Change-Id: I9774b59c417c185f09a61a115364b9642976a100
2016-07-26 13:55:09 -07:00
hui su
386e4ba2f0 Reset segment id if we decide not to update segment map
This avoids potential encoder and decoder mismatch.

Change-Id: I5282d3e168afc6193033ad3fce8fbc35618ab2f5
2016-07-25 17:08:10 -07:00
hui su
0c0fb83211 Do token recording and counting in a single loop
Change-Id: I8afd3c486b210bd67888de03e91dde7f78276f89
2016-07-19 16:28:26 -07:00
skal
5b60db5c9d FastMBAnalyze() for quick i16/i4 decision
The decision is based on the variance between DC values of each
sub-4x4 block. This heuristic is rather ok for predicting whether
the 2nd transform (intra-16) is going to help or not.
The decision threshold varies with quality (=quantization).

It's only used for -m 0 and -m 1, where no full RD-opt is performed.
It actually makes these modes quite faster, with RD curve much
closer to the -m 2 mode.

Change-Id: I15f972db97ba4082cbd1dfd16bee3eb2eca701a8
2016-07-15 11:21:08 -07:00
Vincent Rabaud
2a5c417c68 Apply the RLE heuristic to LZ77.
Change-Id: I7317eed7e017ee8981f40fcf1737f97e0e3a238c
2016-07-14 20:12:48 +02:00
James Zern
bfef6c9f82 libwebp-0.5.1
- 6/14/2016: version 0.5.1
   This is a binary compatible release.
   * miscellaneous bug fixes (issues #280, #289)
   * reverted alpha plane encoding with color cache for compatibility with
     libwebp 0.4.0->0.4.3 (issues #291, #298)
   * lossless encoding performance improvements
   * memory reduction in both lossless encoding and decoding
   * force mux output to be in the extended format (VP8X) when undefined chunks
     are present (issue #294)
   * gradle, cmake build support
   * workaround for compiler bug causing 64-bit decode failures on android
     devices using clang-3.8 in the r11c NDK
   * various WebPAnimEncoder improvements
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXfb1vAAoJEPnD1r24IytdtbwP/iCCEEU9scepXgh9+ICUOm1D
 6ASfz6eTYIPP4s2E+kIJKrKeGUrk7U1j6BeehjKxS3vMQxQlJvkXvepk0mdJUO4C
 okttfLahLY6DOZSAETK9SI4haE2Uuz5WGfxMe8x+4uuZZTxSLHqOCFMvU2oxo6uM
 rhErJgH3jWE9vGV9OuI8YUa109qGi8PLtErrFjXqFmAvnxJS95kJHr3MHVoulH8g
 tXrSUYTq37BCfSsxudhZTCENLhYqlXHO5tydvQVAlVbXJfpOsNLQciWUrqFiPuB9
 qhUv3smRV9YBd4XuUgFWLQcbcecQVBzIqxJ7lv41R71vi17Lu4plLjNAc0Cx70qc
 cnfe/acH+9hX0EwBzpvOpN/Lzirx1tmBKPOqnSiFpFP48RZSngLMG0mwhUufyq1I
 y6T2rEcMLRbAX/85sGMRd1AwffoW6OvgPG2LdhW2bh8u9YbA/g3qGH98z2T1JKjy
 V/TNvpTjXAdZ5XQMY8zIunv83Wp/6AWmJIRWZ+mfhw29F/F80HQG2Ss7dulbe3m2
 zpBjxdsaLj+9iZpheewrGGImZ5mJQsG7nRovtQ0VARVaRSY3xpaYug2CqXlQQ2bc
 bjdmGS9u+a4fHdk+uKTMzJEbu4RbXcOeLrvpzA+PxhUQi9WRyLIucIWeVVEDiUI2
 p7OJop9JmPjkRvvqfi5y
 =Mchr
 -----END PGP SIGNATURE-----

Merge tag 'v0.5.1'

libwebp-0.5.1
- 6/14/2016: version 0.5.1
  This is a binary compatible release.
  * miscellaneous bug fixes (issues #280, #289)
  * reverted alpha plane encoding with color cache for compatibility with
    libwebp 0.4.0->0.4.3 (issues #291, #298)
  * lossless encoding performance improvements
  * memory reduction in both lossless encoding and decoding
  * force mux output to be in the extended format (VP8X) when undefined chunks
    are present (issue #294)
  * gradle, cmake build support
  * workaround for compiler bug causing 64-bit decode failures on android
    devices using clang-3.8 in the r11c NDK
  * various WebPAnimEncoder improvements

* tag 'v0.5.1': (30 commits)
  update ChangeLog
  Clarify the expected 'config' lifespan in WebPIDecode()
  update ChangeLog
  Fix corner case in CostManagerInit.
  gif2webp: normalize the number of .'s in the help message
  vwebp: normalize the number of .'s in the help message
  cwebp: normalize the number of .'s in the help message
  fix rescaling bug: alpha plane wasn't filled with 0xff
  Improve lossless compression.
  'our bug tracker' -> 'the bug tracker'
  normalize the number of .'s in the help message
  pngdec,ReadFunc: throw an error on invalid read
  decode.h,WebPGetInfo: normalize function comment
  Inline GetResidual for speed.
  Speed-up uniform-region processing.
  free -> WebPSafeFree()
  DecodeImageData(): change the incorrect assert
  Fix a boundary case in BackwardReferencesHashChainDistanceOnly.
  Make sure to consider small distances in LZ77.
  add some asserts to delimit the perimeter of CostManager's operation
  ...

Change-Id: I44cee79fddd43527062ea9d83be67da42484ebfc
2016-07-06 19:31:27 -07:00