Commit Graph

665 Commits

Author SHA1 Message Date
Pascal Massimino
bc18ebad2e fix extra 'const's in signatures
Change-Id: Ie433d0defbc0c6feae2eb2f11e70082f1affada8
2016-11-25 09:45:52 +01:00
Vincent Rabaud
71e2f5cadf Remove memcpy in lossless decoding.
Change-Id: Iba694b306486d67764e2fc5576c98a974c9b886c
2016-11-24 17:45:24 +01:00
Vincent Rabaud
7474d46e45 Do not use a register array in SSE.
Change-Id: I79cf95bdac1164fc4de899828e9380c23df8d141
2016-11-24 13:06:44 +01:00
Owen Rodley
67748b41db Improve latency of FTransform2.
Benchmarks from vrabaud@:
8BIT/GRAY                corpus speed: faster: -4.3 % , corpus size: unchanged
skal/sources_png_skal    corpus speed: faster: -5.2 % , corpus size: unchanged
images/png_rgb           corpus speed: faster: -5.1 % , corpus size: unchanged
images/lpcb              corpus speed: unchanged, corpus size: unchanged
images/png_big           corpus speed: faster: -1.7 % , corpus size: unchanged
images/png_doc           corpus speed: unchanged, corpus size: unchanged
images/png_1bit          corpus speed: faster: -1.2 % , corpus size: unchanged
images/jpeg_small        corpus speed: unchanged, corpus size: unchanged
images/icip_core1        corpus speed: unchanged, corpus size: unchanged
images/png_gray          corpus speed: faster: -2.5 % , corpus size: unchanged
images/jpeg_high_quality corpus speed: faster: -4.0 % , corpus size: unchanged
images/jpeg              corpus speed: faster: -2.3 % , corpus size: unchanged
images/png_translucent   corpus speed: faster: -2.8 % , corpus size: unchanged
images/gif               corpus speed: faster: -1.4 % , corpus size: unchanged
images/png_opaque        corpus speed: faster: -2.8 % , corpus size: unchanged
images/png_rgb_opaque    corpus speed: unchanged, corpus size: unchanged
images/png_indexed       corpus speed: faster: -2.0 % , corpus size: unchanged
images/all               corpus speed: faster: -1.5 % , corpus size: unchanged
images/png_small         corpus speed: unchanged, corpus size: unchanged
images/png               corpus speed: unchanged, corpus size: unchanged
images/gif_still         corpus speed: faster: -1.6 % , corpus size: unchanged

Change-Id: I69fe11baa188c5d32cbc77a84b8c0deae13d792b
2016-11-24 07:09:50 +00:00
Vincent Rabaud
6540cd0eeb Provide an SSE implementation of ConvertBGRAToRGB
Change-Id: Ida11b079077a47fe3b92754f08aa30d81c301fcf
2016-11-23 16:25:51 +01:00
Pascal Massimino
3c2a61b099 remove some unneeded casts
Change-Id: Ie68788c77f016ed11446a55142b1bd8d96261452
2016-11-16 22:54:40 -08:00
Pascal Massimino
9ac063c37f add dsp functions for SmartYUV
+ SSE2 implementation

Change-Id: I5cfdb62d68b5a95899241a097d3a2f697fbc590e
2016-11-16 14:23:06 +00:00
Pascal Massimino
31b1e34342 fix SSIM metric ... by ignoring too-dark area
Roughly, if both the source and the reference areas are
darker too dark (R/G/B <= ~6), they are ignored.

One caveat: SSIM calculation won't work for U/V planes,
which are 128-centered and not related to luminance.
But WebPPlaneDistortion() enforces the conversion to RGB,
if needed.

Change-Id: I586c2579c475583b8c90c5baefd766b1d5aea591
2016-10-20 15:17:55 +02:00
Vincent Rabaud
28ce304344 Remove some errors when compiling the code as C++.
This fixes some cases from
https://bugs.chromium.org/p/webp/issues/detail?id=137

Change-Id: I58f3a617bf973dbe4c5794004a01e2aea39ba53a
2016-10-05 09:39:08 +02:00
Pascal Massimino
ba843a92e7 fix some SSIM calculations
* prevent 64bit overflow by controlling the 32b->64b conversions
  and preventively descaling by 8bit before the final multiply
* adjust the threshold constants C1 and C2 to de-emphasis the dark
  areas
* use a hat-like filter instead of box-filtering to avoid blockiness
  during averaging

SSIM distortion calc is actually *faster* now in SSE2, because of the
unrolling during the function rewrite.
The C-version is quite slower because still un-optimized.

Change-Id: I96e2715827f79d26faae354cc28c7406c6800c90
2016-10-04 01:09:07 -07:00
Pascal Massimino
86a84b3598 2x faster SSE2 implementation of SSIMGet
Change-Id: I53705d7ddfa595389ff2d542e5088f96f948d351
2016-09-23 23:23:06 -07:00
Pascal Massimino
7c1fb7d0ff fix uint32_t initialization (0. -> 0)
Change-Id: Ia4aae27f70c4e74ddeb5654cfabb21d785cea9cf
2016-09-14 20:26:05 +02:00
Pascal Massimino
bfff0bf329 speed-up SSIM calculation
SSIM results are incompatible with previous version!
We're now averaging the SSIM value for each pixels instead of
printing a frame-level global SSIM value.

* Got rid of some old code
* switched to uint32_t for accumulation
* refactoring

SSIM calculation is ~4x faster now.

Change-Id: I48d838e66aef5199b9b5cd5cddef6a98411f5673
2016-09-14 16:15:43 +02:00
Vincent Rabaud
64577de8ae De-VP8L-ize GetEntropUnrefinedHelper.
Having it architecture dependent resulted in an extra
function call of an extern function, hence no inlining and
a 5-10% impact on performance.

Change-Id: I0ff40d2d881edc76d3594213a64ee53097d42450
2016-09-14 13:55:24 +02:00
Pascal Massimino
a7be73280b Merge "refactor the PSNR / SSIM calculation code" 2016-09-14 06:37:56 +00:00
Pascal Massimino
50c3d7da9a refactor the PSNR / SSIM calculation code
-print_psnr is now much faster because it doesn't use the SSIM code.
The SSIM speed-up and re-write will come later.

Change-Id: Iabf565e0a8b41651d8164df1266cfeded4ab4823
2016-09-14 06:13:24 +00:00
Vincent Rabaud
dd538b192d Remove unused declaration.
Change-Id: I8ab19654df63e7ef8aad00e97d1428c7b53ee33f
2016-09-13 16:25:46 +02:00
Vincent Rabaud
6cc48b1728 Move some lossless logic out of dsp.
Change-Id: I4cfd60cd5497666a2e1c188ceada2e71b05f1505
2016-09-13 15:37:32 +02:00
Vincent Rabaud
c9b45863e2 Split off common lossless dsp inline functions.
Change-Id: I64f96897b11d1c21f033c7e47b21edccb5c68738
2016-09-12 17:35:08 +02:00
Pascal Massimino
3884972e3f remove WEBP_FORCE_ALIGNED and use memcpy() instead.
BUG=webp:297

Change-Id: I89a08debec7bb1b3f411c897260ab1bb63f77df2
2016-08-17 20:16:03 -07:00
skal
6ab496ed22 fix some 'unsigned integer overflow' warnings in ubsan
I couldn't find a safe way of fixing VP8GetSigned() so i just
used the big-hammer.

Change-Id: I1039bc00307d1c90c85909a458a4bc70670e48b7
2016-08-16 23:18:27 -07:00
James Zern
8a4ebc6ab0 Revert "fix 'unsigned integer overflow' warnings in ubsan"
This reverts commit e44f5248ff.

contains unintentional changes in quant.c

Change-Id: I1928f072566788b0c9ea80f6fbc9e571061f9b3e
2016-08-16 16:55:56 -07:00
Pascal Massimino
9d4f209f80 Merge changes I25711dd5,I43188fab
* changes:
  Fix assertions in WebPRescalerExportRow()
  Add descriptions of default configuration in help info.
2016-08-16 22:13:23 +00:00
skal
e44f5248ff fix 'unsigned integer overflow' warnings in ubsan
I couldn't find a safe way of fixing VP8GetSigned() so i just
used the big-hammer.

Change-Id: I1039bc00307d1c90c85909a458a4bc70670e48b7
2016-08-16 15:04:41 -07:00
Hui Su
27b5d991e2 Fix assertions in WebPRescalerExportRow()
Change-Id: I25711dd54e71c90a25f7b18e0ef9155e8151a15e
2016-08-16 14:32:48 -07:00
James Zern
40872fb2e6 dec_neon,NeedsHev: micro optimization
trade 2 compares + 1 logical or for max + compare

Change-Id: I785ad8efdc64db2d0609456d6e7af795ab2117d8
2016-08-08 20:12:30 -07:00
James Zern
b551e587b3 cosmetics: add {}s on continued control statements
for consistency within the codebase. in some cases simply join the
lines.

Change-Id: I071f061052e274c8a69f651ed4305befb4414a40
2016-08-03 19:08:59 -07:00
James Zern
d2e4484ef3 dsp/Makefile.am: put msa source in correct lib
upsampling_msa.c was incorrectly included in the neon convenience lib
+ sort msa sources

Change-Id: I7c4883f16a5c2fed12bfa0e8d8d6a7acd5d4fb84
2016-08-03 17:50:45 -07:00
Parag Salasakar
d3ddacb625 Add MSA optimized YUV to RGB upsampling functions
We add the following MSA optimized YUV to RGB upsampling functions:
- UpsampleRgbLinePair
- UpsampleBgrLinePair
- UpsampleRgbaLinePair
- UpsampleBgraLinePair
- UpsampleArgbLinePair
- UpsampleRgba4444LinePair
- UpsampleRgb565LinePair

Change-Id: I7264a615edc7eb376e443e9d38bd8e3c9a2cab1f
2016-07-22 14:28:30 +00:00
Parag Salasakar
9ac74f922e Add MSA optimized rescaling functions
We add the following MSA optimized rescaling functions:
- RescalerExportRowExpand
- RescalerExportRowShrink

Change-Id: Ic1c76065423b02617db94cf0c22bb564219b36e6
2016-07-19 15:52:42 +00:00
Parag Salasakar
cb19dbc1a4 Add MSA optimized color transform functions
We add the following MSA optimized color transform functions:
- TransformColor
- SubtractGreenFromBlueAndRed

Change-Id: Ib182d2b5faa7191f503ce70f0dfde0ac89402fd3
2016-07-18 13:49:24 +00:00
James Zern
5e2eb89e1f cosmetics,dsp/*msa.c: associate '*' with the type
not the variable

Change-Id: If5823e9731c406655eaf1dc1aaa2e6554ca7daad
2016-07-15 15:40:41 -07:00
skal
5b60db5c9d FastMBAnalyze() for quick i16/i4 decision
The decision is based on the variance between DC values of each
sub-4x4 block. This heuristic is rather ok for predicting whether
the 2nd transform (intra-16) is going to help or not.
The decision threshold varies with quality (=quantization).

It's only used for -m 0 and -m 1, where no full RD-opt is performed.
It actually makes these modes quite faster, with RD curve much
closer to the -m 2 mode.

Change-Id: I15f972db97ba4082cbd1dfd16bee3eb2eca701a8
2016-07-15 11:21:08 -07:00
Parag Salasakar
567e697776 Add MSA optimized CollectHistogram function
We add the following MSA optimized encoder Histogram function:
- CollectHistogram

Change-Id: I28415704ec62c3ad375de06eeef468d9f514bb2d
2016-07-15 22:51:33 +05:30
Parag Salasakar
c54ab8dd1a Add MSA optimized quantization functions
We add the following MSA optimized encoder quantization functions:
- QuantizeBlock
- Quantize2Blocks

Change-Id: Ie32b442afa99eee62d2ef48942b41116a4e157d3
2016-07-15 15:33:47 +00:00
hui su
91b59e886b Remove QuantizeBlockWHT() in enc.c
QuantizeBlockWHT() is basically identical to QuantizeBlock(),
no need to keep two copies.

Change-Id: I970cb6948da1c750c1339971a55e3b40765cdd01
2016-07-14 10:44:18 -07:00
Parag Salasakar
fe57273736 Add MSA optimized SSE functions
We add the following MSA optimized encoder SSE functions:
- SSE16x16
- SSE16x8
- SSE8x8
- SSE4x4

Change-Id: I9ef9e903019337d9975c83264a652a7282bf5d5b
2016-07-14 15:43:23 +05:30
James Zern
6b53ca876e cosmetics,(dec|enc)_sse2.c: fix indent
Change-Id: Ic3326136ddd325e911e96c2e5a7f06b3e1d60f66
2016-07-13 16:11:29 -07:00
Parag Salasakar
afe3cec813 Add MSA optimized encoder IntraChromaPreds function
We add the following MSA optimized intrapred chroma function:
- IntraChromaPreds

Change-Id: I051cd174f5ce675aeb94e648d52c5a340a133ed4
2016-07-13 18:13:51 +05:30
Parag Salasakar
7b4b05e0dc Add MSA optimized encoder Intra16Preds function
We add the following MSA optimized intrapred 16x16 function:
- Intra16Preds

Change-Id: I89a249e041fbed377cb6a328c0b973add335b980
2016-07-12 14:30:03 +05:30
Parag Salasakar
c18787a0e9 Add MSA optimized encoder Intra4Preds function
We add the following MSA optimized intrapred 4x4 function:
- Intra4Preds

Change-Id: Icf325f3dcbf98bb6210811b666ce632cae575b22
2016-07-12 04:36:56 +00:00
Parag Salasakar
7915396f40 Add MSA optimized distortion functions
We add the following MSA optimized distortion functions:
- Disto4x4
- Disto16x16

Change-Id: I0a545ed0182ea56a0d5f358639f6671c2c21b95c
2016-07-07 07:30:22 +00:00
James Zern
bfef6c9f82 libwebp-0.5.1
- 6/14/2016: version 0.5.1
   This is a binary compatible release.
   * miscellaneous bug fixes (issues #280, #289)
   * reverted alpha plane encoding with color cache for compatibility with
     libwebp 0.4.0->0.4.3 (issues #291, #298)
   * lossless encoding performance improvements
   * memory reduction in both lossless encoding and decoding
   * force mux output to be in the extended format (VP8X) when undefined chunks
     are present (issue #294)
   * gradle, cmake build support
   * workaround for compiler bug causing 64-bit decode failures on android
     devices using clang-3.8 in the r11c NDK
   * various WebPAnimEncoder improvements
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXfb1vAAoJEPnD1r24IytdtbwP/iCCEEU9scepXgh9+ICUOm1D
 6ASfz6eTYIPP4s2E+kIJKrKeGUrk7U1j6BeehjKxS3vMQxQlJvkXvepk0mdJUO4C
 okttfLahLY6DOZSAETK9SI4haE2Uuz5WGfxMe8x+4uuZZTxSLHqOCFMvU2oxo6uM
 rhErJgH3jWE9vGV9OuI8YUa109qGi8PLtErrFjXqFmAvnxJS95kJHr3MHVoulH8g
 tXrSUYTq37BCfSsxudhZTCENLhYqlXHO5tydvQVAlVbXJfpOsNLQciWUrqFiPuB9
 qhUv3smRV9YBd4XuUgFWLQcbcecQVBzIqxJ7lv41R71vi17Lu4plLjNAc0Cx70qc
 cnfe/acH+9hX0EwBzpvOpN/Lzirx1tmBKPOqnSiFpFP48RZSngLMG0mwhUufyq1I
 y6T2rEcMLRbAX/85sGMRd1AwffoW6OvgPG2LdhW2bh8u9YbA/g3qGH98z2T1JKjy
 V/TNvpTjXAdZ5XQMY8zIunv83Wp/6AWmJIRWZ+mfhw29F/F80HQG2Ss7dulbe3m2
 zpBjxdsaLj+9iZpheewrGGImZ5mJQsG7nRovtQ0VARVaRSY3xpaYug2CqXlQQ2bc
 bjdmGS9u+a4fHdk+uKTMzJEbu4RbXcOeLrvpzA+PxhUQi9WRyLIucIWeVVEDiUI2
 p7OJop9JmPjkRvvqfi5y
 =Mchr
 -----END PGP SIGNATURE-----

Merge tag 'v0.5.1'

libwebp-0.5.1
- 6/14/2016: version 0.5.1
  This is a binary compatible release.
  * miscellaneous bug fixes (issues #280, #289)
  * reverted alpha plane encoding with color cache for compatibility with
    libwebp 0.4.0->0.4.3 (issues #291, #298)
  * lossless encoding performance improvements
  * memory reduction in both lossless encoding and decoding
  * force mux output to be in the extended format (VP8X) when undefined chunks
    are present (issue #294)
  * gradle, cmake build support
  * workaround for compiler bug causing 64-bit decode failures on android
    devices using clang-3.8 in the r11c NDK
  * various WebPAnimEncoder improvements

* tag 'v0.5.1': (30 commits)
  update ChangeLog
  Clarify the expected 'config' lifespan in WebPIDecode()
  update ChangeLog
  Fix corner case in CostManagerInit.
  gif2webp: normalize the number of .'s in the help message
  vwebp: normalize the number of .'s in the help message
  cwebp: normalize the number of .'s in the help message
  fix rescaling bug: alpha plane wasn't filled with 0xff
  Improve lossless compression.
  'our bug tracker' -> 'the bug tracker'
  normalize the number of .'s in the help message
  pngdec,ReadFunc: throw an error on invalid read
  decode.h,WebPGetInfo: normalize function comment
  Inline GetResidual for speed.
  Speed-up uniform-region processing.
  free -> WebPSafeFree()
  DecodeImageData(): change the incorrect assert
  Fix a boundary case in BackwardReferencesHashChainDistanceOnly.
  Make sure to consider small distances in LZ77.
  add some asserts to delimit the perimeter of CostManager's operation
  ...

Change-Id: I44cee79fddd43527062ea9d83be67da42484ebfc
2016-07-06 19:31:27 -07:00
Parag Salasakar
435308e029 Add MSA optimized encoder transform functions
We add the following MSA optimized encoder transform functions:
- ITransform
- FTransform
- FTransformWHT

Change-Id: Ia6b17556aba5aff2d7a88208905fb45293d080a8
2016-07-05 14:35:47 +00:00
Parag Salasakar
dce64bfa1b Add MSA optimized alpha filter functions
We add the following MSA optimized alpha filter functions:
- HorizontalFilter
- VerticalFilter
- GradientFilter

Change-Id: I71e2e04050e569b8c0bf086fadf210ee16d50924
2016-07-01 19:58:25 +00:00
Parag Salasakar
429120d0af Add MSA optimized color transform functions
We add the following MSA optimized color transform functions:
- AddGreenToBlueAndRed
- TransformColorInverse

Change-Id: Iceab3813905955aa8b811253df9188512fc7de3f
2016-06-28 20:36:21 +05:30
Pascal Massimino
55b2fede7f normalize the macros' "do {...} while (0)" constructs
(so we're no longer bitten by the extra ';' problem!)

Change-Id: Icf849c97df9a7af135ba15a7906fc28590d7ce77
2016-06-27 15:30:05 -07:00
Parag Salasakar
701c772eed Add MSA optimized colorspace conversion functions
We add the following MSA optimized colorspace conversion functions:
- ConvertBGRAToRGBA
- ConvertBGRAToBGR
- ConvertBGRAToRGB

Change-Id: I76db1c829d593a06d4975d54dbafa385c82b84fb
2016-06-27 21:19:06 +00:00
Parag Salasakar
6a19793777 Add MSA optimized intra pred chroma functions
We add the following MSA optimized intra pred chroma functions:
- DC8uv
- TM8uv
- VE8uv
- HE8uv
- DC8uvNoTop
- DC8uvNoLeft
- DC8uvNoTopLeft

Change-Id: I48ad2409f334371acd38f4a70626ebcf2e10f4fe
2016-06-24 08:45:29 +00:00
Parag Salasakar
293d786f31 Added MSA optimized intra prediction 16x16 functions
1. DC16
2. TM16
3. VE16
4. HE16
5. DC16NoTop
6. DC16NoLeft
7. DC16NoTopLeft

Change-Id: I53c57c27cee40973b7ee40a7b7a7fbf0df812d1a
2016-06-23 13:09:02 +00:00