Commit Graph

3068 Commits

Author SHA1 Message Date
Pascal Massimino
423ecaf484 move some SSIM-accumulation function for dsp/
This is in preparation for some SSE2 code.

And generally speaking, the whole SSIM code needs some
revamp: we're not averaging the SSIM value at each pixels
but just computing the overall SSIM value once, for the whole
plane. The former might be better than the latter.

Change-Id: I935784a917f84a18ef08dc5ec9a7b528abea46a5
2016-03-08 07:50:09 +01:00
Pascal Massimino
f08e66245a Merge "Fix FindClosestDiscretized in near lossless:" 2016-03-04 09:34:16 +00:00
James Zern
0d40cc5ea3 enc_neon,Disto4x4: remove an unnecessary transpose
based on the sse2 change in:
9960c31 Remove an unnecessary transposition in TTransform.

~9-10.5% faster at the function-level, < 1% overall

Change-Id: I44413369b230b250fb0dbc51ff2f17cfeda609b7
2016-03-03 16:18:59 -08:00
Marcin Kowalczyk
e8feb20e39 Fix FindClosestDiscretized in near lossless:
- The result is now indeed closest among possible results for all inputs, which
  was not the case for bits>4, where the mapping was not even monotonic because
  GetValAndDistance was correct only if the significant part of initial fit in
  a byte at most twice.

- The set of results for a larger number of bits dropped is a subset of values
  for a smaller number of bits dropped. This implies that subsequent
  discretizations for a smaller number of bits dropped do not change already
  discretized pixels, which improves the quality (changes do not accumulate)
  and compression density (values tend to repeat more often).

- Errors are more fairly distributed between upwards and downwards thanks to
  bankers’ rounding, which avoids images getting darker or lighter in overall.

- Deltas between discretized values are more repetitive. This improves
  compression density if delta encoding is used.

Also, the implementation is much shorter now.

Change-Id: I0a98e7d5255e91a7b9c193a156cf5405d9701f16
2016-03-02 12:47:22 +01:00
James Zern
8200643017 anim_util: quiet static analysis warning
in ReadAnimatedWebP() frame_index is tied to the reported frame_count.

Change-Id: I69241e5d77a0b8bac1deabd992d0ad2ecf51a4ec
2016-02-23 11:13:38 -08:00
James Zern
a6f23c49b2 Merge "AnimEncoder: Support progress hook and user data." 2016-02-20 04:25:57 +00:00
James Zern
a5193774b0 Merge "Near lossless feature: fix some comments." 2016-02-20 03:51:15 +00:00
Urvang Joshi
da98d31ced AnimEncoder: Support progress hook and user data.
Pass them along to internal 'pic' object, so that progress can be reported back
and user data can also be inspected.

Change-Id: Idb5d0d4a76d07283d704a86c5892e1ad7bda09fa
2016-02-19 19:50:30 -08:00
Urvang Joshi
3335713169 Near lossless feature: fix some comments.
Change-Id: I2c5fc2a3b3fe5123d66b42bf148e361b4862dfb9
2016-02-19 19:26:39 -08:00
James Zern
0beed01aa5 cosmetics: fix indent after 2f5e898
2f5e898 fix multiple allocation for transform buffer

Change-Id: Ied5c89c0040671e2eddf23c8b7a78e0d817dd18e
2016-02-19 19:22:34 -08:00
Pascal Massimino
6753f35cac Merge "FTransformWHT optimization." 2016-02-19 09:38:04 +00:00
Vincent Rabaud
6583bb1a42 Improve SSE4.1 implementation of TTransform.
SSE4.1 is slower than the SSE2 implementation and this seems to
be due to a slow _mm_loadl_epi64 implementation by gcc
(hence a bug with my gcc 4.8) and a very slow _mm_hadd_epi32. Both
got confirmed by IACA and experiments.

Change-Id: I05607f66b7ccd8f4f42e000693aea583ffd5768f
2016-02-19 09:11:53 +01:00
Vincent Rabaud
7561d0c338 FTransformWHT optimization.
Data is packed sooner in the functions.

Change-Id: I018cfeca43f015ac755c7f209f9a97984cc0517b
2016-02-18 17:44:05 +01:00
Pascal Massimino
7ccdb734c2 fix indentation after patch #328220
Change-Id: Iccfcf4deaed6b383b9f80ae84b5b0575a4e94b5f
2016-02-18 15:14:41 +01:00
Pascal Massimino
6ec0d2a946 clarify the logic of the error path when decoding fails.
Change-Id: I2f86751ddafa4708dac3ffc9d6ec1f156e027a83
2016-02-18 14:58:18 +01:00
Vincent Rabaud
8aa352b256 Merge "Remove an unnecessary transposition in TTransform." 2016-02-18 08:15:10 +00:00
James Zern
db86088426 Merge "remove useless #include" 2016-02-17 23:17:12 +00:00
Vincent Rabaud
9960c31685 Remove an unnecessary transposition in TTransform.
Change-Id: Ib715c2d5ba659cb2db9c6832875ba508cc2fca3e
2016-02-17 21:41:28 +01:00
Vincent Rabaud
6e36b51188 Small speedup in FTransform.
It removes two _mm_unpacklo_epi32 and two _mm_sub_epi16.

Change-Id: Icdf86259f796ba855d1cda5e9c0e99cb396cb351
2016-02-17 21:26:36 +01:00
James Zern
9dbd4aad77 Merge "fix C and SIMD flags completion." 2016-02-17 19:22:20 +00:00
Vincent Rabaud
e60853eabc Add missing common_sse2.h file to makefile.unix
This was missed in bf2b4f114f

Change-Id: Ib0d2ac9f5b86196de2ac8feccbb9536530e991b2
2016-02-17 09:48:05 +01:00
Vincent Rabaud
696eb2b03a fix C and SIMD flags completion.
CMAKE_C_FLAGS is handled as a string and not a list by CMake.

Change-Id: Ia0cfb38817edef2f212a04b26601f6aea76a5684
2016-02-17 09:10:04 +01:00
Pascal Massimino
2b4fe33e00 Merge "fix multiple allocation for transform buffer" 2016-02-17 07:27:23 +00:00
Pascal Massimino
2f5e8986cf fix multiple allocation for transform buffer
We were not updating the current_width_, which is usually
not a problem, unless we use Delta Palette with small number
of colors
-> Addressed this re-entrancy problem by checking we have
enough capacity for transform buffer.

The problem is not currently visible, until we restrict
the number of gradient used in delta-palette to less than 16.
Then the buffers have different current_width_ and the problem
surfaces.

Change-Id: Icd84b919905d7789014bb6668bfb6813c93fb36e
2016-02-17 06:14:39 +01:00
Vincent Rabaud
bf2b4f114f Regroup common SSE code + optimization.
The transpose refactoring will help removing a transpose in a
later CL.

The horizontal add function helps removing a _mm_sad_epu8 in DC8uv
=> the latency/throughput went from 29/25 to 23/19

Change-Id: I5f3dfd4aad614eb079b1e83631e6a7cef49a3766
2016-02-16 18:34:34 +01:00
Pascal Massimino
4ed650a13d force "-pass 6" if -psnr or -size is used but -pass isn't.
This is to prevent users shooting in the foot using -psnr or
-size alone and not getting the expected result.

Change-Id: I67a3289e4ec0a2a813c98807f2ec5e600f52dc63
2016-02-13 10:22:32 +01:00
Nico Weber
3ef1ce98b9 yuv_sse2: fix -Wconstant-conversion warning
'implicit conversion from 'int' to 'short' changes value from 33050 to
-32486'

original patch:

https://codereview.chromium.org/1657313003/

Make libwebp build with -Wconstant-conversion from newer clangs.

After http://llvm.org/viewvc/llvm-project?rev=259271&view=rev, clang
points out that _mm_set1_epi16(33050) causes an overflow in the short
argument to _mm_set1_epi16().  Since there's no version that takes an
unsigned short, add an explicit cast to tell the compiler that this is
intentional.

No behavior change.

Change-Id: I6b4e3401b15cfbcc895f9e81b5c2dc59d43ffb9b
2016-02-02 14:52:11 -08:00
James Zern
a7a03e9f96 Merge changes I4852d18f,I51ccb85d
* changes:
  gif2webp: set enc_options.verbose = 0 w/-quiet
  anim_encode,DefaultEncoderOptions: init verbose
2016-02-02 01:49:12 +00:00
James Zern
5e122bd6d6 gif2webp: set enc_options.verbose = 0 w/-quiet
Change-Id: I4852d18f7c04f1f48684459f0ac5b002a8c2f27b
2016-02-01 17:00:51 -08:00
James Zern
ab3c2583aa anim_encode,DefaultEncoderOptions: init verbose
default to disabled

broken since:
c13245c AnimEncoder: Add a GetError() method.

Change-Id: I51ccb85d5df338570512ec1d7430ad3229f93a9f
2016-02-01 17:00:19 -08:00
James Zern
8f0dee773a Merge "configure: fix builtin detection w/-Werror" 2016-01-30 05:11:19 +00:00
James Zern
4a7b85a967 cmake: fix builtin detection w/-Werror
and other warnings enabled (-Wunused-value)

Change-Id: Ie3605628b4cbfb5bfd1e847cc504cb96f2825388
2016-01-28 23:47:11 -08:00
James Zern
b74657fb2e configure: fix builtin detection w/-Werror
and other warnings enabled (-Wunused-value)

Change-Id: Ic0080d559366a5254d0fa3ba939150445157087f
2016-01-28 23:45:35 -08:00
Vincent Rabaud
3661b98069 Add a CMakeLists.txt
The goal is not to replace our autotools configuration, but to
provide a minimal CMake file that can build the lib, cwebp and
dwebp.

Change-Id: I3be343bd698d118c5f00172449d232d87e868f23
2016-01-28 01:30:59 +01:00
Pascal Massimino
75f4af4d54 remove useless #include
Change-Id: Id34f12ec94d8be6853fabd67609a6006ac99f152
2016-01-25 09:34:10 -08:00
Pascal Massimino
6c1d763119 avoid Yoda style for comparison
Change-Id: I8ff9f96951e5e8a619f7132455dd281cbf91aa4d
2016-01-15 23:52:29 -08:00
Vincent Rabaud
8ce975ac82 SSE optimization for vector mismatch.
Change-Id: I564b822033b59d86635230f29ed6197e306a2c4f
2016-01-07 18:23:45 +01:00
James Zern
7db53831a9 libwebp-0.5.0
- 12/17/2015: version 0.5.0
   * miscellaneous bug & build fixes (issues #234, #258, #274, #275, #278)
   * encoder & decoder speed-ups on x86/ARM/MIPS for lossy & lossless
     - note! YUV->RGB conversion was sped-up, but the results will be slightly
       different from previous releases
   * various lossless encoder improvements
   * gif2webp improvements, -min_size option added
   * tools fully support input from stdin and output to stdout (issue #168)
   * New WebPAnimEncoder API for creating animations
   * New WebPAnimDecoder API for decoding animations
   * other API changes:
     - libwebp:
       WebPPictureSmartARGBToYUVA() (-pre 4 in cwebp)
       WebPConfig::exact (-exact in cwebp; -alpha_cleanup is now the default)
       WebPConfig::near_lossless (-near_lossless in cwebp)
       WebPFree() (free'ing webp allocated memory in other languages)
       WebPConfigLosslessPreset()
       WebPMemoryWriterClear()
     - libwebpdemux: removed experimental fragment related fields and functions
     - libwebpmux: WebPMuxSetCanvasSize()
   * new libwebpextras library with some uncommon import functions:
     WebPImportGray/WebPImportRGB565/WebPImportRGB4444
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWes+OAAoJEPnD1r24IytdE5UP/1SiiepxgmDB2zV1gvPOzFM6
 r+ezjDwTsJE7wQxjPjOEP/tOSCmMqs4CgA5B1nhu5df6sewOTq4OMALnoER2zTIq
 qdb70auok4OyrHw33TINApRWJ2jUUJIJUsG4s82+Nsl7Sn6Xt8eLKRcJziZGNuSQ
 y1TUuGYJujkSc1ZKYhEYXqxFK2V9UDPY+4lsNfmMloX5fKoPKhRNhvadOiWYat7R
 SDyV0taIofjRl0WBCp3yhUn7aIkp6W5Ipdeitb1TTc4OpPKKRwib8kafQfZ/w9OV
 KxBngZoaUtwtCs4XT+lJmulEWDX12FbfHcL8Fz74uA/0qF3HvvWCPxrkz8TpvwfV
 zvT3p/C0Hiz0/J+EdW4qr1/Y0qVZ7OHfsRR6TbUZkEhnbnn4O8orm9CHKw48IvuP
 70MaAaAlIFR9XBu+vo7r5+6FuT9d/XhkaWhcxROR63xWpzL1glHcGJWB92OsdOlF
 rcdN7epqdfFhXcKCWXaluRGdpBgk7AtymaWEthdmPKQ9CMUaLPQQra0IRUMbRih2
 e7zJafhSg1TM7v2VEFqOjSkkyYr/Zcp4uc5lVj7nWeiP+R4VlsGCcz61LDNDsMoD
 MS0A1EdPE92B1F55se5WL8uBdk6DLJwjFm+QDQN26Flpr9kKYocFrsuNu8Xp1b3l
 6q1+Cw6I+YtO+8SAuFrk
 =5v3s
 -----END PGP SIGNATURE-----

Merge tag 'v0.5.0'

libwebp-0.5.0
- 12/17/2015: version 0.5.0
  * miscellaneous bug & build fixes (issues #234, #258, #274, #275, #278)
  * encoder & decoder speed-ups on x86/ARM/MIPS for lossy & lossless
    - note! YUV->RGB conversion was sped-up, but the results will be slightly
      different from previous releases
  * various lossless encoder improvements
  * gif2webp improvements, -min_size option added
  * tools fully support input from stdin and output to stdout (issue #168)
  * New WebPAnimEncoder API for creating animations
  * New WebPAnimDecoder API for decoding animations
  * other API changes:
    - libwebp:
      WebPPictureSmartARGBToYUVA() (-pre 4 in cwebp)
      WebPConfig::exact (-exact in cwebp; -alpha_cleanup is now the default)
      WebPConfig::near_lossless (-near_lossless in cwebp)
      WebPFree() (free'ing webp allocated memory in other languages)
      WebPConfigLosslessPreset()
      WebPMemoryWriterClear()
    - libwebpdemux: removed experimental fragment related fields and functions
    - libwebpmux: WebPMuxSetCanvasSize()
  * new libwebpextras library with some uncommon import functions:
    WebPImportGray/WebPImportRGB565/WebPImportRGB4444

* tag 'v0.5.0':
  update ChangeLog
  faster rgb565/rgb4444/argb output
  update NEWS
  update AUTHORS
  update mailmap
  bump version to 0.5.0
  README: update help text, repo link

Change-Id: I21dc611cfd2a3cb6ed6ba5c455a5049fe615f7a1
2015-12-23 11:20:42 -08:00
James Zern
37f049490d update ChangeLog
Change-Id: I1df7a610c466eb1cfc675e030c12fcff21f57c1b
2015-12-18 00:04:24 -08:00
Pascal Massimino
7e7b6ccc7f faster rgb565/rgb4444/argb output
SSE2 and NEON implementation.

Change-Id: I342a1c3d84937b8497f0aaecb7ce9bdb7f50296b
2015-12-17 23:38:58 -08:00
James Zern
4c7f565ff0 update NEWS
Change-Id: I2000d5e7331cba4cb76648df5f16043f48ec622a
2015-12-17 23:25:23 -08:00
James Zern
1f62b6b2e6 update AUTHORS
Change-Id: Ieb8e8b98c82be3d5084915ca0cbd9997117d05ab
2015-12-17 19:45:14 -08:00
James Zern
e224fdc80b update mailmap
Change-Id: I32af2b1bcdc818d3664ef36f548951d07c35c035
2015-12-17 19:45:14 -08:00
James Zern
71100500a8 bump version to 0.5.0
libwebp{,decoder} - 0.5.0
libwebp libtool - 6.0.0
libwebpdecoder libtool - 2.0.0

mux/demux - 0.3.0
libtool - 2.0.0

Change-Id: I5346d13eb827fb5890efbb63ff3f28cea9d0c55f
2015-12-17 19:45:14 -08:00
James Zern
230a685ea7 README: update help text, repo link
Change-Id: Iebcd05932572908abe4fcd0bb39a04dde9325199
2015-12-17 19:45:14 -08:00
James Zern
d48e427b1d Merge "demux: accept raw bitstreams" 2015-12-17 22:52:10 +00:00
James Zern
99a01f4f8b Merge "Unify some entropy functions." 2015-12-17 22:35:29 +00:00
James Zern
4b025f10f7 Merge "configure: disable asserts by default" 2015-12-17 22:28:37 +00:00
James Zern
92cbddf89c Merge "fix PrintBlockInfo()" 2015-12-17 21:00:57 +00:00
Vincent Rabaud
ca509a3362 Unify some entropy functions.
The code and logic is unified when computing bit entropy + Huffman cost.

Speed-wise, we gain 8% for lossless encoding.
Logic-wise, the beginning/end of the distributions are handled properly
and the compression ratio does not change much.

Change-Id: Ifa91d7d3e667c9a9a421faec4e845ecb6479a633
2015-12-17 17:00:08 +01:00