This is in preparation for some SSE2 code.
And generally speaking, the whole SSIM code needs some
revamp: we're not averaging the SSIM value at each pixels
but just computing the overall SSIM value once, for the whole
plane. The former might be better than the latter.
Change-Id: I935784a917f84a18ef08dc5ec9a7b528abea46a5
based on the sse2 change in:
9960c31 Remove an unnecessary transposition in TTransform.
~9-10.5% faster at the function-level, < 1% overall
Change-Id: I44413369b230b250fb0dbc51ff2f17cfeda609b7
- The result is now indeed closest among possible results for all inputs, which
was not the case for bits>4, where the mapping was not even monotonic because
GetValAndDistance was correct only if the significant part of initial fit in
a byte at most twice.
- The set of results for a larger number of bits dropped is a subset of values
for a smaller number of bits dropped. This implies that subsequent
discretizations for a smaller number of bits dropped do not change already
discretized pixels, which improves the quality (changes do not accumulate)
and compression density (values tend to repeat more often).
- Errors are more fairly distributed between upwards and downwards thanks to
bankers’ rounding, which avoids images getting darker or lighter in overall.
- Deltas between discretized values are more repetitive. This improves
compression density if delta encoding is used.
Also, the implementation is much shorter now.
Change-Id: I0a98e7d5255e91a7b9c193a156cf5405d9701f16
Pass them along to internal 'pic' object, so that progress can be reported back
and user data can also be inspected.
Change-Id: Idb5d0d4a76d07283d704a86c5892e1ad7bda09fa
SSE4.1 is slower than the SSE2 implementation and this seems to
be due to a slow _mm_loadl_epi64 implementation by gcc
(hence a bug with my gcc 4.8) and a very slow _mm_hadd_epi32. Both
got confirmed by IACA and experiments.
Change-Id: I05607f66b7ccd8f4f42e000693aea583ffd5768f
We were not updating the current_width_, which is usually
not a problem, unless we use Delta Palette with small number
of colors
-> Addressed this re-entrancy problem by checking we have
enough capacity for transform buffer.
The problem is not currently visible, until we restrict
the number of gradient used in delta-palette to less than 16.
Then the buffers have different current_width_ and the problem
surfaces.
Change-Id: Icd84b919905d7789014bb6668bfb6813c93fb36e
The transpose refactoring will help removing a transpose in a
later CL.
The horizontal add function helps removing a _mm_sad_epu8 in DC8uv
=> the latency/throughput went from 29/25 to 23/19
Change-Id: I5f3dfd4aad614eb079b1e83631e6a7cef49a3766
This is to prevent users shooting in the foot using -psnr or
-size alone and not getting the expected result.
Change-Id: I67a3289e4ec0a2a813c98807f2ec5e600f52dc63
'implicit conversion from 'int' to 'short' changes value from 33050 to
-32486'
original patch:
https://codereview.chromium.org/1657313003/
Make libwebp build with -Wconstant-conversion from newer clangs.
After http://llvm.org/viewvc/llvm-project?rev=259271&view=rev, clang
points out that _mm_set1_epi16(33050) causes an overflow in the short
argument to _mm_set1_epi16(). Since there's no version that takes an
unsigned short, add an explicit cast to tell the compiler that this is
intentional.
No behavior change.
Change-Id: I6b4e3401b15cfbcc895f9e81b5c2dc59d43ffb9b
The goal is not to replace our autotools configuration, but to
provide a minimal CMake file that can build the lib, cwebp and
dwebp.
Change-Id: I3be343bd698d118c5f00172449d232d87e868f23
- 12/17/2015: version 0.5.0
* miscellaneous bug & build fixes (issues #234, #258, #274, #275, #278)
* encoder & decoder speed-ups on x86/ARM/MIPS for lossy & lossless
- note! YUV->RGB conversion was sped-up, but the results will be slightly
different from previous releases
* various lossless encoder improvements
* gif2webp improvements, -min_size option added
* tools fully support input from stdin and output to stdout (issue #168)
* New WebPAnimEncoder API for creating animations
* New WebPAnimDecoder API for decoding animations
* other API changes:
- libwebp:
WebPPictureSmartARGBToYUVA() (-pre 4 in cwebp)
WebPConfig::exact (-exact in cwebp; -alpha_cleanup is now the default)
WebPConfig::near_lossless (-near_lossless in cwebp)
WebPFree() (free'ing webp allocated memory in other languages)
WebPConfigLosslessPreset()
WebPMemoryWriterClear()
- libwebpdemux: removed experimental fragment related fields and functions
- libwebpmux: WebPMuxSetCanvasSize()
* new libwebpextras library with some uncommon import functions:
WebPImportGray/WebPImportRGB565/WebPImportRGB4444
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWes+OAAoJEPnD1r24IytdE5UP/1SiiepxgmDB2zV1gvPOzFM6
r+ezjDwTsJE7wQxjPjOEP/tOSCmMqs4CgA5B1nhu5df6sewOTq4OMALnoER2zTIq
qdb70auok4OyrHw33TINApRWJ2jUUJIJUsG4s82+Nsl7Sn6Xt8eLKRcJziZGNuSQ
y1TUuGYJujkSc1ZKYhEYXqxFK2V9UDPY+4lsNfmMloX5fKoPKhRNhvadOiWYat7R
SDyV0taIofjRl0WBCp3yhUn7aIkp6W5Ipdeitb1TTc4OpPKKRwib8kafQfZ/w9OV
KxBngZoaUtwtCs4XT+lJmulEWDX12FbfHcL8Fz74uA/0qF3HvvWCPxrkz8TpvwfV
zvT3p/C0Hiz0/J+EdW4qr1/Y0qVZ7OHfsRR6TbUZkEhnbnn4O8orm9CHKw48IvuP
70MaAaAlIFR9XBu+vo7r5+6FuT9d/XhkaWhcxROR63xWpzL1glHcGJWB92OsdOlF
rcdN7epqdfFhXcKCWXaluRGdpBgk7AtymaWEthdmPKQ9CMUaLPQQra0IRUMbRih2
e7zJafhSg1TM7v2VEFqOjSkkyYr/Zcp4uc5lVj7nWeiP+R4VlsGCcz61LDNDsMoD
MS0A1EdPE92B1F55se5WL8uBdk6DLJwjFm+QDQN26Flpr9kKYocFrsuNu8Xp1b3l
6q1+Cw6I+YtO+8SAuFrk
=5v3s
-----END PGP SIGNATURE-----
Merge tag 'v0.5.0'
libwebp-0.5.0
- 12/17/2015: version 0.5.0
* miscellaneous bug & build fixes (issues #234, #258, #274, #275, #278)
* encoder & decoder speed-ups on x86/ARM/MIPS for lossy & lossless
- note! YUV->RGB conversion was sped-up, but the results will be slightly
different from previous releases
* various lossless encoder improvements
* gif2webp improvements, -min_size option added
* tools fully support input from stdin and output to stdout (issue #168)
* New WebPAnimEncoder API for creating animations
* New WebPAnimDecoder API for decoding animations
* other API changes:
- libwebp:
WebPPictureSmartARGBToYUVA() (-pre 4 in cwebp)
WebPConfig::exact (-exact in cwebp; -alpha_cleanup is now the default)
WebPConfig::near_lossless (-near_lossless in cwebp)
WebPFree() (free'ing webp allocated memory in other languages)
WebPConfigLosslessPreset()
WebPMemoryWriterClear()
- libwebpdemux: removed experimental fragment related fields and functions
- libwebpmux: WebPMuxSetCanvasSize()
* new libwebpextras library with some uncommon import functions:
WebPImportGray/WebPImportRGB565/WebPImportRGB4444
* tag 'v0.5.0':
update ChangeLog
faster rgb565/rgb4444/argb output
update NEWS
update AUTHORS
update mailmap
bump version to 0.5.0
README: update help text, repo link
Change-Id: I21dc611cfd2a3cb6ed6ba5c455a5049fe615f7a1
The code and logic is unified when computing bit entropy + Huffman cost.
Speed-wise, we gain 8% for lossless encoding.
Logic-wise, the beginning/end of the distributions are handled properly
and the compression ratio does not change much.
Change-Id: Ifa91d7d3e667c9a9a421faec4e845ecb6479a633