Commit Graph

2216 Commits

Author SHA1 Message Date
Vincent Rabaud
9e8e1b7b2a Inline GetResidual for speed.
Change-Id: Ib4228e87dc448866229c0795ca68dabe777ef31c
2016-06-21 16:04:53 +02:00
Vincent Rabaud
7d58d1b7b9 Speed-up uniform-region processing.
Change-Id: I9a88d0ac97c31d19323c9505ebe21f375d2e96b8
2016-06-21 15:45:46 +02:00
Pascal Massimino
8ec7032bc2 simplify HistogramCombineEntropyBin()
We only perform a single pass, and swap the final histograms
into the beginning of the array as we go. Therefore, they are
already at the correct place at the end of the pass.
-> HistogramCompactBins() is removed, we just truncate the array.

output is bitwise the same.

Change-Id: I9508c96dda0f8903c927a71b06af4e6490c3249c
2016-06-21 14:13:36 +02:00
Pascal Massimino
23e29cb1e3 Merge "Fix a boundary case in BackwardReferencesHashChainDistanceOnly." into 0.5.1 2016-06-20 16:06:39 +00:00
Pascal Massimino
472a049b4e remove bin_map[] allocation altogether
... and just re-use histogram_symbols[] instead!
It's guaranteed to work.

Change-Id: Ie3b0cd5781171ded20058e8bc143fce2f69b4c68
2016-06-20 11:19:25 +02:00
Pascal Massimino
0bb23b2cf7 free -> WebPSafeFree()
avoids unbalanced memory track at the end (w/ PRINT_MEM_INFO flag on)

Change-Id: I70da087f079198bcaacd0c81593f104058dcac69
2016-06-17 17:49:07 +02:00
Pascal Massimino
a977b4b513 Merge "rewrite the bin_map clustering to use less memory" 2016-06-17 08:25:35 +00:00
Pascal Massimino
3591ba6684 rewrite the bin_map clustering to use less memory
output should bit-write the same as before, in both
low_effort and non low_effort modes.

if anything, speed is a tad faster, probably because of the
reduced memory traffic.

Change-Id: Iaa2ddcfda2aaffefe7e5b7bc89216373d1ddb194
2016-06-17 09:52:36 +02:00
James Zern
e6ac450cbd utils.[hc]: s/MAX_COLOR_COUNT/MAX_PALETTE_SIZE/
MAX_COLOR_COUNT was just a synonym and its use in the header was a bit
strange given the visibility of that define.

Change-Id: I536964ddc14a0c48263191b6afb80695b5a038e6
2016-06-16 23:14:30 -07:00
Pascal Massimino
e7b917726f Merge "DecodeImageData(): change the incorrect assert" into 0.5.1 2016-06-17 06:07:44 +00:00
Pascal Massimino
2abfa54f95 DecodeImageData(): change the incorrect assert
this function can be called not to decode pixels, but simply
to finish processing (through process_func()) the already decoded
pixels.

Change-Id: I80485e92e3c47f0aa3389476dcb82745a243fc4a
2016-06-16 22:24:30 -07:00
Vincent Rabaud
0174d18d8b Fix a boundary case in BackwardReferencesHashChainDistanceOnly.
The optimization for (len != MIN_LENGTH) actually only holds for
(len > MIN_LENGTH) but (len < MIN_LENGTH) can now happen as len can
be changed in the loop before.

Change-Id: I3f9f91a540206c80385c5fba96c3d64ab9536752
2016-06-16 19:22:28 +02:00
Parag Salasakar
6a9c262aa8 Merge "Added MSA optimized transform functions" 2016-06-16 10:53:43 +00:00
Vincent Rabaud
cfbcc5ece0 Make sure to consider small distances in LZ77.
This could corrupt certain images since commit
a3611513d2

Change-Id: Ifbe43abaafe8efb27c62af18039fea5a9dc4e062
2016-06-16 09:14:11 +00:00
Parag Salasakar
5e60c42a76 Added MSA optimized transform functions
1. TransformWHT
2. TransformTwo
3. TransformDC
4. TransformAC3

Change-Id: Ia3624cb4aed215bcaffce542b28794e643207039
2016-06-16 09:04:27 +00:00
Pascal Massimino
f2a0946a7a add some asserts to delimit the perimeter of CostManager's operation
a small protection in a fairly complex code.

Change-Id: I920e10e1fc1c35da2cf486349417048d516ff2b9
2016-06-15 20:55:32 +02:00
Pascal Massimino
9a583c66f9 fix invalid-write bug for alpha-decoding
On the non-fast path (use_8b_decode_=0) for decoding the alpha-mask,
we could end up requesting ApplyInverseTransform() with more rows
to process than NUM_ARGB_CACHE_ROWS. This could only happen on the
very last bottom rows of the image.

* ProcessRows() doesn't need to be fixed, since we never request more
  than NUM_ARGB_CACHE_ROWS rows. Added an assert for that.

* the use_8b_decode_=1 case doesn't use argb_cache_, but rather does
  the palette-decoding call directly. So, no problem here too.

Only the generic (and rather rare) case of calling ExtractAlphaRows()
was affected.

Change-Id: I58e28d590dcc08c24d237429b79614abcef1db7c
2016-06-15 16:46:19 +02:00
James Zern
6fda58f137 backward_references: quiet double->int warning
since:
059aab4 Fix a compression regression for images with long uniform
regions.

Change-Id: I1783a74220961e8bc3bb42696e3412fe4bfc4ddb
2016-06-14 15:27:58 -07:00
Pascal Massimino
a48cc9d201 Merge "Fix a compression regression for images with long uniform regions." into 0.5.1 2016-06-14 21:26:06 +00:00
Pascal Massimino
cc2720c1d5 Merge "Revert an LZ77 boundary constant." into 0.5.1 2016-06-14 21:25:08 +00:00
Vincent Rabaud
059aab4fa1 Fix a compression regression for images with long uniform regions.
Change-Id: Id87a4ac2a22daaa71e8f3132e69703b9b3ddd752
2016-06-14 21:51:10 +02:00
Vincent Rabaud
b0c7e49e58 Check more backward matches with higher quality.
Change-Id: I3f0887b0b9b7f0e69758f51783807e1583b74be2
2016-06-14 21:50:03 +02:00
Vincent Rabaud
a3611513d2 Revert an LZ77 boundary constant.
This is getting back to the old behavior which is actually better for
compression and speed with the latest patches.

Change-Id: I35884bab02589297c25d6e1e66dc5f13e05f7aa7
2016-06-14 21:42:45 +02:00
James Zern
0fb2269c4d bump version to 0.5.1
libwebp{,decoder} - 0.5.1
libwebp libtool - 6.1.0
libwebpdecoder libtool - 2.1.0

mux - 0.3.1
libtool - 2.1.0

demux (no changes) - 0.3.0
libtool - 2.0.0

Change-Id: I6f51bfaccf33ff7a1492f3e8f888324d7afc0f4b
2016-06-13 19:10:21 -07:00
Urvang Joshi
3259571e7d Refactor GetColorPalette method.
This was defined (slightly differently) at two places. Created a common
method and moved to utils/utils.[hc].

Change-Id: I66c3ac6dea24e0cd2c0eaa5440f3142b4dbbe23b
2016-06-13 18:52:37 -07:00
Pascal Massimino
1df5e26001 avoid using tmp histogram in PreparePair()
we don't need to store the resulting histogram, so no need to
call HistogramAddEval().
Allows some signature simplifications...

Change-Id: I3fff6c45f4a7c6179499c6078ff159df4ca0ac53
2016-06-12 14:32:22 -07:00
Pascal Massimino
7685123a7f fix comment typos
Change-Id: I2a55e371dbf7e62b446f6bb732c8913b85633c49
2016-06-10 13:29:07 +00:00
Vincent Rabaud
a246b921a0 Speedup backward references.
In case where the same offset is found in consecutive pixels,
the cost computation from one pixel can be re-used for the next.

Change-Id: Ic03c7d4ab95f3612eafc703349cfefd75273c3d7
2016-06-09 20:05:15 +02:00
Pascal Massimino
76d73f1835 Merge "CostManager: introduce a free-list of ~10 intervals" 2016-06-09 15:58:38 +00:00
Pascal Massimino
eab39d8147 CostManager: introduce a free-list of ~10 intervals
and also recycle the malloc'd intervals
This avoids quite some malloc/free cycles during interval managment.

Change-Id: Ic2892e7c0260d0fca0e455d4728f261fb4c3800e
2016-06-09 01:50:50 -07:00
James Zern
4c59aac0f9 Merge "mips msa webp configuration" 2016-06-09 05:39:53 +00:00
Pascal Massimino
043c33f1ce Merge "Improve speed and compression in backward reference for lossless." 2016-06-08 19:54:51 +00:00
Pascal Massimino
71be9b8c11 Merge "clarify variable names in HistogramRemap()" 2016-06-08 19:48:38 +00:00
Vincent Rabaud
0ba7fd70c6 Improve speed and compression in backward reference for lossless.
Change-Id: I664c5e68b036a2d424192962dbad873a2c70b826
2016-06-08 21:13:33 +02:00
Pascal Massimino
0481d42ad8 CostManager: cache one interval and re-use it when possible
In a lot of cases, only one interval is used. This can cause
a lot of malloc/free cycles for only 56 bytes. By caching this
single interval and re-using it, we remove this cycle in most
frequent cases.

Change-Id: Ia22d583f60ae438c216612062316b20ecb34f029
2016-06-08 14:10:06 +00:00
Pascal Massimino
41b7e6b56e Merge "histogram: fix bin calculation" 2016-06-08 11:59:16 +00:00
Pascal Massimino
96c3d62496 histogram: fix bin calculation
It could overflow if min==max

Change-Id: I11580328712879fe310e3d9c8c5e10835c4faf86
also: collect GetHistoBinIndex() variant into a single function
2016-06-08 13:38:00 +02:00
Pascal Massimino
fe9e31ef5e clarify variable names in HistogramRemap()
also simplify the loop by removing the initial special case

Change-Id: I4fc1b9eb14a42381a815354599137703c21ed9e6
2016-06-08 13:36:04 +02:00
Pascal Massimino
ce3c824712 disable near-lossless quantization if palette is used
Change-Id: I3335e8f97f99d6d786e953161adbf246e85d80d3
2016-06-08 11:29:48 +02:00
Parag Salasakar
e11da081f9 mips msa webp configuration
Change-Id: I886164d6d3d560b1249603d47391fddf20b5a3d4
2016-06-07 23:49:41 -07:00
Urvang Joshi
5f8f998d0e mux: Presence of unknown chunks should trigger VP8X chunk output.
As per the spec
(https://developers.google.com/speed/webp/docs/riff_container), only the
extended file format can contain an unknown chunk. So, when assembling a
WebP file with muxer, whenever there is an unknown chunk present, we
should create a VP8X chunk (even though none of the features are
present).

BUG=webp:294

Change-Id: I5da52d311e1853d40063d0f5026100d4325effaa
2016-06-06 09:25:21 -07:00
Pascal Massimino
cadec0b126 Merge "Sync mips32 and dsp_r2 YUV->RGB code with C verison" 2016-06-03 10:32:20 +00:00
Vincent Rabaud
d963775859 Compute the hash chain once and for all for lossless compression.
In some cases, the hash chain for a function is filled several
times:
- GetBackwardReferences -> CalculateBestCacheSize ->
BackwardReferencesLz77 that computes the hash chain
- GetBackwardReferences ->
(not always) BackwardReferencesTraceBackwards ->
BackwardReferencesHashChainDistanceOnly that computes the hash
chain in a slightly different way

Speed and compression performance are slightly changed (+ or -)
but will be homogneized in a later patch.

Change-Id: I43f0ecc7a9312c2ed6cdba1c0fabc6c5ad91c953
2016-06-03 11:42:13 +02:00
Jovan Zelincevic
50a486656d Sync mips32 and dsp_r2 YUV->RGB code with C verison
Change-Id: Ibe12f5ef596b8922225b95c36b67955a3f8b9ae4
2016-06-03 10:42:11 +02:00
Pascal Massimino
ca8d951980 remove some obsolete TODOs
Change-Id: Ied77b2dd7e3e5bb65524c0ac7b9a3fb6585cac57
2016-06-01 16:23:16 +02:00
Pascal Massimino
0b8ae8520f Merge "Move DitherCombine8x8 to dsp/dec.c" 2016-05-27 05:55:04 +00:00
Pascal Massimino
f8b7ce9ef1 Merge "test pointer to NULL explicitly" 2016-05-25 06:46:11 +00:00
Pascal Massimino
5df6f214e3 test pointer to NULL explicitly
Change-Id: I3de6f6b717a21c93f9d3144c66b7328b0b322853
2016-05-24 23:26:19 -07:00
Pascal Massimino
77f21c9c39 Move DitherCombine8x8 to dsp/dec.c
To be later optimized in SSE2

Change-Id: I0de9c89eb5166f3319bb4b0500150de271ecac05
2016-05-24 23:14:41 -07:00
Urvang Joshi
08333b8551 WebPAnimEncoder: Detect when canvas is modified, restore only when needed.
Change-Id: I6b45283e13c08c7d11c96500a65c75ce3637b06d
2016-05-23 11:17:23 -07:00
Pascal Massimino
0209d7e6d6 Merge "speed-up MapToPalette() with binary search" 2016-05-23 14:09:34 +00:00
Pascal Massimino
fdd29a3d3f speed-up MapToPalette() with binary search
goes from ~2% CPU to ~0.7% for large images
Change-Id: Ibd8a0fde9ba553f93157a49dcb7da0426209e404
2016-05-23 15:39:54 +02:00
James Zern
cf4a651bb8 Revert "Refactor GetColorPalette method."
This reverts commit 169004b1d5.

this changes the ABI, so should bump versions and add a note to NEWS
when we're ready to expose it

Change-Id: Ic5bbd0aee2b6fd0f9d438a9effedf22fe0cec4bf
2016-05-20 17:05:33 -07:00
James Zern
0a27aca3f8 Merge changes Idfa8ce83,I19adc9c4
* changes:
  WebPAnimEncoder: Restore original canvas between multiple encodes.
  Refactor GetColorPalette method.
2016-05-21 00:03:15 +00:00
Urvang Joshi
f25c4406e6 WebPAnimEncoder: Restore original canvas between multiple encodes.
tl;dr
We do the following:
- Start with transparent value of 0x00000000 instead of 0x00ffffff, so that
WebPCleanupTransparentAreaLossless() is a no-op.
- Restore the original canvas after lossy encoding, to discard changes made by
WebPCleanupTransparentArea() before the next encode.

Explanation of why:
In the mixed mode, anim_encoder tries to encode using both lossless and lossy
compression. In fact, when "min_size" option is enabled, there are at most 4
encodes that can happen in this order:
- lossless with dispose none
- lossy with dispose none
- lossless with dispose background
- lossy with dispose background

But both lossless and lossy both potentially modify the canvas during encode
(for better compression):
- Lossless: WebPCleanupTransparentAreaLossless() turns all transparent pixels
to 0x00000000
- Lossy: WebPCleanupTransparentArea() flattens some transparent pixels

So, the result is that, sometimes we feed the modified canvas to the encoder
instead of the original one, which isn't the right thing to do.

This also applies to just lossless or just lossy encoding, as multiple encodes
happen (with the two dispose methods) in those cases too.

Change-Id: Idfa8ce831a1627014785ba7d0316c42f72594455
2016-05-20 15:07:50 -07:00
Urvang Joshi
169004b1d5 Refactor GetColorPalette method.
This was defined (slightly differently) at two places. Created a common
method and moved to utils/utils.[hc].

Change-Id: I19adc9c48f2a4e2ec9d995e78add6f25172774c2
2016-05-20 15:06:38 -07:00
James Zern
576362abd7 VP8LDoFillBitWindow: support big-endian in fast path
Change-Id: I577944fe0b85505766050dba5ab5aec48b30f541
2016-05-20 00:31:05 -07:00
James Zern
ac49e4e4dc bit_reader.c: s/VP8L_USE_UNALIGNED_LOAD/VP8L_USE_FAST_LOAD/
no longer gate this on WEBP_FORCE_ALIGNED as WebPMemToUint32() provides
this service. replace that check with WORDS_BIGENDIAN as the block is
currently little-endian specific.

Change-Id: Ie04ec0179022d20dab53da878008ae049837782f
2016-05-19 23:14:02 -07:00
James Zern
d39ceb58ac VP8LDoFillBitWindow: remove stale TODO
the read size may be fixed, but the offsets into buf_ are not. forcing
an aligned read then shifting or using a temporary would be costly. this
is less important now that WebPMemToUint32() is being used.

Change-Id: I357fec8f750969cce91987abebed2f95e27a835f
2016-05-19 23:07:10 -07:00
Pascal Massimino
2ec2de1450 Merge "Speed-up BackwardReferencesHashChainDistanceOnly." 2016-05-19 05:17:03 +00:00
Vincent Rabaud
3e023c17cd Speed-up BackwardReferencesHashChainDistanceOnly.
Instead of comparing all the following pixels over len (which can
frequently reach the maximum MAX_LENGTH=4096 for some images),
intervals are stored and compared.

Change-Id: I0dafef6cc988dde3c1c03ae07305ac48901d60ee
2016-05-19 04:51:13 +00:00
Marcin Kowalczyk
f2e1efbeb7 Improve near lossless compression when a prediction filter is used.
The old implementation in enc/near_lossless.c performing a separate
preprocessing step is used only when a prediction filter is not used,
otherwise a new implementation integrated into lossless_enc.c is used.

It retains the same logic for converting near lossless quality into max
number of bits dropped, and for adjusting the number of bits based on
the smoothness of the image at a given pixel. As before, borders are not
changed.

Then, instead of quantizing raw component values, the residual after
subtract green and after prediction is quantized according to the
resulting number of bits, taking care to not cross the boundary between
255 and 0 after decoding. Ties are resolved by moving closer to the
prediction instead of by bankers’ rounding.

This results in about 15% size decrease for the same quality.

Change-Id: If3e9c388158c2e3e75ef88876703f40b932f671f
2016-05-18 20:59:02 +00:00
James Zern
e15afbce5d dsp.h: fix ubsan macro name
copy and paste error in the previous commit, change
no_sanitize("unsigned-integer-overflow") from WEBP_UBSAN_IGNORE_UNDEF ->
WEBP_UBSAN_IGNORE_UNSIGNED_OVERFLOW

Change-Id: Id178ee14df1f2c4923a91ce423241e26b60b5d32
2016-05-13 11:09:57 -07:00
James Zern
e53c9ccb24 dsp.h: add WEBP_UBSAN_IGNORE_UNSIGNED_OVERFLOW
for suppressing expected failures with -fsanitize=integer

Change-Id: I954cba45f0c96478b770ed7a6ac7491359cae075
2016-05-12 23:51:23 -07:00
James Zern
af81fdb772 utils.h: quiet -fsanitize=undefined warnings
add WEBP_UBSAN_IGNORE_UNDEF to WebPMemToUint32() / WebPUint32ToMem()
when WEBP_FORCE_ALIGNED is unset

Change-Id: I726b2e708ce29681584eb10c8874d5cf1e798756
2016-05-11 23:47:52 -07:00
James Zern
ea0be354a0 dsp.h: remove utils.h include
include utils.h directly where needed to allow utils.h to rely on
defines from dsp.h in a follow-up.

Change-Id: I32e26aaeb0b04ba60b3332f685f9a2be5a0a8d3d
2016-05-11 23:17:21 -07:00
James Zern
cd276aecd0 utils/*.c: ../utils/utils.h -> ./utils.h
Change-Id: I0154e788c864f77d15d6c287df59c0a02e6db5e9
2016-05-11 23:08:32 -07:00
James Zern
c892713104 utils/Makefile.am: add some missing headers
Change-Id: Id5b81337b92819038bc7006de8c30bfc8eb9def5
2016-05-11 23:03:52 -07:00
James Zern
ea24e026aa Merge "dsp.h: add WEBP_UBSAN_IGNORE_UNDEF" 2016-05-11 06:21:45 +00:00
James Zern
369e264e2e dsp.h: add WEBP_UBSAN_IGNORE_UNDEF
only defined when WEBP_FORCE_ALIGNED isn't. use it to quiet alignment
warnings VP8LoadNewBytes().

Change-Id: I710a74bb9375285974e97022540551a3f4eda414
2016-05-10 22:45:13 -07:00
James Zern
0d020a7892 Merge "add runtime NEON detection" 2016-05-11 05:42:13 +00:00
James Zern
5ee2136a71 Merge "add VP8LAddPixels() to lossless.h" 2016-05-10 22:39:29 +00:00
Pascal Massimino
47435a6162 add VP8LAddPixels() to lossless.h
Change-Id: I67f9118f875affa32c47adfedf9df28b0ac9957b
2016-05-10 20:30:30 +00:00
Pascal Massimino
8fa6ac68f0 remove two ubsan warnings
(regarding uint overflow)

Change-Id: I1a76e4b1268370b6b7d6a1aa93b99e57f55fd02e
2016-05-10 18:40:18 +00:00
James Zern
74fb56fb5d add runtime NEON detection
configure gets 2 new options:
--enable-neon / --enable-neon-rtcd

the NEON modules are split to their own convenience lib and built with
auto-detected flags if none are given via CFLAGS.

the /proc/cpuinfo check will only be used for armv7 targets whose
toolchain does not enable NEON by default or didn't have NEON forced by
the CFLAGS from the environment.

Change-Id: I2755bc1d065d5d6ee6143b44978c2082f8bef1c5
2016-05-06 15:32:48 -07:00
Jovan Zelincevic
4154a8395d MIPS update to new Unfilter API
Change-Id: I2b5960812954dfcabc84663382b9e032fd1eeb43
2016-05-05 15:50:34 +02:00
James Zern
6235147ed5 cherry-pick decoder fix for 64-bit android devices
This fixes decoders built against clang-3.8 (r11c). Without this change
bad conditional code would be generated causing all calls to
WebPParseHeaders() to return 4 (UNSUPPORTED_FEATURE).

Original fix:
https://android-review.googlesource.com/#/c/196123

Change-Id: Id4b4d84048d347cea110b6cf297ef9ef4fbed323
2016-05-02 22:27:42 -07:00
Marcin Kowalczyk
5f95589fae Fix WEBP_ALIGN in case the argument is a pointer to a type larger than a byte.
Change-Id: Id74a70aa02be813cdf3ab4407ce207bcd647a6a4
2016-05-02 12:40:37 -07:00
Pascal Massimino
2309fd5c1d replace num_parts_ by num_parts_minus_one_ (unsigned)
This avoids spurious warnings and extra calculations.

Change-Id: Ibbe1b893cf89ee6b31a04d7bcbf92b4b357c8628
2016-05-02 12:19:01 -07:00
James Zern
9629f4bcda SimplifySegments: quiet -Warray-bounds warning
the number of segments are previously validated, but an explicit check
is needed to avoid a warning under gcc-4.9
this is similar to the changes made in:
c8a87bb AssignSegments: quiet -Warray-bounds warning
3e7f34a AssignSegments: quiet array-bounds warning

Change-Id: Iec7d470be424390c66f769a19576021d0cd9a2fd
2016-05-02 12:17:49 -07:00
Pascal Massimino
de47492e91 Merge "update the Unfilter API in dsp to process one row independently" 2016-04-22 17:33:41 +00:00
Pascal Massimino
2102ccd091 update the Unfilter API in dsp to process one row independently
This will allow to work in-place on cropped area later.

Also sped up the inverse gradient filtering in SSE2 (~4%)

Change-Id: I463149eee95d36984328f163a1e17f8cabd87441
2016-04-21 08:10:45 +00:00
Urvang Joshi
e3912d560b WebPAnimEncoder: Restore canvas before evaluating blending possibility.
Without this, the canvas may have been modified in-between two
GenerateCandidate() calls.

Change-Id: I0364a269caabdf282c30a8fa3c61896f0247342e
2016-04-19 16:13:45 -07:00
Urvang Joshi
6e12e1e3d2 WebPAnimEncoder: Fix for single-frame optimization.
Change-Id: I2da8dc34ab9589b58cde0ecb05d71357d77f6b25
2016-04-15 12:10:07 -07:00
James Zern
602f344a36 Merge changes I1d03acac,Ifcb64219
* changes:
  anim_diff: Add an experimental option for max inter-frame diff.
  WebPAnimEncoder: FlattenSimilarPixels(): look for similar
2016-04-07 19:13:56 +00:00
Pascal Massimino
95ecccf6dc only apply color-mapping for alpha on the cropped area
This is only possible if the filtering is not VERTICAL or GRADIENT.
Otherwise, we need the spatial predictors and hence need the un-visible
part above crop_top row.

COLOR_INDEX transform is the only transform that is not predicted
from previous row. Applying the same for other transform (spatial
predict, ...) is going to be more involve and use an extra temporary row.

+ remove ApplyInverseTransformsAlpha()
(work is done directly within ExtractPalettedAlphaRows())

+ change back to using filter_ instead of unfilter_func_

Change-Id: I09e57efae4a4af00bde35f21ca6e3d73b35d7d43
2016-04-07 10:21:02 +02:00
Pascal Massimino
aa809cfeb3 only allocate alpha_plane_ up to crop_bottom row
-> reduces memory consumption in case of cropping

work for bug #288

Change-Id: Icab6eb8e67e3c5c184a9363fffcd9f6acd0b311c
2016-04-06 14:07:16 +02:00
Urvang Joshi
31f2b8d8e1 WebPAnimEncoder: FlattenSimilarPixels(): look for similar
not exactly same.

Based on lossy WebP quality setting, ignore minor differences when
flattening
similar blocks.

For 6k set, at default quality with '-min_size' option, improves
compression by 0.3%

Change-Id: Ifcb64219f941e869eb2643e231220b278aad4cd4
2016-04-06 00:17:19 -07:00
Pascal Massimino
774dfbdc34 perform alpha filtering within the decoding loop
Change-Id: I221ae20ba879cd4f7b48d012db1bc5443d968e17
2016-04-06 00:13:14 -07:00
Pascal Massimino
a4cae68de0 lossless decoding: only process decoded row up to last_row
there's some subtle changes:
  - DecodeAlphaData() may be called with pos==end because we don't want
    to decode more data (there's none left), but because we want to apply
    process_func() to all the unprocessed pixels already decoded
  - last_row is exclusive and should be understood as 'up to last_row'. Can be misleading.
  - VP8LDecodeAlphaImageStream() was testing dec->last_pixel_ for completion,
    which was wrong because last_pixel_ is the last *decoded* pixel, not the
    last *processed* one. -> test now uses last_row_, as expected

Change-Id: I1fb04ba25cd7a4775db9e3deee3e2ae80f9c0a75
2016-04-05 23:26:50 -07:00
Pascal Massimino
238cdcdbe1 Only call WebPDequantizeLevels() on cropped area
this might change some crc slightly, since WebPDequantizeLevels()
performs an analysis pass, counting levels, which impacts the smoothing.
Now, the cropping area is not the same, so minor diffs are expected here
and there.

Change-Id: I3cce1e40c6f11c25b7c841044d637685c5740352
2016-04-05 13:12:47 +00:00
Pascal Massimino
cf6c713a0b alpha: preparatory cleanup
* make ALPHNew/Delete static
* properly init ALPHDec::io_
* introduce AllocateAlphaPlane() and WebPDeallocateAlphaMemory()
* reorganize VP8DecompressAlphaRows()

but we're still allocate the full alpha-plane. Optim will come
in another patch since it's tricky

Change-Id: Ib6f190a40abb7926a71535b0ed67c39d0974e06a
2016-04-04 16:30:29 +02:00
Vincent Rabaud
b95ac0a221 Merge "VP8GetHeaders(): initialize VP8Io with sane value for crop/scale dimensions" 2016-04-04 13:50:05 +00:00
Pascal Massimino
8923139414 VP8GetHeaders(): initialize VP8Io with sane value for crop/scale dimensions
so we can use crop_width/height elsewhere, instead of testing use_cropping

Change-Id: I41705db9f6889b036ecc167d73d1a1536dc33552
2016-04-04 09:21:55 +00:00
Pascal Massimino
5828e1995e use_8b_decode -> use_8b_decode_
Change-Id: I78e2b26c9bbe1364e5d048794528f9290606b545
2016-04-04 09:00:14 +00:00
Pascal Massimino
8dca0247d2 fix bug in alpha.c that was triggering a memory error in incremental mode
this change will be superseded by patch #335160 eventually, but until then
let's fix the problem temporarily.

Change-Id: Iafd979c2ff6801e3f1de4614870ca854a4747b04
2016-04-01 00:59:39 -07:00
Urvang Joshi
9a950c5375 WebPAnimEncoder: Disable filtering when blending is used with lossy encoding.
When FlattenSimilarBlocks() was making some blocks transparent with
averaged RGB values, filtering in lossy compression was causing
blockiness just outside the edge of these blocks.
Disabling filtering for that particular case avoid these block
artifacts.

The total encoded size of the 6k GIF set remains roughly the same (in
fact, reduces a bit).

Change-Id: Ida71cbabd59d851e16d871f53d19473312b3cc77
2016-03-31 23:07:47 -07:00
Urvang Joshi
eb423903a4 WebPAnimEncoder: choose max diff for framerect based on quality.
We pick a mapping with quality 0 mapping to max diff 32, to quality 100
mapping to max_diff 1.

For 6k GIF image set, this improves compression by:
4% at quality 0
0.05% at quality 75

Benefits the MovingThumbnailer test videos too.

Change-Id: I6838ce864d41e1e65311d26b9b8115a12390a253
2016-03-31 23:07:41 -07:00
Urvang Joshi
ff0a94beda WebPAnimEncoder lossy: ignore small pixel differences for frame rectangles.
This way we can ignore some noisy pixels and get tighter frame
rectangles.

Some results:
- Correctness:
Tested that anim_diff reports all images are identical for lossless, and
similar min_psnr value for lossy and mixed modes.

Also checked output images visually to make sure there weren't any
obvious kinks.

- Compression:
A very tiny improvement for 6000 image GIF set we have (0.03%) for lossy
and mixed mode. For some of these images, frames get dropped
automatically as they have a very small diff from previous frame.

10 images from test_video_frames_png show a clear improvement in
compression though. This CL leads to 7 out of 9 lossy WebPs getting
smaller -- for one of them, this leads to a higher quality being picked
(as that’s still < 150 KB).

Change-Id: If539b9e77e1375aa15edc8f926933593a9865f1c
2016-03-31 20:58:42 -07:00
Pascal Massimino
6d8c07d375 Merge "WebPDequantizeLevels(): use stride in CountLevels()" 2016-03-31 07:49:56 +00:00
Pascal Massimino
d96fe5e079 WebPDequantizeLevels(): use stride in CountLevels()
follow-up for patch #333712

Change-Id: I3f85e3fce9e1c9d1a862a14ba56134882807718f
2016-03-31 00:08:51 -07:00
James Zern
ec1b2407a4 WebPPictureImport*: check output pointer
fixes crash with NULL output pointer in calls to simple encode api
(WebPEncodeRGB, etc.)

Change-Id: I91e7a1c0e070ea842b0a2a4ac54e981cac8629bf
2016-03-25 18:21:13 -07:00
Pascal Massimino
c07687699b Merge "Revert "Re-enable encoding of alpha plane with color cache for next release."" 2016-03-25 07:29:07 +00:00
James Zern
41f14bcbc5 WebPPictureImport*: check src pointer
fixes crash with NULL source pointer in calls to simple encode api
(WebPEncodeRGB, etc.)

Change-Id: I706d670c80298da5176aaa5ba0eb2238dd71a8f0
2016-03-24 22:52:01 -07:00
Pascal Massimino
64eed38779 Pass stride parameter to WebPDequantizeLevels()
and also pass 'VP8Io* io' extra param to VP8DecompressAlphaRows()

This is somehow in preparation for some memory optimizations in
the 'cropping' case. For now, only the easy crop_bottom case is
optimized.

Change-Id: Ib54531ba057bf62b98422dbb6c181dda626c72c2
2016-03-18 15:36:58 +01:00
Pascal Massimino
97934e2447 Revert "Re-enable encoding of alpha plane with color cache for next release."
This avoids generating file that would trigger a decoding bug
found in 0.4.0 -> 0.4.3 libwebp versions.

This reverts commit 6ecd72f845.

Change-Id: I4667cc8f7b851ba44479e3fe2b9d844b2c56fcf4
2016-03-18 11:01:54 +01:00
Pascal Massimino
e88c4ca013 fix -m 2 mode-cost evaluation (causing partition0 overflow)
The mode's bits were not taken into account, which is ok for most of cases.
But in case of super large image, with 'easy' content, their overhead starts
mattering a lot and we were omitting to optimize for these.
Now, these mode bits have their own lambda values associated, limiting
the jerkiness. We also limit (for -m 2 only) the individual number of bits
to something that will prevent the partition 0 overflow.

removed the I4_PENALTY constant, which was a rather crude approximation.
Replaced by some q-dependent expression.

fixes issue #289

Change-Id: I956ae2d2308c339adc4706d52722f0bb61ccf18c
2016-03-11 20:34:45 +01:00
Pascal Massimino
4562e83dc2 Merge "add extra meaning to WebPDecBuffer::is_external_memory" 2016-03-09 08:24:19 +00:00
Pascal Massimino
abdb109f3b add extra meaning to WebPDecBuffer::is_external_memory
If value is '2', it means the buffer is a 'slow' one, like GPU-mapped memory.

This change is backward compatible (setting is_external_memory to 2
will be a no-op in previous libraries)

dwebp:  add flags to force a particular colorspace format
new flags is:
   -pixel_format {RGB,RGBA,BGR,BGRA,ARGB,RGBA_4444,RGB_565,
                  rgbA,bgrA,Argb,rgbA_4444,YUV,YUVA}
and also,external_memory {0,1,2}
These flags are mostly for debuggging purpose, and hence are not documented.

Change-Id: Iac88ce1e10b35163dd7af57f9660f062f5d8ed5e
2016-03-09 09:00:13 +01:00
James Zern
875aec7044 enc_neon,cosmetics: break long comment
Change-Id: I88dff0271fef1cc6dd5888572bfe0f09f467b028
2016-03-08 23:33:21 -08:00
James Zern
71e856cf84 GetMBSSIM,cosmetics: fix alignment
Change-Id: I884b3361484b48917fa4cba33cd1217ac51685f9
2016-03-08 23:26:10 -08:00
Pascal Massimino
a90edffb7e fix missing 'extern' for SSIM function in dsp/
Change-Id: Id8143120f01065dc088f4e90bd930f8ea7c3ae5a
2016-03-08 10:27:46 -08:00
Pascal Massimino
423ecaf484 move some SSIM-accumulation function for dsp/
This is in preparation for some SSE2 code.

And generally speaking, the whole SSIM code needs some
revamp: we're not averaging the SSIM value at each pixels
but just computing the overall SSIM value once, for the whole
plane. The former might be better than the latter.

Change-Id: I935784a917f84a18ef08dc5ec9a7b528abea46a5
2016-03-08 07:50:09 +01:00
Pascal Massimino
f08e66245a Merge "Fix FindClosestDiscretized in near lossless:" 2016-03-04 09:34:16 +00:00
James Zern
0d40cc5ea3 enc_neon,Disto4x4: remove an unnecessary transpose
based on the sse2 change in:
9960c31 Remove an unnecessary transposition in TTransform.

~9-10.5% faster at the function-level, < 1% overall

Change-Id: I44413369b230b250fb0dbc51ff2f17cfeda609b7
2016-03-03 16:18:59 -08:00
Marcin Kowalczyk
e8feb20e39 Fix FindClosestDiscretized in near lossless:
- The result is now indeed closest among possible results for all inputs, which
  was not the case for bits>4, where the mapping was not even monotonic because
  GetValAndDistance was correct only if the significant part of initial fit in
  a byte at most twice.

- The set of results for a larger number of bits dropped is a subset of values
  for a smaller number of bits dropped. This implies that subsequent
  discretizations for a smaller number of bits dropped do not change already
  discretized pixels, which improves the quality (changes do not accumulate)
  and compression density (values tend to repeat more often).

- Errors are more fairly distributed between upwards and downwards thanks to
  bankers’ rounding, which avoids images getting darker or lighter in overall.

- Deltas between discretized values are more repetitive. This improves
  compression density if delta encoding is used.

Also, the implementation is much shorter now.

Change-Id: I0a98e7d5255e91a7b9c193a156cf5405d9701f16
2016-03-02 12:47:22 +01:00
James Zern
a6f23c49b2 Merge "AnimEncoder: Support progress hook and user data." 2016-02-20 04:25:57 +00:00
James Zern
a5193774b0 Merge "Near lossless feature: fix some comments." 2016-02-20 03:51:15 +00:00
Urvang Joshi
da98d31ced AnimEncoder: Support progress hook and user data.
Pass them along to internal 'pic' object, so that progress can be reported back
and user data can also be inspected.

Change-Id: Idb5d0d4a76d07283d704a86c5892e1ad7bda09fa
2016-02-19 19:50:30 -08:00
Urvang Joshi
3335713169 Near lossless feature: fix some comments.
Change-Id: I2c5fc2a3b3fe5123d66b42bf148e361b4862dfb9
2016-02-19 19:26:39 -08:00
James Zern
0beed01aa5 cosmetics: fix indent after 2f5e898
2f5e898 fix multiple allocation for transform buffer

Change-Id: Ied5c89c0040671e2eddf23c8b7a78e0d817dd18e
2016-02-19 19:22:34 -08:00
Pascal Massimino
6753f35cac Merge "FTransformWHT optimization." 2016-02-19 09:38:04 +00:00
Vincent Rabaud
6583bb1a42 Improve SSE4.1 implementation of TTransform.
SSE4.1 is slower than the SSE2 implementation and this seems to
be due to a slow _mm_loadl_epi64 implementation by gcc
(hence a bug with my gcc 4.8) and a very slow _mm_hadd_epi32. Both
got confirmed by IACA and experiments.

Change-Id: I05607f66b7ccd8f4f42e000693aea583ffd5768f
2016-02-19 09:11:53 +01:00
Vincent Rabaud
7561d0c338 FTransformWHT optimization.
Data is packed sooner in the functions.

Change-Id: I018cfeca43f015ac755c7f209f9a97984cc0517b
2016-02-18 17:44:05 +01:00
Pascal Massimino
7ccdb734c2 fix indentation after patch #328220
Change-Id: Iccfcf4deaed6b383b9f80ae84b5b0575a4e94b5f
2016-02-18 15:14:41 +01:00
Pascal Massimino
6ec0d2a946 clarify the logic of the error path when decoding fails.
Change-Id: I2f86751ddafa4708dac3ffc9d6ec1f156e027a83
2016-02-18 14:58:18 +01:00
Vincent Rabaud
8aa352b256 Merge "Remove an unnecessary transposition in TTransform." 2016-02-18 08:15:10 +00:00
James Zern
db86088426 Merge "remove useless #include" 2016-02-17 23:17:12 +00:00
Vincent Rabaud
9960c31685 Remove an unnecessary transposition in TTransform.
Change-Id: Ib715c2d5ba659cb2db9c6832875ba508cc2fca3e
2016-02-17 21:41:28 +01:00
Vincent Rabaud
6e36b51188 Small speedup in FTransform.
It removes two _mm_unpacklo_epi32 and two _mm_sub_epi16.

Change-Id: Icdf86259f796ba855d1cda5e9c0e99cb396cb351
2016-02-17 21:26:36 +01:00
Pascal Massimino
2b4fe33e00 Merge "fix multiple allocation for transform buffer" 2016-02-17 07:27:23 +00:00
Pascal Massimino
2f5e8986cf fix multiple allocation for transform buffer
We were not updating the current_width_, which is usually
not a problem, unless we use Delta Palette with small number
of colors
-> Addressed this re-entrancy problem by checking we have
enough capacity for transform buffer.

The problem is not currently visible, until we restrict
the number of gradient used in delta-palette to less than 16.
Then the buffers have different current_width_ and the problem
surfaces.

Change-Id: Icd84b919905d7789014bb6668bfb6813c93fb36e
2016-02-17 06:14:39 +01:00
Vincent Rabaud
bf2b4f114f Regroup common SSE code + optimization.
The transpose refactoring will help removing a transpose in a
later CL.

The horizontal add function helps removing a _mm_sad_epu8 in DC8uv
=> the latency/throughput went from 29/25 to 23/19

Change-Id: I5f3dfd4aad614eb079b1e83631e6a7cef49a3766
2016-02-16 18:34:34 +01:00
Nico Weber
3ef1ce98b9 yuv_sse2: fix -Wconstant-conversion warning
'implicit conversion from 'int' to 'short' changes value from 33050 to
-32486'

original patch:

https://codereview.chromium.org/1657313003/

Make libwebp build with -Wconstant-conversion from newer clangs.

After http://llvm.org/viewvc/llvm-project?rev=259271&view=rev, clang
points out that _mm_set1_epi16(33050) causes an overflow in the short
argument to _mm_set1_epi16().  Since there's no version that takes an
unsigned short, add an explicit cast to tell the compiler that this is
intentional.

No behavior change.

Change-Id: I6b4e3401b15cfbcc895f9e81b5c2dc59d43ffb9b
2016-02-02 14:52:11 -08:00
James Zern
ab3c2583aa anim_encode,DefaultEncoderOptions: init verbose
default to disabled

broken since:
c13245c AnimEncoder: Add a GetError() method.

Change-Id: I51ccb85d5df338570512ec1d7430ad3229f93a9f
2016-02-01 17:00:19 -08:00
Pascal Massimino
75f4af4d54 remove useless #include
Change-Id: Id34f12ec94d8be6853fabd67609a6006ac99f152
2016-01-25 09:34:10 -08:00
Pascal Massimino
6c1d763119 avoid Yoda style for comparison
Change-Id: I8ff9f96951e5e8a619f7132455dd281cbf91aa4d
2016-01-15 23:52:29 -08:00
Vincent Rabaud
8ce975ac82 SSE optimization for vector mismatch.
Change-Id: I564b822033b59d86635230f29ed6197e306a2c4f
2016-01-07 18:23:45 +01:00
Pascal Massimino
7e7b6ccc7f faster rgb565/rgb4444/argb output
SSE2 and NEON implementation.

Change-Id: I342a1c3d84937b8497f0aaecb7ce9bdb7f50296b
2015-12-17 23:38:58 -08:00
James Zern
71100500a8 bump version to 0.5.0
libwebp{,decoder} - 0.5.0
libwebp libtool - 6.0.0
libwebpdecoder libtool - 2.0.0

mux/demux - 0.3.0
libtool - 2.0.0

Change-Id: I5346d13eb827fb5890efbb63ff3f28cea9d0c55f
2015-12-17 19:45:14 -08:00
James Zern
d48e427b1d Merge "demux: accept raw bitstreams" 2015-12-17 22:52:10 +00:00
James Zern
99a01f4f8b Merge "Unify some entropy functions." 2015-12-17 22:35:29 +00:00
James Zern
4b025f10f7 Merge "configure: disable asserts by default" 2015-12-17 22:28:37 +00:00
James Zern
92cbddf89c Merge "fix PrintBlockInfo()" 2015-12-17 21:00:57 +00:00
Vincent Rabaud
ca509a3362 Unify some entropy functions.
The code and logic is unified when computing bit entropy + Huffman cost.

Speed-wise, we gain 8% for lossless encoding.
Logic-wise, the beginning/end of the distributions are handled properly
and the compression ratio does not change much.

Change-Id: Ifa91d7d3e667c9a9a421faec4e845ecb6479a633
2015-12-17 17:00:08 +01:00
Pascal Massimino
367bf903b3 fix PrintBlockInfo()
... which has gone out of sync since the last block-cache layout change.

Change-Id: Ic441ec07b0198b508ce3fd34ab582cb60b1daabc
2015-12-17 15:47:25 +01:00
Pascal Massimino
b0547ff0b4 move back common constants for lossless_enc*.c into the .h
Change-Id: I11bc979db691f6518d85e2e1c3ac7f05d69681b0
2015-12-17 15:11:56 +01:00
Lode Vandevenne
fb4c7832f1 lossless: simpler alpha cleanup preprocessing
setting all transparent pixels to black rather than the "flatten" method.

0.3% smaller filesize on the 1000 PNGs if alpha cleanup is used (before: 18685774, after: 18622472)

Change-Id: Ib0db9e7ccde55b36e82de07855f2dbb630fe62b1
2015-12-17 15:04:50 +01:00
Vincent Rabaud
47ddd5a4cc Move some codec logic out of ./dsp .
The functions containing magic constants are moved out of ./dsp .
VP8LPopulationCost got put back in ./enc
VP8LGetCombinedEntropy is now unrefined (refinement happening in ./enc)
VP8LBitsEntropy is now unrefined (refinement happening in ./enc)
VP8LHistogramEstimateBits got put back in ./enc
VP8LHistogramEstimateBitsBulk got deleted.

Change-Id: I09c4101eebbc6f174403157026fe4a23a5316beb
2015-12-17 07:03:25 +00:00
James Zern
357f455dec yuv_sse2: fix 32-bit visual studio build
src\dsp\yuv_sse2.c : C2719: 'in': formal parameter with
  __declspec(align('16')) won't be aligned
src\dsp\yuv_sse2.c : C2719: 'out': formal parameter with
  __declspec(align('16')) won't be aligned

Change-Id: Ifd79e33b35c70748faff19cd64eba4a8ffce5a5a
2015-12-16 15:04:36 -08:00