When re-initializing a bit writer, we could set invalid values because
the bit writer was not big enough.
Change-Id: Id25ab6712603245a5a12d5f4a86fe35a9a799a5d
This is to prepare the inclusion of <windows.h>
FrameRect => FrameRectangle
CLIP_MASK => CLIP_8b_MASK
Change-Id: Ia4b1fa4ac06137b4102c91e232206a1fb7159ce0
The patch 21735e0 introduced a bug where a goto path was not testing
the eos_ state. If this happened just before a row_sync, a SaveState()
would be called that would store the eos_ state as '1' till the end
of the loop. This usually was not a problem, except for the very last
chunk where we disable the incremental decoding altogether (we have all
the data). The termination tests were then going wrong.
The fix is to add a proper eos_ test and avoid falling in this inconsistent
state.
(21735e06f7)
BUG=webp:332
Change-Id: Ib16773aee26bfd068fbf4e9db3d2313bd978b269
Before, the color cache size was chosen optimally for LZ77 and
the same value was used for RLE. Now, we optimize its value
taking both LZ77 and RLE into account.
Unfortunately, that comes with a small CPU hit.
Change-Id: I6261f04af78cf0784bb8e8fc4b4af5f566a0e071
Between each iteration we keep track of the previously found
potential merge hence less work to do.
Change-Id: I2b6237447e79443516a6111727d96c24f10bd98a
It was a bad implementation of a Lehmer random number generator
(the saturation was done wrong and mostly & was used instead of % .....).
That lead to "for" loop stuck with the same values given a specific seed,
hence wasted "for" loops (e.g. seed getting at 374988608 and modulo of 64
later leads to 0 even when updating the seed with the old formula).
As the "for" loops now always return a proper pair of histograms, their
number can greatly be reduced, hence a speedup.
Change-Id: I9f5b44d66cc96fd4824189d92276c3756c8ead5b
This code is ultra-critical for lossless decoding, especially on ARM.
The extra call VP8LIsEndOfStream() was causing unnecessary slow-down.
Now, we check for bitstream-end separately in the main loop.
Change-Id: I739b5d74cc29578e2b712ba99b544fd995ef0e0d
Currently, none are available. If WEBP_HAVE_SSE2 eventually works,
we'll have to refine this conditionals.
BUG=webp:261
Change-Id: Ibc63ee1c013f2a4169eeb85cc8b6317b6420c2ad
Previously, the stochastic method for histogram
combination could finish in a greedy way
if the number of iterations to perform so was smaller.
Except that another greedy combination was performed
afterwards ... hence wasted CPU in some cases.
Change-Id: Ic0f26873e6dc746679486b91cb35d73efee91931
The initial re-writing of this part of the code with intervals
had to be done with a complex logic (mostly intervals with a
lower and upper bound, not a constant value like now) to properly
deal with the inefficiencies of the then LZ77 algorithm.
The improvements made to LZ77 since, now allow for a simpler logic.
There were also small errors in the interval insertion logic
that lead to small inefficiencies (hence a slightly better
compression rate).
Change-Id: If079a0cafaae7be8e3f253485d9015a7177cf973
Documentation says: "if kmin == 0, then key-frame insertion is disabled;
and if kmax == 0, then all frames will be key-frames."
Reading this, you'd expect that if kmax == 0, then with any kmin <= 0
all frames will be key-frames. But actually the kmin <= 0 test is caught
first and you get the opposite (no keyframes but the first). You'd have
instead to set kmax == 0 and any value kmin > 0, which is absolutely
counter-intuitive (reversing order).
Moreover kmax == 1 has no valid kmin (kmin == 1 conflicts with the
`kmax > kmin` rule and kmin == 0 conflicts with `kmin >= kmax / 2 + 1`).
So it should be considered an exception too.
Instead I propose this new logic:
- kmax == 1 means that all frames are keyframes (you are explicitly
requesting a keyframe every 1 frame at most, i.e. all frames).
- kmax == 0 means no keyframes (you ask for a keyframe every 0 frames,
i.e. never).
This is more "logical" language-wise, and also does not involve any
conflicts about what if both kmax and kmin are 0, since now a single
property value is meaningful for the 2 exceptional cases.
Change-Id: Ia90fb963bc26904ff078d2e4ef9f74b22b13a0fd
(cherry picked from commit 2dc0bdcaee)
Compile with XCode, it appears quite slower than the C-version,
especially for arm64.
Change-Id: Ic46dba184a36be454fef674129d2f909003788fc
(cherry picked from commit 4f3e3bbd44)
Documentation says: "if kmin == 0, then key-frame insertion is disabled;
and if kmax == 0, then all frames will be key-frames."
Reading this, you'd expect that if kmax == 0, then with any kmin <= 0
all frames will be key-frames. But actually the kmin <= 0 test is caught
first and you get the opposite (no keyframes but the first). You'd have
instead to set kmax == 0 and any value kmin > 0, which is absolutely
counter-intuitive (reversing order).
Moreover kmax == 1 has no valid kmin (kmin == 1 conflicts with the
`kmax > kmin` rule and kmin == 0 conflicts with `kmin >= kmax / 2 + 1`).
So it should be considered an exception too.
Instead I propose this new logic:
- kmax == 1 means that all frames are keyframes (you are explicitly
requesting a keyframe every 1 frame at most, i.e. all frames).
- kmax == 0 means no keyframes (you ask for a keyframe every 0 frames,
i.e. never).
This is more "logical" language-wise, and also does not involve any
conflicts about what if both kmax and kmin are 0, since now a single
property value is meaningful for the 2 exceptional cases.
Change-Id: Ia90fb963bc26904ff078d2e4ef9f74b22b13a0fd
this avoids duplicates between these trees and dsp/, e.g., enc/tree.c,
dec/tree.c, making pulling the whole library source tree into one target
possible
BUG=webp:279
Change-Id: I060a614833c7c24ddd37bf641702ae6a5eef1775
We can switch at run-time between the standard GetCoeffs() critical
function, that uses a fast variant of VP8GetBit().
However, some platforms have slow instructions that make standard
VP8GetBit() slow. GetCoeffs() is the right level of branching to
switch to GetCoeffsAlt() that avoids these slow instructions in some
not-frequent cases.
Next patch will upgrade VP8GetBit() to use clz, after this one
is proved to be neutral speed-wise.
Change-Id: Ia6cef5de9de6131574d2202bbc0bea8559c9b693
vmlal_u8() is prone to overflow during the accumulation.
There was a mismatch happening at low q mostly. Because in this
case the distortion is important and the accumulated sum was
later than 16bit-unsigned.
Change-Id: I1a08a2f744bcdf0b26647e61b9ee92a0c2e28fe8
This makes the structure more generic, without the hard-coded
internal structure.
This is a borderline incompatible ABI change, even if WebPIDecoder structure
is opaque.
Change-Id: I518765c3f76fc17a136cef045a5a8aa70ed70e85
30% faster on x86, 5% faster on N5.
New generic function: WebPLog2FloorC()
This function is called as fallback for BitsLog2Floor() when there's
no clz() available.
Change-Id: Ica15c6092112e514c0e200fab89c434de48d4b19
This is meant to be used for run-time detection of slow platforms
regarding instructions like pshufb and bsr.
Adapted from libvpx patch: https://chromium-review.googlesource.com/#/c/367731
Change-Id: I2c22fbb9aae699d87a041393ba1ad5f1f21ff640
and 15% faster MultARGBRow()
by switching to formulae:
X / 255 = (X + 1 + (X >> 8)) >> 8 for any 16bit value X.
(X / 255 + .5) = (XX + (XX >> 8)) >> 8, with XX = X + 128
Change-Id: Ia4a7408aee74d7f61b58f5dff304d05546c04e81
The previous optimization was performing dichotomy on a function that
is anything in practice, hence a bit of randomness.
Also, two magic constants were used, one for an extra constant cost,
one for an extra linear cost. Both values/models were empirical.
A brute force search for the best cache size is now performed.
To have less CPU impact, a speed optimization is also made by not
inserting a value again and again.
This makes sense but it's also the most common case of when LZ77 is
useful hence an overall improvement sometimes.
Change-Id: I57de5750ad2313b2feecbcd15cd6e4feeb98e5c8
- 12/13/2016: version 0.5.2
This is a binary compatible release.
This release covers CVE-2016-8888 and CVE-2016-9085.
* further security related hardening in the tools; fixes to
gif2webp/AnimEncoder (issues #310, #314, #316, #322), cwebp/libwebp (issue
#312)
* full libwebp (encoder & decoder) iOS framework; libwebpdecoder
WebP.framework renamed to WebPDecoder.framework (issue #307)
* CMake support for Android Studio (2.2)
* miscellaneous build related fixes (issue #306, #313)
* miscellaneous documentation improvements (issue #225)
* minor lossy encoder fixes and improvements
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJYWfopAAoJEPnD1r24Iytd0gAQALhTSEjJVmKfHxyPNDduc3kn
QeiVaVwPiOS/a266+ZnWHzCvkR3zgqZxNlyKzRty378gM8/P7r2dMCmfdnVFbF4O
a7M1lld9yYldNpAxvHDnY9u2RzmRfVD1yYu27gv77uT7gR2IybQ81FHi1pn56tFA
2g4yHdrC2tXud22ZUb9Bgqe7YW06gWND4EmeJgxF38S98gdrtJla5rmlUcuEhbIl
SHpkbEgJX4nZxWggyCJ61/OxeEwwWBtI3kpSLkEqmCVSnFb7WBC7pITq59n8hg2U
SaYCfWGRJ/oQQvxUxuPYYtzq26dYOxd2vT9S1mcE1be9jMGxKp9vgE8jNflvtza1
wTPUajaPUjsTLAvFikQRo+34W9QxOKp9jCX9Be0V4wvBClfM13toBgKolzPGGUuo
zlcZ0/GgzwfQ+sD7bs/p/7ToiH+GejBUK7FUR8ZB7EHZrDynszSzEevx5SUzPWV3
1q4TyD5eclUOjb4S2yplcKp0kwkwtOA5ETboPzA+b8TQnfTFM3GP7fMoYvORbSZp
39/H5hi1bjlOE4m3mp3qqfR2DMWZlla7YNZiuuTEeY3ztrlqeakC2ma1Fhi6ZmbG
TrqmAaDTueRizry4E7Fr9sBw0mee14v/xcTFcDcSI1BRFclFc1KAw0ObzdaN2iEt
L5tjlqzH0XEH4fl5OnD3
=x+Y3
-----END PGP SIGNATURE-----
Merge tag 'v0.5.2'
libwebp-0.5.2
- 12/13/2016: version 0.5.2
This is a binary compatible release.
This release covers CVE-2016-8888 and CVE-2016-9085.
* further security related hardening in the tools; fixes to
gif2webp/AnimEncoder (issues #310, #314, #316, #322), cwebp/libwebp (issue
#312)
* full libwebp (encoder & decoder) iOS framework; libwebpdecoder
WebP.framework renamed to WebPDecoder.framework (issue #307)
* CMake support for Android Studio (2.2)
* miscellaneous build related fixes (issue #306, #313)
* miscellaneous documentation improvements (issue #225)
* minor lossy encoder fixes and improvements
* tag 'v0.5.2': (54 commits)
update ChangeLog
anim_util: quiet implicit conv warnings in 32-bit
jpegdec: correct ContextFill signature
Remove some errors when compiling the code as C++.
vwebp: clear canvas during resize w/o animation
tiffdec: restore libtiff 3.9.x compatibility
update NEWS
AnimEncoder: avoid freeing uninitialized memory pointer.
WebPAnimEncoder: If 'minimize_size' and 'allow_mixed' on, try lossy + lossless.
fix a potential overflow with MALLOC_LIMIT
bump version to 0.5.2
update AUTHORS & .mailmap
iosbuild.sh: add WebPDecoder.framework + encoder
AnimEncoder: Correctly skip a frame when sub-rectangle is empty.
Fix assertions in WebPRescalerExportRow()
fix a typo in WebPPictureYUVAToARGB's doc
systematically call WebPDemuxReleaseIterator() on dec->prev_iter_
doc: use two's complement explicitly for uint8->int8 conversion
Anim_encoder: correctly handle enc->prev_candidate_undecided_
WebPPictureDistortion(): free() -> WebPSafeFree()
...
Change-Id: I16bcf54af41ce8fad98d4fbc8aa1df58f338fc23
In GenerateCandidates(), when candidate_ll->evaluate_ and
candidate_lossy->evaluate_ are both true, if lossless encoding
exits on error, candidate_ll->evaluate_ would not be correctly
reset. This will cause freeing uninitialized memory pointer in
SetFrame().
BUG=webp:322
Change-Id: I481b49a186e4fa3607ce71b4543a481083edf444
(cherry picked from commit 3ebe1c0003)
This improves compression by ~5% at default quality.
If only 'allow_mixed' is on (but 'minimize_size' isn't), we continue to
use a heuristic to try one of the two or both.
Change-Id: Ia573a73ea26ad25f9debff759eed69d2b0449e82
(cherry picked from commit 3f4042b52a)
In GenerateCandidates(), when candidate_ll->evaluate_ and
candidate_lossy->evaluate_ are both true, if lossless encoding
exits on error, candidate_ll->evaluate_ would not be correctly
reset. This will cause freeing uninitialized memory pointer in
SetFrame().
BUG=webp:322
Change-Id: I481b49a186e4fa3607ce71b4543a481083edf444
after:
fbba5bc optimize predictor #1 in plain-C For some reason, gcc has hard
time inlining this one...
Change-Id: I2e2416593acd4c9d14958d8757bfd284d999100b
For some reason, gcc has hard time inlining this one...
Also optimize predictor #0 and #1 for encoding, so we don't have to
call the generic pointers VP8LPredictors[...]
Change-Id: I1ff31e3b83874b53f84fe23487f644619fd61db9
Set enc->prev_candidate_undecided_ as 0 when a frame is not chosen
as a possible keyframe, so that the dispose method can be
dispose-to-background.
Change-Id: If2899f5dbc06fb53705fb8240072ab6440a6de12
(cherry picked from commit 29fedbf58b)
When try_both_modes=0 (that is: -m 0 or -m 1), and the mode is i4,
we were still sometimes falling back to (unexplored, uninitialized) i16 mode,
which resulted in a enc/dec mismatch.
This was mainly occurring for large images (when bit_limit is low enough)
We disable the fall-back by disabling bit_limit using a large MAX_COST threshold.
Change-Id: I0c60257595812bd813b239ff4c86703ddf63cbf8
(cherry picked from commit 0a3838ca77)
the min-distortion was quite too low. And we were also
considering the fully skipped macroblocks (nz=0) in the stats.
We need to have at least *some* non-zero dc coeffs (nz=0x100XXXX).
Fix also two typos in StoreMaxDelta: the v0/v1 comparison was wrong,
and the DCs[] coeffs are actually already in ZigZag order.
Change-Id: I602aaa74b36f7ce80017e506212c7d6fd9deba1f
(cherry picked from commit e4cd4daf74)
some multiplies here and there needed some extra checks
and error reporting. Even if width * height is guaranteed
to be < 2**32, we were multiplying by num_channels and
triggering a 32b overflow.
Some multiplies were not using size_t or uint64_t, additionally.
Change-Id: If2a35b94c8af204135f4b88a7fd63850aa381bbf
(cherry picked from commit 1c36440094)
max_i4_header_bits_ could drop to zero for difficult image and trigger
a loop. Surprisingly, StatLoop() didn't have this bug.
Change-Id: Idc0f9eadef30a2b2f02041b994f25def30901e36
(cherry picked from commit 21e7537abe)
Pick the mode with the smallest alpha.
It only affects m0, in which case the mode decision is not re-examined
later in VP8Decimate(). Tests on some natural content png images show
PSNR increase as well as visual quality improvement.
Change-Id: Iea997e718cd7477160fa05eb7cfb35f4cec2fa9a
(cherry picked from commit 1377ac2ec1)
Average3 created a slowdown of 1-2% in lossless decoding.
Average4 created a slowdown of 2-3% in lossless decoding.
Change-Id: Ic2e62cdd83fc897887ec2bf41ea7cadbada84fe5
...instead of the pointers stored in the array.
Should be faster (inlined) and safer.
Also: suffix explicitly the functions with _SSE2
Change-Id: Ie7de4b8876caea15067fdbe44abfedd72b299a90
Before, a first thread could enter VP8LDspInitSSE2, set
VP8LPredictorsAdd to an SSE2 version BEFORE another thread
would do the memcpy from VP8LPredictorsAdd to VP8LPredictorsAdd_C
thus leading to a C version actually being the SSE2 one (which
would then create an infinite recursion in the SSE2 predictors
at execution).
Change-Id: I224f4ceab31d38f77a1375a7e2636a6014080e3a
Benchmarks from vrabaud@:
8BIT/GRAY corpus speed: faster: -4.3 % , corpus size: unchanged
skal/sources_png_skal corpus speed: faster: -5.2 % , corpus size: unchanged
images/png_rgb corpus speed: faster: -5.1 % , corpus size: unchanged
images/lpcb corpus speed: unchanged, corpus size: unchanged
images/png_big corpus speed: faster: -1.7 % , corpus size: unchanged
images/png_doc corpus speed: unchanged, corpus size: unchanged
images/png_1bit corpus speed: faster: -1.2 % , corpus size: unchanged
images/jpeg_small corpus speed: unchanged, corpus size: unchanged
images/icip_core1 corpus speed: unchanged, corpus size: unchanged
images/png_gray corpus speed: faster: -2.5 % , corpus size: unchanged
images/jpeg_high_quality corpus speed: faster: -4.0 % , corpus size: unchanged
images/jpeg corpus speed: faster: -2.3 % , corpus size: unchanged
images/png_translucent corpus speed: faster: -2.8 % , corpus size: unchanged
images/gif corpus speed: faster: -1.4 % , corpus size: unchanged
images/png_opaque corpus speed: faster: -2.8 % , corpus size: unchanged
images/png_rgb_opaque corpus speed: unchanged, corpus size: unchanged
images/png_indexed corpus speed: faster: -2.0 % , corpus size: unchanged
images/all corpus speed: faster: -1.5 % , corpus size: unchanged
images/png_small corpus speed: unchanged, corpus size: unchanged
images/png corpus speed: unchanged, corpus size: unchanged
images/gif_still corpus speed: faster: -1.6 % , corpus size: unchanged
Change-Id: I69fe11baa188c5d32cbc77a84b8c0deae13d792b
avoiding triplets of data should make it easier to write SSE2 versions.
FilterRow() can now filter all input in one single pass
-> conversion is 15-20% faster (but still overall slow compared to -pre 0)
Change-Id: I14c3215e672fdecde7ec80394e814bdc7445019f
When try_both_modes=0 (that is: -m 0 or -m 1), and the mode is i4,
we were still sometimes falling back to (unexplored, uninitialized) i16 mode,
which resulted in a enc/dec mismatch.
This was mainly occurring for large images (when bit_limit is low enough)
We disable the fall-back by disabling bit_limit using a large MAX_COST threshold.
Change-Id: I0c60257595812bd813b239ff4c86703ddf63cbf8
avoids int rollover when working with large input
BUG=webp:312
Change-Id: I6ad9f93b6c4b665c559bff87716a7b847f66a20d
(cherry picked from commit 342e15f0ce)
avoids int rollover when working with large input
BUG=webp:312
Change-Id: I2881bec2884b550c966108beeff1bf0d8ef9f76b
(cherry picked from commit 1147ab4ee7)
avoids int rollover when working with large input
BUG=webp:312
Change-Id: I693cbb295df9cf94aa89294b19c0496bdbe84d18
(cherry picked from commit de9fa5074e)
avoids int rollover when working with large input
BUG=webp:312
Change-Id: I3d7b689be8d5751248a82d1021243d80d3f67203
(cherry picked from commit deb1b83199)
the min-distortion was quite too low. And we were also
considering the fully skipped macroblocks (nz=0) in the stats.
We need to have at least *some* non-zero dc coeffs (nz=0x100XXXX).
Fix also two typos in StoreMaxDelta: the v0/v1 comparison was wrong,
and the DCs[] coeffs are actually already in ZigZag order.
Change-Id: I602aaa74b36f7ce80017e506212c7d6fd9deba1f
Roughly, if both the source and the reference areas are
darker too dark (R/G/B <= ~6), they are ignored.
One caveat: SSIM calculation won't work for U/V planes,
which are 128-centered and not related to luminance.
But WebPPlaneDistortion() enforces the conversion to RGB,
if needed.
Change-Id: I586c2579c475583b8c90c5baefd766b1d5aea591
Make WebPPictureDistortion() only compute distortion on A/R/G/B planes, not Y/U/V(A).
(not just for SSIM, but PSNR too).
This is to avoid problems with using SSIM on U/V channels.
If Y/U/V distortion is needed, one can always use WebPPlaneDistortion() individually.
Change-Id: If8bc9c3ac12a8d2220f03224694fc389b16b7da9
When compiling as experimental, WEBP_EXPERIMENTAL_FEATURES
would not be defined because the header defining it would
not be included.
Hence runtime errors in debug mode when running:
./cwebp -lossles whatever
...
Error! Cannot encode picture as WebP
Error code: 4 (INVALID_CONFIGURATION: configuration is invalid)
(detail: WebPConfig would have a random value set for
delta_palettization as config.c does not consider
it to exist.)
Change-Id: I41761cffe81a971130ed514b195a73d1c6dac1b7
* prevent 64bit overflow by controlling the 32b->64b conversions
and preventively descaling by 8bit before the final multiply
* adjust the threshold constants C1 and C2 to de-emphasis the dark
areas
* use a hat-like filter instead of box-filtering to avoid blockiness
during averaging
SSIM distortion calc is actually *faster* now in SSE2, because of the
unrolling during the function rewrite.
The C-version is quite slower because still un-optimized.
Change-Id: I96e2715827f79d26faae354cc28c7406c6800c90
If a small hash map can be used, use it to avoid binary search.
This fist hash function that is tried works with the previous
use case of having indexed data in green.
Change-Id: I2f91cec5f3ca7e9c393fd829e69e09bab74f4e7c
some multiplies here and there needed some extra checks
and error reporting. Even if width * height is guaranteed
to be < 2**32, we were multiplying by num_channels and
triggering a 32b overflow.
Some multiplies were not using size_t or uint64_t, additionally.
Change-Id: If2a35b94c8af204135f4b88a7fd63850aa381bbf
The most common conditions are re-ordered and cached.
iter_min was recently introduced to make sure enough iterations
are made in cases where there are many matches (mostly uniform regions).
Now that those are properly analyzed, it becomes useless.
Change-Id: Id3010ee4ec66b84d602fcb926f91eb9155ad27f4
-Skip examining quantized levels that are too high.
-Calculate last_pos_cost only when needed.
Encoding speed for m6 is increased by about 3%;
Compression performance is neutral.
Change-Id: I8af70b049587cca0375d9b3eb00479ec7c0c842a
max_i4_header_bits_ could drop to zero for difficult image and trigger
a loop. Surprisingly, StatLoop() didn't have this bug.
Change-Id: Idc0f9eadef30a2b2f02041b994f25def30901e36
Pick the mode with the smallest alpha.
It only affects m0, in which case the mode decision is not re-examined
later in VP8Decimate(). Tests on some natural content png images show
PSNR increase as well as visual quality improvement.
Change-Id: Iea997e718cd7477160fa05eb7cfb35f4cec2fa9a
SSIM results are incompatible with previous version!
We're now averaging the SSIM value for each pixels instead of
printing a frame-level global SSIM value.
* Got rid of some old code
* switched to uint32_t for accumulation
* refactoring
SSIM calculation is ~4x faster now.
Change-Id: I48d838e66aef5199b9b5cd5cddef6a98411f5673
Having it architecture dependent resulted in an extra
function call of an extern function, hence no inlining and
a 5-10% impact on performance.
Change-Id: I0ff40d2d881edc76d3594213a64ee53097d42450
-print_psnr is now much faster because it doesn't use the SSIM code.
The SSIM speed-up and re-write will come later.
Change-Id: Iabf565e0a8b41651d8164df1266cfeded4ab4823
we don't need to centralize best_uv[] since target_uv[] and best_rgb_uv[]
are already centralized. The diff 'W' was just in the ~[-2,2] range, so
we can ignore the correction.
Overall speed-impact is not large, though. Around ~4% faster conversion.
Output with -pre 4 is expected to be slightly different
Change-Id: Ib59f033955577c49b084d0560108020f42d84102
also: remove the useless clipping in StoreGray()
For speed reason, the 'gray' plane was initialized with the same
value for 2x2 block. But in some cases (underlying camera noise, e.g.),
it could lead to instability during iteration, noise amplification,
and visible banding.
Using a precise (but slower) initialization solves the issue, and
since the convergence is faster, we might actually gain some speed.
Change-Id: I81c42101497e7096a8f60289d710f5a3bcb0ddea
We usually need at least 2 iterations to converge
(and usually not much more after that). Only 1 was not enough.
Change-Id: Iaf802ea81afa2596f4ba045c92f5eaff61623b7b
Set enc->prev_candidate_undecided_ as 0 when a frame is not chosen
as a possible keyframe, so that the dispose method can be
dispose-to-background.
Change-Id: If2899f5dbc06fb53705fb8240072ab6440a6de12
No need to find backward references for pixels in uniform regions
by looking at all pixels.
Only pixels at the same distance from the end need to be compared to.
Change-Id: I4f187e965f0667d3a929775726a412f7e69f6473
if src_{r,g,b} = 0, any value of src_a or dst_a such that
abs(src_a - dst_a) <= max_allowed_diff
was making the test pass, despite being very different-looking pixels.
The fix is to require same values for src_a and dst_a before attempting
an r/g/b comparison.
Change-Id: If3a55a229eab3904ed454f20065e49e35c39f25c
Constants are such that brute force is sometimes faster for some
data (mostly big images it seems).
Change-Id: I90aef536408683535e3b09ddfa2e77a9834038f6
Return key/index if the query is found, and -1 otherwise.
The benefit of this is to save a hashing computation.
Change-Id: Iff056be330f5fb8204011259ac814f7677dd40fe
reserve src/ for the code of the main libraries: libwebp, libwebpdemux
and libwebpmux
+ demote this library to internal only; i.e., don't install the header +
lib with make install
Change-Id: I8c9844db8f494be0fa0a2549a5b75b5cebcf666d
We add the following MSA optimized rescaling functions:
- RescalerExportRowExpand
- RescalerExportRowShrink
Change-Id: Ic1c76065423b02617db94cf0c22bb564219b36e6
We add the following MSA optimized color transform functions:
- TransformColor
- SubtractGreenFromBlueAndRed
Change-Id: Ib182d2b5faa7191f503ce70f0dfde0ac89402fd3
This improves compression by ~5% at default quality.
If only 'allow_mixed' is on (but 'minimize_size' isn't), we continue to
use a heuristic to try one of the two or both.
Change-Id: Ia573a73ea26ad25f9debff759eed69d2b0449e82
The decision is based on the variance between DC values of each
sub-4x4 block. This heuristic is rather ok for predicting whether
the 2nd transform (intra-16) is going to help or not.
The decision threshold varies with quality (=quantization).
It's only used for -m 0 and -m 1, where no full RD-opt is performed.
It actually makes these modes quite faster, with RD curve much
closer to the -m 2 mode.
Change-Id: I15f972db97ba4082cbd1dfd16bee3eb2eca701a8
We add the following MSA optimized encoder quantization functions:
- QuantizeBlock
- Quantize2Blocks
Change-Id: Ie32b442afa99eee62d2ef48942b41116a4e157d3
pre-allocating a sorted[] array for most common cases of small
alphabet size cuts a lot of traffic.
Change-Id: I73ff2f6e507f81b0b0bb7d9801a344aa4bcb038a
+ s/src_a/dst_a/
+ remove unnecessary (void) as expected_num_lines_out is used within the
function
Change-Id: Ic45f798ef22bd19eaabf1a0512d1cf8a201bb4b5