DitherRow() only checks this value, not 'skip_' so previously it was
uninitialized for these blocks.
Change-Id: I0f698b81854ee9d91edacb51c1e3bdab9cba96f2
similar to:
1ba61b0 enable NEON intrinsics in aarch64 builds
vtbl1_u8 is available everywhere but Xcode-based iOS arm64 builds, use
vtbl1q_u8 there.
performance varies based on the input, 1-3% on encode was observed
Change-Id: Ifec35b37eb856acfcf69ed7f16fa078cd40b7034
AnalyzeSubtractGreen constitutes about 8-10% of the comression CPU cycles.
Statistically, subtract-green is proved to be useful for most of the
non-palette compression. So instead of evaluating the entropy (by calling
AnalyzeSubtractGreen) apply subtract-green transform for the low-effort
compression.
This changes speeds up the compression at m=0 by 8-10% (with very slight loss
of 0.07% in the compression density).
Change-Id: I9797dc39437ae089716acb14631bbc77d367acf4
Speed up AnalyzeSubtractGreen by looping through the image pixel once to
compute the two histograms.
AnalyzeEntropy code cleanup.
Removed some 'if' conditions and pointer indirections inside pixel iterate loop.
Change-Id: Ia65e3033988ff67df8e3ecce19d6e34cfc76358e
Enable the WebP near-lossless feature by pre-processing the image to smoothen
the pixels.
On a 1000 PNG image corpus, for which WebP lossless (default settings) gets
25% compression gains, following is the performance of near-lossless feature
at various '-near_lossless' levels:
-near_lossless 90: 30% (very very high PSNR 54-60dB)
-near_lossless 75: 38% (very high PSNR 48-54dB)
-near_lossless 50: 45% (high PSNR 42-48dB)
-near_lossless 25: 48% (moderate PSNR 36-42dB)
-near_lossless 10: 50% (PSNR 30-36dB)
WebP near-lossless is specifically useful for discrete-tone images like
line-art, icons etc.
Change-Id: I7d12a2c9362ccd076d09710ea05c85fa64664c38
Some frames that were previously selected as key-frames were incorrectly
being reset to sub-frames.
Change-Id: Iee342dbb9a9aec144b8185c3b54ca56aa7038bfb
The 'inverse' variants are harder to parallelize, since
the result of filtering is used for prediction.
The 'direct' way is relatively easier.
The heavy bottleneck left for optimization is still GradientUnfilter()
Change-Id: I358008f492a887e8fff6600cb27857b18dee86e9
Simplify and speedup backward references for low-effort settings by evaluating
LZ77 references only. This change speeds up compression by 10-25% at lower
(q <= 25) quality range with a slight drop (0.2%) in the compression density.
Change-Id: Ibd6f03b1a062d8ab9191786c2a425e9132e4779f
Cleaup Near-lossless code
- Simplified and refactored the code.
- Removed the requirement (TODO) to allocate the buffer of size WxH and work
with buffer of size 3xW.
- Disabled the Near-lossless prr-processing for small icon images (W < 64 and H < 64).
Change-Id: Id7ee90c90622368d5528de4dd14fd5ead593bb1b
Reported here: https://code.google.com/p/webp/issues/detail?id=239
At the beginning of method 'DecodeImageData', pixels up to
'dec->last_pixel_' are assumed to be already cached. So, at the end of
previous call to that method also, that assumption should hold true.
Hence, we should cache all pixels up to 'src' regardless of 'src_last'.
This affects lossless incremental decoding only, as that is when
src_last and src_end differ.
Note: alpha decoding is implicitly incremental, as alpha decoding of
only the rows 'y_end - y_start' happens during FinishRow() call. So, this bug
affects alpha decoding in non-incremental decoding flow as well.
This bug was introduced in: https://gerrit.chromium.org/gerrit/#/c/59716.
Change-Id: Ide6edfeb2609b02aff701e1bd9fd776da0a16be0
SanitizeEncoderOptions() was changing kmin to 2 too, which resulted in a
bad state with kmin == kmax.
Change-Id: Ie7273f1949bac469e7e6c8efbc98b154caf6de0f
* use the same TFIX == YFIX precision (2bits)
* use int instead of float in LinearToGammaF()
output is visually equivalent. Code is a little faster.
Change-Id: Ie3cfebca351dbcbd924b3d00801d6523dca6981f
add additional return checks and asserts to avoid:
C6102: Using 'XXX' from failed function call ...
Change-Id: I51f5fa630324e0cd7b2d9fceefecb4f4021474b1
width / height are unsigned; fixes a warning with msvs /analyze:
C6340: Mismatch on sign: 'const unsigned int' passed as _Param_(4) when
some signed type is required in call to 'fprintf'.
Change-Id: I5f1fad4c93745baf17d70178a5e66579ccd2b155
check enc->argb_ to quiet an msvs /analyze warning:
C6387: 'enc->argb_+y*width' could be '0': this does not adhere to the
specification for the function 'memcpy'.
Change-Id: I87544e92ee0d3ea38942a475c30c6d552f9877b7
quiets an msvs /analyze warning, the cast is due to an earlier msvs
build warning.
C6297: Arithmetic overflow: 32-bit value is shifted, then cast to
64-bit value. Results might not be an expected value.
Change-Id: I0bb6cda57879f2fbd1e3515f6753a11bc08d14ac
and only use it on x86 / x64 where it's available.
has the side-effect of quieting a msvs /analyze warning:
C6001: Using uninitialized memory 'cpu_info'.
Change-Id: Iae51be3b22b2ee949cfc473eeea9fd9fb6b3c2cb
Disable costly TraceBackwards heuristic for computing the backward references
for low_effort (method=0) compression.
The TraceBackwards heuristic is already disabled for lower (q < 25) quality
range. Following is the compression data for 1000 image corpus for q >= 25.
This speeds up compression (q >= 25) by a factor of 2.5-3X with slight loss of
compression density (0.7% for lower quality range and 1.2% for higher qualities).
Change-Id: I256c9e2137c7de4083f423ea32ee12d3b0f46253
added new function CollectColorRedTransforms to C, which calls
TransformColorRed and it is realized via pointer to function
Change-Id: Ia68d73bfcf1ca2cb443dc2825910946221f87835
explicitly add immintrin.h instead of transitively picking it up via
windows.h presumably. makes the code easier to move around.
Change-Id: If70d5143ac94fc331da763ce034358858e460e06