inline function MakeARGB32 calls changed to call
via pointers to functions which make (a)rgb for
entire row
Change-Id: Ia4bd4be171a46c1e1821e408b073ff5791c587a9
and apply Paeth predictor (predictor#11) for the low effort (m=0) mode.
For 1000 image PNG corpus (m=0), this change yields speedup of 25% at lower quality
range and about 10% for higher quality range.
Change-Id: I0f036b8ffe45c241e63a067cbf01527b13d8de93
check for __apple_build_version__ to distinguish the two; a version
check could work as Apple bumped Xcode's to 5.x/6.x, but it's unclear
how upstream will deal with their versioning as they go 3.6+, so avoid
it for now.
Change-Id: I67cda67c4f68e262a92d805a63cc1496374be063
we don't need to store the whole distribution in order to compute the alpha
Later, we can incorporate the max_value / last_non_zero bookkeeping
in SSE2 directly.
Change-Id: I748ccea4ac17965d7afcab91845ef01be3aa3e15
this is a first step to unifying encoding/decoding cache stride
and possibly sharing the prediction functions in dsp/
With this layout, there's a little (~7%) space lost with unused samples.
But no speed change was observed.
Change-Id: I016df8cad41bde5088df3579e6ad65d884ee711e
~68% faster
reuses TM4() adding support for the additional rows, the columns were
already being done.
Change-Id: I6eac17e58cd1c636082bf7281f70f884ec399a6b
Move all the Entropy evaluation methods to lossless.c (from histogram.c).
There's slight difference in the way entropy is computed for evaluating
entropy in prediction methods and histogram (literal) for huffman trees.
Plan (later) to merge few (static) methods and reduce the code size.
This change has no impact on the compression speed/density.
Change-Id: Ife3d96a3c4a8d78a91723d9e0a8d1b78c0256a15
set WEBP_EXTERN to visibility=default
+ explicitly mark VP8GetCPUInfo as it's referenced within the examples
Change-Id: Ie3d2b15088e888f0b55203b205993eba75899d99
move the attribute to the front of the function to quiet clang warning:
GCC does not allow no_sanitize_thread attribute in this position on a
function definition
Change-Id: Ie4cc6e35a07bd00eab67d9cd6801bd2be9cfe676
avoids an ICE with NDK r10b + NDK_TOOLCHAIN_VERSION=4.9
In function 'SSE16x16':
enc_mips32.c (684) internal compiler error: Segmentation fault
Change-Id: I1a3d33c0a9534c97633ab93bcdf9bf59d3a7e473
got rid of the |a-b|^|b-a| method and went back
to just (a-b)^2 instead.
quality | size(bytes) after/before | time (ms) after/before
Change-Id: Ia3e0e6507b3f903deb1e182f78dad6df07380fd0
SSE2 version is 2.1x faster
This is used to transfer the alpha plane to green channel before lossless compression.
Change-Id: I01d9df0051c183b1ff5d6eb69961d4f43e33141a
* We were re-doing most of the work in plain-C as 'left-over'.
* we were always returning has_alpha = true because of a bad mask all_0xff
These bugs were conservative and silent, in the sense that we were 'just' doing
more work than necessary.
Now, the SSE2 version is really 2x faster than the C version.
Change-Id: I6c8132a267fe3c7a3d1fa70e7a5fcd10719543fa