copy and paste error in the previous commit, change
no_sanitize("unsigned-integer-overflow") from WEBP_UBSAN_IGNORE_UNDEF ->
WEBP_UBSAN_IGNORE_UNSIGNED_OVERFLOW
Change-Id: Id178ee14df1f2c4923a91ce423241e26b60b5d32
include utils.h directly where needed to allow utils.h to rely on
defines from dsp.h in a follow-up.
Change-Id: I32e26aaeb0b04ba60b3332f685f9a2be5a0a8d3d
configure gets 2 new options:
--enable-neon / --enable-neon-rtcd
the NEON modules are split to their own convenience lib and built with
auto-detected flags if none are given via CFLAGS.
the /proc/cpuinfo check will only be used for armv7 targets whose
toolchain does not enable NEON by default or didn't have NEON forced by
the CFLAGS from the environment.
Change-Id: I2755bc1d065d5d6ee6143b44978c2082f8bef1c5
This will allow to work in-place on cropped area later.
Also sped up the inverse gradient filtering in SSE2 (~4%)
Change-Id: I463149eee95d36984328f163a1e17f8cabd87441
This is in preparation for some SSE2 code.
And generally speaking, the whole SSIM code needs some
revamp: we're not averaging the SSIM value at each pixels
but just computing the overall SSIM value once, for the whole
plane. The former might be better than the latter.
Change-Id: I935784a917f84a18ef08dc5ec9a7b528abea46a5
global effect is ~2% faster encoding from JPG source
and ~8% faster lossless-webp source decoding to PGM (e.g.)
Also revamped the YUVA case to first accumulate R/G/B value into 16b
temporary buffer, and then doing the UV conversion.
-> New function: WebPConvertRGBA32ToUV
Change-Id: I1d7d0c4003aa02966ad33490ce0fcdc7925cf9f5
Just for RGB24/BGR24 for now, which are the hard-to-optimize ones.
SSE2 implementation coming next.
ConvertRowToY() should go into dsp/ too, at some point.
Change-Id: Ibc705ede5cbf674deefd0d9332cd82f618bc2425
also switch to using ExtractAlpha() instead of hard-coding the loop.
The ARGBToY/UV functions are rather easy to port to SSE2 / NEON.
Change-Id: I8f1346a9ca427a36ce2d6c848369ca7964d8b3c7
* vertical expansion now uses bilinear interpolation
* heavily assumes that the alpha plane is decoded in full, not row-by-row
* split the RescalerExportRow and RescalerImportRow methods into Shrink
and Expand variants.
* MIPS implementation of ExportRowExpand is missing.
There's room for extra speed optim and code re-org, but let's keep that for later patches.
addresses https://code.google.com/p/webp/issues/detail?id=254
Change-Id: I8f12b855342bf07dd467fe85e4fde5fd814effdb
generates a stub function when the specific architecture is not enabled,
exposing a symbol in the module, avoiding a compiler warning
Change-Id: Ia9336e57466a9b5241b85c1c95838e91c9283147
removes circular dependency between dsp and enc.
since:
a987fae MIPS: dspr2: added optimization for function GetResidualCost
Change-Id: Ifeb8fc02de89e2ba982ed7ffacd925d649bfec3c
set/get residual C functions moved to new file in src/dsp
mips32 version of GetResidualCost moved to new file
Change-Id: I7cebb7933a89820ff28c187249a9181f281081d2
inline function MakeARGB32 calls changed to call
via pointers to functions which make (a)rgb for
entire row
Change-Id: Ia4bd4be171a46c1e1821e408b073ff5791c587a9
we don't need to store the whole distribution in order to compute the alpha
Later, we can incorporate the max_value / last_non_zero bookkeeping
in SSE2 directly.
Change-Id: I748ccea4ac17965d7afcab91845ef01be3aa3e15
set WEBP_EXTERN to visibility=default
+ explicitly mark VP8GetCPUInfo as it's referenced within the examples
Change-Id: Ie3d2b15088e888f0b55203b205993eba75899d99
move the attribute to the front of the function to quiet clang warning:
GCC does not allow no_sanitize_thread attribute in this position on a
function definition
Change-Id: Ie4cc6e35a07bd00eab67d9cd6801bd2be9cfe676
SSE2 version is 2.1x faster
This is used to transfer the alpha plane to green channel before lossless compression.
Change-Id: I01d9df0051c183b1ff5d6eb69961d4f43e33141a