avoids integer overflow in extreme cases:
src/dsp/rescaler.c:45:32: runtime error: signed integer overflow: 129 *
16777215 cannot be represented in type 'int'
#0 0x556bde3538e3 in WebPRescalerImportRowExpand_C src/dsp/rescaler.c:45:32
#1 0x556bde357465 in RescalerImportRowExpand_SSE2 src/dsp/rescaler_sse2.c:56:5
...
Bug: chromium:1196850
Change-Id: I4f923807f106713e113f3eec62a1d1c346066345
VerticalUnfilter_SSE2 has long been disabled due to a crash in an
Android emulator that hasn't reproduced elsewhere (crbug.com/654974).
this synchronizes the code for now to avoid needing to locally edit the
file on import.
Bug: 1141126
Change-Id: Ib61aeab93caaff1759606566b9e499eaac1576cf
this function produces different results from the C code due to
use of double/float resulting in output differences when compared to
-noasm.
Bug: webp:499
Change-Id: Ia039b168c0a66da723fb434656657ba1948db8ae
- Add `-msimd128` to flags to actually enable WebAssembly SIMD
when performing SIMD detection. It's currently required in
addition to `-msse*` / `-mfpu=neon` flags which only perform
translation of corresponding intrinsics to Wasm SIMD ones.
See a discussion at emscripten-core/emscripten#12714 for
automating this and making easier in the future.
- Remove compilation branch that prevented definitions of
`WEBP_USE_SSE` and `WEBP_USE_NEON` on Emscripten even when
SIMD support was detected at compile-time.
- Add an implementation of `VP8GetCPUInfo` for Emscripten which
uses static `WEBP_USE_*` flags to determine if a corresponding
SIMD instruction is supported. This is because Wasm doesn't
have proper feature detection (yet) and requires making separate
build for SIMD version anyway.
Change-Id: I77592081b91fd0e4cbc9242f5600ce905184f506
PredictorSub0_SSE2 doesn't use 'upper' (neither does
VP8LPredictorsSub_C[0]); just pass NULL when dealing with trailing
pixels to avoid undefined behavior when offsetting a NULL pointer
BUG=chromium:1026858,oss-fuzz:19430
Change-Id: I08be8899ed2e34f26aaee34defe68dbd0fe216d3
some toolchains may implement vcreate_u64 as an assignment to a vector
causing a type mismatch:
invalid conversion between vector type 'uint64x1_t' (vector of 1
'uint64_t' value) and integer type 'unsigned int' of different size
const uint64x1_t LKJI____ = vcreate_u64(L | (K << 8) | (J << 16) | (I << 24));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Change-Id: I5c7b0076ad66d4b3fcdcb7ee9f59bbaa6f19b783
The workaround for GCC ARM must not be applied when another toolchain
(like MSVC) is used for the build.
Change-Id: I11ec4558902063ccb085d3f435e24b3a60739dd5
'upper' could be NULL and it would be increased.
But that is for predictor zero that does not use 'upper'.
Change-Id: Icd4ae6792cc55ea021b4f828c3dbdb5f03e120d8
For some exact resonance the over-quantization was exactly
compensating the under-quantization, leading to resonance
and strange patterns.
-> we special-handle the very flat blocks, hopefully for the
greater good (and not just the bad-resonance case).
For 'fast mode' (-m 3 or less), we just pay special attention
to the border of the image, where the oscillation / instability
usually starts. For the inner part of the image, since we're not
doing rd-opt, it's harder to fix anything.
Overall, on 'regular' images, the change is written the noise,
often leading to overall faster encoding (because of the short-cut).
BUG=webp:432
Change-Id: Ifaa8286499add80fd77daecf8e347abbff7c3a15
missed in a788b49
with clang7+ quiets conversion warnings like:
implicit conversion from type 'int' of value -114 (32-bit, signed) to
type 'uint8_t' (aka 'unsigned char') changed the value to 142 (8-bit,
unsigned)
Change-Id: I52dcd9cd613107f5424177c277785b92430bffb7
with clang7+ quiets conversion warnings like:
implicit conversion from type 'int' of value -114 (32-bit, signed) to
type 'uint8_t' (aka 'unsigned char') changed the value to 142 (8-bit,
unsigned)
Change-Id: I7f08a836ddcf777454dfd5b877a81b62b2abac86
with clang7+ quiets conversion warnings like:
implicit conversion from type 'int' of value -12 (32-bit, signed) to
type 'uint8_t' (aka 'unsigned char') changed the value to 244 (8-bit,
unsigned)
Change-Id: I053c92301e55dcb0cae89a7733636283da942176
no change in object code
from clang-7 integer sanitizer:
implicit conversion from type 'uint32_t' (aka 'unsigned int') of value
1955895199 (32-bit, unsigned) to type 'uint8_t' (aka 'unsigned char')
changed the value to 159 (8-bit, unsigned)
Change-Id: I0c3022339e34b9c9af03167ab827ade677973644
_mm_set1_epi16 takes a short argument
from clang-7 integer sanitizer:
implicit conversion from type 'int' of value 65280 (32-bit, signed) to
type 'short' changed the value to -256 (16-bit, signed)
Change-Id: Iad64f6209a8c130a7df67515451ded45b3f91702
_mm_set1_epi8() takes a char argument
_mm_insert_epi16 takes a short argument
from clang-7 integer sanitizer:
implicit conversion from type 'int' of value 189 (32-bit, signed) to
type 'char' changed the value to -67 (8-bit, signed)
implicit conversion from type 'int' of value 128 (32-bit, signed) to
type 'char' changed the value to -128 (8-bit, signed)
implicit conversion from type 'int' of value 33909 (32-bit, signed) to
type 'short' changed the value to -31627 (16-bit, signed)
Change-Id: Id6b191b2c06881e27d447eeb1ff5bb2c1857b6ba
_mm_set1_epi8() takes a char argument
_mm_insert_epi16 takes a short argument
from clang-7 integer sanitizer:
implicit conversion from type 'int' of value 255 (32-bit, signed) to
type 'char' changed the value to -1 (8-bit, signed)
implicit conversion from type 'int' of value 33153 (32-bit, signed) to
type 'short' changed the value to -32383 (16-bit, signed)
Change-Id: Ic88c8ef3d00146d34f53a560582db673f818370d
We saturate the result to [0..255]
It's the easiest and safest, given the wide variety of scaling
range we cover: we're not using floats, so precision is always
an issue at one end or the other of the scaling spectrum.
we also use:
round(a - floor(b))
instead of:
floor(a - round(b))
to handle difficult cases (ratio ~= .99, e.g.)
MIPS code is still disabled (and wrong)
Change-Id: I18d3f5ddc4c524879c257b928329b1c648fa7fb5
Move IsFlat to its own header. This allows it to continue to be
inlined. Using the RTCD and creating a distinct function slows down arm
builds.
flower mug
C 3.59 2.12
NEON 3.47 2.01
BUG=b/118740850
Change-Id: Id77e8f76d9e9790c498806e7070bbe37c10bc2e9
Direct copy of sse2. Slight improvement because neon has
abs().
flower.ppm had minimal improvement. Somewhat expected because
GetResidualCost_C is only ~3.6%
mug.ppm had a better improvement because GetResidualCost_C is
almost 9%.
C 2.150
NEON 2.130
BUG=b/118740850
Change-Id: Ibc0dd97a81596635f5599cf568205974b4fd2597
Much faster with aarch64. Still somewhat faster without vmaxv.
C: 3.700s
ArmV7: 3.675
aarch64: 3.600
BUG=b/118740850
Change-Id: I3be852da89633eca4bddce443c87f5e4a2f55868
When histograms are empty, it is easy to add them.
They should also not be considered when merging histograms
(it is a waste of CPU).
This does not change the compression performance,
just the speed.
Change-Id: I42c721ca0f9c5ea067e73b792aa3db6d5e71d01f