We saturate the result to [0..255]
It's the easiest and safest, given the wide variety of scaling
range we cover: we're not using floats, so precision is always
an issue at one end or the other of the scaling spectrum.
we also use:
round(a - floor(b))
instead of:
floor(a - round(b))
to handle difficult cases (ratio ~= .99, e.g.)
MIPS code is still disabled (and wrong)
Change-Id: I18d3f5ddc4c524879c257b928329b1c648fa7fb5
We should be using 'floor' when doing the final divide.
-> new MACRO is MULT_FIX_FLOOR()
XXX*** Mips code is DISABLED for now ***XXX
I'll update and re-enable it in a later
patch, since this code needs some refactoring first.
BUG=oss-fuzz:9179
Change-Id: Ic0693cdca4e71f5beab1029475e35c4d06b12d13
the 'accum' variable can be larger than 15b for large
rescale values.
Assert triggered:
src/dsp/rescaler_sse2.c:249: RescalerExportRowExpand_SSE2: Assertion `v >= 0 && v <= 255' failed.
src/dsp/rescaler_sse2.c:350: RescalerExportRowShrink_SSE2: Assertion `v >= 0 && v <= 255' failed.
-> fall back to C implementation in this case for now
Change-Id: I7ea1cb72301cafc1459be403f6a6f4e3cbc89bb1
this avoids duplicates between these trees and dsp/, e.g., enc/tree.c,
dec/tree.c, making pulling the whole library source tree into one target
possible
BUG=webp:279
Change-Id: I060a614833c7c24ddd37bf641702ae6a5eef1775
include utils.h directly where needed to allow utils.h to rely on
defines from dsp.h in a follow-up.
Change-Id: I32e26aaeb0b04ba60b3332f685f9a2be5a0a8d3d
some limitations: only for RGBA output,
and if reduction factor is not too small (dst_width > src_width / 128)
20-25% faster, ~4-6% global improvement total decoding.
Change-Id: I95366ddaa4a38e0a96bed754dfe790126f7bb84a
It's better to stay with a 32b fixed-point precision overall, otherwise
the C-version on ARM gets *slower*.
Actually, gcc ARM compiler optimizes some instructions pretty
well when WEBP_RESCALER_FIX is exactly 32, even in C.
Change-Id: I0eea97f7db5947470f5af355dee098eca81e178d
The rounding and arithmetic is not the same as previously, to prevent overflow cases for large upscale factors.
We still rely on 32b x 32b -> 64b multiplies. Raised the fixed-point precision to 32b
so that we have some nice shifts from epi64 to epi32.
Changed rescaler_t type to 'uint32_t' in order to squeeze in all the precision required.
The MIPS code has been disabled because it's now out-of-sync. Will be fixed in
a subsequent CL when the dust settles.
~30-35% faster
Change-Id: I32e4ddc00933f1b1aa3463403086199fd5dad07b