Commit Graph

863 Commits

Author SHA1 Message Date
James Zern
4074acf873 dsp.h: bump msvc arm64 version requirement to 16.6
there was a bug in 16.4 causing compile failures with vtbl4_u8():

src\dsp\lossless_neon.c(105): error C2143: syntax error: missing ')'
before '{'
src\dsp\lossless_neon.c(105): error C2168: 'neon_tbl2_q8': too few
actual parameters for intrinsic function

https://developercommunity.visualstudio.com/t/cannot-compile-arm64-neon-vtbl4-u8-function-in-c-f/859331

Change-Id: I87c21850b3c597aa5cb41a8105b81e2135a9f890
2022-01-19 21:49:59 -08:00
jzern@google.com
02ca04c348 add missing USE_{MSA,NEON} checks in headers
msa_macro.h
neon.h

allows the headers to be built / analyzed under different target
configurations

Change-Id: Ibbcfada210b54988aa5279674d53af8e21fd4a97
2021-12-14 22:21:57 +00:00
James Zern
b6f756e82b update http links
- prefer https

- metadataworkinggroup.org/com seem to be offline; the web archive link
  was obtained from exiftool: https://exiftool.org/TagNames/MWG.html

- fix kramdown link, rubyforge has been gone a long time

- fix png/zlib links

Bug: webp:544
Bug: b/202302177
Change-Id: Id69de4553e7baf00393f12a2c1acb262443a1a93
2021-11-23 10:13:40 -08:00
Pascal Massimino
8ea81561d2 change VP8LPredictorFunc signature to avoid reading 'left'
... when it's not available. Even if the value was discarded and
never used, some msan config were complaining about reading it
and passing it around.

Change-Id: Iab8d24676c5bb58e607a829121e36c2862da397c
2021-11-05 16:22:31 +01:00
James Zern
e23cd5481c dsp.h: enable NEON w/VS2019+ ARM64 targets
Visual Studio added ARM64 support, but requires arm64_neon.h to be
included rather than arm_neon.h. Visual Studio 2019 addressed this so
we'll start with that version and leave a local adapter include for a
follow up.

Bug: webp:539
Change-Id: If975c029dafffba99210b3bb2d670035a83e8105
2021-09-13 19:56:19 -07:00
James Zern
1fe3162541 dsp/*: use WEBP_HAVE_* to determine Init availability
after:
  ece18e55 dsp.h: respect --disable-sse2/sse4.1/neon
WEBP_USE_* will be set when a module is targeting a particular
instruction set, e.g., sse4.1, and not overridden if WEBP_HAVE_SSE41 is
set, as previously this would ignore the case where the instruction set
was disabled via config.h and the HAVE macro was unset.

dsp.h not ensures WEBP_HAVE_* are set when WEBP_USE_* to cover cases
where the files are built without config.h.

Change-Id: Ia1c2dcf4100cc1081d968acb6e085e2a1768ece6
2021-07-24 10:19:30 -07:00
James Zern
ece18e5520 dsp.h: respect --disable-sse2/sse4.1/neon
previously this would be overridden if the instruction set was enabled
via -msse4.1, __aarch64__, etc.

Change-Id: I51e87a7da7589c6093d260b848ab41d89ec7b990
2021-07-17 12:14:38 -07:00
James Zern
8f5946634e alpha_processing: fix visual studio warnings
similar to '* const', __restrict needs to be included in the
declaration to avoid warnings like:
src\dsp\alpha_processing.c(429): warning C4028: formal parameter 1
different from declaration

this change also moves WEBP_RESTRICT to dsp.h to avoid a circular
dependency between it and utils.h which already includes dsp.h

Change-Id: Ib070d358a372e76fae4be5445ab288940b9baac0
2021-07-13 23:41:45 +00:00
James Zern
a1e5dae0f0 alpha_processing*: use WEBP_RESTRICT qualifier
this helps both auto-vectorization in the C code and the optimized code
generation

Change-Id: Ide570d6be45125ffef7248bdc40e9eb08f00e832
2021-07-07 15:39:21 -07:00
James Zern
a2fce86744 WebPRescalerImportRowExpand_C: promote some vals before multiply
avoids integer overflow in extreme cases:
src/dsp/rescaler.c:45:32: runtime error: signed integer overflow: 129 *
16777215 cannot be represented in type 'int'
    #0 0x556bde3538e3 in WebPRescalerImportRowExpand_C src/dsp/rescaler.c:45:32
    #1 0x556bde357465 in RescalerImportRowExpand_SSE2 src/dsp/rescaler_sse2.c:56:5
    ...

Bug: chromium:1196850
Change-Id: I4f923807f106713e113f3eec62a1d1c346066345
2021-06-07 18:59:33 -07:00
James Zern
5d4ee4c3c0 cosmetics: remove use of the term 'dummy'
this is replaced with more inclusive / informative text

Bug: webp:507
Change-Id: Ib77f0c79dd548601bf2bc3169985af4b5edf0a62
2021-03-15 11:39:06 -07:00
Ilya Kurdyukov
01b38ee19a faster CollectColorXXXTransforms_SSE41
3/4% faster overall.

Change-Id: If555c5530238ca0342b8d97b0d708b1bdc888d3f
2021-02-19 20:45:07 +01:00
Ilya Kurdyukov
8886f620c0 Use BitCtz for FastSLog2Slow_C
Change-Id: Icc6068b8934e481e6f17efd30616392e68d504ad
2021-02-19 15:11:42 +01:00
Ilya Kurdyukov
fae416179e faster CombinedShannonEntropy_SSE2
optimized for sparse histograms

Change-Id: I54412f5f8fc53d2598964a5be91f6c54ece3f21b
2021-02-19 13:14:46 +01:00
James Zern
33ddb894b1 lossless_sse{2,41}: remove some unneeded includes
Change-Id: Icd2cffd32b39c6bf017eee353ac04a4b6d337a11
2021-02-18 10:54:09 -08:00
Pascal Massimino
b78494a933 Merge "Fix undefined signed shift." 2021-02-18 16:51:17 +00:00
Vincent Rabaud
e79974cd6a Fix undefined signed shift.
Using the fix from SSE2.

Change-Id: Ie53d0163d97322da5a722c3e49f9d5f057ee1d91
2021-02-18 16:56:22 +01:00
Ilya Kurdyukov
a885339448 SSE4.1 versions of BGRA to RGB/BGR color-space conversions
Change-Id: Iacafd2f6402080b02fcbf75831e69c488f447454
2021-02-18 15:32:30 +01:00
Ilya Kurdyukov
a09a647241 SSE4.1 version of TransformColorInverse
Change-Id: I6ba5cb35917eef7a52152c4924eca205b4af7220
2021-02-18 12:42:39 +01:00
James Zern
47f64f6edd filters_sse2: import Chromium change
VerticalUnfilter_SSE2 has long been disabled due to a crash in an
Android emulator that hasn't reproduced elsewhere (crbug.com/654974).
this synchronizes the code for now to avoid needing to locally edit the
file on import.

Bug: 1141126
Change-Id: Ib61aeab93caaff1759606566b9e499eaac1576cf
2021-01-30 11:44:07 -08:00
James Zern
8599571935 disable CombinedShannonEntropy_SSE2 on x86
this function produces different results from the C code due to
use of double/float resulting in output differences when compared to
-noasm.

Bug: webp:499
Change-Id: Ia039b168c0a66da723fb434656657ba1948db8ae
2021-01-18 16:41:44 -08:00
James Zern
ae54553461 dsp.h: allow config.h to override MSVC SIMD autodetection
this fixes builds with cmake targeting visual studio that set
-DWEBP_ENABLE_SIMD=0

BUG=webp:478

Change-Id: I21b61b112c79ff9cbab9e4502a25d3f1fa096c8b
2020-12-03 10:22:04 -08:00
Vincent Rabaud
fc14fc038b Have C encoding predictors use decoding predictors.
libwebp.a in Release mode with no symbols size in bytes:
986430 -> 975114  (-1.1%)

Change-Id: Ia96192a6be2911779e359b72132bdba60b60a13d
2020-12-02 11:54:59 +01:00
Ingvar Stepanyan
52273943c6 Couple of fixes to allow SIMD on Emscripten
- Add `-msimd128` to flags to actually enable WebAssembly SIMD
   when performing SIMD detection. It's currently required in
   addition to `-msse*` / `-mfpu=neon` flags which only perform
   translation of corresponding intrinsics to Wasm SIMD ones.
   See a discussion at emscripten-core/emscripten#12714 for
   automating this and making easier in the future.
 - Remove compilation branch that prevented definitions of
   `WEBP_USE_SSE` and `WEBP_USE_NEON` on Emscripten even when
   SIMD support was detected at compile-time.
 - Add an implementation of `VP8GetCPUInfo` for Emscripten which
   uses static `WEBP_USE_*` flags to determine if a corresponding
   SIMD instruction is supported. This is because Wasm doesn't
   have proper feature detection (yet) and requires making separate
   build for SIMD version anyway.

Change-Id: I77592081b91fd0e4cbc9242f5600ce905184f506
2020-11-18 21:51:41 +00:00
Skal
55a080e50a Add WebPReplaceTransparentPixels() in dsp
with SSE2 implementation.

(Extracted from side experiment)

Change-Id: I62d457fb6643645291cffd6d2d205d4a5ffa4517
2020-09-09 08:15:22 +02:00
Yannis Guyon
47309ef52d webp: WEBP_OFFSET_PTR()
Removes undefined behavior of offsetting NULL.

Change-Id: I7c83d0c913c631c091a5fb128f6d6b46b1d116db
2020-03-20 11:39:06 +01:00
James Zern
687ab00e6e DC{4,8,16}_NEON: replace vmovl w/vaddl
4/8/16 fewer instructions

Change-Id: I38fe08722e7b839e3f3e0bf4df7e0fa8e7a0138f
2020-03-05 09:41:14 -08:00
James Zern
1b92fe75a1 DC16_NEON,aarch64: use vaddlv
saves 3 instructions, neutral to mildly faster on a pixel 3a

Change-Id: I6ae57e8e38d4149167ea14e27cd2b32113b4f8e7
2020-03-04 23:12:20 -08:00
James Zern
53f3d8cf7e dec_neon,DC8_NEON: use vaddlv instead of movl+vaddv
one fewer instruction

Change-Id: I2f599fd6f9eebbb0cab81ae9855244fc401d4323
2020-03-04 15:46:38 -08:00
James Zern
c6b75a1966 lossless_(enc_|)sse2: avoid offsetting a NULL pointer
PredictorSub0_SSE2 doesn't use 'upper' (neither does
VP8LPredictorsSub_C[0]); just pass NULL when dealing with trailing
pixels to avoid undefined behavior when offsetting a NULL pointer

BUG=chromium:1026858,oss-fuzz:19430

Change-Id: I08be8899ed2e34f26aaee34defe68dbd0fe216d3
2019-12-13 18:33:10 +00:00
James Zern
e2575e05cb DC8_NEON,aarch64: use vaddv
results in one fewer instruction for both DC8uv_NEON and
DC8uvNoLeft_NEON

Change-Id: Ia4e6f4dbc070079cdc2496a698bd4b34198ea164
2019-12-06 09:38:48 -08:00
Cheng Yi
b0e09e346f dec_neon: Fix build failure under some toolchains
some toolchains may implement vcreate_u64 as an assignment to a vector
causing a type mismatch:
 invalid conversion between vector type 'uint64x1_t' (vector of 1
'uint64_t' value) and integer type 'unsigned int' of different size
  const uint64x1_t LKJI____ = vcreate_u64(L | (K << 8) | (J << 16) | (I << 24));
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Change-Id: I5c7b0076ad66d4b3fcdcb7ee9f59bbaa6f19b783
2019-12-06 00:06:44 -08:00
Oliver Wolff
cf0e903c89 dsp/lossless: Fix non gcc ARM builds
The workaround for GCC ARM must not be applied when another toolchain
(like MSVC) is used for the build.

Change-Id: I11ec4558902063ccb085d3f435e24b3a60739dd5
2019-11-27 15:05:08 +01:00
Vincent Rabaud
bb7bc40b6d Remove ubsan errors.
'upper' could be NULL and it would be increased.
But that is for predictor zero that does not use 'upper'.

Change-Id: Icd4ae6792cc55ea021b4f828c3dbdb5f03e120d8
2019-11-06 14:08:14 +01:00
James Zern
fab8f9cfcf cosmetics: normalize '*' association
we associate '*' with types rather than variables

Change-Id: Id93ed65272a8a88e604278693e3850649639e9b6
2019-07-26 01:04:09 -07:00
Pascal Massimino
9d6988f44d Fix the oscillating prediction problem at low quality
For some exact resonance the over-quantization was exactly
compensating the under-quantization, leading to resonance
and strange patterns.

-> we special-handle the very flat blocks, hopefully for the
greater good (and not just the bad-resonance case).

For 'fast mode' (-m 3 or less), we just pay special attention
to the border of the image, where the oscillation / instability
usually starts. For the inner part of the image, since we're not
doing rd-opt, it's harder to fix anything.

Overall, on 'regular' images, the change is written the noise,
often leading to overall faster encoding (because of the short-cut).

BUG=webp:432

Change-Id: Ifaa8286499add80fd77daecf8e347abbff7c3a15
2019-07-03 08:40:41 -07:00
James Zern
92dbf23775 filters_sse2,cosmetics: shorten some long lines
Change-Id: Ifd8ddec50821aba175d41237df18e41b9ac6c7d4
2019-07-01 12:17:43 -07:00
James Zern
a277d197a2 filters_sse2.c: quiet integer sanitizer warnings
missed in a788b49

with clang7+ quiets conversion warnings like:
implicit conversion from type 'int' of value -114 (32-bit, signed) to
type 'uint8_t' (aka 'unsigned char') changed the value to 142 (8-bit,
unsigned)

Change-Id: I52dcd9cd613107f5424177c277785b92430bffb7
2019-07-01 11:16:50 -07:00
James Zern
a788b49897 filters_sse2.c: quiet integer sanitizer warnings
with clang7+ quiets conversion warnings like:
implicit conversion from type 'int' of value -114 (32-bit, signed) to
type 'uint8_t' (aka 'unsigned char') changed the value to 142 (8-bit,
unsigned)

Change-Id: I7f08a836ddcf777454dfd5b877a81b62b2abac86
2019-06-28 23:22:49 -07:00
James Zern
e6a92c5e15 filters.c: quiet integer sanitizer warnings
with clang7+ quiets conversion warnings like:
implicit conversion from type 'int' of value -12 (32-bit, signed) to
type 'uint8_t' (aka 'unsigned char') changed the value to 244 (8-bit,
unsigned)

Change-Id: I053c92301e55dcb0cae89a7733636283da942176
2019-06-28 23:16:28 -07:00
James Zern
ec1cc40a59 lossless.c: remove U32 -> S8 conversion warnings
Change-Id: Ica2664ea087254959391275654412141ed9472df
2019-06-28 01:34:55 -07:00
Pascal Massimino
1106478f42 remove conversion U32 -> S8 warnings
using an inline U32ToS8() function

Change-Id: I45f535c6c9b5de33d69acc17b466e183fcc19a63
2019-06-24 16:42:42 -07:00
Skal
812a6b49fc lossless_enc: fix some conversion warning
object code is unchanged.

Change-Id: I40fc16056c0ab44c5c57ef6b02af14be767abe87
2019-06-24 16:16:18 +02:00
James Zern
4627c1c91b lossless_enc,TransformColorBlue: quiet uint32_t conv warning
no change in object code

from clang-7 integer sanitizer:
implicit conversion from type 'uint32_t' (aka 'unsigned int') of value
1955895199 (32-bit, unsigned) to type 'uint8_t' (aka 'unsigned char')
changed the value to 159 (8-bit, unsigned)

Change-Id: I0c3022339e34b9c9af03167ab827ade677973644
2019-06-20 23:06:13 -07:00
James Zern
c84673a62f lossless_enc_sse{2,41}: quiet signed conv warnings
_mm_set1_epi16 takes a short argument

from clang-7 integer sanitizer:
implicit conversion from type 'int' of value 65280 (32-bit, signed) to
type 'short' changed the value to -256 (16-bit, signed)

Change-Id: Iad64f6209a8c130a7df67515451ded45b3f91702
2019-06-15 00:22:03 -07:00
James Zern
776a775709 dec_sse2: quiet signed conv warnings
_mm_set1_epi8() takes a char argument
_mm_insert_epi16 takes a short argument

from clang-7 integer sanitizer:
implicit conversion from type 'int' of value 189 (32-bit, signed) to
type 'char' changed the value to -67 (8-bit, signed)
implicit conversion from type 'int' of value 128 (32-bit, signed) to
type 'char' changed the value to -128 (8-bit, signed)
implicit conversion from type 'int' of value 33909 (32-bit, signed) to
type 'short' changed the value to -31627 (16-bit, signed)

Change-Id: Id6b191b2c06881e27d447eeb1ff5bb2c1857b6ba
2019-06-14 01:00:20 -07:00
James Zern
e78dea7587 (alpha_processing,enc}_sse2: quiet signed conv warnings
_mm_set1_epi8() takes a char argument
_mm_insert_epi16 takes a short argument

from clang-7 integer sanitizer:
implicit conversion from type 'int' of value 255 (32-bit, signed) to
type 'char' changed the value to -1 (8-bit, signed)
implicit conversion from type 'int' of value 33153 (32-bit, signed) to
type 'short' changed the value to -32383 (16-bit, signed)

Change-Id: Ic88c8ef3d00146d34f53a560582db673f818370d
2019-06-10 14:23:58 -07:00
Pascal Massimino
ab2dc8939f Rescaler: fix rounding error
We saturate the result to [0..255]
It's the easiest and safest, given the wide variety of scaling
range we cover: we're not using floats, so precision is always
an issue at one end or the other of the scaling spectrum.

we also use:
  round(a - floor(b))
instead of:
  floor(a - round(b))
to handle difficult cases (ratio ~= .99, e.g.)

MIPS code is still disabled (and wrong)

Change-Id: I18d3f5ddc4c524879c257b928329b1c648fa7fb5
2019-03-30 06:43:55 +00:00
James Zern
8c3f04febb AndroidCPUInfo: reorder terms in conditional
'var != constant' is the preferred style for the library

Change-Id: I226e6d5d80dddd0469808136605f49205d238341
2019-03-15 18:12:04 -07:00
Johann
5173d4ee6f neon IsFlat
Move IsFlat to its own header. This allows it to continue to be
inlined. Using the RTCD and creating a distinct function slows down arm
builds.

   flower   mug
C    3.59  2.12
NEON 3.47  2.01

BUG=b/118740850

Change-Id: Id77e8f76d9e9790c498806e7070bbe37c10bc2e9
2018-12-03 22:59:12 +00:00