Commit Graph

903 Commits

Author SHA1 Message Date
James Zern
5d4ee4c3c0 cosmetics: remove use of the term 'dummy'
this is replaced with more inclusive / informative text

Bug: webp:507
Change-Id: Ib77f0c79dd548601bf2bc3169985af4b5edf0a62
2021-03-15 11:39:06 -07:00
Ilya Kurdyukov
01b38ee19a faster CollectColorXXXTransforms_SSE41
3/4% faster overall.

Change-Id: If555c5530238ca0342b8d97b0d708b1bdc888d3f
2021-02-19 20:45:07 +01:00
Ilya Kurdyukov
8886f620c0 Use BitCtz for FastSLog2Slow_C
Change-Id: Icc6068b8934e481e6f17efd30616392e68d504ad
2021-02-19 15:11:42 +01:00
Ilya Kurdyukov
fae416179e faster CombinedShannonEntropy_SSE2
optimized for sparse histograms

Change-Id: I54412f5f8fc53d2598964a5be91f6c54ece3f21b
2021-02-19 13:14:46 +01:00
James Zern
33ddb894b1 lossless_sse{2,41}: remove some unneeded includes
Change-Id: Icd2cffd32b39c6bf017eee353ac04a4b6d337a11
2021-02-18 10:54:09 -08:00
Pascal Massimino
b78494a933 Merge "Fix undefined signed shift." 2021-02-18 16:51:17 +00:00
Vincent Rabaud
e79974cd6a Fix undefined signed shift.
Using the fix from SSE2.

Change-Id: Ie53d0163d97322da5a722c3e49f9d5f057ee1d91
2021-02-18 16:56:22 +01:00
Ilya Kurdyukov
a885339448 SSE4.1 versions of BGRA to RGB/BGR color-space conversions
Change-Id: Iacafd2f6402080b02fcbf75831e69c488f447454
2021-02-18 15:32:30 +01:00
Ilya Kurdyukov
a09a647241 SSE4.1 version of TransformColorInverse
Change-Id: I6ba5cb35917eef7a52152c4924eca205b4af7220
2021-02-18 12:42:39 +01:00
James Zern
47f64f6edd filters_sse2: import Chromium change
VerticalUnfilter_SSE2 has long been disabled due to a crash in an
Android emulator that hasn't reproduced elsewhere (crbug.com/654974).
this synchronizes the code for now to avoid needing to locally edit the
file on import.

Bug: 1141126
Change-Id: Ib61aeab93caaff1759606566b9e499eaac1576cf
2021-01-30 11:44:07 -08:00
James Zern
8599571935 disable CombinedShannonEntropy_SSE2 on x86
this function produces different results from the C code due to
use of double/float resulting in output differences when compared to
-noasm.

Bug: webp:499
Change-Id: Ia039b168c0a66da723fb434656657ba1948db8ae
2021-01-18 16:41:44 -08:00
James Zern
ae54553461 dsp.h: allow config.h to override MSVC SIMD autodetection
this fixes builds with cmake targeting visual studio that set
-DWEBP_ENABLE_SIMD=0

BUG=webp:478

Change-Id: I21b61b112c79ff9cbab9e4502a25d3f1fa096c8b
2020-12-03 10:22:04 -08:00
Vincent Rabaud
fc14fc038b Have C encoding predictors use decoding predictors.
libwebp.a in Release mode with no symbols size in bytes:
986430 -> 975114  (-1.1%)

Change-Id: Ia96192a6be2911779e359b72132bdba60b60a13d
2020-12-02 11:54:59 +01:00
Ingvar Stepanyan
52273943c6 Couple of fixes to allow SIMD on Emscripten
- Add `-msimd128` to flags to actually enable WebAssembly SIMD
   when performing SIMD detection. It's currently required in
   addition to `-msse*` / `-mfpu=neon` flags which only perform
   translation of corresponding intrinsics to Wasm SIMD ones.
   See a discussion at emscripten-core/emscripten#12714 for
   automating this and making easier in the future.
 - Remove compilation branch that prevented definitions of
   `WEBP_USE_SSE` and `WEBP_USE_NEON` on Emscripten even when
   SIMD support was detected at compile-time.
 - Add an implementation of `VP8GetCPUInfo` for Emscripten which
   uses static `WEBP_USE_*` flags to determine if a corresponding
   SIMD instruction is supported. This is because Wasm doesn't
   have proper feature detection (yet) and requires making separate
   build for SIMD version anyway.

Change-Id: I77592081b91fd0e4cbc9242f5600ce905184f506
2020-11-18 21:51:41 +00:00
Skal
55a080e50a Add WebPReplaceTransparentPixels() in dsp
with SSE2 implementation.

(Extracted from side experiment)

Change-Id: I62d457fb6643645291cffd6d2d205d4a5ffa4517
2020-09-09 08:15:22 +02:00
Yannis Guyon
47309ef52d webp: WEBP_OFFSET_PTR()
Removes undefined behavior of offsetting NULL.

Change-Id: I7c83d0c913c631c091a5fb128f6d6b46b1d116db
2020-03-20 11:39:06 +01:00
James Zern
687ab00e6e DC{4,8,16}_NEON: replace vmovl w/vaddl
4/8/16 fewer instructions

Change-Id: I38fe08722e7b839e3f3e0bf4df7e0fa8e7a0138f
2020-03-05 09:41:14 -08:00
James Zern
1b92fe75a1 DC16_NEON,aarch64: use vaddlv
saves 3 instructions, neutral to mildly faster on a pixel 3a

Change-Id: I6ae57e8e38d4149167ea14e27cd2b32113b4f8e7
2020-03-04 23:12:20 -08:00
James Zern
53f3d8cf7e dec_neon,DC8_NEON: use vaddlv instead of movl+vaddv
one fewer instruction

Change-Id: I2f599fd6f9eebbb0cab81ae9855244fc401d4323
2020-03-04 15:46:38 -08:00
James Zern
c6b75a1966 lossless_(enc_|)sse2: avoid offsetting a NULL pointer
PredictorSub0_SSE2 doesn't use 'upper' (neither does
VP8LPredictorsSub_C[0]); just pass NULL when dealing with trailing
pixels to avoid undefined behavior when offsetting a NULL pointer

BUG=chromium:1026858,oss-fuzz:19430

Change-Id: I08be8899ed2e34f26aaee34defe68dbd0fe216d3
2019-12-13 18:33:10 +00:00
James Zern
e2575e05cb DC8_NEON,aarch64: use vaddv
results in one fewer instruction for both DC8uv_NEON and
DC8uvNoLeft_NEON

Change-Id: Ia4e6f4dbc070079cdc2496a698bd4b34198ea164
2019-12-06 09:38:48 -08:00
Cheng Yi
b0e09e346f dec_neon: Fix build failure under some toolchains
some toolchains may implement vcreate_u64 as an assignment to a vector
causing a type mismatch:
 invalid conversion between vector type 'uint64x1_t' (vector of 1
'uint64_t' value) and integer type 'unsigned int' of different size
  const uint64x1_t LKJI____ = vcreate_u64(L | (K << 8) | (J << 16) | (I << 24));
                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Change-Id: I5c7b0076ad66d4b3fcdcb7ee9f59bbaa6f19b783
2019-12-06 00:06:44 -08:00
Oliver Wolff
cf0e903c89 dsp/lossless: Fix non gcc ARM builds
The workaround for GCC ARM must not be applied when another toolchain
(like MSVC) is used for the build.

Change-Id: I11ec4558902063ccb085d3f435e24b3a60739dd5
2019-11-27 15:05:08 +01:00
Vincent Rabaud
bb7bc40b6d Remove ubsan errors.
'upper' could be NULL and it would be increased.
But that is for predictor zero that does not use 'upper'.

Change-Id: Icd4ae6792cc55ea021b4f828c3dbdb5f03e120d8
2019-11-06 14:08:14 +01:00
James Zern
fab8f9cfcf cosmetics: normalize '*' association
we associate '*' with types rather than variables

Change-Id: Id93ed65272a8a88e604278693e3850649639e9b6
2019-07-26 01:04:09 -07:00
Pascal Massimino
9d6988f44d Fix the oscillating prediction problem at low quality
For some exact resonance the over-quantization was exactly
compensating the under-quantization, leading to resonance
and strange patterns.

-> we special-handle the very flat blocks, hopefully for the
greater good (and not just the bad-resonance case).

For 'fast mode' (-m 3 or less), we just pay special attention
to the border of the image, where the oscillation / instability
usually starts. For the inner part of the image, since we're not
doing rd-opt, it's harder to fix anything.

Overall, on 'regular' images, the change is written the noise,
often leading to overall faster encoding (because of the short-cut).

BUG=webp:432

Change-Id: Ifaa8286499add80fd77daecf8e347abbff7c3a15
2019-07-03 08:40:41 -07:00
James Zern
92dbf23775 filters_sse2,cosmetics: shorten some long lines
Change-Id: Ifd8ddec50821aba175d41237df18e41b9ac6c7d4
2019-07-01 12:17:43 -07:00
James Zern
a277d197a2 filters_sse2.c: quiet integer sanitizer warnings
missed in a788b49

with clang7+ quiets conversion warnings like:
implicit conversion from type 'int' of value -114 (32-bit, signed) to
type 'uint8_t' (aka 'unsigned char') changed the value to 142 (8-bit,
unsigned)

Change-Id: I52dcd9cd613107f5424177c277785b92430bffb7
2019-07-01 11:16:50 -07:00
James Zern
a788b49897 filters_sse2.c: quiet integer sanitizer warnings
with clang7+ quiets conversion warnings like:
implicit conversion from type 'int' of value -114 (32-bit, signed) to
type 'uint8_t' (aka 'unsigned char') changed the value to 142 (8-bit,
unsigned)

Change-Id: I7f08a836ddcf777454dfd5b877a81b62b2abac86
2019-06-28 23:22:49 -07:00
James Zern
e6a92c5e15 filters.c: quiet integer sanitizer warnings
with clang7+ quiets conversion warnings like:
implicit conversion from type 'int' of value -12 (32-bit, signed) to
type 'uint8_t' (aka 'unsigned char') changed the value to 244 (8-bit,
unsigned)

Change-Id: I053c92301e55dcb0cae89a7733636283da942176
2019-06-28 23:16:28 -07:00
James Zern
ec1cc40a59 lossless.c: remove U32 -> S8 conversion warnings
Change-Id: Ica2664ea087254959391275654412141ed9472df
2019-06-28 01:34:55 -07:00
Pascal Massimino
1106478f42 remove conversion U32 -> S8 warnings
using an inline U32ToS8() function

Change-Id: I45f535c6c9b5de33d69acc17b466e183fcc19a63
2019-06-24 16:42:42 -07:00
Skal
812a6b49fc lossless_enc: fix some conversion warning
object code is unchanged.

Change-Id: I40fc16056c0ab44c5c57ef6b02af14be767abe87
2019-06-24 16:16:18 +02:00
James Zern
4627c1c91b lossless_enc,TransformColorBlue: quiet uint32_t conv warning
no change in object code

from clang-7 integer sanitizer:
implicit conversion from type 'uint32_t' (aka 'unsigned int') of value
1955895199 (32-bit, unsigned) to type 'uint8_t' (aka 'unsigned char')
changed the value to 159 (8-bit, unsigned)

Change-Id: I0c3022339e34b9c9af03167ab827ade677973644
2019-06-20 23:06:13 -07:00
James Zern
c84673a62f lossless_enc_sse{2,41}: quiet signed conv warnings
_mm_set1_epi16 takes a short argument

from clang-7 integer sanitizer:
implicit conversion from type 'int' of value 65280 (32-bit, signed) to
type 'short' changed the value to -256 (16-bit, signed)

Change-Id: Iad64f6209a8c130a7df67515451ded45b3f91702
2019-06-15 00:22:03 -07:00
James Zern
776a775709 dec_sse2: quiet signed conv warnings
_mm_set1_epi8() takes a char argument
_mm_insert_epi16 takes a short argument

from clang-7 integer sanitizer:
implicit conversion from type 'int' of value 189 (32-bit, signed) to
type 'char' changed the value to -67 (8-bit, signed)
implicit conversion from type 'int' of value 128 (32-bit, signed) to
type 'char' changed the value to -128 (8-bit, signed)
implicit conversion from type 'int' of value 33909 (32-bit, signed) to
type 'short' changed the value to -31627 (16-bit, signed)

Change-Id: Id6b191b2c06881e27d447eeb1ff5bb2c1857b6ba
2019-06-14 01:00:20 -07:00
James Zern
e78dea7587 (alpha_processing,enc}_sse2: quiet signed conv warnings
_mm_set1_epi8() takes a char argument
_mm_insert_epi16 takes a short argument

from clang-7 integer sanitizer:
implicit conversion from type 'int' of value 255 (32-bit, signed) to
type 'char' changed the value to -1 (8-bit, signed)
implicit conversion from type 'int' of value 33153 (32-bit, signed) to
type 'short' changed the value to -32383 (16-bit, signed)

Change-Id: Ic88c8ef3d00146d34f53a560582db673f818370d
2019-06-10 14:23:58 -07:00
Pascal Massimino
ab2dc8939f Rescaler: fix rounding error
We saturate the result to [0..255]
It's the easiest and safest, given the wide variety of scaling
range we cover: we're not using floats, so precision is always
an issue at one end or the other of the scaling spectrum.

we also use:
  round(a - floor(b))
instead of:
  floor(a - round(b))
to handle difficult cases (ratio ~= .99, e.g.)

MIPS code is still disabled (and wrong)

Change-Id: I18d3f5ddc4c524879c257b928329b1c648fa7fb5
2019-03-30 06:43:55 +00:00
James Zern
8c3f04febb AndroidCPUInfo: reorder terms in conditional
'var != constant' is the preferred style for the library

Change-Id: I226e6d5d80dddd0469808136605f49205d238341
2019-03-15 18:12:04 -07:00
Johann
5173d4ee6f neon IsFlat
Move IsFlat to its own header. This allows it to continue to be
inlined. Using the RTCD and creating a distinct function slows down arm
builds.

   flower   mug
C    3.59  2.12
NEON 3.47  2.01

BUG=b/118740850

Change-Id: Id77e8f76d9e9790c498806e7070bbe37c10bc2e9
2018-12-03 22:59:12 +00:00
Johann
9f4d4a3f49 neon: GetResidualCost
Direct copy of sse2. Slight improvement because neon has
abs().

flower.ppm had minimal improvement. Somewhat expected because
GetResidualCost_C is only ~3.6%

mug.ppm had a better improvement because GetResidualCost_C is
almost 9%.

C    2.150
NEON 2.130

BUG=b/118740850

Change-Id: Ibc0dd97a81596635f5599cf568205974b4fd2597
2018-11-14 11:46:58 -08:00
Johann
0fd7514b55 neon: SetResidualCoeffs
Much faster with aarch64. Still somewhat faster without vmaxv.

C: 3.700s
ArmV7: 3.675
aarch64: 3.600

BUG=b/118740850

Change-Id: I3be852da89633eca4bddce443c87f5e4a2f55868
2018-11-14 11:46:40 -08:00
Vincent Rabaud
decf6f6b87 Speedups for empty histograms.
When histograms are empty, it is easy to add them.
They should also not be considered when merging histograms
(it is a waste of CPU).
This does not change the compression performance,
just the speed.

Change-Id: I42c721ca0f9c5ea067e73b792aa3db6d5e71d01f
2018-10-20 13:23:50 +02:00
Vincent Rabaud
dea3e89983 Split HistogramAdd to only have the high level logic in C.
Change-Id: Ic9eaebf7128ca0215b49d2a13bde1f5b94a28061
2018-10-19 14:03:28 +02:00
Vincent Rabaud
cbf82cc04d Remove AVX2 files.
There is only enc_avx2.c and we never managed to get
something fast enough.

Change-Id: I7465b5d8ccf47d9aa612173b8f80f96060cdb366
2018-10-16 14:12:03 +02:00
Vincent Rabaud
ac5433118a Remove a few more useless #defines
Change-Id: I211e9bcb1c37d0ebc108896f109b23ce915e22b4
2018-10-15 16:26:10 +02:00
Vincent Rabaud
3e13da7b4f Clean-up the common sources in dsp.
Change-Id: I1b995e6517e8437127a433dccbb5b2db63e7c3a3
2018-10-08 15:00:01 +02:00
James Zern
de08d72741 cosmetics: normalize include guard comment
Change-Id: I0e08ec604aad8412cfe3d3670d773f4ae5650375
2018-08-22 14:46:53 -07:00
Pascal Massimino
2563db4759 fix rescaling rounding inaccuracy
We should be using 'floor' when doing the final divide.

-> new MACRO is MULT_FIX_FLOOR()

     XXX*** Mips code is DISABLED for now ***XXX

I'll update and re-enable it in a later
patch, since this code needs some refactoring first.

BUG=oss-fuzz:9179

Change-Id: Ic0693cdca4e71f5beab1029475e35c4d06b12d13
2018-07-10 22:45:50 -07:00
James Zern
0d5fad46cf add WEBP_DSP_INIT / WEBP_DSP_INIT_FUNC
this internalizes the init checks and provides stronger synchronization
with pthreads when available while still allowing VP8GetCPUInfo to be
modified (mostly for testing purposes). windows is left as is since a
critical section or mutex would cause a leak.

Change-Id: Ieb997e014f2805c0ae39c16f13337663521356f4
(cherry picked from commit d77bf512bd)
2018-04-17 18:01:34 -07:00
Pascal Massimino
c1cb86af5f fix 16b overflow in SSE2
the 'accum' variable can be larger than 15b for large
rescale values.

Assert triggered:
 src/dsp/rescaler_sse2.c:249: RescalerExportRowExpand_SSE2: Assertion `v >= 0 && v <= 255' failed.
 src/dsp/rescaler_sse2.c:350: RescalerExportRowShrink_SSE2: Assertion `v >= 0 && v <= 255' failed.

-> fall back to C implementation in this case for now

Change-Id: I7ea1cb72301cafc1459be403f6a6f4e3cbc89bb1
2018-04-11 21:25:06 +00:00
James Zern
120f58c3aa Merge "lossless*sse2: improve non-const 16-bit vector creation" 2018-02-20 19:56:07 +00:00
James Zern
8043504f95 lossless*sse2: improve non-const 16-bit vector creation
use _mm_set1_epi32 instead of _mm_set_epi16 with non-const values;
reduces shifts and ors.

Change-Id: Ie2cb2ab815f642855d03c6f3001223bcac4bd35c
2018-02-17 17:59:20 -08:00
Pascal Massimino
3b07d32712 Import,RGBA: fix for BigEndian import
+ simplification of the logic

Change-Id: Ia20ce844793ed35ea03a17cef45838f3d0ae4afa
2018-02-17 13:07:58 -08:00
James Zern
f4dd92565e remove WEBP_EXPERIMENTAL_FEATURES
the webp bitstream is considered stable at this point

Change-Id: I4b13f9ed4c45f63785474b097e96cb7bf651be7b
2018-02-09 10:25:11 -08:00
skal
6de58603b7 MIPS64: Fix defined-but-not-used errors with WEBP_REDUCE_CSP
BUG=webp:372

Change-Id: Ided3fae748face18138a8050eaced5e0f58120d4
2018-01-30 17:40:09 -08:00
Vincent Rabaud
cf1c5054c7 Add an SSE4 version of some lossless color transforms.
Change-Id: Ieac094f684116d1292793b2ca321f6f1a69565b5
2018-01-24 14:33:25 +01:00
James Zern
05f6fe24c3 upsampling: rm asserts w/REDUCE_CSP+OMIT_C_CODE
with WEBP_NEON_OMIT_C_CODE the default _C functions won't be set and
with WEBP_REDUCE_CSP the NEON functions won't be either triggering an
assert for an empty table member.

BUG=chromium:792627

Change-Id: I8d2d430eaa37bb92885b61a3dd39f961924a8def
2017-12-06 17:09:26 -08:00
Vincent Rabaud
55403a9a5a Upsampling SSE2/SSE4 speedup.
RGB to YUV conversion was not using SSE to finish up the row.
End data is now copied to a buffer big enough to fit in a
SSE register.
(UPSAMPLE_LAST_BLOCK was already using that trick).

Change-Id: Ie539bcbe570a643a774aa88263503c0d2c41890f
2017-12-05 23:37:06 +01:00
Vincent Rabaud
807b53c47e Implement the upsampling/yuv functions in SSE41
Change-Id: If122da22b74a974262063d232f6ca0ab902ff64e
2017-12-04 22:29:43 +01:00
Pascal Massimino
1af0df7662 Merge "WEBP_REDUCE_CSP: restrict colorspace support" 2017-11-27 20:08:55 +00:00
Pascal Massimino
6de20df02c WEBP_REDUCE_CSP: restrict colorspace support
only supported ones are: RGBA/BGRA/rgbA/bgrA (decoder)
as well as: WebPPictureImportRGB/RGBX/RGBA (encoder).

(note: extras/get_disto is affected too)

Change-Id: If6c4f95054ca15759c4e289fb3b4c352b3521c2c
2017-11-26 08:44:08 +00:00
Pascal Massimino
0df22b9eed WEBP_REDUCE_SIZE: disable all rescaler code
BUG=webp:355

Change-Id: Id87cb11902e3fb8544a214308526ea9665ce8440
2017-11-24 22:08:32 +00:00
Pascal Massimino
a80c46bd87 SSE2 implementation of HasAlphaXXX
Change-Id: I2548d9a0c252e20ee3cf5f4be736a3703671ecb4
HasAlpha32b: ~3-4x faster
HasAlpha8b: ~7-8x faster
2017-11-23 15:02:21 +01:00
James Zern
b299c47eac add WEBP_REDUCE_SIZE
remove auto-filter (-af) support and make WebPPictureCopy,
WebPPictureIsView, WebPPictureView, WebPPictureCrop, and
WebPPictureRescale noops.

Change-Id: If39d512cc268a0015298a1138dbc94feb86575e5
2017-11-22 17:35:39 -08:00
James Zern
eab5bab74f add WEBP_DISABLE_STATS
use to to make WebPPictureDistortion & WebPPlaneDistortion noops and
clear some ssim code.

Change-Id: I9b50b2318b7a114632e5a237a4002f64e95afbbc
2017-11-22 12:41:17 -08:00
Pascal Massimino
c245343dcb move LOAD8x4 and STORE8x2 closer to their use location
Change-Id: I674821732d3e607123070e4bbba87d9359c9a4ec
2017-11-21 23:44:39 -08:00
James Zern
b9e734fd5c dec,cosmetics: normalize function naming style
Change-Id: I33a2d1b4133db7a6d56d506f5c19670f0268cecd
2017-11-21 14:31:34 -08:00
James Zern
c188d546b3 dec: harmonize function suffixes
BUG=webp:355

Change-Id: Iabdfd3fbde906c2e35a7d7c080a8512425eb8ccb
2017-11-21 13:00:25 -08:00
James Zern
28c5ac8104 dec_sse41: harmonize function suffixes
BUG=webp:355

Change-Id: Id55f7b2e6288d1d0885d8451fbc59771222073d6
2017-11-21 12:47:06 -08:00
Pascal Massimino
e65b72a368 Merge "introduce WebPHasAlpha8b and WebPHasAlpha32b" 2017-11-21 06:21:44 +00:00
James Zern
b94cee98fb dec_sse2: remove HE8uv_SSE2
with gcc-4.8, clang-4.0.1/5 this is no faster (actually up to 2x slower)
than the code generated for memset (0x01010... * dst[-1]). shuffles in
sse4 recover a bit, but performance is still down.

Change-Id: Ie85e8353f8ede559d0b05a1d388787fd18ecc80f
2017-11-20 20:34:05 -08:00
Pascal Massimino
44a0ee3fa7 introduce WebPHasAlpha8b and WebPHasAlpha32b
Rewrote WebPPictureHasTransparency() to use them (even for argb).
This is 10% faster, for some reasons.

SSE2 version should be straightforward.
Removes a TODO.

Change-Id: I7ad5848fc5e355e2df505dbcd5a0f42fb6cbab41
2017-11-20 15:20:29 +01:00
Vincent Rabaud
c462cd0065 Remove useless code.
The casts are to the same type and the #define not used.

Change-Id: I8d69c3b9dde7a1c53c2ba5a026a653d8c2e1d2a7
2017-11-08 10:52:49 +01:00
James Zern
b7971d0e22 dsp: avoid defining _C functions w/NEON builds
when targeting NEON C functions with NEON equivalents won't be used, but
will contribute to binary size. the same goes for sse2, etc., but this
change is primarily concerned with binary sizes for android arm targets.

note '-noasm' or otherwise modifying VP8GetCPUInfo will have no effect
on the use of NEON functions.

this decision can be overridden by defining WEBP_DSP_OMIT_C_CODE to 0.

Change-Id: I47bd453c84a3d341ca39bc986a39eb9c785aface
2017-10-27 10:54:56 -07:00
James Zern
8d033b14d7 {dec,enc}_neon: harmonize function suffixes x2
+ neon.h

BUG=webp:355

Change-Id: Ia17c7dfc7d61742a4758823675a2d556a739c389
2017-10-20 19:00:53 -07:00
James Zern
0295e9815d upsampling_neon: harmonize function suffixes
BUG=webp:355

Change-Id: I75423abbe0bcea3c98a42e412cc2116be81b5d08
2017-10-20 19:00:53 -07:00
James Zern
d572c4e52b yuv_neon: harmonize function suffixes
BUG=webp:355

Change-Id: Ia2f716b459950c18717b062175197d1e6419bf2a
2017-10-20 19:00:53 -07:00
James Zern
ab9c2500db rescaler_neon: harmonize function suffixes
BUG=webp:355

Change-Id: I161caa14f7ebbc3ae978b1722472625a77d0a4a4
2017-10-20 19:00:53 -07:00
James Zern
93e0ce27f4 lossless_neon: harmonize function suffixes
BUG=webp:355

Change-Id: I4210081a39800b5c2589c443da237269908af666
2017-10-20 19:00:53 -07:00
James Zern
22fbc50edd lossless_enc_neon: harmonize function suffixes
BUG=webp:355

Change-Id: I462facaeade4f0f4fc1e96895493306d095a6a9a
2017-10-20 19:00:53 -07:00
James Zern
447875b47b filters_neon,cosmetics: fix indent
BUG=webp:355

Change-Id: I9df1119f1ea94868f75253a92c2e878c9290f744
2017-10-20 19:00:29 -07:00
James Zern
785da7eadd enc_neon: harmonize function suffixes
BUG=webp:355

Change-Id: Ie59efd271d16f12d21f3c800667dfc0980dc2e68
2017-10-20 00:18:32 -07:00
James Zern
bc1a251fcf dec_neon: harmonize function suffixes
BUG=webp:355

Change-Id: I61c9a0c9e24515322955e04afd8c4ea6a44b9319
2017-10-20 00:14:18 -07:00
James Zern
61e535f1ac dsp/lossless: workaround gcc-4.8 bug on arm
and all older versions.
force Sub3() to not be inlined, otherwise the code in Select() will be
incorrect.

extends the check add previously in:
637b3888 dsp/lossless: workaround gcc-4.9 bug on arm

BUG=webp:363

Change-Id: I1403b558f8660b764f3a570a3326822d5ef0be29
2017-10-19 13:05:48 -07:00
Pascal Massimino
0a17f4712c Merge "WIP: list includes as descendants of the project dir" 2017-10-11 08:21:42 +00:00
James Zern
a439972175 WIP: list includes as descendants of the project dir
#include "(.|..)/..." -> #include "src/..."

Change-Id: I772880aa097a770722043c8a4393552ba38a89b6
2017-10-10 23:04:05 -07:00
James Zern
d361a6a733 yuv_sse2: harmonize function suffixes
BUG=webp:355

Change-Id: I02a66f7446c75a10c3ce4766235e5767617d0dce
2017-10-08 14:06:34 -07:00
James Zern
6921aa6f0c upsampling_sse2: harmonize function suffixes
BUG=webp:355

Change-Id: I3a02cc717eb7506bd87511d6a17ab1691e84f72c
2017-10-08 14:06:30 -07:00
James Zern
08c67d3ed1 ssim_sse2: harmonize function suffixes
BUG=webp:355

Change-Id: I1282559888118b8cb0a46b7f0aa627d26b8838f5
2017-10-08 14:06:24 -07:00
James Zern
582a1b572a rescaler_sse2: harmonize function suffixes
BUG=webp:355

Change-Id: I978fd826ff90149c0ffd9d7607dcc6f88082d3e6
2017-10-08 14:06:19 -07:00
James Zern
2c1b18ba2f lossless_sse2: harmonize function suffixes
BUG=webp:355

Change-Id: I59d828800c2ab2a36e0ea90f629b74bd57207411
2017-10-08 14:06:14 -07:00
James Zern
0ac46e818b lossless_enc_sse2: harmonize function suffixes
BUG=webp:355

Change-Id: I06c64416103c3f3fc0519dd46d64b0a35f9798e4
2017-10-08 14:06:05 -07:00
James Zern
bc634d57c2 enc_sse2: harmonize function suffixes
BUG=webp:355

Change-Id: Idd2f289fcf99f12bf36494111b07a8906c99c826
2017-10-08 14:05:59 -07:00
James Zern
bcb7347c2b dec_sse2: harmonize function suffixes
BUG=webp:355

Change-Id: Ic0390a4a24a5d8caff5b8af9fc9d59769ec533b1
2017-10-07 15:14:03 -07:00
James Zern
fb3daad604 cpu: fix ssse3 check
ssse3 is bit #9 in ecx, bit 1 is sse3. this only controls the check for
slow ssse3 and likely had no ill effect.

Change-Id: I84ce73dc480e1cdbd085e37be06f3f402116c201
2017-09-29 16:27:47 -07:00
Vincent Rabaud
a5216efc8c Fix integer overflow warning.
Though the overflow could happen, it does not change the
end results.

Change-Id: I1b84e022a0776d35eab5c5c4fb7d3563f5667bfa
2017-09-25 11:02:22 +02:00
James Zern
f78da3dea6 add LOCAL_CLANG_PREREQ and avoid WORK_AROUND_GCC w/3.8+
this results in a 15-20% speedup for lossy decoding on a N5/S6/CM1

BUG=webp:339

Change-Id: Icdeb84c3e0b8908147ac276b4d8f76c3d565b735
2017-09-19 20:59:49 -07:00
James Zern
01c426f1e7 define WEBP_USE_INTRINSICS w/gcc-4.9+
32-bit builds are neutral to slightly faster using ndk r15c on a
N5/S6/CM1

BUG=webp:339

Change-Id: I94b9442e0ceaf2f5edb2b4026bc8b99cd77c918b
2017-09-19 20:59:43 -07:00
Pascal Massimino
3822762a6c rationalize the Makefile.am
one library addition per line, etc...

BUG=webp:355

Change-Id: I95761dea598a382db5632c5187210937e129ff75
2017-08-29 00:00:14 -07:00