Commit Graph

809 Commits

Author SHA1 Message Date
Vincent Rabaud
cbf82cc04d Remove AVX2 files.
There is only enc_avx2.c and we never managed to get
something fast enough.

Change-Id: I7465b5d8ccf47d9aa612173b8f80f96060cdb366
2018-10-16 14:12:03 +02:00
Vincent Rabaud
ac5433118a Remove a few more useless #defines
Change-Id: I211e9bcb1c37d0ebc108896f109b23ce915e22b4
2018-10-15 16:26:10 +02:00
Vincent Rabaud
3e13da7b4f Clean-up the common sources in dsp.
Change-Id: I1b995e6517e8437127a433dccbb5b2db63e7c3a3
2018-10-08 15:00:01 +02:00
James Zern
de08d72741 cosmetics: normalize include guard comment
Change-Id: I0e08ec604aad8412cfe3d3670d773f4ae5650375
2018-08-22 14:46:53 -07:00
Pascal Massimino
2563db4759 fix rescaling rounding inaccuracy
We should be using 'floor' when doing the final divide.

-> new MACRO is MULT_FIX_FLOOR()

     XXX*** Mips code is DISABLED for now ***XXX

I'll update and re-enable it in a later
patch, since this code needs some refactoring first.

BUG=oss-fuzz:9179

Change-Id: Ic0693cdca4e71f5beab1029475e35c4d06b12d13
2018-07-10 22:45:50 -07:00
James Zern
0d5fad46cf add WEBP_DSP_INIT / WEBP_DSP_INIT_FUNC
this internalizes the init checks and provides stronger synchronization
with pthreads when available while still allowing VP8GetCPUInfo to be
modified (mostly for testing purposes). windows is left as is since a
critical section or mutex would cause a leak.

Change-Id: Ieb997e014f2805c0ae39c16f13337663521356f4
(cherry picked from commit d77bf512bd)
2018-04-17 18:01:34 -07:00
Pascal Massimino
c1cb86af5f fix 16b overflow in SSE2
the 'accum' variable can be larger than 15b for large
rescale values.

Assert triggered:
 src/dsp/rescaler_sse2.c:249: RescalerExportRowExpand_SSE2: Assertion `v >= 0 && v <= 255' failed.
 src/dsp/rescaler_sse2.c:350: RescalerExportRowShrink_SSE2: Assertion `v >= 0 && v <= 255' failed.

-> fall back to C implementation in this case for now

Change-Id: I7ea1cb72301cafc1459be403f6a6f4e3cbc89bb1
2018-04-11 21:25:06 +00:00
James Zern
120f58c3aa Merge "lossless*sse2: improve non-const 16-bit vector creation" 2018-02-20 19:56:07 +00:00
James Zern
8043504f95 lossless*sse2: improve non-const 16-bit vector creation
use _mm_set1_epi32 instead of _mm_set_epi16 with non-const values;
reduces shifts and ors.

Change-Id: Ie2cb2ab815f642855d03c6f3001223bcac4bd35c
2018-02-17 17:59:20 -08:00
Pascal Massimino
3b07d32712 Import,RGBA: fix for BigEndian import
+ simplification of the logic

Change-Id: Ia20ce844793ed35ea03a17cef45838f3d0ae4afa
2018-02-17 13:07:58 -08:00
James Zern
f4dd92565e remove WEBP_EXPERIMENTAL_FEATURES
the webp bitstream is considered stable at this point

Change-Id: I4b13f9ed4c45f63785474b097e96cb7bf651be7b
2018-02-09 10:25:11 -08:00
skal
6de58603b7 MIPS64: Fix defined-but-not-used errors with WEBP_REDUCE_CSP
BUG=webp:372

Change-Id: Ided3fae748face18138a8050eaced5e0f58120d4
2018-01-30 17:40:09 -08:00
Vincent Rabaud
cf1c5054c7 Add an SSE4 version of some lossless color transforms.
Change-Id: Ieac094f684116d1292793b2ca321f6f1a69565b5
2018-01-24 14:33:25 +01:00
James Zern
05f6fe24c3 upsampling: rm asserts w/REDUCE_CSP+OMIT_C_CODE
with WEBP_NEON_OMIT_C_CODE the default _C functions won't be set and
with WEBP_REDUCE_CSP the NEON functions won't be either triggering an
assert for an empty table member.

BUG=chromium:792627

Change-Id: I8d2d430eaa37bb92885b61a3dd39f961924a8def
2017-12-06 17:09:26 -08:00
Vincent Rabaud
55403a9a5a Upsampling SSE2/SSE4 speedup.
RGB to YUV conversion was not using SSE to finish up the row.
End data is now copied to a buffer big enough to fit in a
SSE register.
(UPSAMPLE_LAST_BLOCK was already using that trick).

Change-Id: Ie539bcbe570a643a774aa88263503c0d2c41890f
2017-12-05 23:37:06 +01:00
Vincent Rabaud
807b53c47e Implement the upsampling/yuv functions in SSE41
Change-Id: If122da22b74a974262063d232f6ca0ab902ff64e
2017-12-04 22:29:43 +01:00
Pascal Massimino
1af0df7662 Merge "WEBP_REDUCE_CSP: restrict colorspace support" 2017-11-27 20:08:55 +00:00
Pascal Massimino
6de20df02c WEBP_REDUCE_CSP: restrict colorspace support
only supported ones are: RGBA/BGRA/rgbA/bgrA (decoder)
as well as: WebPPictureImportRGB/RGBX/RGBA (encoder).

(note: extras/get_disto is affected too)

Change-Id: If6c4f95054ca15759c4e289fb3b4c352b3521c2c
2017-11-26 08:44:08 +00:00
Pascal Massimino
0df22b9eed WEBP_REDUCE_SIZE: disable all rescaler code
BUG=webp:355

Change-Id: Id87cb11902e3fb8544a214308526ea9665ce8440
2017-11-24 22:08:32 +00:00
Pascal Massimino
a80c46bd87 SSE2 implementation of HasAlphaXXX
Change-Id: I2548d9a0c252e20ee3cf5f4be736a3703671ecb4
HasAlpha32b: ~3-4x faster
HasAlpha8b: ~7-8x faster
2017-11-23 15:02:21 +01:00
James Zern
b299c47eac add WEBP_REDUCE_SIZE
remove auto-filter (-af) support and make WebPPictureCopy,
WebPPictureIsView, WebPPictureView, WebPPictureCrop, and
WebPPictureRescale noops.

Change-Id: If39d512cc268a0015298a1138dbc94feb86575e5
2017-11-22 17:35:39 -08:00
James Zern
eab5bab74f add WEBP_DISABLE_STATS
use to to make WebPPictureDistortion & WebPPlaneDistortion noops and
clear some ssim code.

Change-Id: I9b50b2318b7a114632e5a237a4002f64e95afbbc
2017-11-22 12:41:17 -08:00
Pascal Massimino
c245343dcb move LOAD8x4 and STORE8x2 closer to their use location
Change-Id: I674821732d3e607123070e4bbba87d9359c9a4ec
2017-11-21 23:44:39 -08:00
James Zern
b9e734fd5c dec,cosmetics: normalize function naming style
Change-Id: I33a2d1b4133db7a6d56d506f5c19670f0268cecd
2017-11-21 14:31:34 -08:00
James Zern
c188d546b3 dec: harmonize function suffixes
BUG=webp:355

Change-Id: Iabdfd3fbde906c2e35a7d7c080a8512425eb8ccb
2017-11-21 13:00:25 -08:00
James Zern
28c5ac8104 dec_sse41: harmonize function suffixes
BUG=webp:355

Change-Id: Id55f7b2e6288d1d0885d8451fbc59771222073d6
2017-11-21 12:47:06 -08:00
Pascal Massimino
e65b72a368 Merge "introduce WebPHasAlpha8b and WebPHasAlpha32b" 2017-11-21 06:21:44 +00:00
James Zern
b94cee98fb dec_sse2: remove HE8uv_SSE2
with gcc-4.8, clang-4.0.1/5 this is no faster (actually up to 2x slower)
than the code generated for memset (0x01010... * dst[-1]). shuffles in
sse4 recover a bit, but performance is still down.

Change-Id: Ie85e8353f8ede559d0b05a1d388787fd18ecc80f
2017-11-20 20:34:05 -08:00
Pascal Massimino
44a0ee3fa7 introduce WebPHasAlpha8b and WebPHasAlpha32b
Rewrote WebPPictureHasTransparency() to use them (even for argb).
This is 10% faster, for some reasons.

SSE2 version should be straightforward.
Removes a TODO.

Change-Id: I7ad5848fc5e355e2df505dbcd5a0f42fb6cbab41
2017-11-20 15:20:29 +01:00
Vincent Rabaud
c462cd0065 Remove useless code.
The casts are to the same type and the #define not used.

Change-Id: I8d69c3b9dde7a1c53c2ba5a026a653d8c2e1d2a7
2017-11-08 10:52:49 +01:00
James Zern
b7971d0e22 dsp: avoid defining _C functions w/NEON builds
when targeting NEON C functions with NEON equivalents won't be used, but
will contribute to binary size. the same goes for sse2, etc., but this
change is primarily concerned with binary sizes for android arm targets.

note '-noasm' or otherwise modifying VP8GetCPUInfo will have no effect
on the use of NEON functions.

this decision can be overridden by defining WEBP_DSP_OMIT_C_CODE to 0.

Change-Id: I47bd453c84a3d341ca39bc986a39eb9c785aface
2017-10-27 10:54:56 -07:00
James Zern
8d033b14d7 {dec,enc}_neon: harmonize function suffixes x2
+ neon.h

BUG=webp:355

Change-Id: Ia17c7dfc7d61742a4758823675a2d556a739c389
2017-10-20 19:00:53 -07:00
James Zern
0295e9815d upsampling_neon: harmonize function suffixes
BUG=webp:355

Change-Id: I75423abbe0bcea3c98a42e412cc2116be81b5d08
2017-10-20 19:00:53 -07:00
James Zern
d572c4e52b yuv_neon: harmonize function suffixes
BUG=webp:355

Change-Id: Ia2f716b459950c18717b062175197d1e6419bf2a
2017-10-20 19:00:53 -07:00
James Zern
ab9c2500db rescaler_neon: harmonize function suffixes
BUG=webp:355

Change-Id: I161caa14f7ebbc3ae978b1722472625a77d0a4a4
2017-10-20 19:00:53 -07:00
James Zern
93e0ce27f4 lossless_neon: harmonize function suffixes
BUG=webp:355

Change-Id: I4210081a39800b5c2589c443da237269908af666
2017-10-20 19:00:53 -07:00
James Zern
22fbc50edd lossless_enc_neon: harmonize function suffixes
BUG=webp:355

Change-Id: I462facaeade4f0f4fc1e96895493306d095a6a9a
2017-10-20 19:00:53 -07:00
James Zern
447875b47b filters_neon,cosmetics: fix indent
BUG=webp:355

Change-Id: I9df1119f1ea94868f75253a92c2e878c9290f744
2017-10-20 19:00:29 -07:00
James Zern
785da7eadd enc_neon: harmonize function suffixes
BUG=webp:355

Change-Id: Ie59efd271d16f12d21f3c800667dfc0980dc2e68
2017-10-20 00:18:32 -07:00
James Zern
bc1a251fcf dec_neon: harmonize function suffixes
BUG=webp:355

Change-Id: I61c9a0c9e24515322955e04afd8c4ea6a44b9319
2017-10-20 00:14:18 -07:00
James Zern
61e535f1ac dsp/lossless: workaround gcc-4.8 bug on arm
and all older versions.
force Sub3() to not be inlined, otherwise the code in Select() will be
incorrect.

extends the check add previously in:
637b3888 dsp/lossless: workaround gcc-4.9 bug on arm

BUG=webp:363

Change-Id: I1403b558f8660b764f3a570a3326822d5ef0be29
2017-10-19 13:05:48 -07:00
Pascal Massimino
0a17f4712c Merge "WIP: list includes as descendants of the project dir" 2017-10-11 08:21:42 +00:00
James Zern
a439972175 WIP: list includes as descendants of the project dir
#include "(.|..)/..." -> #include "src/..."

Change-Id: I772880aa097a770722043c8a4393552ba38a89b6
2017-10-10 23:04:05 -07:00
James Zern
d361a6a733 yuv_sse2: harmonize function suffixes
BUG=webp:355

Change-Id: I02a66f7446c75a10c3ce4766235e5767617d0dce
2017-10-08 14:06:34 -07:00
James Zern
6921aa6f0c upsampling_sse2: harmonize function suffixes
BUG=webp:355

Change-Id: I3a02cc717eb7506bd87511d6a17ab1691e84f72c
2017-10-08 14:06:30 -07:00
James Zern
08c67d3ed1 ssim_sse2: harmonize function suffixes
BUG=webp:355

Change-Id: I1282559888118b8cb0a46b7f0aa627d26b8838f5
2017-10-08 14:06:24 -07:00
James Zern
582a1b572a rescaler_sse2: harmonize function suffixes
BUG=webp:355

Change-Id: I978fd826ff90149c0ffd9d7607dcc6f88082d3e6
2017-10-08 14:06:19 -07:00
James Zern
2c1b18ba2f lossless_sse2: harmonize function suffixes
BUG=webp:355

Change-Id: I59d828800c2ab2a36e0ea90f629b74bd57207411
2017-10-08 14:06:14 -07:00
James Zern
0ac46e818b lossless_enc_sse2: harmonize function suffixes
BUG=webp:355

Change-Id: I06c64416103c3f3fc0519dd46d64b0a35f9798e4
2017-10-08 14:06:05 -07:00
James Zern
bc634d57c2 enc_sse2: harmonize function suffixes
BUG=webp:355

Change-Id: Idd2f289fcf99f12bf36494111b07a8906c99c826
2017-10-08 14:05:59 -07:00