James Zern
120f58c3aa
Merge "lossless*sse2: improve non-const 16-bit vector creation"
2018-02-20 19:56:07 +00:00
James Zern
8043504f95
lossless*sse2: improve non-const 16-bit vector creation
...
use _mm_set1_epi32 instead of _mm_set_epi16 with non-const values;
reduces shifts and ors.
Change-Id: Ie2cb2ab815f642855d03c6f3001223bcac4bd35c
2018-02-17 17:59:20 -08:00
Pascal Massimino
3b07d32712
Import,RGBA: fix for BigEndian import
...
+ simplification of the logic
Change-Id: Ia20ce844793ed35ea03a17cef45838f3d0ae4afa
2018-02-17 13:07:58 -08:00
James Zern
f4dd92565e
remove WEBP_EXPERIMENTAL_FEATURES
...
the webp bitstream is considered stable at this point
Change-Id: I4b13f9ed4c45f63785474b097e96cb7bf651be7b
2018-02-09 10:25:11 -08:00
skal
6de58603b7
MIPS64: Fix defined-but-not-used errors with WEBP_REDUCE_CSP
...
BUG=webp:372
Change-Id: Ided3fae748face18138a8050eaced5e0f58120d4
2018-01-30 17:40:09 -08:00
Vincent Rabaud
cf1c5054c7
Add an SSE4 version of some lossless color transforms.
...
Change-Id: Ieac094f684116d1292793b2ca321f6f1a69565b5
2018-01-24 14:33:25 +01:00
James Zern
05f6fe24c3
upsampling: rm asserts w/REDUCE_CSP+OMIT_C_CODE
...
with WEBP_NEON_OMIT_C_CODE the default _C functions won't be set and
with WEBP_REDUCE_CSP the NEON functions won't be either triggering an
assert for an empty table member.
BUG=chromium:792627
Change-Id: I8d2d430eaa37bb92885b61a3dd39f961924a8def
2017-12-06 17:09:26 -08:00
Vincent Rabaud
55403a9a5a
Upsampling SSE2/SSE4 speedup.
...
RGB to YUV conversion was not using SSE to finish up the row.
End data is now copied to a buffer big enough to fit in a
SSE register.
(UPSAMPLE_LAST_BLOCK was already using that trick).
Change-Id: Ie539bcbe570a643a774aa88263503c0d2c41890f
2017-12-05 23:37:06 +01:00
Vincent Rabaud
807b53c47e
Implement the upsampling/yuv functions in SSE41
...
Change-Id: If122da22b74a974262063d232f6ca0ab902ff64e
2017-12-04 22:29:43 +01:00
Pascal Massimino
1af0df7662
Merge "WEBP_REDUCE_CSP: restrict colorspace support"
2017-11-27 20:08:55 +00:00
Pascal Massimino
6de20df02c
WEBP_REDUCE_CSP: restrict colorspace support
...
only supported ones are: RGBA/BGRA/rgbA/bgrA (decoder)
as well as: WebPPictureImportRGB/RGBX/RGBA (encoder).
(note: extras/get_disto is affected too)
Change-Id: If6c4f95054ca15759c4e289fb3b4c352b3521c2c
2017-11-26 08:44:08 +00:00
Pascal Massimino
0df22b9eed
WEBP_REDUCE_SIZE: disable all rescaler code
...
BUG=webp:355
Change-Id: Id87cb11902e3fb8544a214308526ea9665ce8440
2017-11-24 22:08:32 +00:00
Pascal Massimino
a80c46bd87
SSE2 implementation of HasAlphaXXX
...
Change-Id: I2548d9a0c252e20ee3cf5f4be736a3703671ecb4
HasAlpha32b: ~3-4x faster
HasAlpha8b: ~7-8x faster
2017-11-23 15:02:21 +01:00
James Zern
b299c47eac
add WEBP_REDUCE_SIZE
...
remove auto-filter (-af) support and make WebPPictureCopy,
WebPPictureIsView, WebPPictureView, WebPPictureCrop, and
WebPPictureRescale noops.
Change-Id: If39d512cc268a0015298a1138dbc94feb86575e5
2017-11-22 17:35:39 -08:00
James Zern
eab5bab74f
add WEBP_DISABLE_STATS
...
use to to make WebPPictureDistortion & WebPPlaneDistortion noops and
clear some ssim code.
Change-Id: I9b50b2318b7a114632e5a237a4002f64e95afbbc
2017-11-22 12:41:17 -08:00
Pascal Massimino
c245343dcb
move LOAD8x4 and STORE8x2 closer to their use location
...
Change-Id: I674821732d3e607123070e4bbba87d9359c9a4ec
2017-11-21 23:44:39 -08:00
James Zern
b9e734fd5c
dec,cosmetics: normalize function naming style
...
Change-Id: I33a2d1b4133db7a6d56d506f5c19670f0268cecd
2017-11-21 14:31:34 -08:00
James Zern
c188d546b3
dec: harmonize function suffixes
...
BUG=webp:355
Change-Id: Iabdfd3fbde906c2e35a7d7c080a8512425eb8ccb
2017-11-21 13:00:25 -08:00
James Zern
28c5ac8104
dec_sse41: harmonize function suffixes
...
BUG=webp:355
Change-Id: Id55f7b2e6288d1d0885d8451fbc59771222073d6
2017-11-21 12:47:06 -08:00
Pascal Massimino
e65b72a368
Merge "introduce WebPHasAlpha8b and WebPHasAlpha32b"
2017-11-21 06:21:44 +00:00
James Zern
b94cee98fb
dec_sse2: remove HE8uv_SSE2
...
with gcc-4.8, clang-4.0.1/5 this is no faster (actually up to 2x slower)
than the code generated for memset (0x01010... * dst[-1]). shuffles in
sse4 recover a bit, but performance is still down.
Change-Id: Ie85e8353f8ede559d0b05a1d388787fd18ecc80f
2017-11-20 20:34:05 -08:00
Pascal Massimino
44a0ee3fa7
introduce WebPHasAlpha8b and WebPHasAlpha32b
...
Rewrote WebPPictureHasTransparency() to use them (even for argb).
This is 10% faster, for some reasons.
SSE2 version should be straightforward.
Removes a TODO.
Change-Id: I7ad5848fc5e355e2df505dbcd5a0f42fb6cbab41
2017-11-20 15:20:29 +01:00
Vincent Rabaud
c462cd0065
Remove useless code.
...
The casts are to the same type and the #define not used.
Change-Id: I8d69c3b9dde7a1c53c2ba5a026a653d8c2e1d2a7
2017-11-08 10:52:49 +01:00
James Zern
b7971d0e22
dsp: avoid defining _C functions w/NEON builds
...
when targeting NEON C functions with NEON equivalents won't be used, but
will contribute to binary size. the same goes for sse2, etc., but this
change is primarily concerned with binary sizes for android arm targets.
note '-noasm' or otherwise modifying VP8GetCPUInfo will have no effect
on the use of NEON functions.
this decision can be overridden by defining WEBP_DSP_OMIT_C_CODE to 0.
Change-Id: I47bd453c84a3d341ca39bc986a39eb9c785aface
2017-10-27 10:54:56 -07:00
James Zern
8d033b14d7
{dec,enc}_neon: harmonize function suffixes x2
...
+ neon.h
BUG=webp:355
Change-Id: Ia17c7dfc7d61742a4758823675a2d556a739c389
2017-10-20 19:00:53 -07:00
James Zern
0295e9815d
upsampling_neon: harmonize function suffixes
...
BUG=webp:355
Change-Id: I75423abbe0bcea3c98a42e412cc2116be81b5d08
2017-10-20 19:00:53 -07:00
James Zern
d572c4e52b
yuv_neon: harmonize function suffixes
...
BUG=webp:355
Change-Id: Ia2f716b459950c18717b062175197d1e6419bf2a
2017-10-20 19:00:53 -07:00
James Zern
ab9c2500db
rescaler_neon: harmonize function suffixes
...
BUG=webp:355
Change-Id: I161caa14f7ebbc3ae978b1722472625a77d0a4a4
2017-10-20 19:00:53 -07:00
James Zern
93e0ce27f4
lossless_neon: harmonize function suffixes
...
BUG=webp:355
Change-Id: I4210081a39800b5c2589c443da237269908af666
2017-10-20 19:00:53 -07:00
James Zern
22fbc50edd
lossless_enc_neon: harmonize function suffixes
...
BUG=webp:355
Change-Id: I462facaeade4f0f4fc1e96895493306d095a6a9a
2017-10-20 19:00:53 -07:00
James Zern
447875b47b
filters_neon,cosmetics: fix indent
...
BUG=webp:355
Change-Id: I9df1119f1ea94868f75253a92c2e878c9290f744
2017-10-20 19:00:29 -07:00
James Zern
785da7eadd
enc_neon: harmonize function suffixes
...
BUG=webp:355
Change-Id: Ie59efd271d16f12d21f3c800667dfc0980dc2e68
2017-10-20 00:18:32 -07:00
James Zern
bc1a251fcf
dec_neon: harmonize function suffixes
...
BUG=webp:355
Change-Id: I61c9a0c9e24515322955e04afd8c4ea6a44b9319
2017-10-20 00:14:18 -07:00
James Zern
61e535f1ac
dsp/lossless: workaround gcc-4.8 bug on arm
...
and all older versions.
force Sub3() to not be inlined, otherwise the code in Select() will be
incorrect.
extends the check add previously in:
637b3888
dsp/lossless: workaround gcc-4.9 bug on arm
BUG=webp:363
Change-Id: I1403b558f8660b764f3a570a3326822d5ef0be29
2017-10-19 13:05:48 -07:00
Pascal Massimino
0a17f4712c
Merge "WIP: list includes as descendants of the project dir"
2017-10-11 08:21:42 +00:00
James Zern
a439972175
WIP: list includes as descendants of the project dir
...
#include "(.|..)/..." -> #include "src/..."
Change-Id: I772880aa097a770722043c8a4393552ba38a89b6
2017-10-10 23:04:05 -07:00
James Zern
d361a6a733
yuv_sse2: harmonize function suffixes
...
BUG=webp:355
Change-Id: I02a66f7446c75a10c3ce4766235e5767617d0dce
2017-10-08 14:06:34 -07:00
James Zern
6921aa6f0c
upsampling_sse2: harmonize function suffixes
...
BUG=webp:355
Change-Id: I3a02cc717eb7506bd87511d6a17ab1691e84f72c
2017-10-08 14:06:30 -07:00
James Zern
08c67d3ed1
ssim_sse2: harmonize function suffixes
...
BUG=webp:355
Change-Id: I1282559888118b8cb0a46b7f0aa627d26b8838f5
2017-10-08 14:06:24 -07:00
James Zern
582a1b572a
rescaler_sse2: harmonize function suffixes
...
BUG=webp:355
Change-Id: I978fd826ff90149c0ffd9d7607dcc6f88082d3e6
2017-10-08 14:06:19 -07:00
James Zern
2c1b18ba2f
lossless_sse2: harmonize function suffixes
...
BUG=webp:355
Change-Id: I59d828800c2ab2a36e0ea90f629b74bd57207411
2017-10-08 14:06:14 -07:00
James Zern
0ac46e818b
lossless_enc_sse2: harmonize function suffixes
...
BUG=webp:355
Change-Id: I06c64416103c3f3fc0519dd46d64b0a35f9798e4
2017-10-08 14:06:05 -07:00
James Zern
bc634d57c2
enc_sse2: harmonize function suffixes
...
BUG=webp:355
Change-Id: Idd2f289fcf99f12bf36494111b07a8906c99c826
2017-10-08 14:05:59 -07:00
James Zern
bcb7347c2b
dec_sse2: harmonize function suffixes
...
BUG=webp:355
Change-Id: Ic0390a4a24a5d8caff5b8af9fc9d59769ec533b1
2017-10-07 15:14:03 -07:00
James Zern
fb3daad604
cpu: fix ssse3 check
...
ssse3 is bit #9 in ecx, bit 1 is sse3. this only controls the check for
slow ssse3 and likely had no ill effect.
Change-Id: I84ce73dc480e1cdbd085e37be06f3f402116c201
2017-09-29 16:27:47 -07:00
Vincent Rabaud
a5216efc8c
Fix integer overflow warning.
...
Though the overflow could happen, it does not change the
end results.
Change-Id: I1b84e022a0776d35eab5c5c4fb7d3563f5667bfa
2017-09-25 11:02:22 +02:00
James Zern
f78da3dea6
add LOCAL_CLANG_PREREQ and avoid WORK_AROUND_GCC w/3.8+
...
this results in a 15-20% speedup for lossy decoding on a N5/S6/CM1
BUG=webp:339
Change-Id: Icdeb84c3e0b8908147ac276b4d8f76c3d565b735
2017-09-19 20:59:49 -07:00
James Zern
01c426f1e7
define WEBP_USE_INTRINSICS w/gcc-4.9+
...
32-bit builds are neutral to slightly faster using ndk r15c on a
N5/S6/CM1
BUG=webp:339
Change-Id: I94b9442e0ceaf2f5edb2b4026bc8b99cd77c918b
2017-09-19 20:59:43 -07:00
Pascal Massimino
3822762a6c
rationalize the Makefile.am
...
one library addition per line, etc...
BUG=webp:355
Change-Id: I95761dea598a382db5632c5187210937e129ff75
2017-08-29 00:00:14 -07:00
Pascal Massimino
42c79aa66b
Merge "Encoder: harmonize function suffixes"
2017-08-09 18:13:57 +00:00