libwebp

mirror of https://github.com/webmproject/libwebp.git synced 2025-08-16 17:08:08 +02:00

Author	SHA1	Message	Date
James Zern	8599571935	disable CombinedShannonEntropy_SSE2 on x86 this function produces different results from the C code due to use of double/float resulting in output differences when compared to -noasm. Bug: webp:499 Change-Id: Ia039b168c0a66da723fb434656657ba1948db8ae	2021-01-18 16:41:44 -08:00
James Zern	ae54553461	dsp.h: allow config.h to override MSVC SIMD autodetection this fixes builds with cmake targeting visual studio that set -DWEBP_ENABLE_SIMD=0 BUG=webp:478 Change-Id: I21b61b112c79ff9cbab9e4502a25d3f1fa096c8b	2020-12-03 10:22:04 -08:00
Vincent Rabaud	fc14fc038b	Have C encoding predictors use decoding predictors. libwebp.a in Release mode with no symbols size in bytes: 986430 -> 975114 (-1.1%) Change-Id: Ia96192a6be2911779e359b72132bdba60b60a13d	2020-12-02 11:54:59 +01:00
Ingvar Stepanyan	52273943c6	Couple of fixes to allow SIMD on Emscripten - Add `-msimd128` to flags to actually enable WebAssembly SIMD when performing SIMD detection. It's currently required in addition to `-msse` / `-mfpu=neon` flags which only perform translation of corresponding intrinsics to Wasm SIMD ones. See a discussion at emscripten-core/emscripten#12714 for automating this and making easier in the future. - Remove compilation branch that prevented definitions of `WEBP_USE_SSE` and `WEBP_USE_NEON` on Emscripten even when SIMD support was detected at compile-time. - Add an implementation of `VP8GetCPUInfo` for Emscripten which uses static `WEBP_USE_` flags to determine if a corresponding SIMD instruction is supported. This is because Wasm doesn't have proper feature detection (yet) and requires making separate build for SIMD version anyway. Change-Id: I77592081b91fd0e4cbc9242f5600ce905184f506	2020-11-18 21:51:41 +00:00
Skal	55a080e50a	Add WebPReplaceTransparentPixels() in dsp with SSE2 implementation. (Extracted from side experiment) Change-Id: I62d457fb6643645291cffd6d2d205d4a5ffa4517	2020-09-09 08:15:22 +02:00
Yannis Guyon	47309ef52d	webp: WEBP_OFFSET_PTR() Removes undefined behavior of offsetting NULL. Change-Id: I7c83d0c913c631c091a5fb128f6d6b46b1d116db	2020-03-20 11:39:06 +01:00
James Zern	687ab00e6e	DC{4,8,16}_NEON: replace vmovl w/vaddl 4/8/16 fewer instructions Change-Id: I38fe08722e7b839e3f3e0bf4df7e0fa8e7a0138f	2020-03-05 09:41:14 -08:00
James Zern	1b92fe75a1	DC16_NEON,aarch64: use vaddlv saves 3 instructions, neutral to mildly faster on a pixel 3a Change-Id: I6ae57e8e38d4149167ea14e27cd2b32113b4f8e7	2020-03-04 23:12:20 -08:00
James Zern	53f3d8cf7e	dec_neon,DC8_NEON: use vaddlv instead of movl+vaddv one fewer instruction Change-Id: I2f599fd6f9eebbb0cab81ae9855244fc401d4323	2020-03-04 15:46:38 -08:00
James Zern	c6b75a1966	lossless_(enc_\|)sse2: avoid offsetting a NULL pointer PredictorSub0_SSE2 doesn't use 'upper' (neither does VP8LPredictorsSub_C[0]); just pass NULL when dealing with trailing pixels to avoid undefined behavior when offsetting a NULL pointer BUG=chromium:1026858,oss-fuzz:19430 Change-Id: I08be8899ed2e34f26aaee34defe68dbd0fe216d3	2019-12-13 18:33:10 +00:00
James Zern	e2575e05cb	DC8_NEON,aarch64: use vaddv results in one fewer instruction for both DC8uv_NEON and DC8uvNoLeft_NEON Change-Id: Ia4e6f4dbc070079cdc2496a698bd4b34198ea164	2019-12-06 09:38:48 -08:00
Cheng Yi	b0e09e346f	dec_neon: Fix build failure under some toolchains some toolchains may implement vcreate_u64 as an assignment to a vector causing a type mismatch: invalid conversion between vector type 'uint64x1_t' (vector of 1 'uint64_t' value) and integer type 'unsigned int' of different size const uint64x1_t LKJI____ = vcreate_u64(L \| (K << 8) \| (J << 16) \| (I << 24)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Change-Id: I5c7b0076ad66d4b3fcdcb7ee9f59bbaa6f19b783	2019-12-06 00:06:44 -08:00
Oliver Wolff	cf0e903c89	dsp/lossless: Fix non gcc ARM builds The workaround for GCC ARM must not be applied when another toolchain (like MSVC) is used for the build. Change-Id: I11ec4558902063ccb085d3f435e24b3a60739dd5	2019-11-27 15:05:08 +01:00
Vincent Rabaud	bb7bc40b6d	Remove ubsan errors. 'upper' could be NULL and it would be increased. But that is for predictor zero that does not use 'upper'. Change-Id: Icd4ae6792cc55ea021b4f828c3dbdb5f03e120d8	2019-11-06 14:08:14 +01:00
James Zern	fab8f9cfcf	cosmetics: normalize '' association we associate '' with types rather than variables Change-Id: Id93ed65272a8a88e604278693e3850649639e9b6	2019-07-26 01:04:09 -07:00
Pascal Massimino	9d6988f44d	Fix the oscillating prediction problem at low quality For some exact resonance the over-quantization was exactly compensating the under-quantization, leading to resonance and strange patterns. -> we special-handle the very flat blocks, hopefully for the greater good (and not just the bad-resonance case). For 'fast mode' (-m 3 or less), we just pay special attention to the border of the image, where the oscillation / instability usually starts. For the inner part of the image, since we're not doing rd-opt, it's harder to fix anything. Overall, on 'regular' images, the change is written the noise, often leading to overall faster encoding (because of the short-cut). BUG=webp:432 Change-Id: Ifaa8286499add80fd77daecf8e347abbff7c3a15	2019-07-03 08:40:41 -07:00
James Zern	92dbf23775	filters_sse2,cosmetics: shorten some long lines Change-Id: Ifd8ddec50821aba175d41237df18e41b9ac6c7d4	2019-07-01 12:17:43 -07:00
James Zern	a277d197a2	filters_sse2.c: quiet integer sanitizer warnings missed in `a788b49` with clang7+ quiets conversion warnings like: implicit conversion from type 'int' of value -114 (32-bit, signed) to type 'uint8_t' (aka 'unsigned char') changed the value to 142 (8-bit, unsigned) Change-Id: I52dcd9cd613107f5424177c277785b92430bffb7	2019-07-01 11:16:50 -07:00
James Zern	a788b49897	filters_sse2.c: quiet integer sanitizer warnings with clang7+ quiets conversion warnings like: implicit conversion from type 'int' of value -114 (32-bit, signed) to type 'uint8_t' (aka 'unsigned char') changed the value to 142 (8-bit, unsigned) Change-Id: I7f08a836ddcf777454dfd5b877a81b62b2abac86	2019-06-28 23:22:49 -07:00
James Zern	e6a92c5e15	filters.c: quiet integer sanitizer warnings with clang7+ quiets conversion warnings like: implicit conversion from type 'int' of value -12 (32-bit, signed) to type 'uint8_t' (aka 'unsigned char') changed the value to 244 (8-bit, unsigned) Change-Id: I053c92301e55dcb0cae89a7733636283da942176	2019-06-28 23:16:28 -07:00
James Zern	ec1cc40a59	lossless.c: remove U32 -> S8 conversion warnings Change-Id: Ica2664ea087254959391275654412141ed9472df	2019-06-28 01:34:55 -07:00
Pascal Massimino	1106478f42	remove conversion U32 -> S8 warnings using an inline U32ToS8() function Change-Id: I45f535c6c9b5de33d69acc17b466e183fcc19a63	2019-06-24 16:42:42 -07:00
Skal	812a6b49fc	lossless_enc: fix some conversion warning object code is unchanged. Change-Id: I40fc16056c0ab44c5c57ef6b02af14be767abe87	2019-06-24 16:16:18 +02:00
James Zern	4627c1c91b	lossless_enc,TransformColorBlue: quiet uint32_t conv warning no change in object code from clang-7 integer sanitizer: implicit conversion from type 'uint32_t' (aka 'unsigned int') of value 1955895199 (32-bit, unsigned) to type 'uint8_t' (aka 'unsigned char') changed the value to 159 (8-bit, unsigned) Change-Id: I0c3022339e34b9c9af03167ab827ade677973644	2019-06-20 23:06:13 -07:00
James Zern	c84673a62f	lossless_enc_sse{2,41}: quiet signed conv warnings _mm_set1_epi16 takes a short argument from clang-7 integer sanitizer: implicit conversion from type 'int' of value 65280 (32-bit, signed) to type 'short' changed the value to -256 (16-bit, signed) Change-Id: Iad64f6209a8c130a7df67515451ded45b3f91702	2019-06-15 00:22:03 -07:00
James Zern	776a775709	dec_sse2: quiet signed conv warnings _mm_set1_epi8() takes a char argument _mm_insert_epi16 takes a short argument from clang-7 integer sanitizer: implicit conversion from type 'int' of value 189 (32-bit, signed) to type 'char' changed the value to -67 (8-bit, signed) implicit conversion from type 'int' of value 128 (32-bit, signed) to type 'char' changed the value to -128 (8-bit, signed) implicit conversion from type 'int' of value 33909 (32-bit, signed) to type 'short' changed the value to -31627 (16-bit, signed) Change-Id: Id6b191b2c06881e27d447eeb1ff5bb2c1857b6ba	2019-06-14 01:00:20 -07:00
James Zern	e78dea7587	(alpha_processing,enc}_sse2: quiet signed conv warnings _mm_set1_epi8() takes a char argument _mm_insert_epi16 takes a short argument from clang-7 integer sanitizer: implicit conversion from type 'int' of value 255 (32-bit, signed) to type 'char' changed the value to -1 (8-bit, signed) implicit conversion from type 'int' of value 33153 (32-bit, signed) to type 'short' changed the value to -32383 (16-bit, signed) Change-Id: Ic88c8ef3d00146d34f53a560582db673f818370d	2019-06-10 14:23:58 -07:00
Pascal Massimino	ab2dc8939f	Rescaler: fix rounding error We saturate the result to [0..255] It's the easiest and safest, given the wide variety of scaling range we cover: we're not using floats, so precision is always an issue at one end or the other of the scaling spectrum. we also use: round(a - floor(b)) instead of: floor(a - round(b)) to handle difficult cases (ratio ~= .99, e.g.) MIPS code is still disabled (and wrong) Change-Id: I18d3f5ddc4c524879c257b928329b1c648fa7fb5	2019-03-30 06:43:55 +00:00
James Zern	8c3f04febb	AndroidCPUInfo: reorder terms in conditional 'var != constant' is the preferred style for the library Change-Id: I226e6d5d80dddd0469808136605f49205d238341	2019-03-15 18:12:04 -07:00
Johann	5173d4ee6f	neon IsFlat Move IsFlat to its own header. This allows it to continue to be inlined. Using the RTCD and creating a distinct function slows down arm builds. flower mug C 3.59 2.12 NEON 3.47 2.01 BUG=b/118740850 Change-Id: Id77e8f76d9e9790c498806e7070bbe37c10bc2e9	2018-12-03 22:59:12 +00:00
Johann	9f4d4a3f49	neon: GetResidualCost Direct copy of sse2. Slight improvement because neon has abs(). flower.ppm had minimal improvement. Somewhat expected because GetResidualCost_C is only ~3.6% mug.ppm had a better improvement because GetResidualCost_C is almost 9%. C 2.150 NEON 2.130 BUG=b/118740850 Change-Id: Ibc0dd97a81596635f5599cf568205974b4fd2597	2018-11-14 11:46:58 -08:00
Johann	0fd7514b55	neon: SetResidualCoeffs Much faster with aarch64. Still somewhat faster without vmaxv. C: 3.700s ArmV7: 3.675 aarch64: 3.600 BUG=b/118740850 Change-Id: I3be852da89633eca4bddce443c87f5e4a2f55868	2018-11-14 11:46:40 -08:00
Vincent Rabaud	decf6f6b87	Speedups for empty histograms. When histograms are empty, it is easy to add them. They should also not be considered when merging histograms (it is a waste of CPU). This does not change the compression performance, just the speed. Change-Id: I42c721ca0f9c5ea067e73b792aa3db6d5e71d01f	2018-10-20 13:23:50 +02:00
Vincent Rabaud	dea3e89983	Split HistogramAdd to only have the high level logic in C. Change-Id: Ic9eaebf7128ca0215b49d2a13bde1f5b94a28061	2018-10-19 14:03:28 +02:00
Vincent Rabaud	cbf82cc04d	Remove AVX2 files. There is only enc_avx2.c and we never managed to get something fast enough. Change-Id: I7465b5d8ccf47d9aa612173b8f80f96060cdb366	2018-10-16 14:12:03 +02:00
Vincent Rabaud	ac5433118a	Remove a few more useless #defines Change-Id: I211e9bcb1c37d0ebc108896f109b23ce915e22b4	2018-10-15 16:26:10 +02:00
Vincent Rabaud	3e13da7b4f	Clean-up the common sources in dsp. Change-Id: I1b995e6517e8437127a433dccbb5b2db63e7c3a3	2018-10-08 15:00:01 +02:00
James Zern	de08d72741	cosmetics: normalize include guard comment Change-Id: I0e08ec604aad8412cfe3d3670d773f4ae5650375	2018-08-22 14:46:53 -07:00
Pascal Massimino	2563db4759	fix rescaling rounding inaccuracy We should be using 'floor' when doing the final divide. -> new MACRO is MULT_FIX_FLOOR() XXX* Mips code is DISABLED for now *XXX I'll update and re-enable it in a later patch, since this code needs some refactoring first. BUG=oss-fuzz:9179 Change-Id: Ic0693cdca4e71f5beab1029475e35c4d06b12d13	2018-07-10 22:45:50 -07:00
James Zern	0d5fad46cf	add WEBP_DSP_INIT / WEBP_DSP_INIT_FUNC this internalizes the init checks and provides stronger synchronization with pthreads when available while still allowing VP8GetCPUInfo to be modified (mostly for testing purposes). windows is left as is since a critical section or mutex would cause a leak. Change-Id: Ieb997e014f2805c0ae39c16f13337663521356f4 (cherry picked from commit `d77bf512bd`)	2018-04-17 18:01:34 -07:00
Pascal Massimino	c1cb86af5f	fix 16b overflow in SSE2 the 'accum' variable can be larger than 15b for large rescale values. Assert triggered: src/dsp/rescaler_sse2.c:249: RescalerExportRowExpand_SSE2: Assertion `v >= 0 && v <= 255' failed. src/dsp/rescaler_sse2.c:350: RescalerExportRowShrink_SSE2: Assertion `v >= 0 && v <= 255' failed. -> fall back to C implementation in this case for now Change-Id: I7ea1cb72301cafc1459be403f6a6f4e3cbc89bb1	2018-04-11 21:25:06 +00:00
James Zern	120f58c3aa	Merge "lossless*sse2: improve non-const 16-bit vector creation"	2018-02-20 19:56:07 +00:00
James Zern	8043504f95	lossless*sse2: improve non-const 16-bit vector creation use _mm_set1_epi32 instead of _mm_set_epi16 with non-const values; reduces shifts and ors. Change-Id: Ie2cb2ab815f642855d03c6f3001223bcac4bd35c	2018-02-17 17:59:20 -08:00
Pascal Massimino	3b07d32712	Import,RGBA: fix for BigEndian import + simplification of the logic Change-Id: Ia20ce844793ed35ea03a17cef45838f3d0ae4afa	2018-02-17 13:07:58 -08:00
James Zern	f4dd92565e	remove WEBP_EXPERIMENTAL_FEATURES the webp bitstream is considered stable at this point Change-Id: I4b13f9ed4c45f63785474b097e96cb7bf651be7b	2018-02-09 10:25:11 -08:00
skal	6de58603b7	MIPS64: Fix defined-but-not-used errors with WEBP_REDUCE_CSP BUG=webp:372 Change-Id: Ided3fae748face18138a8050eaced5e0f58120d4	2018-01-30 17:40:09 -08:00
Vincent Rabaud	cf1c5054c7	Add an SSE4 version of some lossless color transforms. Change-Id: Ieac094f684116d1292793b2ca321f6f1a69565b5	2018-01-24 14:33:25 +01:00
James Zern	05f6fe24c3	upsampling: rm asserts w/REDUCE_CSP+OMIT_C_CODE with WEBP_NEON_OMIT_C_CODE the default _C functions won't be set and with WEBP_REDUCE_CSP the NEON functions won't be either triggering an assert for an empty table member. BUG=chromium:792627 Change-Id: I8d2d430eaa37bb92885b61a3dd39f961924a8def	2017-12-06 17:09:26 -08:00
Vincent Rabaud	55403a9a5a	Upsampling SSE2/SSE4 speedup. RGB to YUV conversion was not using SSE to finish up the row. End data is now copied to a buffer big enough to fit in a SSE register. (UPSAMPLE_LAST_BLOCK was already using that trick). Change-Id: Ie539bcbe570a643a774aa88263503c0d2c41890f	2017-12-05 23:37:06 +01:00
Vincent Rabaud	807b53c47e	Implement the upsampling/yuv functions in SSE41 Change-Id: If122da22b74a974262063d232f6ca0ab902ff64e	2017-12-04 22:29:43 +01:00

1 2 3 4 5 ...

843 Commits