libwebp

mirror of https://github.com/webmproject/libwebp.git synced 2025-08-10 18:10:57 +02:00

Author	SHA1	Message	Date
James Zern	306335198d	muxread: fix reading of buffers > riff size After: `2c70ad76` muxread,CreateInternal: fix riff size checks (cl/200674839) `SizeWithPadding()` adds `CHUNK_HEADER_SIZE` (plus additional 1 byte padding if needed). A later check included `CHUNK_HEADER_SIZE` before capping the value of the size passed to `WebPMuxCreateInternal()`, missing cases with a small amount of extra data after the RIFF chunk (like a newline when the file is opened and saved in a text editor) and setting size to an incorrect value, so larger sizes would also fail. Another check of `riff_size < CHUNK_HEADER_SIZE` after the call to `SizeWithPadding()` is removed because 1) it could not fail given `SizeWithPadding()` adds `CHUNK_HEADER_SIZE` to the value; and 2) it is redundant as `size < RIFF_HEADER_SIZE + CHUNK_HEADER_SIZE` is checked earlier in the function. Bug: webp:42340561 Change-Id: I58dc4f071b27c2841001b4012aabdb1869f64f97	2024-11-22 12:40:34 -08:00
James Zern	4c85d860ea	yuv.h: update RGB<->YUV coefficients in comment The values for the R/G/B floating point formulas resembled https://fourcc.org/fccyvrgb.php and Video Demystified, but the fixed point values are more closely aligned to rounded values from https://en.wikipedia.org/wiki/YCbCr and BT.601. The R/G/B formulas with the values prior to this change are added to sharpyuv_csp.c as they align with the fixed values. The origin of those coefficients is unclear. For consistency between library versions we'll leave them as is. Bug: webp:375011696 Change-Id: Id3f2a57530eee700cc52a899b32b25b5c015e89b	2024-11-21 16:21:45 -08:00
James Zern	61e2cfdadd	rework AddVectorEq_SSE2 Take advantage of the known sizes used by VP8LHistogramAdd() and remove loop for the remainder. The loop was being auto-vectorized making the code larger and slower than the vectorized C code. For larger sizes the new code is ~3-4.5% faster than the old code with about the same improvement against the vectorized C code. For the minimal size (40), the new code is ~30% faster than the C and old SSE2 code. The LINE_SIZE==8 option is removed with this change. It had been set to 16 for its entire life and clang-16 was unrolling the LINE_SIZE==8 case by 2 in any case; they both profile similarly. Change-Id: I6dfedfd57474f44d15e2ce510a48e5252221077a	2024-11-14 12:21:39 -08:00
James Zern	7bda3deb89	rework AddVector_SSE2 Take advantage of the known sizes used by VP8LHistogramAdd() and remove loop for the remainder. The loop was being auto-vectorized making the code larger and slower than the vectorized C code. For larger sizes the new code is ~4-7% faster than the old code with about the same improvement against the vectorized C code. For the minimal size (40), the new code is ~30% faster than the C and old SSE2 code. The LINE_SIZE==8 option is removed with this change. It had been set to 16 for its entire life and clang-16 was unrolling the LINE_SIZE==8 case by 2 in any case; they both profile similarly. Change-Id: I2376e2dca3bffa38477b4a432f4c533419e3be0e	2024-11-14 12:21:33 -08:00
James Zern	dfdcb7f95c	Merge "lossless.h: fix function declaration mismatches" into main	2024-10-09 22:30:49 +00:00
James Zern	78ed683978	fix overread in Intra4Preds_NEON Extend VP8EncIterator::i4_boundary_ by 3 bytes to avoid Intra4Preds_NEON reading deeper into the struct (likely padding) when top is positioned at offset 29. This data is memset with MSan to prevent a warning due to its incorrect modeling of tbl instructions. Prior to: `169dfbf9` disable Intra4Preds_NEON there was a mismatch in the preprocessor checks for enabling the function in NEON and removing the C version; NEON used `BPS == 32` while the C code was removed unconditionally when building for aarch64. This patch also normalizes those checks to look for `BPS == 32` and `BPS != 32` as appropriate. Bug: b:366668849,webp:372109644 Change-Id: Ic9e6ad4b2d844cb446decd63aec0b2676a89c8d0	2024-10-08 16:55:12 -07:00
James Zern	d516a68e54	lossless.h: fix function declaration mismatches These appear as warnings under VS15 (16 and 17 are silent) and were missed in: `a32b436b` dsp/lossless*: use WEBP_RESTRICT qualifier Change-Id: Ia7cffafc166f2da93b51714363558798cda71b67	2024-10-08 13:41:16 -07:00
James Zern	fdb229ea3a	Merge changes I07a7e36a,Ib29980f7,I2316122d,I2356e314,I32b53dd3, ... into main * changes: dsp/yuv: use WEBP_RESTRICT qualifier dsp/upsampling: use WEBP_RESTRICT qualifier dsp/rescaler: use WEBP_RESTRICT qualifier dsp/lossless: use WEBP_RESTRICT qualifier dsp/filters: use WEBP_RESTRICT qualifier dsp/enc: use WEBP_RESTRICT qualifier dsp/dec: use WEBP_RESTRICT qualifier dsp/cost: use WEBP_RESTRICT qualifier	2024-10-03 17:01:02 +00:00
James Zern	169dfbf931	disable Intra4Preds_NEON The load of the `top` parameter may over read causing MSan errors: ==7373==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0xfff891d52ad4 in Intra4Preds_NEON src/dsp/enc_neon.c:1003:12 #1 0xfff892d87618 in MakeIntra4Preds src/enc/quant_enc.c:484:3 Bug: b:366668849 Change-Id: I29cf3b2f402ee79ea93c1ee2a4fdd95083aeed68	2024-10-02 15:42:19 -07:00
James Zern	2dd5eb9862	dsp/yuv*: use WEBP_RESTRICT qualifier Better vectorization in the C code, fewer instructions / comparisons in NEON, and fewer reloads in SSE2/SSE4 w/ndk r27/gcc-13/clang-16. This only affects non-vector pointers; any vector pointers are left as a follow up. Change-Id: I07a7e36a2dce8632c71c0fbbeef94dc51453eaf7	2024-10-02 14:55:15 -07:00
James Zern	23bbafbeb8	dsp/upsampling*: use WEBP_RESTRICT qualifier Better vectorization in the C code, fewer instructions in NEON, and some code reordering / better register usage in SSE2/SSE4 w/ndk r27/gcc-13/clang-16. This only affects non-vector pointers; any vector pointers are left as a follow up. Change-Id: Ib29980f778ad3dbb952178ad8dee39b8673c4ff8	2024-10-02 14:55:15 -07:00
James Zern	35915b389e	dsp/rescaler*: use WEBP_RESTRICT qualifier Some improvement in the C code. No changes in NEON or SSE2 w/ndk r27/gcc-13/clang-16. This only affects non-vector pointers; any vector pointers are left as a follow up. Change-Id: I2316122db893f48f0afda90a147c83cac7f07526	2024-10-02 14:55:14 -07:00
James Zern	a32b436bd5	dsp/lossless*: use WEBP_RESTRICT qualifier lossless_enc: better vectorization, most benefits seen in AddVector/Eq w/ndk r27/gcc-13/clang-16 lossless: minor reordering and some improvement to PredictorAdd5_SSE2 w/gcc-13 This only affects non-vector pointers; any vector pointers are left as a follow up. Change-Id: I2356e314f391ee2f2c71f00bc6ee10097d3881e7	2024-10-02 14:55:14 -07:00
James Zern	04d4b4f387	dsp/filters*: use WEBP_RESTRICT qualifier Better stack/register usage in SSE2/NEON code and improved vectorization of the C code with ndk r27/gcc-13/clang-16. This only affects non-vector pointers; any vector pointers are left as a follow up. Change-Id: I32b53dd38bfc7e2231d875409e7dfda7c513cfb6	2024-10-02 14:55:14 -07:00
James Zern	b1cb37e659	dsp/enc*: use WEBP_RESTRICT qualifier This allows for better vectorization of the C code, inlining of TrueMotion_SSE2, better load usage in aarch64 and other minor reordering with ndk r27/gcc-13/clang-16. This only affects non-vector pointers; any vector pointers are left as a follow up. Change-Id: I07e9944d5c0aa5a079b22883ac5a2d649695e4a0	2024-10-02 14:55:14 -07:00
James Zern	201894ef24	dsp/dec*: use WEBP_RESTRICT qualifier A minor improvement for arm targets with ndk r27/gcc-13 in H/VFilter8 (a couple fewer moves w/aarch64) and much better vectorization of DitherCombine8x8_C in most targets. This only affects non-vector pointers; any vector pointers are left as a follow up. Change-Id: I03e73e6d6404261bb8408a9ae76a4b6ef142f8f0	2024-10-02 14:55:14 -07:00
James Zern	02eac8a741	dsp/cost: use WEBP_RESTRICT qualifier on SetResidualCoeffs_. This results in some minor code reordering when targeting arvm7 with ndk r27 and other recent versions of clang. No changes in the x86 compilations with clang-16 / gcc-13. This only affects non-vector pointers; any vector pointers are left as a follow up. Change-Id: I7c3554ece848fafbc5ac9c4944f1dc85129f6fd8	2024-10-02 14:55:14 -07:00
Vincent Rabaud	220ee52967	Search for best predictor transform bits This is useful in cruncher mode. Change-Id: I8586bdbf464daf85db381ab77a18bf63dd48f323	2024-09-24 10:44:22 +02:00
Vincent Rabaud	7861947813	Try to reduce the sampling for the entropy image This offers minor compression improvements. Change-Id: I4b3b1bb11ee83273c0e4c9f47e53b21cf7cd5f76	2024-09-24 10:28:43 +02:00
Vincent Rabaud	a78c5356ba	Remove a useless malloc for entropy image histogram_symbols is converted to uint32_t and <<8 into histogram_argb. Using a uint32_t buffer from the start prevents copying and converting the data. Change-Id: I245003a6a0f048c31519afa25a600d4479e762e3	2024-09-18 22:38:11 +02:00
Vincent Rabaud	367ca938f1	Refactor predictor finding This is useful for a forward change that will improve compression. It splits the residual computation and the best predictor selection. The only downside is that more memory is allocated: we had 2 histograms before, we now have 14, but this is necessary for the later change. Still, this is nothing compared to what is done later in the pipeline in HistogramSetTotalSize where the number of histograms created is the number of pixels in the subsampled image. Change-Id: If03501a26f00462dd1809daa6e9314abd180945d	2024-09-17 09:49:43 +02:00
James Zern	f888291359	anim_encode.c: fix function ref in comment WebPCleanupTransparentAreaLossless() was renamed to WebPReplaceTransparentPixels() in: `55a080e5` Add WebPReplaceTransparentPixels() in dsp Change-Id: I91e32574e6add2748c0655146f100eb2b40498b2	2024-09-09 19:28:12 -07:00
Vincent Rabaud	2e81017c7a	Convert predictor_enc.c to fixed point Also remove the last float in histogram_enc.c Change-Id: I6f647a5fc6dd34a19292820817472b4462c94f49	2024-08-30 09:22:48 +02:00
Vincent Rabaud	8e0cc14c3e	Fix static overflow warning. In practice, this can never happen because: - 'streak' is at most as long as a histogram - 'count' counts the number of streaks 'streak' and 'count' are therefore at most as big as the histogram length which is at most the max of VP8LHistogramNumCodes, which is 256+24+(1<<10). Change-Id: I31c8834543479c8a9260732313ea26b045519515	2024-08-28 10:23:54 +02:00
James Zern	615e58744f	Merge "make VP8LPredictor[01]_C() static" into main	2024-08-22 17:35:52 +00:00
James Zern	233e86b91f	Merge changes Ie43dc5ef,I94cd8bab into main * changes: DoFilter_: remove row & num_rows parameters Do*Filter_C: remove dead 'inverse' code paths	2024-08-19 18:51:06 +00:00
James Zern	1a29fd2fc3	make VP8LPredictor[01]_C() static Only predictors 2-13 are reused in lossless_enc.c. Change-Id: Ia3a7342fccfb44b9ad5297f48d6be2d96af68ec8	2024-08-16 10:58:45 -07:00
James Zern	dd9d3770d7	DoFilter_: remove row & num_rows parameters The row parameter became a constant in: `2102ccd` update the Unfilter API in dsp to process one row independently num_rows is always equal to height. Change-Id: Ie43dc5ef222e442ce8c92766da0b9824ccbca236	2024-08-12 19:36:31 -07:00
James Zern	ab451a495c	Do*Filter_C: remove dead 'inverse' code paths The inverse parameter became a constant in: `2102ccd` update the Unfilter API in dsp to process one row independently The row parameter to these functions is in a similar state; it will be removed in a follow up. Change-Id: I94cd8babe0e42474ff794ba5fa29dd48039de5f8	2024-08-08 18:13:48 -07:00
James Zern	f9a480f7c3	{TrueMotion,TM16}_NEON: remove zero extension Replace vmovl_u8 -> s16 + signed vaddq with unsigned vaddw. No change in assembly with clang-16 (armv7 & aarch64) and gcc-13 (aarch64). armv7 gcc-13 had kept the vmovl instructions, those are now gone. Change-Id: Ibb4fbdd5680d3e9dd06933c100528a6f363de472	2024-08-07 16:43:14 -07:00
James Zern	04834acae7	Merge changes I25c30a9e,I0a192fc6,I4cf89575 into main * changes: WASM: Enable VP8L_USE_FAST_LOAD WASM: don't use USE_GENERIC_TREE WASM: Enable 64-bit BITS caching	2024-08-01 18:36:34 +00:00
Vincent Rabaud	74be8e22d9	Fix implicit conversion issues Change-Id: If2cc8a137371ef365cf4a9c55f1b6ab131fba564	2024-07-25 22:30:15 +02:00
Vincent Rabaud	f2d6dc1eef	Increase the transform bits if possible. This brings minor size improvements because repetitive values in the transform images are easily explainable through LZ77. Still, it makes an upcoming pull request a bit more stable. This is a rollforward of `7ec51c5916` `ee26766a89` Change-Id: I254ab3ccd5053344f89099280e8d994ecd55aee0	2024-07-19 23:22:27 +02:00
wrv	8a7c8dc662	WASM: Enable VP8L_USE_FAST_LOAD It is 2-5% faster to use VP8L fast load on WASM Bug: webp:643 Change-Id: I25c30a9e6bcfc7cadd640122579eeebcb37e6fc0	2024-07-15 14:41:36 -05:00
wrv	f0c53cd966	WASM: don't use USE_GENERIC_TREE It is 2-4% faster to use hard-coded tree on WASM Bug: webp:643 Change-Id: I0a192fc6af210c79814a81084cd1f199714bf46c	2024-07-15 14:41:14 -05:00
wrv	eef903d04a	WASM: Enable 64-bit BITS caching Bug: webp:643 Change-Id: I4cf89575e0ebcfeaf9d84be8e188863657893a07	2024-07-15 14:40:45 -05:00
James Zern	6296cc8d0d	iterator_enc: make VP8IteratorReset() static This function is unused outside of iterator_enc.c. Change-Id: I0f1ecedeb9ed4d9f51d0135f04b8ef00424f24cc	2024-07-12 15:23:10 -07:00
James Zern	fbd93896a6	histogram_enc: make VP8LGetHistogramSize static This function is unused outside of histogram_enc.c. Change-Id: I527f54408383d0bc9d04878ca397a3d044b350de	2024-07-12 15:23:10 -07:00
James Zern	cc7ff5459a	cost_enc: make VP8CalculateLevelCosts[] static This table is unused outside of cost_enc.c. Change-Id: I0aa46554b8470fb09a7ffeae0e98d2356b40b671	2024-07-12 15:23:10 -07:00
James Zern	4e2828bae8	vp8l_dec: make VP8LClear() static This function is unused outside of vp8l_dec.c. Change-Id: I16733a44ea024ca9601c098641a3cd464bed2b53	2024-07-12 15:22:20 -07:00
James Zern	d742b24a88	Intra16Preds_NEON: fix truemotion saturation This needs to be done with signed saturation as the sum may be negative. fixes mismatch with C code after: `3bfb05e3` Add AArch64 Neon implementation of Intra16Preds Change-Id: I017e939d7155cc3489ceb76fc8ad50ac9917f23d	2024-07-11 13:37:06 -07:00
James Zern	c7bb4cb585	Intra4Preds_NEON: fix truemotion saturation This needs to be done with signed saturation as the sum may be negative. fixes mismatch with C code after: `baa93808` Add AArch64 Neon implementation of Intra4Preds Change-Id: I190c3d7f78cfd2c7ae83fb7059de41e307abda36	2024-07-11 13:37:06 -07:00
Vincent Rabaud	dde11574b0	Remove TODO now that log is using fixed point. Bug: webp:499 Change-Id: I39ab340ec6b5932db7535c6b7f31843c28de8415	2024-07-11 20:11:03 +00:00
James Zern	3bd9420289	Merge changes Iff6e47ed,I24c67cd5,Id781e761 into main * changes: Use QuantizeBlock_NEON for VP8EncQuantizeBlockWHT on Arm Add AArch64 Neon implementation of Intra16Preds Add AArch64 Neon implementation of Intra4Preds	2024-07-11 02:04:42 +00:00
Vincent Rabaud	d27d246e42	Merge "Convert VP8LFastSLog2 to fixed point" into main	2024-07-10 21:52:39 +00:00
Istvan Stefan	314a142a34	Use QuantizeBlock_NEON for VP8EncQuantizeBlockWHT on Arm Use the Neon implementation instead of falling back to QuantizeBlock_C. Change-Id: Iff6e47eda353cbaa9766f75040fa63aa34607816	2024-07-10 14:48:38 +01:00
Istvan Stefan	3bfb05e38c	Add AArch64 Neon implementation of Intra16Preds Add a Neon implementation of Intra16Preds for use on 64-bit Arm platforms. (This implementation cannot be used on 32-bit Arm platforms as it makes use of a number of AArch64-only Neon instructions.) Change-Id: I24c67cd54b66307e3924fd332c2795fd7422f082	2024-07-10 14:48:38 +01:00
Istvan Stefan	baa93808d9	Add AArch64 Neon implementation of Intra4Preds Add Neon implementation of Intra4Preds for use on 64-bit Arm platforms. (The same implementation cannot be used for 32-bit Arm platforms as it uses a number of AArch64-only Neon instructions.) Change-Id: Id781e7614f4e8e876dfeecd95cfc85e04611d8c6	2024-07-10 14:48:26 +01:00
Vincent Rabaud	41a5e582c2	Fix errors when compiling code as C++ Change-Id: Iba94e24e764038640f39d61fb2bc9cfb3434cc8f	2024-07-10 10:30:48 +02:00
Vincent Rabaud	fb444b692b	Convert VP8LFastSLog2 to fixed point Speedups: 1% with '-lossless', 2% with '-lossless -q 100 -m6' Change-Id: I1d79ea8e3e9e4bac7bcea4d7cbcc1bd56273988e	2024-07-09 16:42:21 +02:00

1 2 3 4 5 ...

2949 Commits