Commit Graph

4898 Commits

Author SHA1 Message Date
James Zern
0ab789e067 Merge changes I6dfedfd5,I2376e2dc into main
* changes:
  rework AddVectorEq_SSE2
  rework AddVector_SSE2
2024-11-15 02:58:10 +00:00
James Zern
0323645066 {ios,xcframework}build.sh: fix compilation w/Xcode 16
Don't use `-fembed-bitcode`, fixes:
ld: warning: -bitcode_bundle is no longer supported and will be ignored
ld: -mllvm and -bitcode_bundle (Xcode setting ENABLE_BITCODE=YES) cannot
    be used together

Change-Id: I4ead0fc71da39bb5ec92c1f5ba467b95ad8b7461
2024-11-14 20:26:57 +00:00
James Zern
61e2cfdadd rework AddVectorEq_SSE2
Take advantage of the known sizes used by VP8LHistogramAdd() and
remove loop for the remainder. The loop was being auto-vectorized making
the code larger and slower than the vectorized C code.

For larger sizes the new code is ~3-4.5% faster than the old code with
about the same improvement against the vectorized C code. For the
minimal size (40), the new code is ~30% faster than the C and old SSE2
code.

The LINE_SIZE==8 option is removed with this change. It had been set
to 16 for its entire life and clang-16 was unrolling the LINE_SIZE==8
case by 2 in any case; they both profile similarly.

Change-Id: I6dfedfd57474f44d15e2ce510a48e5252221077a
2024-11-14 12:21:39 -08:00
James Zern
7bda3deb89 rework AddVector_SSE2
Take advantage of the known sizes used by VP8LHistogramAdd() and remove
loop for the remainder. The loop was being auto-vectorized making the
code larger and slower than the vectorized C code.

For larger sizes the new code is ~4-7% faster than the old code with
about the same improvement against the vectorized C code. For the
minimal size (40), the new code is ~30% faster than the C and old SSE2
code.

The LINE_SIZE==8 option is removed with this change. It had been set to
16 for its entire life and clang-16 was unrolling the LINE_SIZE==8 case
by 2 in any case; they both profile similarly.

Change-Id: I2376e2dca3bffa38477b4a432f4c533419e3be0e
2024-11-14 12:21:33 -08:00
Maryla
2ddaaf0aa5 Fix variable names in SharpYuvComputeConversionMatrix
Change-Id: Ia07e71aae42396100a4f50dc104e828239522d77
2024-11-07 09:37:40 +01:00
James Zern
a3ba6f19e9 Makefile.vc: fix gif2webp link error
Add missing dependency on libsharpyuv.

needed after:
f999d94f gif2webp: add -sharp_yuv/-near_lossless

Change-Id: I8bdd5c0fd4622f9c8ec6ffdf4ac11399f86350da
2024-11-06 10:14:05 -08:00
James Zern
f999d94f4a gif2webp: add -sharp_yuv/-near_lossless
This change is the same as the one that introduced the options to
img2webp:
0825faa4 img2webp: add -sharp_yuv/-near_lossless

Change-Id: Id380d159299c38dd6440f833d487e00c0976afec
2024-11-04 12:29:24 -08:00
James Zern
dfdcb7f95c Merge "lossless.h: fix function declaration mismatches" into main 2024-10-09 22:30:49 +00:00
James Zern
78ed683978 fix overread in Intra4Preds_NEON
Extend VP8EncIterator::i4_boundary_ by 3 bytes to avoid Intra4Preds_NEON
reading deeper into the struct (likely padding) when top is positioned
at offset 29. This data is memset with MSan to prevent a warning due to
its incorrect modeling of tbl instructions.

Prior to:
  169dfbf9 disable Intra4Preds_NEON
there was a mismatch in the preprocessor checks for enabling the
function in NEON and removing the C version; NEON used `BPS == 32` while
the C code was removed unconditionally when building for aarch64. This
patch also normalizes those checks to look for `BPS == 32` and `BPS !=
32` as appropriate.

Bug: b:366668849,webp:372109644
Change-Id: Ic9e6ad4b2d844cb446decd63aec0b2676a89c8d0
2024-10-08 16:55:12 -07:00
James Zern
d516a68e54 lossless.h: fix function declaration mismatches
These appear as warnings under VS15 (16 and 17 are silent) and were
missed in:
a32b436b dsp/lossless*: use WEBP_RESTRICT qualifier

Change-Id: Ia7cffafc166f2da93b51714363558798cda71b67
2024-10-08 13:41:16 -07:00
Maryla Ustarroz-Calonge
874069042e Merge "Improve documentation of SharpYuvConversionMatrix." into main 2024-10-04 11:59:25 +00:00
James Zern
fdb229ea3a Merge changes I07a7e36a,Ib29980f7,I2316122d,I2356e314,I32b53dd3, ... into main
* changes:
  dsp/yuv*: use WEBP_RESTRICT qualifier
  dsp/upsampling*: use WEBP_RESTRICT qualifier
  dsp/rescaler*: use WEBP_RESTRICT qualifier
  dsp/lossless*: use WEBP_RESTRICT qualifier
  dsp/filters*: use WEBP_RESTRICT qualifier
  dsp/enc*: use WEBP_RESTRICT qualifier
  dsp/dec*: use WEBP_RESTRICT qualifier
  dsp/cost*: use WEBP_RESTRICT qualifier
2024-10-03 17:01:02 +00:00
Maryla
0c3cd9cc2c Improve documentation of SharpYuvConversionMatrix.
Change-Id: I39898bf53db759b68c86c9005c11ded20de4eb3e
2024-10-03 10:35:58 +02:00
James Zern
169dfbf931 disable Intra4Preds_NEON
The load of the `top` parameter may over read causing MSan errors:

==7373==WARNING: MemorySanitizer: use-of-uninitialized-value
  #0 0xfff891d52ad4 in Intra4Preds_NEON src/dsp/enc_neon.c:1003:12
  #1 0xfff892d87618 in MakeIntra4Preds src/enc/quant_enc.c:484:3

Bug: b:366668849
Change-Id: I29cf3b2f402ee79ea93c1ee2a4fdd95083aeed68
2024-10-02 15:42:19 -07:00
James Zern
2dd5eb9862 dsp/yuv*: use WEBP_RESTRICT qualifier
Better vectorization in the C code, fewer instructions / comparisons in
NEON, and fewer reloads in SSE2/SSE4 w/ndk r27/gcc-13/clang-16.

This only affects non-vector pointers; any vector pointers are left as a
follow up.

Change-Id: I07a7e36a2dce8632c71c0fbbeef94dc51453eaf7
2024-10-02 14:55:15 -07:00
James Zern
23bbafbeb8 dsp/upsampling*: use WEBP_RESTRICT qualifier
Better vectorization in the C code, fewer instructions in NEON, and some
code reordering / better register usage in SSE2/SSE4 w/ndk
r27/gcc-13/clang-16.

This only affects non-vector pointers; any vector pointers are left as a
follow up.

Change-Id: Ib29980f778ad3dbb952178ad8dee39b8673c4ff8
2024-10-02 14:55:15 -07:00
James Zern
35915b389e dsp/rescaler*: use WEBP_RESTRICT qualifier
Some improvement in the C code. No changes in NEON or SSE2 w/ndk
r27/gcc-13/clang-16.

This only affects non-vector pointers; any vector pointers are left as a
follow up.

Change-Id: I2316122db893f48f0afda90a147c83cac7f07526
2024-10-02 14:55:14 -07:00
James Zern
a32b436bd5 dsp/lossless*: use WEBP_RESTRICT qualifier
lossless_enc: better vectorization, most benefits seen in AddVector/Eq
              w/ndk r27/gcc-13/clang-16
lossless: minor reordering and some improvement to PredictorAdd5_SSE2
          w/gcc-13

This only affects non-vector pointers; any vector pointers are left as a
follow up.

Change-Id: I2356e314f391ee2f2c71f00bc6ee10097d3881e7
2024-10-02 14:55:14 -07:00
James Zern
04d4b4f387 dsp/filters*: use WEBP_RESTRICT qualifier
Better stack/register usage in SSE2/NEON code and improved vectorization
of the C code with ndk r27/gcc-13/clang-16.

This only affects non-vector pointers; any vector pointers are left as a
follow up.

Change-Id: I32b53dd38bfc7e2231d875409e7dfda7c513cfb6
2024-10-02 14:55:14 -07:00
James Zern
b1cb37e659 dsp/enc*: use WEBP_RESTRICT qualifier
This allows for better vectorization of the C code, inlining of
TrueMotion_SSE2, better load usage in aarch64 and other minor
reordering with ndk r27/gcc-13/clang-16.

This only affects non-vector pointers; any vector pointers are left as a
follow up.

Change-Id: I07e9944d5c0aa5a079b22883ac5a2d649695e4a0
2024-10-02 14:55:14 -07:00
James Zern
201894ef24 dsp/dec*: use WEBP_RESTRICT qualifier
A minor improvement for arm targets with ndk r27/gcc-13 in H/VFilter8 (a
couple fewer moves w/aarch64) and much better vectorization of
DitherCombine8x8_C in most targets.

This only affects non-vector pointers; any vector pointers are left as a
follow up.

Change-Id: I03e73e6d6404261bb8408a9ae76a4b6ef142f8f0
2024-10-02 14:55:14 -07:00
James Zern
02eac8a741 dsp/cost*: use WEBP_RESTRICT qualifier
on SetResidualCoeffs_*. This results in some minor code reordering when
targeting arvm7 with ndk r27 and other recent versions of clang. No
changes in the x86 compilations with clang-16 / gcc-13.

This only affects non-vector pointers; any vector pointers are left as a
follow up.

Change-Id: I7c3554ece848fafbc5ac9c4944f1dc85129f6fd8
2024-10-02 14:55:14 -07:00
James Zern
84b118c9c3 Merge "webp-container-spec: normalize notes & unknown chunk link" into main 2024-09-30 18:16:05 +00:00
James Zern
052cf42f1a webp-container-spec: normalize notes & unknown chunk link
- use plural 'Notes' in the description of 'Background Color' to match
  the formatting of the notes describing 'Disposal method';
- fix one unknown chunks link to contain both words to match others

+ add a couple of missing commas

These changes are based on editor changes in AUTH48:
https://datatracker.ietf.org/doc/draft-zern-webp/

Change-Id: Ibbed0459d42944099e295f492dc21bde4e107658
2024-09-27 11:07:10 -07:00
Vincent Rabaud
220ee52967 Search for best predictor transform bits
This is useful in cruncher mode.

Change-Id: I8586bdbf464daf85db381ab77a18bf63dd48f323
2024-09-24 10:44:22 +02:00
Vincent Rabaud
7861947813 Try to reduce the sampling for the entropy image
This offers minor compression improvements.

Change-Id: I4b3b1bb11ee83273c0e4c9f47e53b21cf7cd5f76
2024-09-24 10:28:43 +02:00
James Zern
14f09ab75b webp-container-spec: reorder chunk size - N text
Use 'Chunk Size bytes - N' to avoid singular/plural confusion in the
case of 'Chunk Size - 1 bytes' case.

These changes are based on editor comments in AUTH48:
https://datatracker.ietf.org/doc/draft-zern-webp/

Change-Id: I898113033fd53d744fe9289f971887b8cfe278b9
2024-09-19 11:54:42 -07:00
Vincent Rabaud
a78c5356ba Remove a useless malloc for entropy image
histogram_symbols is converted to uint32_t and <<8 into
histogram_argb.
Using a uint32_t buffer from the start prevents copying and
converting the data.

Change-Id: I245003a6a0f048c31519afa25a600d4479e762e3
2024-09-18 22:38:11 +02:00
Vincent Rabaud
bc49176355 Merge "Refactor predictor finding" into main 2024-09-18 08:38:57 +00:00
James Zern
34f9223829 man/{cwebp,img2webp}.1: rm 'if needed' from -sharp_yuv
The wording might have implied that the library would optionally use
sharpyuv, though this option forces its use. The riskiness score
computed by SharpYuvEstimate420Risk() (extras/extras.c) is not used by
the library.

Change-Id: I56ea3262d7985215570809a4a629a2a7760e936a
2024-09-17 11:03:04 -07:00
Vincent Rabaud
367ca938f1 Refactor predictor finding
This is useful for a forward change that will improve compression.
It splits the residual computation and the best predictor
selection.

The only downside is that more memory is allocated: we had 2
histograms before, we now have 14, but this is necessary for the
later change. Still, this is nothing compared to what is done
later in the pipeline in HistogramSetTotalSize where the number of
histograms created is the number of pixels in the subsampled image.

Change-Id: If03501a26f00462dd1809daa6e9314abd180945d
2024-09-17 09:49:43 +02:00
James Zern
a582b53b74 webp-lossless-bitstream-spec: clarify some text
These changes are based on editor comments in AUTH48:
https://datatracker.ietf.org/doc/draft-zern-webp/

Change-Id: I21f18bce43fde0e396b2cbc935d0ff90448f96c4
2024-09-10 18:04:24 -07:00
James Zern
0fd25d8406 Merge "anim_encode.c: fix function ref in comment" into main 2024-09-10 18:53:23 +00:00
James Zern
f888291359 anim_encode.c: fix function ref in comment
WebPCleanupTransparentAreaLossless() was renamed to
WebPReplaceTransparentPixels() in:
55a080e5 Add WebPReplaceTransparentPixels() in dsp

Change-Id: I91e32574e6add2748c0655146f100eb2b40498b2
2024-09-09 19:28:12 -07:00
James Zern
40e4ca60ea specs_generation.md: update kramdown command line
coderay was extracted from the core and the options removed in 2.0.0.
See:
49e1b12f52

Change-Id: I5191dcec296ba4bcde5f0bcbc46d1e1135d40ec2
2024-09-06 15:32:26 -07:00
James Zern
57883c78ed img2webp: add -exact/-noexact per-frame options
Bug: b:363409354
Change-Id: I4e7282ed2df091dbef6d79743be1c8c868c0d44a
2024-09-03 18:58:13 -07:00
James Zern
1c8eba978b img2webp,cosmetics: add missing '.' spacers to help
Change-Id: I98e853a8caa091c182d41ea9d95499021c8deb3a
2024-09-03 18:27:45 -07:00
Vincent Rabaud
2e81017c7a Convert predictor_enc.c to fixed point
Also remove the last float in histogram_enc.c

Change-Id: I6f647a5fc6dd34a19292820817472b4462c94f49
2024-08-30 09:22:48 +02:00
Vincent Rabaud
94de6c7fed Merge "Fix fuzztest link errors w/-DBUILD_SHARED_LIBS=1" into main 2024-08-30 07:15:24 +00:00
Vincent Rabaud
51d9832a36 Fix fuzztest link errors w/-DBUILD_SHARED_LIBS=1
Change-Id: I089a59baa3275f7a62483da0bc1d5269e51af74e
2024-08-28 11:39:11 +02:00
Vincent Rabaud
7bcb36b884 Merge "Fix static overflow warning." into main 2024-08-28 09:17:08 +00:00
Vincent Rabaud
8e0cc14c3e Fix static overflow warning.
In practice, this can never happen because:
- 'streak' is at most as long as a histogram
- 'count' counts the number of streaks

'streak' and 'count' are therefore at most as big as the histogram
length which is at most the max of VP8LHistogramNumCodes,
which is 256+24+(1<<10).

Change-Id: I31c8834543479c8a9260732313ea26b045519515
2024-08-28 10:23:54 +02:00
James Zern
cea684626d README.md: add security report note
The default template for https://issues.webmproject.org/ is a public bug
report. Security issues can be reported securely using the 'Security
report' template.

Change-Id: Id489253c0def8a4d6d26327ea93ef4c796703ff1
2024-08-26 18:34:04 -07:00
James Zern
615e58744f Merge "make VP8LPredictor[01]_C() static" into main 2024-08-22 17:35:52 +00:00
James Zern
233e86b91f Merge changes Ie43dc5ef,I94cd8bab into main
* changes:
  Do*Filter_*: remove row & num_rows parameters
  Do*Filter_C: remove dead 'inverse' code paths
2024-08-19 18:51:06 +00:00
James Zern
1a29fd2fc3 make VP8LPredictor[01]_C() static
Only predictors 2-13 are reused in lossless_enc.c.

Change-Id: Ia3a7342fccfb44b9ad5297f48d6be2d96af68ec8
2024-08-16 10:58:45 -07:00
James Zern
dd9d3770d7 Do*Filter_*: remove row & num_rows parameters
The row parameter became a constant in:
2102ccd update the Unfilter API in dsp to process one row independently

num_rows is always equal to height.

Change-Id: Ie43dc5ef222e442ce8c92766da0b9824ccbca236
2024-08-12 19:36:31 -07:00
James Zern
ab451a495c Do*Filter_C: remove dead 'inverse' code paths
The inverse parameter became a constant in:
2102ccd update the Unfilter API in dsp to process one row independently

The row parameter to these functions is in a similar state; it will be
removed in a follow up.

Change-Id: I94cd8babe0e42474ff794ba5fa29dd48039de5f8
2024-08-08 18:13:48 -07:00
James Zern
f9a480f7c3 {TrueMotion,TM16}_NEON: remove zero extension
Replace vmovl_u8 -> s16 + signed vaddq with unsigned vaddw.
No change in assembly with clang-16 (armv7 & aarch64) and gcc-13
(aarch64). armv7 gcc-13 had kept the vmovl instructions, those are now
gone.

Change-Id: Ibb4fbdd5680d3e9dd06933c100528a6f363de472
2024-08-07 16:43:14 -07:00
James Zern
04834acae7 Merge changes I25c30a9e,I0a192fc6,I4cf89575 into main
* changes:
  WASM: Enable VP8L_USE_FAST_LOAD
  WASM: don't use USE_GENERIC_TREE
  WASM: Enable 64-bit BITS caching
2024-08-01 18:36:34 +00:00