62 Commits

Author SHA1 Message Date
Vincent Rabaud
0a9f1c19f8 Convert VP8LFastLog2 to fixed point
The lossless encoding speed-ups are:
- up to 1% with default parameters
- up to 4% in cruncher mode: -q 100 -m 6

Change-Id: Id92d4bad0b0a2c28c8aa9ff5280eea5717017f30
2024-07-02 10:29:38 +02:00
Vincent Rabaud
a90160e11a Refactor histograms in predictors.
Replace the 2d histograms with uint32_t 1d versions (to avoid
pointer casting and to use the optimized VP8LAddVectorEq).

Change-Id: I90b0fe98390b49e3fd03e3484289571cf7ae6eca
2024-05-03 22:09:38 +02:00
Vincent Rabaud
828b4ce062 Switch ExtraCost to ints and implement it in SSE.
The histograms count the occurrences of len/dist in entropy images.
Those (at most (1<<14) by (1<<14)) are sub-sampled by at least
MIN_HUFFMAN_BITS == 2, hence at most 24 bits in a histogram value.
At most, we multiply by 19 (because the longest histogram is of
size 40 and we do 40>>1, cf code) for the bit cost. So it all fits
in 32 bits.

Change-Id: Ife24b035f54794851ff31f2fac07901f724c6d7f
2023-06-01 10:17:13 +02:00
James Zern
8151f388eb move VP8GetCPUInfo declaration to cpu.c
This avoids defining a version in each translation unit when using
__declspec(dllexport) which causes failures due to multiply defined
symbols with clang-cl:

lld-link: error: duplicate symbol: VP8GetCPUInfo
>>> defined at CMakeFiles\webpdecode.dir\Debug\src\dec\alpha_dec.c.obj
>>> defined at CMakeFiles\webpdsp.dir\Debug\src\dsp\dec_sse41.c.obj
...

Bug: webp:607
Change-Id: I6cd1ee75b3db984aa513263a05516e867a64925d
2023-04-27 12:39:13 -07:00
James Zern
e2fecc22e1 dsp/lossless_enc.c: clear int sanitizer warnings
in TransformColorBlue; make new_blue an int to avoid:

implicit conversion from type 'int' of value 264 (32-bit, signed) to
type 'uint8_t' (aka 'unsigned char') changed the value to 8 (8-bit,
unsigned)

Bug: b/229626362
Change-Id: Ife276a59231075788396204e1a192f3b0c6d9e21
2022-08-08 17:34:01 -07:00
James Zern
ad7d1753c5 dsp/lossless_enc.c: clear int sanitizer warnings
add explicit casts in calls to ColorTransformDelta()

clears warnings of the form:
implicit conversion from type 'uint8_t' (aka 'unsigned char') of value
254 (8-bit, unsigned) to type 'int8_t' (aka 'signed char') changed the
value to -2 (8-bit, signed)

Bug: b/229626362
Change-Id: I40618209509508f56d8053f9daa29cf2e6999766
2022-08-08 17:34:00 -07:00
James Zern
5037220e55 VP8LSubtractGreenFromBlueAndRed_C: clear int sanitizer warnings
previously the types were changed to int to prevent unsigned overflow
warnings:
6ab496ed fix some 'unsigned integer overflow' warnings in ubsan

clears warnings of the form:
implicit conversion from type 'uint32_t' (aka 'unsigned int') of value
3724541952 (32-bit, unsigned) to type 'int' changed the value to
-570425344 (32-bit, signed)

implicit conversion from type 'int' of value -3361661 (32-bit, signed)
to type 'unsigned int' changed the value to 4291605635 (32-bit,
unsigned)

Bug: b/229626362
Change-Id: If1eb39c5dd7218d686c3c47fb7df72431b873be4
2022-08-08 17:34:00 -07:00
Vincent Rabaud
a19a25bb03 Replace doubles by floats in lossless misc cost estimations.
Doubles are slower and use more RAM for no benefit.

Change-Id: I05b313576f9b33388c7c39d7fed8de84170c3753
2022-04-17 21:07:54 +02:00
Pascal Massimino
8ea81561d2 change VP8LPredictorFunc signature to avoid reading 'left'
... when it's not available. Even if the value was discarded and
never used, some msan config were complaining about reading it
and passing it around.

Change-Id: Iab8d24676c5bb58e607a829121e36c2862da397c
2021-11-05 16:22:31 +01:00
James Zern
1fe3162541 dsp/*: use WEBP_HAVE_* to determine Init availability
after:
  ece18e55 dsp.h: respect --disable-sse2/sse4.1/neon
WEBP_USE_* will be set when a module is targeting a particular
instruction set, e.g., sse4.1, and not overridden if WEBP_HAVE_SSE41 is
set, as previously this would ignore the case where the instruction set
was disabled via config.h and the HAVE macro was unset.

dsp.h not ensures WEBP_HAVE_* are set when WEBP_USE_* to cover cases
where the files are built without config.h.

Change-Id: Ia1c2dcf4100cc1081d968acb6e085e2a1768ece6
2021-07-24 10:19:30 -07:00
Ilya Kurdyukov
8886f620c0 Use BitCtz for FastSLog2Slow_C
Change-Id: Icc6068b8934e481e6f17efd30616392e68d504ad
2021-02-19 15:11:42 +01:00
Vincent Rabaud
fc14fc038b Have C encoding predictors use decoding predictors.
libwebp.a in Release mode with no symbols size in bytes:
986430 -> 975114  (-1.1%)

Change-Id: Ia96192a6be2911779e359b72132bdba60b60a13d
2020-12-02 11:54:59 +01:00
Pascal Massimino
1106478f42 remove conversion U32 -> S8 warnings
using an inline U32ToS8() function

Change-Id: I45f535c6c9b5de33d69acc17b466e183fcc19a63
2019-06-24 16:42:42 -07:00
Skal
812a6b49fc lossless_enc: fix some conversion warning
object code is unchanged.

Change-Id: I40fc16056c0ab44c5c57ef6b02af14be767abe87
2019-06-24 16:16:18 +02:00
James Zern
4627c1c91b lossless_enc,TransformColorBlue: quiet uint32_t conv warning
no change in object code

from clang-7 integer sanitizer:
implicit conversion from type 'uint32_t' (aka 'unsigned int') of value
1955895199 (32-bit, unsigned) to type 'uint8_t' (aka 'unsigned char')
changed the value to 159 (8-bit, unsigned)

Change-Id: I0c3022339e34b9c9af03167ab827ade677973644
2019-06-20 23:06:13 -07:00
Vincent Rabaud
decf6f6b87 Speedups for empty histograms.
When histograms are empty, it is easy to add them.
They should also not be considered when merging histograms
(it is a waste of CPU).
This does not change the compression performance,
just the speed.

Change-Id: I42c721ca0f9c5ea067e73b792aa3db6d5e71d01f
2018-10-20 13:23:50 +02:00
Vincent Rabaud
dea3e89983 Split HistogramAdd to only have the high level logic in C.
Change-Id: Ic9eaebf7128ca0215b49d2a13bde1f5b94a28061
2018-10-19 14:03:28 +02:00
James Zern
d77bf512bd add WEBP_DSP_INIT / WEBP_DSP_INIT_FUNC
this internalizes the init checks and provides stronger synchronization
with pthreads when available while still allowing VP8GetCPUInfo to be
modified (mostly for testing purposes). windows is left as is since a
critical section or mutex would cause a leak.

Change-Id: Ieb997e014f2805c0ae39c16f13337663521356f4
2018-04-17 11:45:34 +00:00
James Zern
b7971d0e22 dsp: avoid defining _C functions w/NEON builds
when targeting NEON C functions with NEON equivalents won't be used, but
will contribute to binary size. the same goes for sse2, etc., but this
change is primarily concerned with binary sizes for android arm targets.

note '-noasm' or otherwise modifying VP8GetCPUInfo will have no effect
on the use of NEON functions.

this decision can be overridden by defining WEBP_DSP_OMIT_C_CODE to 0.

Change-Id: I47bd453c84a3d341ca39bc986a39eb9c785aface
2017-10-27 10:54:56 -07:00
James Zern
a439972175 WIP: list includes as descendants of the project dir
#include "(.|..)/..." -> #include "src/..."

Change-Id: I772880aa097a770722043c8a4393552ba38a89b6
2017-10-10 23:04:05 -07:00
skal
1411f02761 Lossless Enc: harmonize the function suffixes
BUG=webp:355

Change-Id: I8baf506bd2a27095b956ef22a862b071f60c0d72
2017-08-07 18:02:07 -07:00
Vincent Rabaud
e4eb458741 lossless, VP8LTransformColor_C: make sure no overflow happens with colors.
Change-Id: Iec0d07cf1188ba96391cdb1b62131fc1469dfac6
2017-05-24 11:34:40 +02:00
James Zern
668e1dd44f src/{dec,enc,utils}: give filenames a unique suffix
this avoids duplicates between these trees and dsp/, e.g., enc/tree.c,
dec/tree.c, making pulling the whole library source tree into one target
possible

BUG=webp:279

Change-Id: I060a614833c7c24ddd37bf641702ae6a5eef1775
2017-01-19 19:09:48 -08:00
Vincent Rabaud
875fafc191 Implement BundleColorMap in SSE2.
Change-Id: I44cd23647bd0a49330b6b2b3ed08050a5500e58e
2016-12-21 10:44:31 +01:00
Pascal Massimino
9cc421675b PredictorSub: implement fully-SSE2 version
and inline the C-version too.

Predictor #13 is still a hard one.

Change-Id: Iedecfb5cbf216da4e28ccfdd0810286133f42331
2016-12-13 02:19:35 -08:00
Pascal Massimino
fbba5bc2c1 optimize predictor #1 in plain-C
For some reason, gcc has hard time inlining this one...

Also optimize predictor #0 and #1 for encoding, so we don't have to
call the generic pointers VP8LPredictors[...]

Change-Id: I1ff31e3b83874b53f84fe23487f644619fd61db9
2016-12-12 17:41:36 +01:00
Vincent Rabaud
2e6cb6f34e Give more flexibility to the predictor generating macro.
Change-Id: Ia651afa8322cb5c5ae87128340d05245c0f6a900
2016-12-02 12:33:12 -08:00
Vincent Rabaud
4239a1489c Make the lossless predictors work on a batch of pixels.
Change-Id: Ieaee34f1f97c375b9e97ef7e9df60aed353dffa1
2016-11-28 17:12:10 +01:00
Vincent Rabaud
71e2f5cadf Remove memcpy in lossless decoding.
Change-Id: Iba694b306486d67764e2fc5576c98a974c9b886c
2016-11-24 17:45:24 +01:00
Vincent Rabaud
64577de8ae De-VP8L-ize GetEntropUnrefinedHelper.
Having it architecture dependent resulted in an extra
function call of an extern function, hence no inlining and
a 5-10% impact on performance.

Change-Id: I0ff40d2d881edc76d3594213a64ee53097d42450
2016-09-14 13:55:24 +02:00
Vincent Rabaud
6cc48b1728 Move some lossless logic out of dsp.
Change-Id: I4cfd60cd5497666a2e1c188ceada2e71b05f1505
2016-09-13 15:37:32 +02:00
Vincent Rabaud
c9b45863e2 Split off common lossless dsp inline functions.
Change-Id: I64f96897b11d1c21f033c7e47b21edccb5c68738
2016-09-12 17:35:08 +02:00
skal
6ab496ed22 fix some 'unsigned integer overflow' warnings in ubsan
I couldn't find a safe way of fixing VP8GetSigned() so i just
used the big-hammer.

Change-Id: I1039bc00307d1c90c85909a458a4bc70670e48b7
2016-08-16 23:18:27 -07:00
James Zern
8a4ebc6ab0 Revert "fix 'unsigned integer overflow' warnings in ubsan"
This reverts commit e44f5248ff4b9d27d76edaff93128046a517b5e8.

contains unintentional changes in quant.c

Change-Id: I1928f072566788b0c9ea80f6fbc9e571061f9b3e
2016-08-16 16:55:56 -07:00
skal
e44f5248ff fix 'unsigned integer overflow' warnings in ubsan
I couldn't find a safe way of fixing VP8GetSigned() so i just
used the big-hammer.

Change-Id: I1039bc00307d1c90c85909a458a4bc70670e48b7
2016-08-16 15:04:41 -07:00
Parag Salasakar
cb19dbc1a4 Add MSA optimized color transform functions
We add the following MSA optimized color transform functions:
- TransformColor
- SubtractGreenFromBlueAndRed

Change-Id: Ib182d2b5faa7191f503ce70f0dfde0ac89402fd3
2016-07-18 13:49:24 +00:00
Vincent Rabaud
9e8e1b7b2a Inline GetResidual for speed.
Change-Id: Ib4228e87dc448866229c0795ca68dabe777ef31c
2016-06-21 16:04:53 +02:00
Marcin Kowalczyk
f2e1efbeb7 Improve near lossless compression when a prediction filter is used.
The old implementation in enc/near_lossless.c performing a separate
preprocessing step is used only when a prediction filter is not used,
otherwise a new implementation integrated into lossless_enc.c is used.

It retains the same logic for converting near lossless quality into max
number of bits dropped, and for adjusting the number of bits based on
the smoothness of the image at a given pixel. As before, borders are not
changed.

Then, instead of quantizing raw component values, the residual after
subtract green and after prediction is quantized according to the
resulting number of bits, taking care to not cross the boundary between
255 and 0 after decoding. Ties are resolved by moving closer to the
prediction instead of by bankers’ rounding.

This results in about 15% size decrease for the same quality.

Change-Id: If3e9c388158c2e3e75ef88876703f40b932f671f
2016-05-18 20:59:02 +00:00
Vincent Rabaud
8ce975ac82 SSE optimization for vector mismatch.
Change-Id: I564b822033b59d86635230f29ed6197e306a2c4f
2016-01-07 18:23:45 +01:00
James Zern
99a01f4f8b Merge "Unify some entropy functions." 2015-12-17 22:35:29 +00:00
Vincent Rabaud
ca509a3362 Unify some entropy functions.
The code and logic is unified when computing bit entropy + Huffman cost.

Speed-wise, we gain 8% for lossless encoding.
Logic-wise, the beginning/end of the distributions are handled properly
and the compression ratio does not change much.

Change-Id: Ifa91d7d3e667c9a9a421faec4e845ecb6479a633
2015-12-17 17:00:08 +01:00
Pascal Massimino
b0547ff0b4 move back common constants for lossless_enc*.c into the .h
Change-Id: I11bc979db691f6518d85e2e1c3ac7f05d69681b0
2015-12-17 15:11:56 +01:00
Vincent Rabaud
47ddd5a4cc Move some codec logic out of ./dsp .
The functions containing magic constants are moved out of ./dsp .
VP8LPopulationCost got put back in ./enc
VP8LGetCombinedEntropy is now unrefined (refinement happening in ./enc)
VP8LBitsEntropy is now unrefined (refinement happening in ./enc)
VP8LHistogramEstimateBits got put back in ./enc
VP8LHistogramEstimateBitsBulk got deleted.

Change-Id: I09c4101eebbc6f174403157026fe4a23a5316beb
2015-12-17 07:03:25 +00:00
Vincent Rabaud
2835089d6a Provide an SSE2 implementation of CombinedShannonEntropy.
CombinedShannonEntropy takes 30% for lossless compression.
This implementation speeds up the overall process by 2 to 3 %.

Change-Id: I04a71743284c38814fd0726034d51a02b1b6ba8f
2015-12-11 15:12:19 +01:00
Lode Vandevenne
6938111357 Improved alpha cleanup for the webp encoder when prediction transform is used.
Gives 0.9% smaller (2.4% compared to before alpha cleanup) size on the 1000 PNGs dataset:
Alpha cleanup before: 18856614
Alpha cleanup after: 18685802
For reference, with no alpha cleanup: 19159992

Note: WebPCleanupTransparentArea is still also called in WebPEncode. This cleanup still helps
preprocessing in the encoder, and the cases when the prediction transform is not used.

Change-Id: I63e69f48af6ddeb9804e2e603c59dde2718c6c28
2015-12-04 13:50:56 +00:00
James Zern
0837512964 Merge "Make a separate case for low_effort in CopyImageWithPrediction" 2015-12-03 08:46:31 +00:00
James Zern
aa2eb2d4a1 Merge "cosmetics: fix indent" 2015-12-03 08:44:54 +00:00
James Zern
b7551e90e1 cosmetics: fix indent
Change-Id: I67e5a0308a964bc37b2314d96f3691fc0550e9bc
2015-12-03 00:34:15 -08:00
Lode Vandevenne
5bda52d4e8 Make a separate case for low_effort in CopyImageWithPrediction
for more speed.

This gives a roughly a 1% speedup for low_effort. But actually this is a
preparation for the upcoming CL that changes RGB values of transparent pixels
based on prediction, which should not be done for low_effort because that would
slightly hurt its performance.

On 1000 PNGs, with quality 0, method 0:
Before:
Compression (output/input): 2.9120/3.2667 bpp, Encode rate (raw data): 36.034 MP/s
After:
Compression (output/input): 2.9120/3.2667 bpp, Encode rate (raw data): 36.428 MP/s

Change-Id: I5ed9f599bbf908a917723f3c780551ceb7fd724d
2015-12-03 00:22:50 -08:00
Vincent Rabaud
829bd14145 Combine Huffman cost and bit entropy into one loop
The same computation was done for both values: go over two buffers,
sum them up, and take a decision on the sum at each iteration.

MIPS32 code has been disabled for now, pending a code update.

Change-Id: I997984326f7092b3dbb8cfa1e524bd8132b2ab9d
2015-11-30 13:57:25 +01:00