Commit Graph

157 Commits

Author SHA1 Message Date
Pascal Massimino
2f5e8986cf fix multiple allocation for transform buffer
We were not updating the current_width_, which is usually
not a problem, unless we use Delta Palette with small number
of colors
-> Addressed this re-entrancy problem by checking we have
enough capacity for transform buffer.

The problem is not currently visible, until we restrict
the number of gradient used in delta-palette to less than 16.
Then the buffers have different current_width_ and the problem
surfaces.

Change-Id: Icd84b919905d7789014bb6668bfb6813c93fb36e
2016-02-17 06:14:39 +01:00
Vincent Rabaud
47ddd5a4cc Move some codec logic out of ./dsp .
The functions containing magic constants are moved out of ./dsp .
VP8LPopulationCost got put back in ./enc
VP8LGetCombinedEntropy is now unrefined (refinement happening in ./enc)
VP8LBitsEntropy is now unrefined (refinement happening in ./enc)
VP8LHistogramEstimateBits got put back in ./enc
VP8LHistogramEstimateBitsBulk got deleted.

Change-Id: I09c4101eebbc6f174403157026fe4a23a5316beb
2015-12-17 07:03:25 +00:00
Lode Vandevenne
6938111357 Improved alpha cleanup for the webp encoder when prediction transform is used.
Gives 0.9% smaller (2.4% compared to before alpha cleanup) size on the 1000 PNGs dataset:
Alpha cleanup before: 18856614
Alpha cleanup after: 18685802
For reference, with no alpha cleanup: 19159992

Note: WebPCleanupTransparentArea is still also called in WebPEncode. This cleanup still helps
preprocessing in the encoder, and the cases when the prediction transform is not used.

Change-Id: I63e69f48af6ddeb9804e2e603c59dde2718c6c28
2015-12-04 13:50:56 +00:00
Pascal Massimino
2c08aac81a introduce WebPMemToUint32 and WebPUint32ToMem for memory access
it uses memcpy() when unaligned memory write is tricky

Change-Id: I5d966ca9d19e9b43ac90140fa487824116982874
2015-12-04 13:43:01 +00:00
Lode Vandevenne
239421c5ef lossless: make prediction in encoder work per scanline
instead of per block. This prepares for a next CL that can make the
predictors alter RGB value behind transparent pixels for denser
encoding. Some predictors depend on the top-right pixel, and it must
have been already processed to know its new RGB value, so requires per
scanline instead of per block.

Running the encode speed test on 1000 PNGs 10 times with default
settings:
Before:
Compression (output/input): 2.3745/3.2667 bpp, Encode rate (raw data): 1.497 MP/s
After:
Compression (output/input): 2.3745/3.2667 bpp, Encode rate (raw data): 1.501 MP/s

Same but with quality 0, method 0 and 30 iterations:
Before:
Compression (output/input): 2.9120/3.2667 bpp, Encode rate (raw data): 36.379 MP/s
After:
Compression (output/input): 2.9120/3.2667 bpp, Encode rate (raw data): 36.462 MP/s

No effect on compressed size, this produces exactly same files. No
significant measured effect on speed. Expected faster speed from better
memory layout with scanline processing but slower speed due to needing
to get predictor mode per pixel, may compensate each other.

Change-Id: I40f766f1c1c19f87b62c1e2a1c4cd7627a2c3334
2015-11-25 00:38:27 -08:00
Urvang Joshi
6ecd72f845 Re-enable encoding of alpha plane with color cache for next release.
This is a revert of: https://chromium-review.googlesource.com/#/c/73607/

Change-Id: I7ec45277d73608d77d5e873290c6c185caa30c32
2015-11-13 07:15:19 +00:00
James Zern
f717b82864 vp8l.c, cosmetics: fix indent after 95509f9
95509f9 large re-organization of the delta-palettization code

Change-Id: I9d27f15cb6072a2bd1dd593d53db5b2dd3c30133
2015-10-19 12:28:57 -07:00
Pascal Massimino
fea94b2b36 fix alignment of allocated memory in AllocateTransformBuffer
likely to avoid unaligned reads in the future

Change-Id: I434ba17c139ad6e190ebd9b909b241c6c6f1e7f8
2015-10-18 13:09:22 -07:00
Pascal Massimino
95509f9914 large re-organization of the delta-palettization code
same functionality, but better code layout.

What changed:
  * don't trash the palette_[] in EncodePalette(), so it can be re-used
  * split generation of image from bit-stream coding
  * move all the delta-palette code to delta_palettization.c, and only have 1 entry point there WebPSearchOptimalDeltaPalette()
  * minimize the number of "#ifdef WEBP_EXPERIMENTAL_FEATURES" in vp8l.c
  * clarify the TransformBuffer stuff. more clean-up to come here...

This should make experimenting with delta-palettization easier and more compartimentalized.

Change-Id: Iadaa90e6c5b9dabc7791aec2530e18c973a94610
2015-10-14 00:25:42 +02:00
Mislav Bradac
48f66b6687 Add delta_palettization feature to WebP
Change-Id: Ibaf4e49aa67d63d0eb11848cca4fd0c60815864a
2015-10-02 14:29:54 -07:00
Jyrki Alakuijala
b969f888ab Reduce magic in palette reordering
Slightly faster on -m 0 -q 0, particularly for small images (50 x 75
image was 0.1 % faster on callgrind measurement).

Increases compression density by 0.005 % for the 1000 images, but small
images can improve even 0.5 % (about 4 bytes, depending on the
characteristics of the palette).

Change-Id: I94f568d396ac62a054a829abeeef3eb0af6b3f94
2015-08-10 19:06:07 -07:00
Jyrki Alakuijala
52931fd548 lossless: combine the Huffman code with extra bits
gives 2 % speedup
24.9 -> 25.5 MP/s for a photo with -q 0 -m 0

Change-Id: If9ae04683a86dd7b1fced2183cf79b9349a24a9e
2015-07-07 20:24:28 -07:00
Jyrki Alakuijala
7b23b19808 lossless: Add zeroes into the predicted histograms.
Increases compression density by 0.03 % for lossy.

Speeds up at least one of the lossy alpha images by 20 %.

Palette entropy 'kludge' seems to save 1-2 % on alpha images.

Change-Id: I2116b8d81593ac8173bfba54a7c833997fca0804
2015-07-07 20:24:27 -07:00
Jyrki Alakuijala
85b44d8a69 lossless: encoding, don't compute unnecessary histo
share the computation between different modes

3-5 % speedup for lossless alpha
1 % for lossy alpha

no change in compression density

Change-Id: I5e31413b3efcd4319121587da8320ac4f14550b2
2015-07-07 20:24:26 -07:00
Jyrki Alakuijala
d92453f381 lossless: Remove about 25 % of the speed degradation
introduced in:
"lossless: 0.37 % compression density improvement"

Uses the statistics of red and blue histograms to decide if to run
cross color correction at all.

Improves compression density by 0.02 % or so.

Change-Id: I47429557e9cdbd9fa90c584696f241b17427d73f
2015-07-07 20:24:26 -07:00
Jyrki Alakuijala
2cce031704 Faster alpha coding for webp
No significant size degradation (+0.001 %) for 1000 image corpus

Fixes the 8 ms vs 2 ms degradation from:
"lossless: 0.37 % compression density improvement"

Change-Id: Id540169a305d9d5c6213a82b46c879761b3ca608
2015-07-07 20:24:25 -07:00
Jyrki Alakuijala
84326e4ab0 lossless: Less code for the entropy selection
Tested:
	1000 png corpus gives same results

Change-Id: Ief5ea7727290743b9bd893b08af7aa7951f556cb
2015-07-07 20:24:25 -07:00
Jyrki Alakuijala
16ab951abf lossless: 0.37 % compression density improvement
counting the entropy expectation for five different configurations:
palette
non-predicted
non-predicted with subtract green
predicted
predicted with subtract green

and choose the strategy with the smallest expected entropy

Change-Id: Iaaf209c0d565660a54a4f9b3959067afb9951960
2015-07-07 20:24:24 -07:00
James Zern
553051f741 dsp/lossless: split enc/dec functions
adds lossless_enc*.c; reduces the size of the decode-only so: ~78K
w/gcc-4.8.2 on x86_64.

Change-Id: If5e4610b67d05eba5896bc64bab79e9df92b2092
2015-03-23 22:57:50 -07:00
Vikas Arora
ef98750027 Speedup method StoreImageToBitMask by 5%.
Speedup method StoreImageToBitMask by replacing the code to find histogram
index and Huffman tree codes at every iteration to a more optimal code that
updates these only when the current pixel (to write) crosses the histogram
tile-row boundary.

This change speeds up the StoreImageToBitMask method by 5%.

Change-Id: If01a1ccd7820f9a3a3e5bc449d070defa51be14b
2015-02-20 09:46:19 -08:00
Vikas Arora
4c82284d2e Updated the near-lossless level mapping.
Updated the near-lossless level mapping and make it correlated to lossy
quality i.e 100 => minimum loss (in-fact no-loss) and the visual-quality loss
increases with decrease in near-lossless level (quality) till value 0.

The new mapping implies following (PSNR) loss-metric:
-near_lossless 100: No-loss (bit-stream same as -lossless).
-near_lossless  80: Very very high PSNR (around 54dB).
-near_lossless  60: Very high PSNR (around 48dB).
-near_lossless  40: High PSNR (around 42dB).
-near_lossless  20: Moderate PSNR (around 36dB).
-near_lossless   0: Low PSNR (around 30dB).

Change-Id: I930de4b18950faf2868c97d42e9e49ba0b642960
2015-02-05 11:17:14 -08:00
Vikas Arora
72831f6b28 Speedup AnalyzeAndInit for low effort compression.
AnalyzeSubtractGreen constitutes about 8-10% of the comression CPU cycles.
Statistically, subtract-green is proved to be useful for most of the
non-palette compression. So instead of evaluating the entropy (by calling
AnalyzeSubtractGreen) apply subtract-green transform for the low-effort
compression.

This changes speeds up the compression at m=0 by 8-10% (with very slight loss
of 0.07% in the compression density).

Change-Id: I9797dc39437ae089716acb14631bbc77d367acf4
2015-01-30 10:37:31 -08:00
Vikas Arora
a6597483af Speedup Analyze methods for lossless compression.
Speed up AnalyzeSubtractGreen by looping through the image pixel once to
compute the two histograms.

AnalyzeEntropy code cleanup.
Removed some 'if' conditions and pointer indirections inside pixel iterate loop.

Change-Id: Ia65e3033988ff67df8e3ecce19d6e34cfc76358e
2015-01-30 09:16:31 -08:00
Urvang Joshi
2db15a9583 Temporarily disable encoding of alpha plane with color cache.
This is to avoid triggering the related decoder bug.

Change-Id: I8fa074a5393bcd62aa4a2232cd4e02935e927a89
2015-01-28 15:28:02 -08:00
Vikas Arora
9814ddb601 Remove the post-transform near-lossless heuristic.
Remove the post-transform (prediction, subtract green & cross-color)
near-lossless heuristic, that's not ready yet and produces unacceptable visual
(banding) artifacts.

Change-Id: I9b606a790ce0344c588f2ef83a09c57ac19c2fc1
2015-01-26 14:19:00 -08:00
Pascal Massimino
f480d1a7ef Fix to near lossless artefacts on palettized images.
Don't rely on palette not being there before the palette colors are counted.

Change-Id: I988286675d3398f2da8f6d2fb6462db08af8028d
2015-01-22 17:47:50 +01:00
James Zern
f0e0677b87 VP8LEncodeStream: add an assert
check enc->argb_ to quiet an msvs /analyze warning:
C6387: 'enc->argb_+y*width' could be '0':  this does not adhere to the
specification for the function 'memcpy'.

Change-Id: I87544e92ee0d3ea38942a475c30c6d552f9877b7
2015-01-16 18:16:40 -08:00
Vikas Arora
ea08466d34 Tune BackwardReferencesLz77 for low_effort (m=0).
- Lower the threshold parameters for HashChainFindCopy.

For 1000 image PNG corpus (m=0), this change yields speedup of 15-20% at
lower quality range (0.25% drop in compression density) and about 10%
for higher quality range without any drop in the compression density.
Following is the compression stats (before/after) for method = 0:
         Before           After
         bpp/MPs          bpp/MPs
q=0      2.8615/18.000    2.8651/18.631
q=5      2.8615/18.216    2.8650/20.517
q=10     2.8572/18.070    2.8650/21.992
q=15     2.8519/18.371    2.8584/21.747
q=20     2.8454/18.975    2.8515/20.448
q=25     2.8230/8.531     2.8253/9.585
// Compression density remains same for q-range [30-100]
q=30     2.7310/7.706     2.7310/8.028
q=35     2.7253/6.855     2.7253/7.184
q=40     2.7231/6.364     2.7231/6.604
q=45     2.7216/5.844     2.7216/6.223
q=50     2.7196/5.210     2.7196/5.731
q=55     2.7208/4.766     2.7208/4.970
q=60     2.7195/4.495     2.7195/4.602
q=65     2.7185/4.024     2.7185/4.236
q=70     2.7174/3.699     2.7174/3.861
q=75     2.7164/3.449     2.7164/3.605
q=80     2.7161/3.222     2.7161/3.038
q=85     2.7153/2.919     2.7153/2.946
q=90     2.7145/2.766     2.7145/2.771
q=95     2.7124/2.548     2.7124/2.575
q=100    2.6873/2.253     2.6873/2.335

Change-Id: I0e17581fb71f6094032ad06c6203350bd502f9a1
2015-01-08 00:30:21 -08:00
Vikas Arora
b0b973c39b Speedup VP8LGetHistoImageSymbols for low effort (m=0) mode.
- Do light weight entropy based histogram combine and leave out CPU
  intensive stochastic and greedy heuristics for combining the
  histograms.

For 1000 image PNG corpus (m=0), this change yields speedup of 10% at
lower quality range (1% drop in compression density) and about 5% for
higher quality range (1% drop in compression density). Following is the
compression stats (before/after) for method = 0:
         Before           After
         bpp/MPs          bpp/MPs
q=0      2.8336/16.577    2.8615/18.000
q=5      2.8336/16.504    2.8615/18.216
q=10     2.8293/16.419    2.8572/18.070
q=15     2.8242/17.582    2.8519/18.371
q=20     2.8182/16.131    2.8454/18.975
q=25     2.7924/7.670     2.8230/8.531
q=30     2.7078/6.635     2.7310/7.706
q=35     2.7028/6.203     2.7253/6.855
q=40     2.7005/6.198     2.7231/6.364
q=45     2.6989/5.570     2.7216/5.844
q=50     2.6970/5.087     2.7196/5.210
q=55     2.6963/4.589     2.7208/4.766
q=60     2.6949/4.292     2.7195/4.495
q=65     2.6940/3.970     2.7185/4.024
q=70     2.6929/3.698     2.7174/3.699
q=75     2.6919/3.427     2.7164/3.449
q=80     2.6918/3.106     2.7161/3.222
q=85     2.6909/2.856     2.7153/2.919
q=90     2.6902/2.695     2.7145/2.766
q=95     2.6881/2.499     2.7124/2.548
q=100    2.6873/2.253     2.6873/2.285

Change-Id: I0567945068f8dc7888041e93d872f9def91f50ba
2015-01-08 00:29:57 -08:00
Pascal Massimino
24284459c7 replace unneeded calls to HistogramCopy() by swaps
most of the time, we don't need to actually move the
data.

Compression is randomly slightly different, because HistogramCompactBins() changed.
Timing is about the same.

Change-Id: Ia6af8e9780581014d6860f2b546189ac817cfad1
2014-12-17 15:32:36 +01:00
Pascal Massimino
31a9cf6417 Speedup WebP lossless compression for low effort (m=0) mode with following:
- Disable Cross-Color transform.
- Evaluate predictors #11 (paeth), #12 and #13 only.

Change-Id: I857264c85c61c3957d4fb45ae32d261d947c8bed
2014-12-17 11:52:11 +01:00
Vikas Arora
a0df55104e Remove handling for WEBP_HINT_GRAPH
Remove handling for WEBP_HINT_GRAPH w.r.t use_palette flag.

The WEBP_HINT_GRAPH is now used at one place, to set the initial size of the
Bit Writer as bpp for photo images are generally larger than the graphical
images.

Change-Id: I1b9c4436c85a8f69da74c0dbcd292397323f2696
2014-11-13 15:49:23 -08:00
Vikas Arora
95a9bd85c4 Updated VP8LGetBackwardReferences and color cache.
- The optimal cache bits is evaluated inside the method 'VP8LGetBackwardReferences'.
- The input cache_bits to 'VP8LGetBackwardReferences' sets the maximum cache
  bits to use (passing 0 implies disabling the local color cache).
- The local color cache is disabled for lowerf (<= 25) quality levels (as before).
- Enabled local color cache for palette images as well. This saves additional
  0.017% bytes with a slight (2-3%) improvement in the compression speed.
- Removed 'use_2d_locality' parameter from method VP8LGetBackwardReferences, as
  this option is not an option now (after we freeze the lossless bit-stream).

Change-Id: I33430401e465474fa1be899f330387cd2b466280
2014-11-06 13:14:05 -08:00
Vikas Arora
919220c7e6 Change the logic adjusting the Histogram bits.
Updated the logic to limit the Histogram size to a constant, instead of
computing the same based on the Histogram size (that's variable size based on
the cache bits) for the maximum possible cache bits. The actual cache bits may
be lower than the maximum.
Note: The constant 2600 is 16MB/Sizeof(HistogramSize(MAX_COLOR_CACHE_BITS)).

The compression density remains the same with this change, with little faster
compression speed.

Change-Id: I3149894962852e9dad2501b9aa16bb847a20fd86
2014-10-27 09:57:17 -07:00
Vikas Arora
e912bd55be Fix bug in VP8LCalculateEstimateForCacheSize.
The method VP8LCalculateEstimateForCacheSize is not evaluating the all possible
range for cache_bits.
Also added a small penality for choosing the larger cache-size. This is done to
strike a balance between additional memory/CPU cost (with larger cache-size) and
byte savings from smaller WebP lossless files.

This change saves about 0.07% bytes and speeds up compression by 8% (default
settings). There's small speedup at Q=50 along with byte savings as well.
Compression at Quality=25 is not effected by this change.

Change-Id: Id8f87dee6b5bccb2baa6dbdee479ee9cda8f4f77
2014-10-26 20:05:48 -07:00
Pascal Massimino
5b90d8fe42 Unify the API between VP8BitWriter and VP8LBitWriter
BitReader will be next...

Change-Id: Icd9e7ab2e3890131e664c0523627d9b8c5399a74
2014-10-23 15:35:16 +02:00
Pascal Massimino
6c6736816c Improved near-lossless mode.
Compared to previous mode it gives another 10-30% improvement in compression keeping comparable PSNR on corresponding quality settings.

Still protected by the WEBP_EXPERIMENTAL_FEATURES flag.

Change-Id: I4821815b9a508f4f38c98821acaddb74c73c60ac
2014-10-15 10:57:21 -07:00
James Zern
aca1b98f52 enc/vp8l.c: fix indent
reindent after ca00502

Change-Id: I8c88dbc11dc96c117531b17682b764a235ef23bb
2014-10-13 11:33:23 +02:00
Vikas Arora
ca00502788 Evaluate non-palette compression for palette image
Evaluate if for Palette images (num_colors <= 256), non-palette
compression path (Subtract green, predictor transform etc) yield an
optimal compression density.

This change reduces the WebP file (for palette images) size by 0.4% with
drop of 3-5% in compression speed.

Change-Id: I1ad66fa94db4fd7ba7bc215763791ef662cd4f42
2014-10-10 11:55:45 -07:00
Pascal Massimino
5f81391263 Merge "Fix return code of EncodeImageInternal()" 2014-10-07 23:49:29 -07:00
Pascal Massimino
e321abe43d Fix return code of EncodeImageInternal()
It was returning 'VP8_ENC_OK' in case of memory error.

Change-Id: I184a3e29c9f1b863637cacbe389b058d75c3dbf8
2014-10-08 08:48:53 +02:00
Pascal Massimino
f82cb06afb optimize palette ordering
We compact the palette by weighted distance, favoring the green channel.

Average gain on paletted file is ~0.5%, with gain up to 6-7% on some favorable cases.
Encoding speed is unaffected.

Disabled for alpha (or any single-channel input)

Also: always use quality=20 for EncodePalette() since it
doesn't make any real difference.

Change-Id: I19fb14316a366f139a941b45aef5663a33c905e1
2014-10-08 08:42:36 +02:00
Pascal Massimino
f545feee64 don't set the alpha value for histogram index image
This leads to tiny extra compression (~few bytes per file) for free

Change-Id: Ia4d8cef3de4365e32eacefd69a57689c80042a23
2014-10-08 08:24:19 +02:00
Vikas Arora
b901416b90 Record the lossless size stats.
Record and show the lossless header and image data sizes in the cwebp.

Change-Id: I08f19693cb7a756b6fdce5b55d71f5367b5f02fc
2014-09-17 15:16:05 -07:00
skal
77bf4410f7 make error-code reporting consistent upon malloc failure
Sometimes, the error-code was not set correctly.
We now return OUT_OF_MEMORY everytimes it's appropriate
(tested using MALLOC_FAIL_AT mechanism)

Took the opportunity to clean-up the code and dust the error
code returned (some were erroneously set to INVALID_CONFIGURATION)

Change-Id: I56f7331e2447557b3dd038e245daace4fc82214c
2014-06-13 08:45:12 +02:00
skal
ca3d746e39 use block-based allocation for backward refs storage, and free-lists
Non-photo source produce far less literal reference and their
buffer is usually much smaller than the picture size if its compresses
well. Hence, use a block-base allocation (and recycling) to avoid
pre-allocating a buffer with maximal size.

This can reduce memory consumption up to 50% for non-photographic
content. Encode speed is also a little better (1-2%)

Change-Id: Icbc229e1e5a08976348e600c8906beaa26954a11
2014-05-05 11:11:55 -07:00
skal
d3bcf72bf5 Don't allocate VP8LHashChain, but treat like automatic object
the unique instance of VP8LHashChain (1MB size corresponding to hash_to_first_index_)
is now wholy part of VP8LEncoder, instead of maintaining the pointer to VP8LHashChain
in the encoder.

Change-Id: Ib6fe52019fdd211fbbc78dc0ba731a4af0728677
2014-04-30 14:10:48 -07:00
Pascal Massimino
cf5eb8ad19 remove some uint64_t casts and use.
We use automatic int->uint64_t promotion where applicable.

(uint64_t should be kept only for overflow checking and memory alloc).

Change-Id: I1f41b0f73e2e6380e7d65cc15c1f730696862125
2014-04-29 09:08:25 -07:00
Pascal Massimino
b3a616b356 make HistogramAdd() a pointer in dsp
* merged the two HistogramAdd/AddEval() into a single call
  (with detection of special case when b==out)
* added a SSE2 variant
* harmonize the histogram type to 'uint32_t' instead
  of just 'int'. This has a lot of ripples on signatures.
* 1-2% faster

Change-Id: I10299ff300f36cdbca5a560df1ae4d4df149d306
2014-04-28 10:09:34 -07:00
Vikas Arora
0b896101b4 Reduce memory footprint for encoding WebP lossless.
Reduce calls to Malloc (WebPSafeMalloc/WebPSafeCalloc) for:
- Building HashChain data-structure used in creating the backward references.
- Creating Backward references for LZ77 or RLE coding.
- Creating Huffman tree for encoding the image.
For the above mentioned code-paths, allocate memory once and re-use it
subsequently.

Reduce the foorprint of VP8LHistogram struct by changing the Struct
field 'literal_' from an array of constant size to dynamically allocated
buffer based on the input parameter cache_bits.

Initialize BitWriter buffer corresponding to 16bpp (2*W*H).
There are some hard-files that are compressed at 12 bpp or more. The
realloc is costly and can be avoided for most of the WebP lossless
images by allocating some extra memory at the encoder initializaiton.

Change-Id: I1ea8cf60df727b8eb41547901f376c9a585e6095
2014-04-26 01:14:33 -07:00