- Do light weight entropy based histogram combine and leave out CPU
intensive stochastic and greedy heuristics for combining the
histograms.
For 1000 image PNG corpus (m=0), this change yields speedup of 10% at
lower quality range (1% drop in compression density) and about 5% for
higher quality range (1% drop in compression density). Following is the
compression stats (before/after) for method = 0:
Before After
bpp/MPs bpp/MPs
q=0 2.8336/16.577 2.8615/18.000
q=5 2.8336/16.504 2.8615/18.216
q=10 2.8293/16.419 2.8572/18.070
q=15 2.8242/17.582 2.8519/18.371
q=20 2.8182/16.131 2.8454/18.975
q=25 2.7924/7.670 2.8230/8.531
q=30 2.7078/6.635 2.7310/7.706
q=35 2.7028/6.203 2.7253/6.855
q=40 2.7005/6.198 2.7231/6.364
q=45 2.6989/5.570 2.7216/5.844
q=50 2.6970/5.087 2.7196/5.210
q=55 2.6963/4.589 2.7208/4.766
q=60 2.6949/4.292 2.7195/4.495
q=65 2.6940/3.970 2.7185/4.024
q=70 2.6929/3.698 2.7174/3.699
q=75 2.6919/3.427 2.7164/3.449
q=80 2.6918/3.106 2.7161/3.222
q=85 2.6909/2.856 2.7153/2.919
q=90 2.6902/2.695 2.7145/2.766
q=95 2.6881/2.499 2.7124/2.548
q=100 2.6873/2.253 2.6873/2.285
Change-Id: I0567945068f8dc7888041e93d872f9def91f50ba
most of the time, we don't need to actually move the
data.
Compression is randomly slightly different, because HistogramCompactBins() changed.
Timing is about the same.
Change-Id: Ia6af8e9780581014d6860f2b546189ac817cfad1
Move all the Entropy evaluation methods to lossless.c (from histogram.c).
There's slight difference in the way entropy is computed for evaluating
entropy in prediction methods and histogram (literal) for huffman trees.
Plan (later) to merge few (static) methods and reduce the code size.
This change has no impact on the compression speed/density.
Change-Id: Ife3d96a3c4a8d78a91723d9e0a8d1b78c0256a15
Don't combine the Histograms that have trivial (single valued A, R & B)
symbols.
Following is the compression savings data along with compression time (before
& after) per image.
Before After
bpp, rate(MP/s) bpp, rate(MP/s)
Q=25, method = 4 2.508, 1.807 2.499, 1.916
Q=50, method = 4 2.460, 1.488 2.456, 1.512
Q=75, method = 4 2.452, 1.078 2.450, 1.092
Q=25, method = 5 2.505, 1.398 2.496, 1.383
Q=50, method = 5 2.458, 1.170 2.453, 1.143
Q=75, method = 5 2.453, 0.886 2.450, 0.855
This change provides 0.1-0.4% compression gains and speeds up the lossless
compression for the default method=4 (the drop in compression speed is between 1-3.5% for method=5).
Change-Id: Idfd88c2092f37afacd26a97097b3053f8183953a
We use automatic int->uint64_t promotion where applicable.
(uint64_t should be kept only for overflow checking and memory alloc).
Change-Id: I1f41b0f73e2e6380e7d65cc15c1f730696862125
* merged the two HistogramAdd/AddEval() into a single call
(with detection of special case when b==out)
* added a SSE2 variant
* harmonize the histogram type to 'uint32_t' instead
of just 'int'. This has a lot of ripples on signatures.
* 1-2% faster
Change-Id: I10299ff300f36cdbca5a560df1ae4d4df149d306
Reduce calls to Malloc (WebPSafeMalloc/WebPSafeCalloc) for:
- Building HashChain data-structure used in creating the backward references.
- Creating Backward references for LZ77 or RLE coding.
- Creating Huffman tree for encoding the image.
For the above mentioned code-paths, allocate memory once and re-use it
subsequently.
Reduce the foorprint of VP8LHistogram struct by changing the Struct
field 'literal_' from an array of constant size to dynamically allocated
buffer based on the input parameter cache_bits.
Initialize BitWriter buffer corresponding to 16bpp (2*W*H).
There are some hard-files that are compressed at 12 bpp or more. The
realloc is costly and can be avoided for most of the WebP lossless
images by allocating some extra memory at the encoder initializaiton.
Change-Id: I1ea8cf60df727b8eb41547901f376c9a585e6095
This is to help further optimizations.
(like in https://gerrit.chromium.org/gerrit/#/c/69787/)
There's a small slowdown (~0.5% at -z 9 quality) due to
function pointer usage. Note that, for speed, it's important
to return VP8LStreaks by value, and not pass a pointer.
Change-Id: Id4167366765fb7fc5dff89c1fd75dee456737000
there's still some malloc/free in the external example
This is an encoder API change because of the introduction
of WebPMemoryWriterClear() for symmetry reasons.
The MemoryWriter object should probably go in examples/ instead
of being in the main lib, though.
mux_types.h stil contain some inlined free()/malloc() that are
harder to remove (we need to put them in the libwebputils lib
and make sure link is ok). Left as a TODO for now.
Also: WebPDecodeRGB*() function are still returning a pointer
that needs to be free()'d. We should call WebPSafeFree() on
these, but it means exposing the whole mechanism. TODO(later).
Change-Id: Iad2c9060f7fa6040e3ba489c8b07f4caadfab77b
Optimize and re-structured VP8LGetHistoImageSymbols method, by using the bin-hash
for merging the Histograms more efficiently, instead of the randomized
heuristic of existing method HistogramCombine.
This change speeds up the Lossless encoding by 40-50% (for method=4 and Q > 50)
with 0.8% penalty in compression density. For lower method, the speed up is 25-30%,
with 0.4% penalty in the compression density.
Change-Id: If61adadb1a041b95def6405aa1fe3b83c3cb25ce
Refactor code for HistogramCombine and optimize the code by calculating
the combined entropy and avoid un-necessary Histogram merges.
This speeds up lossless encoding by 1-2% and almost no impact on compression
density.
Change-Id: Iedfcf4c1f3e88077bc77fc7b8c780c4cd5d6362b
rather than symlink the webm/vpx terms, use the same header as libvpx to
reference in-tree files
based on the discussion in:
https://codereview.chromium.org/12771026/
Change-Id: Ia3067ecddefaa7ee01550136e00f7b3f086d4af4
* merge cost calculation functions (BitsEntropy() and HuffmanCost())
* have HistogramAdd() specialized into separate functions
* use threshold to bail-out early
* revamp code a bit
* also: save memory by freeing free(histogram_image)
Change-Id: I8ee5d2cfa1462d5d6ea6361f5c89925a3720ef55
Order-by-cost mostly unchanged (up to a scaling constant 1/log(2))
(except for few minor diff in < 2% of cases)
+ remove unused field cost_mode->cache_bits_
Change-Id: I714f8ab12f49a23f5d499a64c741382c9b489a3e
Used when we don't code a Huffman Image.
-> Simplify the code quite some because we don't have to
deal with special cases of histo bits
Change-Id: I0c3f46cbf3b501e021c093e07253e7404c01ff4f
-> lot of simplifications ensue and we should be able to get rid of
ClearHuffmanTreeIfOnlyOneSymbol() too, in a subsequent patch.
Change-Id: Ic4c51d05e4b1970e37f94ffd85fae6a02e4a6422
VP8LHistogramSet is container for pointers to histograms that
we can shuffle around. Allocation is one big chunk of memory.
Downside is that we don't de-allocate memory on-the-go during
HistogramRefine().
+ renamed HistogramRefine() into HistogramRemap(), so we don't
confuse with "HistogramCombine"
+ made VP8LHistogramClear() static.
Change-Id: Idf1a748a871c3b942cca5c8050072ccd82c7511d
* de-inline some function
* make VP8LBackwardRefs be more like a vectorwith max capacity
* add bit_cost_ field to VP8LHistogram
* general code simplifications
* remove some memmov() from HistogramRefine
* simplify HistogramDistance()
...
Change-Id: I16904d9fa2380e1cf4a3fdddf56ed1fcadfa25dc
- VP8LEncAnalyze, EvalAndApplySubtractGreen, ApplyPredictFilter,
ApplyCrossColorFilter
- Added palette handling and transform buffer management in VP8LEncodeImage()
- Add Transforms (subtract Green, Predict, cross_color) to dsp/lossless.c.
These are more-or-less copied from src/lossless code.
After this Change, will implement the EncodeImageInternal() method.
Change-Id: Idf71f803c24b3b5ae3b5079b15e019721784611d