added new function CollectColorRedTransforms to C, which calls
TransformColorRed and it is realized via pointer to function
Change-Id: Ia68d73bfcf1ca2cb443dc2825910946221f87835
added new function CollectColorBlueTransforms to C, which calls
TransformColorBlue and it is realized via pointer to function
Change-Id: Ia488b7a7a689223b5d33aae9724afab89b97fced
Move all the Entropy evaluation methods to lossless.c (from histogram.c).
There's slight difference in the way entropy is computed for evaluating
entropy in prediction methods and histogram (literal) for huffman trees.
Plan (later) to merge few (static) methods and reduce the code size.
This change has no impact on the compression speed/density.
Change-Id: Ife3d96a3c4a8d78a91723d9e0a8d1b78c0256a15
* merged the two HistogramAdd/AddEval() into a single call
(with detection of special case when b==out)
* added a SSE2 variant
* harmonize the histogram type to 'uint32_t' instead
of just 'int'. This has a lot of ripples on signatures.
* 1-2% faster
Change-Id: I10299ff300f36cdbca5a560df1ae4d4df149d306
This is to help further optimizations.
(like in https://gerrit.chromium.org/gerrit/#/c/69787/)
There's a small slowdown (~0.5% at -z 9 quality) due to
function pointer usage. Note that, for speed, it's important
to return VP8LStreaks by value, and not pass a pointer.
Change-Id: Id4167366765fb7fc5dff89c1fd75dee456737000
Functions VP8LFastLog2Slow and VP8LFastSLog2Slow
also: replaced some "% y" by "& (y-1)" in the C-version
(since y is a power-of-two)
Change-Id: I875170384e3c333812ca42d6ce7278aecabd60f0
expose the predictor array as function pointers instead
of each individual sub-function
+ merged Average2() into ClampedAddSubtractHalf directly
+ unified the signature as "VP8LProcessBlueAndRedFunc"
no speed diff observed
Change-Id: Ic3c45dff11884a8330a9ad38c2c8e82491c6e044
Speedup lossless encoder by 20-25% by optimizing:
- GetBestColorTransformForTile: Use techniques like binary search and
local minima search to reduce the search space.
- VP8LFastSLog2Slow & VP8LFastLog2Slow: Adding the correction factor for
log(1 + x) and increase the threshold for calling the approximate
version of log_2 (compared to costly call to log()).
Change-Id: Ia2444c914521ac298492aafa458e617028fc2f9d
Also created variant VP8LPrefixEncodeBits that returns the
code & extra_bits only.
There's no impact on compression density and compression speed.
Change-Id: I2cafdd3438ac9270cd72ad9d57b383cdddfdfa4c
This speeds up WebP lossless decoding by 20%. In particular, the
photographic images get 35% speedup.
Change-Id: Idb94750342a140ec05df52c07e12be4bba335adc
rather than symlink the webm/vpx terms, use the same header as libvpx to
reference in-tree files
based on the discussion in:
https://codereview.chromium.org/12771026/
Change-Id: Ia3067ecddefaa7ee01550136e00f7b3f086d4af4
Earlier such images were using roughly 9 * width * height bytes for
decoding. Now, they take 6 * width * height memory.
Change-Id: Ie4a681ca5074d96d64f30b2597fafdca648dd8f7
larger values are still dealt with in the .cc
~5% faster encoding
Output size is slightly different (variably), because of
different floating-point calculation ordering.
Change-Id: I6ede18b09c753997cf78aa1199a807d9ddb5d4b4
Order-by-cost mostly unchanged (up to a scaling constant 1/log(2))
(except for few minor diff in < 2% of cases)
+ remove unused field cost_mode->cache_bits_
Change-Id: I714f8ab12f49a23f5d499a64c741382c9b489a3e
Profiled data: Profiled few images and found that in the function VP8LFastLog,
90% of time table lookup is performed, while rest of time (10%) call to log
function is made. Typical lookup accounts for 10 CPU instructions and call to
log 200 instruction counts. The weighted average comes out to be 30
instructions per call. For mid qualities (25-75), this function (VP8LFastLog)
accounts for 30-50% of total CPU cycles (via call path: VP8LCOlorSpaceTransform
-> PredictionCostCrossColor -> ShannonEntropy). After this change, the log is
called less that 1% of time, with average instructions being 15 per call.
Measured the performance over 1000 files for various qualities and found
overall compression speedup between 10-15% (in quality range [0, 75]). The
compression density loss is around 0.5% (though at some qualities, compression
is little better as well).
Change-Id: I247bc6a8d4351819c871f19d65455dc23aea8650
so that it uses original values of left, top etc for prediction rather than the
predicted values of the same. Also, do some renaming in the same to make it
more readable.
Change-Id: I2fe94e35a6700bd437f5c601e2af12323bf32445
- VP8LEncAnalyze, EvalAndApplySubtractGreen, ApplyPredictFilter,
ApplyCrossColorFilter
- Added palette handling and transform buffer management in VP8LEncodeImage()
- Add Transforms (subtract Green, Predict, cross_color) to dsp/lossless.c.
These are more-or-less copied from src/lossless code.
After this Change, will implement the EncodeImageInternal() method.
Change-Id: Idf71f803c24b3b5ae3b5079b15e019721784611d
Pulled from the current HEAD (218c32e).
The history of this and related files is a bit entangled so rather
trying to split the changes and introduce some noise in master's history
we'll start with a fresh snapshot.
The file progression is still available in the experimental branch.
Change-Id: I40538799dbf999abb9408ac83f55b897d8e22498