and only use it on x86 / x64 where it's available.
has the side-effect of quieting a msvs /analyze warning:
C6001: Using uninitialized memory 'cpu_info'.
Change-Id: Iae51be3b22b2ee949cfc473eeea9fd9fb6b3c2cb
Disable costly TraceBackwards heuristic for computing the backward references
for low_effort (method=0) compression.
The TraceBackwards heuristic is already disabled for lower (q < 25) quality
range. Following is the compression data for 1000 image corpus for q >= 25.
This speeds up compression (q >= 25) by a factor of 2.5-3X with slight loss of
compression density (0.7% for lower quality range and 1.2% for higher qualities).
Change-Id: I256c9e2137c7de4083f423ea32ee12d3b0f46253
added new function CollectColorRedTransforms to C, which calls
TransformColorRed and it is realized via pointer to function
Change-Id: Ia68d73bfcf1ca2cb443dc2825910946221f87835
explicitly add immintrin.h instead of transitively picking it up via
windows.h presumably. makes the code easier to move around.
Change-Id: If70d5143ac94fc331da763ce034358858e460e06
added new function CollectColorBlueTransforms to C, which calls
TransformColorBlue and it is realized via pointer to function
Change-Id: Ia488b7a7a689223b5d33aae9724afab89b97fced
reported in https://code.google.com/p/webp/issues/detail?id=237
An empty partition #0 should be indicative of a bitstream error.
The previous code was correct, only an assert was triggered in debug mode.
But we might as well handle the case properly right away...
Change-Id: I4dc31a46191fa9e65659c9a5bf5de9605e93f2f5
- Lower the threshold parameters for HashChainFindCopy.
For 1000 image PNG corpus (m=0), this change yields speedup of 15-20% at
lower quality range (0.25% drop in compression density) and about 10%
for higher quality range without any drop in the compression density.
Following is the compression stats (before/after) for method = 0:
Before After
bpp/MPs bpp/MPs
q=0 2.8615/18.000 2.8651/18.631
q=5 2.8615/18.216 2.8650/20.517
q=10 2.8572/18.070 2.8650/21.992
q=15 2.8519/18.371 2.8584/21.747
q=20 2.8454/18.975 2.8515/20.448
q=25 2.8230/8.531 2.8253/9.585
// Compression density remains same for q-range [30-100]
q=30 2.7310/7.706 2.7310/8.028
q=35 2.7253/6.855 2.7253/7.184
q=40 2.7231/6.364 2.7231/6.604
q=45 2.7216/5.844 2.7216/6.223
q=50 2.7196/5.210 2.7196/5.731
q=55 2.7208/4.766 2.7208/4.970
q=60 2.7195/4.495 2.7195/4.602
q=65 2.7185/4.024 2.7185/4.236
q=70 2.7174/3.699 2.7174/3.861
q=75 2.7164/3.449 2.7164/3.605
q=80 2.7161/3.222 2.7161/3.038
q=85 2.7153/2.919 2.7153/2.946
q=90 2.7145/2.766 2.7145/2.771
q=95 2.7124/2.548 2.7124/2.575
q=100 2.6873/2.253 2.6873/2.335
Change-Id: I0e17581fb71f6094032ad06c6203350bd502f9a1
- Do light weight entropy based histogram combine and leave out CPU
intensive stochastic and greedy heuristics for combining the
histograms.
For 1000 image PNG corpus (m=0), this change yields speedup of 10% at
lower quality range (1% drop in compression density) and about 5% for
higher quality range (1% drop in compression density). Following is the
compression stats (before/after) for method = 0:
Before After
bpp/MPs bpp/MPs
q=0 2.8336/16.577 2.8615/18.000
q=5 2.8336/16.504 2.8615/18.216
q=10 2.8293/16.419 2.8572/18.070
q=15 2.8242/17.582 2.8519/18.371
q=20 2.8182/16.131 2.8454/18.975
q=25 2.7924/7.670 2.8230/8.531
q=30 2.7078/6.635 2.7310/7.706
q=35 2.7028/6.203 2.7253/6.855
q=40 2.7005/6.198 2.7231/6.364
q=45 2.6989/5.570 2.7216/5.844
q=50 2.6970/5.087 2.7196/5.210
q=55 2.6963/4.589 2.7208/4.766
q=60 2.6949/4.292 2.7195/4.495
q=65 2.6940/3.970 2.7185/4.024
q=70 2.6929/3.698 2.7174/3.699
q=75 2.6919/3.427 2.7164/3.449
q=80 2.6918/3.106 2.7161/3.222
q=85 2.6909/2.856 2.7153/2.919
q=90 2.6902/2.695 2.7145/2.766
q=95 2.6881/2.499 2.7124/2.548
q=100 2.6873/2.253 2.6873/2.285
Change-Id: I0567945068f8dc7888041e93d872f9def91f50ba
As 'curr_canvas_mod' is being modified during calls to IncreaseTransparency()
and FlattenSimilarBlocks(), GetSubRect() should get the sub-frame from
'curr_canvas_mod' as well.
Earlier, GetSubRect() was computed from 'curr_canvas', so modifying
'curr_canvas_mod' had no effect on encoding.
Change-Id: Ia847503007b66364817fe57def5a9e3c37d1b3cc
We set the parameters even if there is just one frame. This is to make sure
assembly is correct even if single frame animation is NOT converted to a full
frame later.
Change-Id: If79e6aa5e2575cb0f3cd229f16c655b3663c35b0
Given that we decided to only handle frame disposal for output WebP
internally,
only current and previous canvas need to be maintained.
Change-Id: I625293bed5aeb5aabf4eca779f6ec3ee84c9ff2a
A separate API to generate animated WebP images.
It will eventually replace the internal gif2webp_util methods.
Also: update makefiles.
Change-Id: Idf61dfc1016c10b24fea70425d1a2323cffba515
we compare the current VP8GetCPUInfo pointer to the last used.
This is less code overall and each implementation is still
testable separately (by just changing VP8GetCPUInfo, but not
a separate threads!)
Change-Id: Ia13fa8ffc4561a884508f6ab71ed0d1b9f1ce59b
inline function MakeARGB32 calls changed to call
via pointers to functions which make (a)rgb for
entire row
Change-Id: Ia4bd4be171a46c1e1821e408b073ff5791c587a9
references to fragments remain, along with some superfluous checks; these
will be removed in a future commit.
Change-Id: I39fe9314900ecbc5d60e5065b65fa1b4c668af63
and apply Paeth predictor (predictor#11) for the low effort (m=0) mode.
For 1000 image PNG corpus (m=0), this change yields speedup of 25% at lower quality
range and about 10% for higher quality range.
Change-Id: I0f036b8ffe45c241e63a067cbf01527b13d8de93
most of the time, we don't need to actually move the
data.
Compression is randomly slightly different, because HistogramCompactBins() changed.
Timing is about the same.
Change-Id: Ia6af8e9780581014d6860f2b546189ac817cfad1
check for __apple_build_version__ to distinguish the two; a version
check could work as Apple bumped Xcode's to 5.x/6.x, but it's unclear
how upstream will deal with their versioning as they go 3.6+, so avoid
it for now.
Change-Id: I67cda67c4f68e262a92d805a63cc1496374be063
we don't need to store the whole distribution in order to compute the alpha
Later, we can incorporate the max_value / last_non_zero bookkeeping
in SSE2 directly.
Change-Id: I748ccea4ac17965d7afcab91845ef01be3aa3e15
this is a first step to unifying encoding/decoding cache stride
and possibly sharing the prediction functions in dsp/
With this layout, there's a little (~7%) space lost with unused samples.
But no speed change was observed.
Change-Id: I016df8cad41bde5088df3579e6ad65d884ee711e
~68% faster
reuses TM4() adding support for the additional rows, the columns were
already being done.
Change-Id: I6eac17e58cd1c636082bf7281f70f884ec399a6b
Move all the Entropy evaluation methods to lossless.c (from histogram.c).
There's slight difference in the way entropy is computed for evaluating
entropy in prediction methods and histogram (literal) for huffman trees.
Plan (later) to merge few (static) methods and reduce the code size.
This change has no impact on the compression speed/density.
Change-Id: Ife3d96a3c4a8d78a91723d9e0a8d1b78c0256a15
Remove handling for WEBP_HINT_GRAPH w.r.t use_palette flag.
The WEBP_HINT_GRAPH is now used at one place, to set the initial size of the
Bit Writer as bpp for photo images are generally larger than the graphical
images.
Change-Id: I1b9c4436c85a8f69da74c0dbcd292397323f2696
Update BackwardRefsWithLocalCache to do in-place update of backward
references w.r.t local color cache index.
No impact on the compression density or compression speed.
Change-Id: Ie066251464c3928c044e037b43df3af28b48ca30
histogram.c:
- Verified (earlier) that there's low correlation between Red & Blue colors
(particularly after applying Cross-color transform). The Bin based histogram
merge, bins on three entropies viz literal, red & blue symbols. Removing
either of blue or red increases the compression density. So keeping the bins
for red & blue sybmols.
- Keeping the compact bins method as-is. This way it's simpler to read.
huffman_encode.h: Added field comments for struct HuffmanTree and removed the TODO.
Change-Id: Ia76f7bc730079d1b3b644038c5d9931db3797f0e
Use the refs_lz77 computed (with cache_bits=0) in the method 'CalculateBestCacheSize'
to regenerate the LZ77 references corresponding to the optimum cache_bits and avoid
calling costly 'BackwardReferencesLz77' one extra time.
This change leaves the compression density unchanged and speeds up compression
by 10-15%.
Change-Id: I5a92e11788d3c3f656aa7e1fba54fb5d96ee0027