Return key/index if the query is found, and -1 otherwise.
The benefit of this is to save a hashing computation.
Change-Id: Iff056be330f5fb8204011259ac814f7677dd40fe
This is essentially a revert of a3611513d2bf465fd282d9dc45b3f72c79c232ad
and cfbcc5ece022fc74ae9b987e05c2807df0d82ec5.
Here is what happened: there was a corruption bug that eventually
got fixed by 0174d18d8b51f6c9228c70066a987c30a8132995.
But before finding the root, a3611513d2bf465fd282d9dc45b3f72c79c232ad
and cfbcc5ece022fc74ae9b987e05c2807df0d82ec5 hid the bug
by not imposing length of 1 when it was actually 2 or 3 (which does help
compression as a litteral is more efficient than an offset and a length
of size 2 or 3).
Change-Id: I6f18fc1f583a51ac9d8aab2508458264047cd493
The optimization for (len != MIN_LENGTH) actually only holds for
(len > MIN_LENGTH) but (len < MIN_LENGTH) can now happen as len can
be changed in the loop before.
Change-Id: I3f9f91a540206c80385c5fba96c3d64ab9536752
This is getting back to the old behavior which is actually better for
compression and speed with the latest patches.
Change-Id: I35884bab02589297c25d6e1e66dc5f13e05f7aa7
In case where the same offset is found in consecutive pixels,
the cost computation from one pixel can be re-used for the next.
Change-Id: Ic03c7d4ab95f3612eafc703349cfefd75273c3d7
and also recycle the malloc'd intervals
This avoids quite some malloc/free cycles during interval managment.
Change-Id: Ic2892e7c0260d0fca0e455d4728f261fb4c3800e
In a lot of cases, only one interval is used. This can cause
a lot of malloc/free cycles for only 56 bytes. By caching this
single interval and re-using it, we remove this cycle in most
frequent cases.
Change-Id: Ia22d583f60ae438c216612062316b20ecb34f029
In some cases, the hash chain for a function is filled several
times:
- GetBackwardReferences -> CalculateBestCacheSize ->
BackwardReferencesLz77 that computes the hash chain
- GetBackwardReferences ->
(not always) BackwardReferencesTraceBackwards ->
BackwardReferencesHashChainDistanceOnly that computes the hash
chain in a slightly different way
Speed and compression performance are slightly changed (+ or -)
but will be homogneized in a later patch.
Change-Id: I43f0ecc7a9312c2ed6cdba1c0fabc6c5ad91c953
Instead of comparing all the following pixels over len (which can
frequently reach the maximum MAX_LENGTH=4096 for some images),
intervals are stored and compared.
Change-Id: I0dafef6cc988dde3c1c03ae07305ac48901d60ee
The 32-bit buffers are actually rarely 64-bit aligned.
The new solution uses memcmp and is alignment agnostic.
It is also slightly faster.
Change-Id: I863003e9ee4ee8a3eed25b7b2478cb82a0ddbb20
Arrays were compared 32 bits at a time, it is now done 64 bits at a time.
Overall encoding speed-up is only of 0.2% on @skal's small PNG corpus.
It is of 3% on my initial 1.3 Mp desktop screenshot image.
Change-Id: I1acb32b437397a7bf3dcffbecbcd4b06d29c05e1
This makes the chains more efficient and a larger variety of data is tested.
0.02 % compression gain at q 100, 0.05 % at default quality. 0.8 % speedup by
callgrind.
0.16 % compression gain for lossy alpha ?!
Change-Id: I888120133352799eb14f5f602c7f40ab404bd665
a total impact of 1 % on encoding speed
This allows for performance neutral removal of the binary search
in cache bits selection. This will give a small improvement in
compression density.
Change-Id: If5d4d59460fa1924ce71af977320834a47c2054a
0.21 % compression density improvement for 1000 png corpus in
lossless mode
0.50 % compression density improvement for 1000 png corpus in
lossy mode
Change-Id: I14ee8c427ae5d3e116b0ee6695fcdea3321a319d
do not do length 2 matches far away
speedup for non compressible data by inserting two literals at a time
when no matches are found
Change-Id: Ia8e033071f4186bb8148bb2bf13ca37586734aa3
Speed-wise equivalent on x86 and ARM (maybe a tad faster, hard to tell).
Note that the two 32-bit multiples are not strictly equivalent
to the 64-bit one, since we're missing one carry propagation.
In practice, no observable difference was seen because of this
slightly different hashing result.
Change-Id: I8f2381175eae1cb20dabf149e6b27e1768fba6ab
Simplify and speedup backward references for low-effort settings by evaluating
LZ77 references only. This change speeds up compression by 10-25% at lower
(q <= 25) quality range with a slight drop (0.2%) in the compression density.
Change-Id: Ibd6f03b1a062d8ab9191786c2a425e9132e4779f
Disable costly TraceBackwards heuristic for computing the backward references
for low_effort (method=0) compression.
The TraceBackwards heuristic is already disabled for lower (q < 25) quality
range. Following is the compression data for 1000 image corpus for q >= 25.
This speeds up compression (q >= 25) by a factor of 2.5-3X with slight loss of
compression density (0.7% for lower quality range and 1.2% for higher qualities).
Change-Id: I256c9e2137c7de4083f423ea32ee12d3b0f46253
- Lower the threshold parameters for HashChainFindCopy.
For 1000 image PNG corpus (m=0), this change yields speedup of 15-20% at
lower quality range (0.25% drop in compression density) and about 10%
for higher quality range without any drop in the compression density.
Following is the compression stats (before/after) for method = 0:
Before After
bpp/MPs bpp/MPs
q=0 2.8615/18.000 2.8651/18.631
q=5 2.8615/18.216 2.8650/20.517
q=10 2.8572/18.070 2.8650/21.992
q=15 2.8519/18.371 2.8584/21.747
q=20 2.8454/18.975 2.8515/20.448
q=25 2.8230/8.531 2.8253/9.585
// Compression density remains same for q-range [30-100]
q=30 2.7310/7.706 2.7310/8.028
q=35 2.7253/6.855 2.7253/7.184
q=40 2.7231/6.364 2.7231/6.604
q=45 2.7216/5.844 2.7216/6.223
q=50 2.7196/5.210 2.7196/5.731
q=55 2.7208/4.766 2.7208/4.970
q=60 2.7195/4.495 2.7195/4.602
q=65 2.7185/4.024 2.7185/4.236
q=70 2.7174/3.699 2.7174/3.861
q=75 2.7164/3.449 2.7164/3.605
q=80 2.7161/3.222 2.7161/3.038
q=85 2.7153/2.919 2.7153/2.946
q=90 2.7145/2.766 2.7145/2.771
q=95 2.7124/2.548 2.7124/2.575
q=100 2.6873/2.253 2.6873/2.335
Change-Id: I0e17581fb71f6094032ad06c6203350bd502f9a1
Update BackwardRefsWithLocalCache to do in-place update of backward
references w.r.t local color cache index.
No impact on the compression density or compression speed.
Change-Id: Ie066251464c3928c044e037b43df3af28b48ca30
Use the refs_lz77 computed (with cache_bits=0) in the method 'CalculateBestCacheSize'
to regenerate the LZ77 references corresponding to the optimum cache_bits and avoid
calling costly 'BackwardReferencesLz77' one extra time.
This change leaves the compression density unchanged and speeds up compression
by 10-15%.
Change-Id: I5a92e11788d3c3f656aa7e1fba54fb5d96ee0027
- The optimal cache bits is evaluated inside the method 'VP8LGetBackwardReferences'.
- The input cache_bits to 'VP8LGetBackwardReferences' sets the maximum cache
bits to use (passing 0 implies disabling the local color cache).
- The local color cache is disabled for lowerf (<= 25) quality levels (as before).
- Enabled local color cache for palette images as well. This saves additional
0.017% bytes with a slight (2-3%) improvement in the compression speed.
- Removed 'use_2d_locality' parameter from method VP8LGetBackwardReferences, as
this option is not an option now (after we freeze the lossless bit-stream).
Change-Id: I33430401e465474fa1be899f330387cd2b466280