Commit Graph

616 Commits

Author SHA1 Message Date
Pascal Massimino
7fa67c9b9e change GetPixPairHash64() return type to uint32_t
Change-Id: Ibb61c1631d7a4bcda5417b5a85864d5e2c3f3858
2015-04-16 00:55:25 -07:00
Pascal Massimino
7073bfb3ee Merge "split 64-mult hashing into two 32-bit multiplies" 2015-04-15 23:04:47 -07:00
Pascal Massimino
7fe357b8c0 split 64-mult hashing into two 32-bit multiplies
Speed-wise equivalent on x86 and ARM (maybe a tad faster, hard to tell).

Note that the two 32-bit multiples are not strictly equivalent
to the 64-bit one, since we're missing one carry propagation.
In practice, no observable difference was seen because of this
slightly different hashing result.

Change-Id: I8f2381175eae1cb20dabf149e6b27e1768fba6ab
2015-04-15 17:45:19 +02:00
Pascal Massimino
6121413415 remove VP8Residual::cost unused field
Change-Id: Id494475b05c540b40fd104594acbcaa783b88d77
2015-04-15 01:56:31 -07:00
James Zern
db12250fd1 cosmetics: vp8enci.h: break long line
Change-Id: Ib7c7ef6171506e826ed5f7df20c5644f240fd645
2015-04-06 16:11:02 -07:00
Pascal Massimino
82d980209b add a dec/common.h header to collect common enc/dec #defines
had to rename few structs.

-> we can now include both vp8i.h and vp8enci.h without naming
conflicts.

Change-Id: Ib41b498f1b57aab3d6b796361afc45210ec75174
2015-03-31 22:17:58 -07:00
James Zern
553051f741 dsp/lossless: split enc/dec functions
adds lossless_enc*.c; reduces the size of the decode-only so: ~78K
w/gcc-4.8.2 on x86_64.

Change-Id: If5e4610b67d05eba5896bc64bab79e9df92b2092
2015-03-23 22:57:50 -07:00
James Zern
92a5da9c8c sync versions with 0.4.3
libwebp{,decoder} - 0.4.3
libwebp libtool - 5.3.0
libwebpdecoder libtool - 1.3.0

mux/demux - 0.2.2 (unchanged)
libtool - 1.2.0 (unchanged)

(cherry picked from commit bd852f5d81)

Change-Id: Ie8c35ffc20c1bfd782bdafd99da6c6b1373022c1
2015-03-11 17:29:23 -07:00
Pascal Massimino
b1bdbbabfb ~30% faster smart-yuv (-pre 4) with early-out criterion
we look at average global improvement and stop when things are
moving slow, or when we had a quite good first iteration already
(means: the picture is "not difficult")

Change-Id: I8ab7d100353039b5b32bb5fac3fe03c8440c78d5
2015-03-11 00:42:12 -07:00
Pascal Massimino
44bd95612e fix signature for VP8RecordCoeffTokens()
Change-Id: Ia2fe764b7280931335237ced8190604129fae565
2015-03-02 23:38:20 -08:00
Pascal Massimino
c9b8ea0eef small cosmetics on TokenBuffer.
Change-Id: I7c33651ed8e3a151aef44247db5fb1e8bf41f8ba
2015-03-03 00:48:28 +01:00
Vikas Arora
ef98750027 Speedup method StoreImageToBitMask by 5%.
Speedup method StoreImageToBitMask by replacing the code to find histogram
index and Huffman tree codes at every iteration to a more optimal code that
updates these only when the current pixel (to write) crosses the histogram
tile-row boundary.

This change speeds up the StoreImageToBitMask method by 5%.

Change-Id: If01a1ccd7820f9a3a3e5bc449d070defa51be14b
2015-02-20 09:46:19 -08:00
Pascal Massimino
2382050748 1-2% faster encoding by removing an indirection in GetResidualCost()
The MIPS code for cost is not updated yet, that's why i keep Residual::*cost
around for now. Should be removed in favor of *costs later.

Change-Id: Id1d09a8c37ea8c5b34ad5eb8811d6a3ec6c4d89f
2015-02-19 08:44:35 +01:00
James Zern
9bc0f922aa ApplyFiltersAndEncode: only copy lossless stats
this avoids a race with multi-threaded lossy + alpha compression

Change-Id: Ie437105f5a899ed28b9c8885b6ca5431092ce8f5
2015-02-12 19:44:25 -08:00
James Zern
e15560107c move some cost tables from enc/ to dsp/
removes circular dependency between dsp and enc.

since:
a987fae MIPS: dspr2: added optimization for function GetResidualCost

Change-Id: Ifeb8fc02de89e2ba982ed7ffacd925d649bfec3c
2015-02-11 16:10:06 -08:00
pascal massimino
c3a031686a Merge "picture_csp: fix build w/USE_GAMMA_COMPRESSION undefined" 2015-02-10 00:09:03 -08:00
James Zern
1dd419ced5 picture_csp: fix build w/USE_GAMMA_COMPRESSION undefined
kGammaFix is now only defined with USE_GAMMA_COMPRESSION;

fixes:
use of undeclared identifier 'kGammaFix'

Change-Id: Ib1e2f410eff9b83be065894f88181f91dd2776e1
2015-02-09 23:57:14 -08:00
James Zern
0ec4da960d picture_csp::InitGammaTables*: add missing TSan annotations
Change-Id: I66ca5b3e7b1614f861a9b68bd437f58b24cb1ebb
2015-02-09 23:44:47 -08:00
Pascal Massimino
a987faedfa MIPS: dspr2: added optimization for function GetResidualCost
set/get residual C functions moved to new file in src/dsp
mips32 version of GetResidualCost moved to new file

Change-Id: I7cebb7933a89820ff28c187249a9181f281081d2
2015-02-07 02:13:26 -08:00
James Zern
3b77e5a735 VP8TBufferClear: remove some misleading const's
the input to the function is non-const and the pointer being operated is
being free'd; removes an unnecessary cast in the process

Change-Id: Ic515ed672ddf7f8e4e36eeac696ff7aa8a3652f7
2015-02-05 23:56:26 -08:00
James Zern
aa139c8f1a VP8EmitTokens: remove unnecessary param void cast
'final_pass' is used within the function

Change-Id: I81be1a6e18cafaa6ae685ed8ad2b107fa7ed29cf
2015-02-05 23:56:26 -08:00
Vikas Arora
4c82284d2e Updated the near-lossless level mapping.
Updated the near-lossless level mapping and make it correlated to lossy
quality i.e 100 => minimum loss (in-fact no-loss) and the visual-quality loss
increases with decrease in near-lossless level (quality) till value 0.

The new mapping implies following (PSNR) loss-metric:
-near_lossless 100: No-loss (bit-stream same as -lossless).
-near_lossless  80: Very very high PSNR (around 54dB).
-near_lossless  60: Very high PSNR (around 48dB).
-near_lossless  40: High PSNR (around 42dB).
-near_lossless  20: Moderate PSNR (around 36dB).
-near_lossless   0: Low PSNR (around 30dB).

Change-Id: I930de4b18950faf2868c97d42e9e49ba0b642960
2015-02-05 11:17:14 -08:00
James Zern
c86b40cca0 enc/near_lossless.c: fix alignment
Change-Id: Ifd1b1b88c375abf655d94e2ba7d52087110294a5
2015-02-02 19:35:12 -08:00
Vikas Arora
72831f6b28 Speedup AnalyzeAndInit for low effort compression.
AnalyzeSubtractGreen constitutes about 8-10% of the comression CPU cycles.
Statistically, subtract-green is proved to be useful for most of the
non-palette compression. So instead of evaluating the entropy (by calling
AnalyzeSubtractGreen) apply subtract-green transform for the low-effort
compression.

This changes speeds up the compression at m=0 by 8-10% (with very slight loss
of 0.07% in the compression density).

Change-Id: I9797dc39437ae089716acb14631bbc77d367acf4
2015-01-30 10:37:31 -08:00
Vikas Arora
a6597483af Speedup Analyze methods for lossless compression.
Speed up AnalyzeSubtractGreen by looping through the image pixel once to
compute the two histograms.

AnalyzeEntropy code cleanup.
Removed some 'if' conditions and pointer indirections inside pixel iterate loop.

Change-Id: Ia65e3033988ff67df8e3ecce19d6e34cfc76358e
2015-01-30 09:16:31 -08:00
Vikas Arora
98c8138663 Enable Near-lossless feature.
Enable the WebP near-lossless feature by pre-processing the image to smoothen
the pixels.

On a 1000 PNG image corpus, for which WebP lossless (default settings) gets
25% compression gains, following is the performance of near-lossless feature
at various '-near_lossless' levels:
-near_lossless 90: 30% (very very high PSNR 54-60dB)
-near_lossless 75: 38% (very high PSNR 48-54dB)
-near_lossless 50: 45% (high PSNR 42-48dB)
-near_lossless 25: 48% (moderate PSNR 36-42dB)
-near_lossless 10: 50% (PSNR 30-36dB)

WebP near-lossless is specifically useful for discrete-tone images like
line-art, icons etc.

Change-Id: I7d12a2c9362ccd076d09710ea05c85fa64664c38
2015-01-29 16:10:20 -08:00
Urvang Joshi
2db15a9583 Temporarily disable encoding of alpha plane with color cache.
This is to avoid triggering the related decoder bug.

Change-Id: I8fa074a5393bcd62aa4a2232cd4e02935e927a89
2015-01-28 15:28:02 -08:00
James Zern
cafa1d882f Merge "Simplify backward refs calculation for low-effort." 2015-01-27 23:32:21 -08:00
Pascal Massimino
7afdaf8496 Alpha coding: reorganize the filter/unfiltering code
Move the filtering code to their own dsp/ spot
New function: VP8FiltersInit()

Change-Id: I0b2041eab42346c59b972f2575b05509e6a8f7b1
2015-01-28 08:02:41 +01:00
Vikas Arora
4d6d7285b0 Simplify backward refs calculation for low-effort.
Simplify and speedup backward references for low-effort settings by evaluating
LZ77 references only. This change speeds up compression by 10-25% at lower
(q <= 25) quality range with a slight drop (0.2%) in the compression density.

Change-Id: Ibd6f03b1a062d8ab9191786c2a425e9132e4779f
2015-01-27 09:36:14 -08:00
Vikas Arora
ec0d1be577 Cleaup Near-lossless code.
Cleaup Near-lossless code
- Simplified and refactored the code.
- Removed the requirement (TODO) to allocate the buffer of size WxH and work
  with buffer of size 3xW.
- Disabled the Near-lossless prr-processing for small icon images (W < 64 and H < 64).

Change-Id: Id7ee90c90622368d5528de4dd14fd5ead593bb1b
2015-01-26 15:29:59 -08:00
Vikas Arora
9814ddb601 Remove the post-transform near-lossless heuristic.
Remove the post-transform (prediction, subtract green & cross-color)
near-lossless heuristic, that's not ready yet and produces unacceptable visual
(banding) artifacts.

Change-Id: I9b606a790ce0344c588f2ef83a09c57ac19c2fc1
2015-01-26 14:19:00 -08:00
Pascal Massimino
0f027a72bf simplify smart RGB->YUV conversion code
* use the same TFIX == YFIX precision (2bits)
* use int instead of float in LinearToGammaF()

output is visually equivalent. Code is a little faster.

Change-Id: Ie3cfebca351dbcbd924b3d00801d6523dca6981f
2015-01-23 14:42:32 +01:00
Pascal Massimino
0d5b334ee8 BackwardReferencesHashChainFollowChosenPath: remove unused variable
Change-Id: I8dc4622dbacca03a7876f8856a0db5b9b9ec2fbd
2015-01-22 23:22:58 -08:00
Pascal Massimino
f480d1a7ef Fix to near lossless artefacts on palettized images.
Don't rely on palette not being there before the palette colors are counted.

Change-Id: I988286675d3398f2da8f6d2fb6462db08af8028d
2015-01-22 17:47:50 +01:00
Pascal Massimino
cb4a18a7ba rename HashChainInit into HashChainReset
this avoids the confusion with "VP8LHashChainInit"

Change-Id: Ia1686828c138729e5bda3cc5c8246d99c80915ef
2015-01-20 00:38:07 -08:00
Pascal Massimino
f079e487ae use uint16_t for chosen_path[]
len is MAX_LENGTH (4096) at max. This reduce memory for path by a half.

Change-Id: I399fda4093d93b1e9d956397b7b210956c5b948f
2015-01-20 00:34:09 -08:00
James Zern
f0e0677b87 VP8LEncodeStream: add an assert
check enc->argb_ to quiet an msvs /analyze warning:
C6387: 'enc->argb_+y*width' could be '0':  this does not adhere to the
specification for the function 'memcpy'.

Change-Id: I87544e92ee0d3ea38942a475c30c6d552f9877b7
2015-01-16 18:16:40 -08:00
Vikas Arora
b9e356b998 Disable costly TraceBackwards for method=0.
Disable costly TraceBackwards heuristic for computing the backward references
for low_effort (method=0) compression.

The TraceBackwards heuristic is already disabled for lower (q < 25) quality
range. Following is the compression data for 1000 image corpus for q >= 25.

This speeds up compression (q >= 25) by a factor of 2.5-3X with slight loss of
compression density (0.7% for lower quality range and 1.2% for higher qualities).

Change-Id: I256c9e2137c7de4083f423ea32ee12d3b0f46253
2015-01-15 09:01:40 -08:00
Vikas Arora
ea08466d34 Tune BackwardReferencesLz77 for low_effort (m=0).
- Lower the threshold parameters for HashChainFindCopy.

For 1000 image PNG corpus (m=0), this change yields speedup of 15-20% at
lower quality range (0.25% drop in compression density) and about 10%
for higher quality range without any drop in the compression density.
Following is the compression stats (before/after) for method = 0:
         Before           After
         bpp/MPs          bpp/MPs
q=0      2.8615/18.000    2.8651/18.631
q=5      2.8615/18.216    2.8650/20.517
q=10     2.8572/18.070    2.8650/21.992
q=15     2.8519/18.371    2.8584/21.747
q=20     2.8454/18.975    2.8515/20.448
q=25     2.8230/8.531     2.8253/9.585
// Compression density remains same for q-range [30-100]
q=30     2.7310/7.706     2.7310/8.028
q=35     2.7253/6.855     2.7253/7.184
q=40     2.7231/6.364     2.7231/6.604
q=45     2.7216/5.844     2.7216/6.223
q=50     2.7196/5.210     2.7196/5.731
q=55     2.7208/4.766     2.7208/4.970
q=60     2.7195/4.495     2.7195/4.602
q=65     2.7185/4.024     2.7185/4.236
q=70     2.7174/3.699     2.7174/3.861
q=75     2.7164/3.449     2.7164/3.605
q=80     2.7161/3.222     2.7161/3.038
q=85     2.7153/2.919     2.7153/2.946
q=90     2.7145/2.766     2.7145/2.771
q=95     2.7124/2.548     2.7124/2.575
q=100    2.6873/2.253     2.6873/2.335

Change-Id: I0e17581fb71f6094032ad06c6203350bd502f9a1
2015-01-08 00:30:21 -08:00
Vikas Arora
b0b973c39b Speedup VP8LGetHistoImageSymbols for low effort (m=0) mode.
- Do light weight entropy based histogram combine and leave out CPU
  intensive stochastic and greedy heuristics for combining the
  histograms.

For 1000 image PNG corpus (m=0), this change yields speedup of 10% at
lower quality range (1% drop in compression density) and about 5% for
higher quality range (1% drop in compression density). Following is the
compression stats (before/after) for method = 0:
         Before           After
         bpp/MPs          bpp/MPs
q=0      2.8336/16.577    2.8615/18.000
q=5      2.8336/16.504    2.8615/18.216
q=10     2.8293/16.419    2.8572/18.070
q=15     2.8242/17.582    2.8519/18.371
q=20     2.8182/16.131    2.8454/18.975
q=25     2.7924/7.670     2.8230/8.531
q=30     2.7078/6.635     2.7310/7.706
q=35     2.7028/6.203     2.7253/6.855
q=40     2.7005/6.198     2.7231/6.364
q=45     2.6989/5.570     2.7216/5.844
q=50     2.6970/5.087     2.7196/5.210
q=55     2.6963/4.589     2.7208/4.766
q=60     2.6949/4.292     2.7195/4.495
q=65     2.6940/3.970     2.7185/4.024
q=70     2.6929/3.698     2.7174/3.699
q=75     2.6919/3.427     2.7164/3.449
q=80     2.6918/3.106     2.7161/3.222
q=85     2.6909/2.856     2.7153/2.919
q=90     2.6902/2.695     2.7145/2.766
q=95     2.6881/2.499     2.7124/2.548
q=100    2.6873/2.253     2.6873/2.285

Change-Id: I0567945068f8dc7888041e93d872f9def91f50ba
2015-01-08 00:29:57 -08:00
Pascal Massimino
72d573f693 simplify the PackARGB signature
Change-Id: I51570e362126b2681f93211a4f59a3fedb5fd4b5
2015-01-05 02:10:04 -08:00
Pascal Massimino
1f4b8642e8 move VP8EncDspARGBInit() call closer to where it's needed
Change-Id: I0d5121b456918f0ee6646903a8d71d4384deafe3
2014-12-23 16:04:14 +01:00
Djordje Pesut
7ce8788b06 MIPS: dspr2: added optimization for function MakeARGB32
inline function MakeARGB32 calls changed to call
via pointers to functions which make (a)rgb for
entire row

Change-Id: Ia4bd4be171a46c1e1821e408b073ff5791c587a9
2014-12-22 12:31:36 +01:00
Pascal Massimino
24284459c7 replace unneeded calls to HistogramCopy() by swaps
most of the time, we don't need to actually move the
data.

Compression is randomly slightly different, because HistogramCompactBins() changed.
Timing is about the same.

Change-Id: Ia6af8e9780581014d6860f2b546189ac817cfad1
2014-12-17 15:32:36 +01:00
Pascal Massimino
31a9cf6417 Speedup WebP lossless compression for low effort (m=0) mode with following:
- Disable Cross-Color transform.
- Evaluate predictors #11 (paeth), #12 and #13 only.

Change-Id: I857264c85c61c3957d4fb45ae32d261d947c8bed
2014-12-17 11:52:11 +01:00
Pascal Massimino
bad775715a simplify the Histogram struct, to only store max_value and last_nz
we don't need to store the whole distribution in order to compute the alpha

Later, we can incorporate the max_value / last_non_zero bookkeeping
in SSE2 directly.

Change-Id: I748ccea4ac17965d7afcab91845ef01be3aa3e15
2014-12-10 10:44:57 +01:00
James Zern
9475bef4d7 PickBestUV: fix VP8Copy16x8 invocation
param order is src, dst

broken in:
66ad372 factorize BPS definition in dsp.h and add VP8Copy16x8

Change-Id: I761f618e3fe31ae7f58953256381f4f16bdb238e
2014-12-04 23:12:30 -08:00
Pascal Massimino
66ad372500 factorize BPS definition in dsp.h and add VP8Copy16x8
Change-Id: Id73a1e968c96455808755df4d131d74e3e2e135d
2014-12-04 13:45:14 +01:00
Pascal Massimino
57606047ec encoder: switch BPS to 32 instead of 16
this is a first step to unifying encoding/decoding cache stride
and possibly sharing the prediction functions in dsp/

With this layout, there's a little (~7%) space lost with unused samples.
But no speed change was observed.

Change-Id: I016df8cad41bde5088df3579e6ad65d884ee711e
2014-12-04 09:17:18 +01:00
James Zern
c6d0f9e758 histogram: cosmetics
fix indent + other minor spelling / whitespace changes

Change-Id: I6e4462b75c98994e3c53c115de07047dbe71ce3c
2014-11-25 15:53:19 -08:00
Vikas Arora
e0c809ad23 Move Entropy methods to lossless.c
Move all the Entropy evaluation methods to lossless.c (from histogram.c).
There's slight difference in the way entropy is computed for evaluating
entropy in prediction methods and histogram (literal) for huffman trees.
Plan (later) to merge few (static) methods and reduce the code size.

This change has no impact on the compression speed/density.

Change-Id: Ife3d96a3c4a8d78a91723d9e0a8d1b78c0256a15
2014-11-20 13:48:05 -08:00
Vikas Arora
a0df55104e Remove handling for WEBP_HINT_GRAPH
Remove handling for WEBP_HINT_GRAPH w.r.t use_palette flag.

The WEBP_HINT_GRAPH is now used at one place, to set the initial size of the
Bit Writer as bpp for photo images are generally larger than the graphical
images.

Change-Id: I1b9c4436c85a8f69da74c0dbcd292397323f2696
2014-11-13 15:49:23 -08:00
Vikas Arora
413dfc0c4b Move static method definition before its usage.
Change-Id: Id766c2bea92e7ebf0de65046f73429b74b4fdda4
2014-11-13 13:18:30 -08:00
Vikas Arora
0f23566558 Update BackwardRefsWithLocalCache.
Update BackwardRefsWithLocalCache to do in-place update of backward
references w.r.t local color cache index.

No impact on the compression density or compression speed.

Change-Id: Ie066251464c3928c044e037b43df3af28b48ca30
2014-11-13 11:54:26 -08:00
Vikas Arora
d69e36ec59 Remove TODOs from lossless encoder code.
histogram.c:
 - Verified (earlier) that there's low correlation between Red & Blue colors
   (particularly after applying Cross-color transform). The Bin based histogram
   merge, bins on three entropies viz literal, red & blue symbols. Removing
   either of blue or red increases the compression density. So keeping the bins
   for red & blue sybmols.
 - Keeping the compact bins method as-is. This way it's simpler to read.
huffman_encode.h: Added field comments for struct HuffmanTree and removed the TODO.

Change-Id: Ia76f7bc730079d1b3b644038c5d9931db3797f0e
2014-11-12 16:10:16 -08:00
Vikas Arora
fdaac8e0ca Optmize VP8LGetBackwardReferences LZ77 references.
Use the refs_lz77 computed (with cache_bits=0) in the method 'CalculateBestCacheSize'
to regenerate the LZ77 references corresponding to the optimum cache_bits and avoid
calling costly 'BackwardReferencesLz77' one extra time.

This change leaves the compression density unchanged and speeds up compression
by 10-15%.

Change-Id: I5a92e11788d3c3f656aa7e1fba54fb5d96ee0027
2014-11-12 14:50:04 -08:00
pascal massimino
a3e79a46f6 Merge "WebPEncode: Support encoding same pic twice (even if modified)" 2014-11-06 22:20:01 -08:00
Urvang Joshi
e4f4dddba3 WebPEncode: Support encoding same pic twice (even if modified)
This wasn't working for this specific scenario:
- Encode an RGBA 'pic' (with trivial alpha) using lossy encoding.
(so that pic->a == NULL after import happens).
- Modify the 'pic->argb' so that it has non-trivial alpha.
- Encode the same 'pic' again.
This used to fail to encode alpha data as pic->a == NULL.

Change-Id: Ieaaa7bd09825c42f54fbd99e6781d98f0b19cc0c
2014-11-06 13:52:48 -08:00
Vikas Arora
95a9bd85c4 Updated VP8LGetBackwardReferences and color cache.
- The optimal cache bits is evaluated inside the method 'VP8LGetBackwardReferences'.
- The input cache_bits to 'VP8LGetBackwardReferences' sets the maximum cache
  bits to use (passing 0 implies disabling the local color cache).
- The local color cache is disabled for lowerf (<= 25) quality levels (as before).
- Enabled local color cache for palette images as well. This saves additional
  0.017% bytes with a slight (2-3%) improvement in the compression speed.
- Removed 'use_2d_locality' parameter from method VP8LGetBackwardReferences, as
  this option is not an option now (after we freeze the lossless bit-stream).

Change-Id: I33430401e465474fa1be899f330387cd2b466280
2014-11-06 13:14:05 -08:00
James Zern
4171b6724e backward_references.c: reindent after c8581b0
Change-Id: Icfc0fe8e266c0f67a70b8cb095e5aaee155290b6
2014-11-04 17:40:04 +01:00
Vikas Arora
c8581b06e1 Optimize BackwardReferences for RLE encoding.
Updated BackwardReferencesRle method by utilizing the local color cache.
Also changed the name of method BackwardReferencesHashChain to
BackwardReferencesLz77 to reflect the LZ77 coding.

For the 1000 image corpus, this change saves 0.2% bytes
(at default settings) and is 2-5% faster to encode.

Change-Id: Ic3f288253b3bbb101a69945a80994c3fd0917f8b
2014-11-04 08:12:07 -08:00
Vikas Arora
4167a3f5f7 Optimize backwardreferences
Optimize backwardreferences (about 0.1% byte savings) with almost same
compression speed (3% faster on defaut compression settings).
1.) Simplified iteration logic for HashChainFindCopy.
    - Remapped the iter_max constant.
2.) Simplified main for loop for BackwardReferencesHashChain
    - Removed 'if' conditions for corner cases in the main loop.
    - Refactored the method(AddSingleLiteral) for adding one pixel.

Change-Id: I1bc44832fd81f11e714868a13e606c8f83157e64
2014-10-31 18:08:38 -07:00
Vikas Arora
77bdddf016 Speed up BackwardReferences
Speed up BackwardReferencesHashChainDistanceOnly method by:
1.) Remove for loop for shortmax code path.
2.) Execute the shortmax code path after regular call to
HashChainFindCopy, only if HashChainFindCopy() returns length > 2 (MIN_LENGTH).
3.) Also for shortmax, call method HashChainFindOffset (for length = 2),
instead of expensive method HashChainFindCopy().
4.) Handling first pixel (i==0) outside main loop and removing one if
condition (i > 0) per pixel.
5.) Handle the last pixel outside the main 'for' loop.

Overall compression speedup observed is around 5% (+/- noise).

Change-Id: Ifa30c4035f8d26e6e43e3c4881244d777961c22b
2014-10-30 10:58:24 -07:00
Vikas Arora
abf04205b3 Enable entropy based merge histo for (q<100)
Enable bin-partition entropy based heuristic for merging histograms
for higher (q >= 90) qualities as well. Keep the old behavior at the
maximum quality level (q==100).

This speeds up the compression between Q=90-99 (method=4) by factor 5-7X
and with loss of 0.5-0.8% in the compression density.

Change-Id: I011182cb8ae5403c565a150362bc302630b3f330
2014-10-30 03:59:36 -07:00
Vikas Arora
653ace55c3 Increase the MAX_COLOR_CACHE_BITS from 9 to 10.
The Maximum allowed limit is 11.
The Q=25 and below is not impacted as cache bits are forced to 0.
This saves 0.05% - 0.1% bytes for other quality with almost same compression
speed (+/- 2-3%, that's more of a noise).

Change-Id: Icf972a98f298c89e140e37a627baf709134be9a0
2014-10-27 14:19:04 -07:00
Vikas Arora
919220c7e6 Change the logic adjusting the Histogram bits.
Updated the logic to limit the Histogram size to a constant, instead of
computing the same based on the Histogram size (that's variable size based on
the cache bits) for the maximum possible cache bits. The actual cache bits may
be lower than the maximum.
Note: The constant 2600 is 16MB/Sizeof(HistogramSize(MAX_COLOR_CACHE_BITS)).

The compression density remains the same with this change, with little faster
compression speed.

Change-Id: I3149894962852e9dad2501b9aa16bb847a20fd86
2014-10-27 09:57:17 -07:00
Vikas Arora
e912bd55be Fix bug in VP8LCalculateEstimateForCacheSize.
The method VP8LCalculateEstimateForCacheSize is not evaluating the all possible
range for cache_bits.
Also added a small penality for choosing the larger cache-size. This is done to
strike a balance between additional memory/CPU cost (with larger cache-size) and
byte savings from smaller WebP lossless files.

This change saves about 0.07% bytes and speeds up compression by 8% (default
settings). There's small speedup at Q=50 along with byte savings as well.
Compression at Quality=25 is not effected by this change.

Change-Id: Id8f87dee6b5bccb2baa6dbdee479ee9cda8f4f77
2014-10-26 20:05:48 -07:00
Pascal Massimino
5b90d8fe42 Unify the API between VP8BitWriter and VP8LBitWriter
BitReader will be next...

Change-Id: Icd9e7ab2e3890131e664c0523627d9b8c5399a74
2014-10-23 15:35:16 +02:00
Vikas Arora
c2b5a0396a Modify CostModel to allocate optimal memory.
Change-Id: I7d52675d28bfc109d4e901581fc24cd36fcb79ee
2014-10-22 13:30:33 -07:00
Vikas Arora
139142e440 Optimize BackwardReferenceHashChainFollowPath.
Instead of calling HashChainFindMethod, call a new (subset) method
HashChainFindOffset to get the offset/distance for a given length.

The encoding is tad faster at default compression

                       Before              After
                     bpp/rate            bpp/rate
442 Palette     0.2720/5.270 MP/s      0.2720/5.790 MP/s
558 non-palette 3.7607/0.797 MP/s      3.7607/0.816 MP/s

Change-Id: If4041a9c18f7e972f49fcbab8c3e2f013d8bf1cf
2014-10-21 10:04:27 -07:00
James Zern
5f36b68d22 enc/backward_references.c: fix indent
reindent after c24f895

Change-Id: I55adcbef21ea3fdaded84b138745515596191a09
2014-10-20 11:35:20 +02:00
James Zern
e0e9960dd1 Merge "sync version numbers to 0.4.2 release" 2014-10-17 11:47:30 -07:00
James Zern
64ac51446d sync version numbers to 0.4.2 release
libwebp{,decoder} - 0.4.2
libwebp libtool - 5.2.0
libwebpdecoder libtool - 1.2.0

mux/demux - 0.2.2
libtool - 1.2.0

(cherry picked from commit eec5f5f121)
(cherry picked from commit 857578a811)

Change-Id: Ie9d10c68e28083674a8865ad8447b1a70dcea95d
2014-10-17 19:50:21 +02:00
Vikas Arora
c24f8954be Simplify and speedup Backward refs computation.
Updated VP8LGetBackwardReferences and HashChainFindCopy method with following:
- Remove the recursive CostModelBuild.
- Reuse the lz77 backward refs in CostModelBuild, instead of evaluating it
  again (as it was done for recursion_level=0).
- Consolidated the Match-length logic inside FindMatchLength method.
- Removed the logic for altering best_length/val based on the 2D distance.
  The additional 162 value (+= 9 * 9 + 9 * 9 - y * y - x * x) can't change the
  best_val eval computation to choose a different curr_length, as best_val was
  set to 'curr_length << 16'.

  Following is the impact on the compression speed/density at default & max
  quality, overall this speeds up compression by 5-15% (q=100 -> 75) with a tad
  drop (0.02-0.03%) in compression density for the non-palette images.

                  Before                After
                bpp/Rate(MP/s)        bpp/Rate(MP/s)
q=75 (def)
All 1000        2.4492/1.049 MP/s     2.4498/1.230 MP/s
Palette         0.2719/5.060 MP/s     0.2719/6.110 MP/s
non-Palette     3.7597/0.732 MP/s     3.7607/0.840 MP/s

q=100
All 1000        2.4134/0.125 MP/s     2.4142/0.131 MP/s
Palette         0.2692/2.585 MP/s     0.2692/2.885 MP/s
non-Palette     3.7040/0.079 MP/s     3.7053/0.083 MP/s

Change-Id: I27a5eff3356d876c3e949fd32262244b25678b7a
2014-10-17 09:21:30 -07:00
Pascal Massimino
6c6736816c Improved near-lossless mode.
Compared to previous mode it gives another 10-30% improvement in compression keeping comparable PSNR on corresponding quality settings.

Still protected by the WEBP_EXPERIMENTAL_FEATURES flag.

Change-Id: I4821815b9a508f4f38c98821acaddb74c73c60ac
2014-10-15 10:57:21 -07:00
James Zern
aca1b98f52 enc/vp8l.c: fix indent
reindent after ca00502

Change-Id: I8c88dbc11dc96c117531b17682b764a235ef23bb
2014-10-13 11:33:23 +02:00
Vikas Arora
ca00502788 Evaluate non-palette compression for palette image
Evaluate if for Palette images (num_colors <= 256), non-palette
compression path (Subtract green, predictor transform etc) yield an
optimal compression density.

This change reduces the WebP file (for palette images) size by 0.4% with
drop of 3-5% in compression speed.

Change-Id: I1ad66fa94db4fd7ba7bc215763791ef662cd4f42
2014-10-10 11:55:45 -07:00
James Zern
c8a87bb62d AssignSegments: quiet -Warray-bounds warning
the number of segments are previously validated, but an explicit check
is needed to avoid a warning under gcc-4.9

Change-Id: Ifa7c0dd7f3f075b3860fa8ec176d2c98ff54fcea
2014-10-10 17:18:39 +02:00
Pascal Massimino
5f81391263 Merge "Fix return code of EncodeImageInternal()" 2014-10-07 23:49:29 -07:00
Pascal Massimino
e321abe43d Fix return code of EncodeImageInternal()
It was returning 'VP8_ENC_OK' in case of memory error.

Change-Id: I184a3e29c9f1b863637cacbe389b058d75c3dbf8
2014-10-08 08:48:53 +02:00
Pascal Massimino
f82cb06afb optimize palette ordering
We compact the palette by weighted distance, favoring the green channel.

Average gain on paletted file is ~0.5%, with gain up to 6-7% on some favorable cases.
Encoding speed is unaffected.

Disabled for alpha (or any single-channel input)

Also: always use quality=20 for EncodePalette() since it
doesn't make any real difference.

Change-Id: I19fb14316a366f139a941b45aef5663a33c905e1
2014-10-08 08:42:36 +02:00
Pascal Massimino
f545feee64 don't set the alpha value for histogram index image
This leads to tiny extra compression (~few bytes per file) for free

Change-Id: Ia4d8cef3de4365e32eacefd69a57689c80042a23
2014-10-08 08:24:19 +02:00
Pascal Massimino
2d9b0a4472 add WebPDispatchAlphaToGreen() to dsp
SSE2 version is 2.1x faster

This is used to transfer the alpha plane to green channel before lossless compression.

Change-Id: I01d9df0051c183b1ff5d6eb69961d4f43e33141a
2014-10-06 23:15:44 +02:00
Vikas Arora
d5e498d47f Change Entropy based Histogram Combine heuristic.
Don't combine the Histograms that have trivial (single valued A, R & B)
  symbols.
Following is the compression savings data along with compression time (before
& after) per image.
                     Before             After
                     bpp, rate(MP/s)    bpp, rate(MP/s)
Q=25, method = 4     2.508, 1.807       2.499, 1.916
Q=50, method = 4     2.460, 1.488       2.456, 1.512
Q=75, method = 4     2.452, 1.078       2.450, 1.092
Q=25, method = 5     2.505, 1.398       2.496, 1.383
Q=50, method = 5     2.458, 1.170       2.453, 1.143
Q=75, method = 5     2.453, 0.886       2.450, 0.855

This change provides 0.1-0.4% compression gains and speeds up the lossless
compression for the default method=4 (the drop in compression speed is between 1-3.5% for method=5).

Change-Id: Idfd88c2092f37afacd26a97097b3053f8183953a
2014-09-30 13:41:39 -07:00
Pascal Massimino
47a2d8e1d9 fix MSVC float->int conversion warning
+ add a clarifying comment

Change-Id: I8ac1df1de2e5277f2d968dec489546e680bb5e0c
2014-09-27 00:36:01 -07:00
James Zern
35ad48b848 HistoHeapInit: correct positions allocation size
Change-Id: I1879fd48bee3aea6f0504926d7030b504dd9be07
2014-09-26 11:21:19 -07:00
Pascal Massimino
45d9635fd3 lossless: entropy clustering for high qualities.
Tested on 1000 pngs corpus with quality 90-100 it gives ~0.15% improvement
in compression density and ~7% speed up.

Change-Id: I460f56c96707edb3c1f0b51a024e5122e10458df
2014-09-26 15:26:56 +02:00
skal
c792d4129a Premultiply with alpha during U/V downsampling
This prevents the 'alpha-leak' reported in issue #220

Speed-diff is kept minimal.

Change-Id: I1976de5e6de7cfcec89a54df9233c1a6586a5846
2014-09-18 23:40:34 -07:00
Vikas Arora
b901416b90 Record the lossless size stats.
Record and show the lossless header and image data sizes in the cwebp.

Change-Id: I08f19693cb7a756b6fdce5b55d71f5367b5f02fc
2014-09-17 15:16:05 -07:00
James Zern
854509fec0 enc/histogram.c: reindent after f4059d0
fixes indent in HistogramRemap after:
f4059d0 Code cleanup for HistogramRemap.

Change-Id: I9f53a088749e9100a70331bda1662488666c5156
2014-09-03 16:58:49 -07:00
skal
344219645b Merge "~3-5% faster encoding optimizing PickBestIntra*()" 2014-09-03 15:53:32 -07:00
skal
187d379db6 add a fallback to ALPHA_NO_COMPRESSION
if ALPHA_LOSSLESS_COMPRESSION produces a too big file (very rare!),
we fall-back to no-compression automatically.

Change-Id: I5f3f509c635ce43a5e7c23f5d0f0c8329a5f24b7
2014-09-02 21:55:13 +02:00
skal
a48a2d7635 ~3-5% faster encoding optimizing PickBestIntra*()
* Add early-out check for Intra16
* replace some memcpy() by pointer swap

Change-Id: I5edc5f7fbc8e39984deb48e6c045c97c61418589
2014-09-01 14:40:25 +02:00
skal
49911d4df2 Merge "fix indentation" 2014-08-27 07:52:36 -07:00
Vikas Arora
f4059d0c7d Code cleanup for HistogramRemap.
Avoid call to HistogramAddThresh when there's only one Histogram image.
Change-Id: I43b09e8e2d218c95969567034224777dcce37ab9
2014-08-26 15:45:22 -07:00
skal
e632b0929b fix indentation
Change-Id: I2294a6c83e5f345f64bd5120b91532e00ed6c543
2014-08-25 23:52:09 -07:00
skal
73d361dd5f introduce VP8EncQuantize2Blocks to quantize two blocks at a time
No speed diff for now. We might reorder better the instructions later,
to speed things up.

Change-Id: I1949525a0b329c7fd861b8dbea7db4b23d37709c
2014-08-25 20:21:42 -07:00
skal
2523aa73cb SmartRGBYUV: fix odd-width problem with pixel replication
rightmost pixel was missing a copy, which could lead to invalid read.

Also added a lower dimension of 4, below which we use the regular conversion.
This is to prevent corner cases, in addition to not being overkill.

Change-Id: Iac12e7a3d74590f12fe8eeb1830b9891e61439f6
2014-08-18 15:58:36 -07:00
Pascal Massimino
ee52dc4e54 fix some MSVC64 warning about float conversion
Change-Id: I27ab27fc15033d27d0505729f6275fb542c8d473
2014-08-16 00:15:29 -07:00