libwebp

mirror of https://github.com/webmproject/libwebp.git synced 2025-01-31 17:12:53 +01:00

Author	SHA1	Message	Date
Pascal Massimino	f079e487ae	use uint16_t for chosen_path[] len is MAX_LENGTH (4096) at max. This reduce memory for path by a half. Change-Id: I399fda4093d93b1e9d956397b7b210956c5b948f	2015-01-20 00:34:09 -08:00
Vikas Arora	b9e356b998	Disable costly TraceBackwards for method=0. Disable costly TraceBackwards heuristic for computing the backward references for low_effort (method=0) compression. The TraceBackwards heuristic is already disabled for lower (q < 25) quality range. Following is the compression data for 1000 image corpus for q >= 25. This speeds up compression (q >= 25) by a factor of 2.5-3X with slight loss of compression density (0.7% for lower quality range and 1.2% for higher qualities). Change-Id: I256c9e2137c7de4083f423ea32ee12d3b0f46253	2015-01-15 09:01:40 -08:00
Vikas Arora	ea08466d34	Tune BackwardReferencesLz77 for low_effort (m=0). - Lower the threshold parameters for HashChainFindCopy. For 1000 image PNG corpus (m=0), this change yields speedup of 15-20% at lower quality range (0.25% drop in compression density) and about 10% for higher quality range without any drop in the compression density. Following is the compression stats (before/after) for method = 0: Before After bpp/MPs bpp/MPs q=0 2.8615/18.000 2.8651/18.631 q=5 2.8615/18.216 2.8650/20.517 q=10 2.8572/18.070 2.8650/21.992 q=15 2.8519/18.371 2.8584/21.747 q=20 2.8454/18.975 2.8515/20.448 q=25 2.8230/8.531 2.8253/9.585 // Compression density remains same for q-range [30-100] q=30 2.7310/7.706 2.7310/8.028 q=35 2.7253/6.855 2.7253/7.184 q=40 2.7231/6.364 2.7231/6.604 q=45 2.7216/5.844 2.7216/6.223 q=50 2.7196/5.210 2.7196/5.731 q=55 2.7208/4.766 2.7208/4.970 q=60 2.7195/4.495 2.7195/4.602 q=65 2.7185/4.024 2.7185/4.236 q=70 2.7174/3.699 2.7174/3.861 q=75 2.7164/3.449 2.7164/3.605 q=80 2.7161/3.222 2.7161/3.038 q=85 2.7153/2.919 2.7153/2.946 q=90 2.7145/2.766 2.7145/2.771 q=95 2.7124/2.548 2.7124/2.575 q=100 2.6873/2.253 2.6873/2.335 Change-Id: I0e17581fb71f6094032ad06c6203350bd502f9a1	2015-01-08 00:30:21 -08:00
Vikas Arora	413dfc0c4b	Move static method definition before its usage. Change-Id: Id766c2bea92e7ebf0de65046f73429b74b4fdda4	2014-11-13 13:18:30 -08:00
Vikas Arora	0f23566558	Update BackwardRefsWithLocalCache. Update BackwardRefsWithLocalCache to do in-place update of backward references w.r.t local color cache index. No impact on the compression density or compression speed. Change-Id: Ie066251464c3928c044e037b43df3af28b48ca30	2014-11-13 11:54:26 -08:00
Vikas Arora	fdaac8e0ca	Optmize VP8LGetBackwardReferences LZ77 references. Use the refs_lz77 computed (with cache_bits=0) in the method 'CalculateBestCacheSize' to regenerate the LZ77 references corresponding to the optimum cache_bits and avoid calling costly 'BackwardReferencesLz77' one extra time. This change leaves the compression density unchanged and speeds up compression by 10-15%. Change-Id: I5a92e11788d3c3f656aa7e1fba54fb5d96ee0027	2014-11-12 14:50:04 -08:00
Vikas Arora	95a9bd85c4	Updated VP8LGetBackwardReferences and color cache. - The optimal cache bits is evaluated inside the method 'VP8LGetBackwardReferences'. - The input cache_bits to 'VP8LGetBackwardReferences' sets the maximum cache bits to use (passing 0 implies disabling the local color cache). - The local color cache is disabled for lowerf (<= 25) quality levels (as before). - Enabled local color cache for palette images as well. This saves additional 0.017% bytes with a slight (2-3%) improvement in the compression speed. - Removed 'use_2d_locality' parameter from method VP8LGetBackwardReferences, as this option is not an option now (after we freeze the lossless bit-stream). Change-Id: I33430401e465474fa1be899f330387cd2b466280	2014-11-06 13:14:05 -08:00
James Zern	4171b6724e	backward_references.c: reindent after c8581b0 Change-Id: Icfc0fe8e266c0f67a70b8cb095e5aaee155290b6	2014-11-04 17:40:04 +01:00
Vikas Arora	c8581b06e1	Optimize BackwardReferences for RLE encoding. Updated BackwardReferencesRle method by utilizing the local color cache. Also changed the name of method BackwardReferencesHashChain to BackwardReferencesLz77 to reflect the LZ77 coding. For the 1000 image corpus, this change saves 0.2% bytes (at default settings) and is 2-5% faster to encode. Change-Id: Ic3f288253b3bbb101a69945a80994c3fd0917f8b	2014-11-04 08:12:07 -08:00
Vikas Arora	4167a3f5f7	Optimize backwardreferences Optimize backwardreferences (about 0.1% byte savings) with almost same compression speed (3% faster on defaut compression settings). 1.) Simplified iteration logic for HashChainFindCopy. - Remapped the iter_max constant. 2.) Simplified main for loop for BackwardReferencesHashChain - Removed 'if' conditions for corner cases in the main loop. - Refactored the method(AddSingleLiteral) for adding one pixel. Change-Id: I1bc44832fd81f11e714868a13e606c8f83157e64	2014-10-31 18:08:38 -07:00
Vikas Arora	77bdddf016	Speed up BackwardReferences Speed up BackwardReferencesHashChainDistanceOnly method by: 1.) Remove for loop for shortmax code path. 2.) Execute the shortmax code path after regular call to HashChainFindCopy, only if HashChainFindCopy() returns length > 2 (MIN_LENGTH). 3.) Also for shortmax, call method HashChainFindOffset (for length = 2), instead of expensive method HashChainFindCopy(). 4.) Handling first pixel (i==0) outside main loop and removing one if condition (i > 0) per pixel. 5.) Handle the last pixel outside the main 'for' loop. Overall compression speedup observed is around 5% (+/- noise). Change-Id: Ifa30c4035f8d26e6e43e3c4881244d777961c22b	2014-10-30 10:58:24 -07:00
Vikas Arora	e912bd55be	Fix bug in VP8LCalculateEstimateForCacheSize. The method VP8LCalculateEstimateForCacheSize is not evaluating the all possible range for cache_bits. Also added a small penality for choosing the larger cache-size. This is done to strike a balance between additional memory/CPU cost (with larger cache-size) and byte savings from smaller WebP lossless files. This change saves about 0.07% bytes and speeds up compression by 8% (default settings). There's small speedup at Q=50 along with byte savings as well. Compression at Quality=25 is not effected by this change. Change-Id: Id8f87dee6b5bccb2baa6dbdee479ee9cda8f4f77	2014-10-26 20:05:48 -07:00
Vikas Arora	c2b5a0396a	Modify CostModel to allocate optimal memory. Change-Id: I7d52675d28bfc109d4e901581fc24cd36fcb79ee	2014-10-22 13:30:33 -07:00
Vikas Arora	139142e440	Optimize BackwardReferenceHashChainFollowPath. Instead of calling HashChainFindMethod, call a new (subset) method HashChainFindOffset to get the offset/distance for a given length. The encoding is tad faster at default compression Before After bpp/rate bpp/rate 442 Palette 0.2720/5.270 MP/s 0.2720/5.790 MP/s 558 non-palette 3.7607/0.797 MP/s 3.7607/0.816 MP/s Change-Id: If4041a9c18f7e972f49fcbab8c3e2f013d8bf1cf	2014-10-21 10:04:27 -07:00
James Zern	5f36b68d22	enc/backward_references.c: fix indent reindent after c24f895 Change-Id: I55adcbef21ea3fdaded84b138745515596191a09	2014-10-20 11:35:20 +02:00
Vikas Arora	c24f8954be	Simplify and speedup Backward refs computation. Updated VP8LGetBackwardReferences and HashChainFindCopy method with following: - Remove the recursive CostModelBuild. - Reuse the lz77 backward refs in CostModelBuild, instead of evaluating it again (as it was done for recursion_level=0). - Consolidated the Match-length logic inside FindMatchLength method. - Removed the logic for altering best_length/val based on the 2D distance. The additional 162 value (+= 9 * 9 + 9 * 9 - y * y - x * x) can't change the best_val eval computation to choose a different curr_length, as best_val was set to 'curr_length << 16'. Following is the impact on the compression speed/density at default & max quality, overall this speeds up compression by 5-15% (q=100 -> 75) with a tad drop (0.02-0.03%) in compression density for the non-palette images. Before After bpp/Rate(MP/s) bpp/Rate(MP/s) q=75 (def) All 1000 2.4492/1.049 MP/s 2.4498/1.230 MP/s Palette 0.2719/5.060 MP/s 0.2719/6.110 MP/s non-Palette 3.7597/0.732 MP/s 3.7607/0.840 MP/s q=100 All 1000 2.4134/0.125 MP/s 2.4142/0.131 MP/s Palette 0.2692/2.585 MP/s 0.2692/2.885 MP/s non-Palette 3.7040/0.079 MP/s 3.7053/0.083 MP/s Change-Id: I27a5eff3356d876c3e949fd32262244b25678b7a	2014-10-17 09:21:30 -07:00
Vikas Arora	516971b136	lossless: Remove unaligned read warning (typecast uint32 pointer to uint64). The proposed change is little (0.05%) slower but avoids uint32 to uint64 pointer conversion. Change-Id: I6b8828077ea1324fabd04bfa7e7439e324776250	2014-07-02 20:55:27 -07:00
skal	77bf4410f7	make error-code reporting consistent upon malloc failure Sometimes, the error-code was not set correctly. We now return OUT_OF_MEMORY everytimes it's appropriate (tested using MALLOC_FAIL_AT mechanism) Took the opportunity to clean-up the code and dust the error code returned (some were erroneously set to INVALID_CONFIGURATION) Change-Id: I56f7331e2447557b3dd038e245daace4fc82214c	2014-06-13 08:45:12 +02:00
skal	ca3d746e39	use block-based allocation for backward refs storage, and free-lists Non-photo source produce far less literal reference and their buffer is usually much smaller than the picture size if its compresses well. Hence, use a block-base allocation (and recycling) to avoid pre-allocating a buffer with maximal size. This can reduce memory consumption up to 50% for non-photographic content. Encode speed is also a little better (1-2%) Change-Id: Icbc229e1e5a08976348e600c8906beaa26954a11	2014-05-05 11:11:55 -07:00
skal	d3bcf72bf5	Don't allocate VP8LHashChain, but treat like automatic object the unique instance of VP8LHashChain (1MB size corresponding to hash_to_first_index_) is now wholy part of VP8LEncoder, instead of maintaining the pointer to VP8LHashChain in the encoder. Change-Id: Ib6fe52019fdd211fbbc78dc0ba731a4af0728677	2014-04-30 14:10:48 -07:00
Pascal Massimino	cf5eb8ad19	remove some uint64_t casts and use. We use automatic int->uint64_t promotion where applicable. (uint64_t should be kept only for overflow checking and memory alloc). Change-Id: I1f41b0f73e2e6380e7d65cc15c1f730696862125	2014-04-29 09:08:25 -07:00
Pascal Massimino	b3a616b356	make HistogramAdd() a pointer in dsp * merged the two HistogramAdd/AddEval() into a single call (with detection of special case when b==out) * added a SSE2 variant * harmonize the histogram type to 'uint32_t' instead of just 'int'. This has a lot of ripples on signatures. * 1-2% faster Change-Id: I10299ff300f36cdbca5a560df1ae4d4df149d306	2014-04-28 10:09:34 -07:00
Vikas Arora	0b896101b4	Reduce memory footprint for encoding WebP lossless. Reduce calls to Malloc (WebPSafeMalloc/WebPSafeCalloc) for: - Building HashChain data-structure used in creating the backward references. - Creating Backward references for LZ77 or RLE coding. - Creating Huffman tree for encoding the image. For the above mentioned code-paths, allocate memory once and re-use it subsequently. Reduce the foorprint of VP8LHistogram struct by changing the Struct field 'literal_' from an array of constant size to dynamically allocated buffer based on the input parameter cache_bits. Initialize BitWriter buffer corresponding to 16bpp (2WH). There are some hard-files that are compressed at 12 bpp or more. The realloc is costly and can be avoided for most of the WebP lossless images by allocating some extra memory at the encoder initializaiton. Change-Id: I1ea8cf60df727b8eb41547901f376c9a585e6095	2014-04-26 01:14:33 -07:00
skal	af93bdd6bc	use WebPSafe[CM]alloc/WebPSafeFree instead of [cm]alloc/free there's still some malloc/free in the external example This is an encoder API change because of the introduction of WebPMemoryWriterClear() for symmetry reasons. The MemoryWriter object should probably go in examples/ instead of being in the main lib, though. mux_types.h stil contain some inlined free()/malloc() that are harder to remove (we need to put them in the libwebputils lib and make sure link is ok). Left as a TODO for now. Also: WebPDecodeRGB*() function are still returning a pointer that needs to be free()'d. We should call WebPSafeFree() on these, but it means exposing the whole mechanism. TODO(later). Change-Id: Iad2c9060f7fa6040e3ba489c8b07f4caadfab77b	2014-03-27 15:50:59 -07:00
Vikas Arora	5f0cfa80ff	Do a binary search to get the optimum cache bits. This speeds up the lossless encoder by a bit (1-2%), without impacting the compression density. Change-Id: Ied6fb38fab58eef9ded078697e0463fe7c560b26	2014-03-13 10:30:32 -07:00
Vikas Arora	b33e8a05ee	Refactor code for HistogramCombine. Refactor code for HistogramCombine and optimize the code by calculating the combined entropy and avoid un-necessary Histogram merges. This speeds up lossless encoding by 1-2% and almost no impact on compression density. Change-Id: Iedfcf4c1f3e88077bc77fc7b8c780c4cd5d6362b	2014-03-03 13:50:42 -08:00
James Zern	a42ea9742a	cosmetics: backward_references.c: reindent after a7d2ee3 a7d2ee3 Optimize cache estimate logic. Change-Id: I81dd1eea49f603465dc5f3afae8a101e5205e963	2014-02-11 15:52:22 -08:00
Vikas Arora	a7d2ee39be	Optimize cache estimate logic. Optimize 'VP8LCalculateEstimateForCacheSize' for lower quality ranges (Q < 50). The entropy is generally lower for higher cache_bits, so start searching from higher cache_bits and settle for a local minima, instead of evaluating all values. This speeds up the lossless encoding at lower qualities by 10-15%. Change-Id: I33c1e958515a2549f2e6f64b1aab3f128660dcec	2014-02-11 10:59:01 -08:00
Scott Talbot	391316fee2	Don't dereference NULL, ensure HashChain fully initialized Found by clang's static analyzer, they look validly uninitialized to me. Change-Id: I650250f516cdf6081b35cdfe92288c20a3036ac8	2014-02-03 21:16:59 -08:00
James Zern	4931c3294b	cosmetics: fix some typos Change-Id: I0d6efebd817815139db5ae87236fd8911df4d53c	2013-11-26 19:21:14 -08:00
skal	ff379db317	few % speedup of lossless encoding mostly visible for method 4 and up Change-Id: I1561d871bc055ec5f7998eb193d927927d3f2add	2013-11-12 00:09:45 +01:00
Vikas Arora	e081f2f359	Pack code & extra_bits to Struct (VP8LPrefixCode). Also created variant VP8LPrefixEncodeBits that returns the code & extra_bits only. There's no impact on compression density and compression speed. Change-Id: I2cafdd3438ac9270cd72ad9d57b383cdddfdfa4c	2013-08-12 11:56:42 -07:00
Vikas Arora	69257f70df	Create LUT for PrefixEncode. This speeds up lossless compression by 5%. Change-Id: Ifd114b1d9850dc3aac74593809e7d48529d35e3d	2013-08-05 10:20:18 -07:00
Vikas Arora	7d60bbc6d9	Speed up HashChainFindCopy function. Speed up HashChainFindCopy by optimizing on number of calls to FindMatchLength method. This change speeds up the lossless & lossy (Alpha) encoding by 20% without loss of compression density. At method=3, lossy (Alpha) compression speed (and density) remains unchanged, as at that settings, the costly Backward Refs method is not called Change-Id: Ia1797148e9e4ee2787011837fa248afbae2242cb	2013-07-16 19:58:18 -07:00
Vikas Arora	6674014077	Speedup Alpha plane encoding. Disable costly 'BackwardReferencesTraceBackwards' for encoding Alpha plane. Increase the threshold for triggering 'BackwardReferencesTraceBackwards' to quality 25 and above. Also lower the Alpha quality (at method 3) to be lesser than this threshold (25). Change-Id: Ic29fb2e6943472c564223df9fe099b19ccda0f31	2013-07-12 11:02:13 -07:00
James Zern	d640614d54	update copyright text rather than symlink the webm/vpx terms, use the same header as libvpx to reference in-tree files based on the discussion in: https://codereview.chromium.org/12771026/ Change-Id: Ia3067ecddefaa7ee01550136e00f7b3f086d4af4	2013-06-06 23:09:14 -07:00
Vikas Arora	8eae188a62	WebP-Lossless encoding improvements. Lossy (with Alpha) image compression gets 2.3X speedup. Compressing lossless images is 20%-40% faster now. Change-Id: I41f0225838b48ae5c60b1effd1b0de72fecb3ae6	2013-05-08 17:22:11 -07:00
Vikas Arora	0478b5d214	Tune Lossless compression for lower qualities. This is required for WebP lossy+Alpha images, where Alpha channel is taking 60-70% of the compression (CPU) cycles. Also evaluated on 1000 PNG corpus and overall compression speed is 15-40% better for lossy (PNG+Alpha) compression. The pure lossless compression numbers are almost same (or little better) with this change. Change-Id: I9e5ae7372ed6227a9a5b64cd9cff84c747195a57	2013-03-12 14:17:28 -07:00
skal	23c0f354a6	fix missing intptr_t->int cast for MSVC Change-Id: I39acdfbe287ef7b7f615d59e0729aab161189bf9	2013-02-06 17:51:10 +01:00
skal	043076e2ef	Merge "speed-up lossless in BackwardTrace"	2013-02-05 10:46:06 -08:00
skal	f3a44dcd83	remove one malloc from TraceBackwards() Change-Id: I4f77c0240ca2ece86e8beab11d02c74277409921	2013-02-05 19:43:43 +01:00
skal	0fc1a3a072	speed-up lossless in BackwardTrace we special-case code=2 (with a later TODO to adapt this on quality) Change-Id: I93d43f5b3f8f1ef9f211cce253bb4b415918ee57	2013-02-05 19:42:23 +01:00
skal	0d19fbff51	remove some -Wshadow warnings these are quite noisy, but it's not a big deal to remove them. Change-Id: I5deb08f10263feb77e2cc8a70be44ad4f725febd	2013-01-22 23:06:28 +01:00
vikas arora	ab5ea217f7	Tune Lossless encoder - Changed the dynamic range where more aggressive (BackwardReferencesTraceBackward) heuristic is run from quality > 10 (instead of quality > 25). - Limit the backward-ref Window size to 16width & 256width for lower qualities ([0, 25[ & [25, 50[) respectively, instead of 1M window. - Evaluate the params for HashChainFindCopy outside this function call and pass it, instead of recomputing them for every call. Change-Id: If9eedfc14b978e7632d7cf69c96186e2910b0554	2012-10-30 17:30:17 -07:00
James Zern	e855208c16	fix double to float conversion warning introduced in: a792b91 fix the -g/O3 discrepancy for 32bit compile Change-Id: I6a77223f237527eda4ee1d6eaa993351bd74f1d6	2012-10-08 18:27:27 -07:00
Vikas Arora	7b3eb372ad	Tune lossless compression to get better gains. Tune compression heuristics to get better gains across wide quality range. Change-Id: Ic342d4dbcf83fe2086a34e5c184aef0714109430	2012-10-05 09:58:29 -07:00
skal	a792b913bd	fix the -g/O3 discrepancy for 32bit compile in debug mode, some float operations see their intermediate values stored in memory rather than staying in the FPU (which is 80bit precision). Several fixes are possible (breaking long calculations into atomic steps for instance), but simpler of all is just about turning the cost[] array into float* instead of double*. The code is a tad faster, and i didn't see any major output size difference. Change-Id: Icf1f833e15f8ee4ecc7f9a521d07fdc96ef711aa	2012-10-03 15:15:58 +02:00
Pascal Massimino	323dc4d9b9	remove use of log2(). Use VP8LFastLog2() instead. Order-by-cost mostly unchanged (up to a scaling constant 1/log(2)) (except for few minor diff in < 2% of cases) + remove unused field cost_mode->cache_bits_ Change-Id: I714f8ab12f49a23f5d499a64c741382c9b489a3e	2012-08-02 00:08:58 -07:00
Pascal Massimino	bff34ac1ca	harness some malloc/calloc to use WebPSafeMalloc and WebPSafeCalloc quite a large security sweep. Change-Id: If150dfbb46e6e9b56210473a109c8ad6ccd0cea4	2012-08-01 12:06:04 -07:00
Pascal Massimino	906be65744	rationalize use of color-cache * ~1-4% faster * if it's not used, don't use it * remove the special handling of cache_bits = 0 * remove some tests in the loops Change-Id: I19d87c3ca731052ff532ea8b2d8e89816507b75f	2012-08-01 00:32:12 -07:00

1 2 3

116 Commits