libwebp

mirror of https://github.com/webmproject/libwebp.git synced 2025-06-08 06:54:22 +02:00

Author	SHA1	Message	Date
skal	ca3d746e39	use block-based allocation for backward refs storage, and free-lists Non-photo source produce far less literal reference and their buffer is usually much smaller than the picture size if its compresses well. Hence, use a block-base allocation (and recycling) to avoid pre-allocating a buffer with maximal size. This can reduce memory consumption up to 50% for non-photographic content. Encode speed is also a little better (1-2%) Change-Id: Icbc229e1e5a08976348e600c8906beaa26954a11	2014-05-05 11:11:55 -07:00
skal	d3bcf72bf5	Don't allocate VP8LHashChain, but treat like automatic object the unique instance of VP8LHashChain (1MB size corresponding to hash_to_first_index_) is now wholy part of VP8LEncoder, instead of maintaining the pointer to VP8LHashChain in the encoder. Change-Id: Ib6fe52019fdd211fbbc78dc0ba731a4af0728677	2014-04-30 14:10:48 -07:00
Pascal Massimino	cf5eb8ad19	remove some uint64_t casts and use. We use automatic int->uint64_t promotion where applicable. (uint64_t should be kept only for overflow checking and memory alloc). Change-Id: I1f41b0f73e2e6380e7d65cc15c1f730696862125	2014-04-29 09:08:25 -07:00
Pascal Massimino	b3a616b356	make HistogramAdd() a pointer in dsp * merged the two HistogramAdd/AddEval() into a single call (with detection of special case when b==out) * added a SSE2 variant * harmonize the histogram type to 'uint32_t' instead of just 'int'. This has a lot of ripples on signatures. * 1-2% faster Change-Id: I10299ff300f36cdbca5a560df1ae4d4df149d306	2014-04-28 10:09:34 -07:00
Pascal Massimino	4825b4360d	fix warning about size_t -> int conversion + re-order and add some const Change-Id: I3746520b75699e56e20835d10d1dd9cd9fd6d85d	2014-04-27 00:50:07 -07:00
Vikas Arora	0b896101b4	Reduce memory footprint for encoding WebP lossless. Reduce calls to Malloc (WebPSafeMalloc/WebPSafeCalloc) for: - Building HashChain data-structure used in creating the backward references. - Creating Backward references for LZ77 or RLE coding. - Creating Huffman tree for encoding the image. For the above mentioned code-paths, allocate memory once and re-use it subsequently. Reduce the foorprint of VP8LHistogram struct by changing the Struct field 'literal_' from an array of constant size to dynamically allocated buffer based on the input parameter cache_bits. Initialize BitWriter buffer corresponding to 16bpp (2WH). There are some hard-files that are compressed at 12 bpp or more. The realloc is costly and can be avoided for most of the WebP lossless images by allocating some extra memory at the encoder initializaiton. Change-Id: I1ea8cf60df727b8eb41547901f376c9a585e6095	2014-04-26 01:14:33 -07:00
skal	75b12006e3	Move the HuffmanCost() function to dsp lib This is to help further optimizations. (like in https://gerrit.chromium.org/gerrit/#/c/69787/) There's a small slowdown (~0.5% at -z 9 quality) due to function pointer usage. Note that, for speed, it's important to return VP8LStreaks by value, and not pass a pointer. Change-Id: Id4167366765fb7fc5dff89c1fd75dee456737000	2014-04-18 11:59:48 -07:00
skal	a9fc697cb6	Merge "WIP: extract the float-calculation of HuffmanCost from loop"	2014-04-15 11:33:11 -07:00
Djordje Pesut	4ae0533f39	MIPS: MIPS32r1: Added optimizations for ExtraCost functions. ExtraCost and ExtraCostCombined Change-Id: I7eceb9ce2807296c6b43b974e4216879ddcd79f2	2014-04-15 15:37:06 +02:00
skal	b30a04cf11	WIP: extract the float-calculation of HuffmanCost from loop new function: VP8FinalHuffmanCost() Change-Id: I42102f8e5ef6d7a7af66490af77b7dc2048a9cb9	2014-04-15 14:52:52 +02:00
Slobodan Prijic	2b1b4d5ae9	MIPS: MIPS32r1: Add optimization for GetResidualCost + reorganize the cost-evaluation code by moving some functions to cost.h/cost.c and exposing VP8Residual Change-Id: Id976299b5d4484e65da8bed31b3d2eb9cb4c1f7d	2014-04-08 15:28:49 +02:00
skal	869eaf6c60	~30% encoding speedup: use NEON for QuantizeBlock() also revamped the signature to avoid having to pass the 'first' parameter Change-Id: Ief9af1747dcfb5db0700b595d0073cebd57542a5	2014-04-08 03:08:22 -07:00
Vikas Arora	bc374ff39e	Use histogram_bits to initalize transform_bits. This change gains back 1% in compression density for method=3 and 0.5% for method=4, at the expense of 10% slower compression speed. Change-Id: I491aa1c726def934161d4a4377e009737fbeff82	2014-04-02 11:46:40 -07:00
Vikas Arora	6af6b8e1b6	Tune HistogramCombineBin for large images. Tune HistogramCombineBin for hard images that are larger than 1-2 Mega pixel and represent photographic images. This speeds up lossless encoding on 1000 image corpus by 10-12% and compression penalty of 0.1-0.2%. Change-Id: Ifd03b75c503b9e886098e5fe6f86be0391ca8e81	2014-03-28 07:09:59 -07:00
skal	af93bdd6bc	use WebPSafe[CM]alloc/WebPSafeFree instead of [cm]alloc/free there's still some malloc/free in the external example This is an encoder API change because of the introduction of WebPMemoryWriterClear() for symmetry reasons. The MemoryWriter object should probably go in examples/ instead of being in the main lib, though. mux_types.h stil contain some inlined free()/malloc() that are harder to remove (we need to put them in the libwebputils lib and make sure link is ok). Left as a TODO for now. Also: WebPDecodeRGB*() function are still returning a pointer that needs to be free()'d. We should call WebPSafeFree() on these, but it means exposing the whole mechanism. TODO(later). Change-Id: Iad2c9060f7fa6040e3ba489c8b07f4caadfab77b	2014-03-27 15:50:59 -07:00
James Zern	fbed36433d	Merge "dsp: reuse wht transform from dec in encoder"	2014-03-26 15:13:07 -07:00
skal	d1b33ad58b	2-5% faster trellis with clang/MacOS (and ~2-3% on ARM) We don't need to store cost/score for each node, but only for the current and previous one -> simplify code and save some memory. Also made the 'Node' structure tighter. Change-Id: Ie3ad7d3b678992b396242f56e2ac387fe43852e6	2014-03-26 22:33:01 +01:00
James Zern	df230f2723	dsp: reuse wht transform from dec in encoder Change-Id: Ide663db9eaecb7a37fe0e6ad4cd5f37de190c717	2014-03-22 13:25:08 -07:00
Pascal Massimino	59daf08362	Merge "cosmetics:"	2014-03-18 04:02:33 -07:00
Pascal Massimino	536220084c	cosmetics: - use VP8ScanUV, separate from VP8Scan[] (for luma) - fix indentation - few missing consts - change TrellisQuantizeBlock() signature Change-Id: I94b437d791cbf887015772b5923feb83dd145530	2014-03-18 03:34:56 -07:00
James Zern	3e7f34a3fb	AssignSegments: quiet array-bounds warning nb (enc->segment_hdr_.num_segments_) will be in the range [1, NUM_MB_SEGMENTS]. Change-Id: I5c2bd0bb82b17c99aff39c98b6b1747fc040dc16	2014-03-14 18:47:52 -07:00
James Zern	cf821c821f	UpdateHistogramCost: avoid implicit double->float all the functions involved return double and later these locals are used in double calculations. fixes a vs build warning Change-Id: Idb547104ef00b48c71c124a774ef6f2ec5f30f14	2014-03-14 11:18:52 -07:00
Vikas Arora	1c58526fe1	Fix few nits Add/remove few casts, fixed indentation. Change-Id: Icd141694201843c04e476f09142ce4be6e502dff	2014-03-13 13:57:39 -07:00
Vikas Arora	fef22704ec	Optimize and re-structure VP8LGetHistoImageSymbols Optimize and re-structured VP8LGetHistoImageSymbols method, by using the bin-hash for merging the Histograms more efficiently, instead of the randomized heuristic of existing method HistogramCombine. This change speeds up the Lossless encoding by 40-50% (for method=4 and Q > 50) with 0.8% penalty in compression density. For lower method, the speed up is 25-30%, with 0.4% penalty in the compression density. Change-Id: If61adadb1a041b95def6405aa1fe3b83c3cb25ce	2014-03-13 11:48:37 -07:00
Vikas Arora	5f0cfa80ff	Do a binary search to get the optimum cache bits. This speeds up the lossless encoder by a bit (1-2%), without impacting the compression density. Change-Id: Ied6fb38fab58eef9ded078697e0463fe7c560b26	2014-03-13 10:30:32 -07:00
skal	65b99f1c92	add a -z option to cwebp, and WebPConfigLosslessPreset() function These are presets for lossless coding, similar to zlib. The shortcut for lossless coding is now, e.g.: cwebp -z 5 in.png -o out_lossless.webp There are 10 possible values for -z parameter: 0 (fastest, lowest compression) to 9 (slowest, best compression) A reasonable tradeoff is -z 6, e.g. -z 9 can be quite slow, so use with care. This -z option is just a shortcut for some pre-defined '-lossless -m xx -q yy' combinations. Change-Id: I6ae716456456aea065469c916c2d5ca4d6c6cf04	2014-03-11 23:25:35 +01:00
skal	30176619c6	4-5% faster trellis by removing some unneeded calculations. (We didn't need the exact value of the max_error properly. We can work with relative values instead of absolute) Output is bitwise the same as before. Change-Id: I67aeaaea5f81bfd9ca8e1158387a5083a2b6c649	2014-03-06 15:57:25 +01:00
James Zern	687a58ecc3	histogram.c: reindent after b33e8a0 b33e8a0 Refactor code for HistogramCombine. Change-Id: Ia1b4b545c5f4e29cc897339df2b58f18f83c15b3	2014-03-04 00:38:14 -08:00
James Zern	42eb06fc0e	Merge "few cosmetics after patch #69079 "	2014-03-03 15:13:25 -08:00
skal	82af82644b	few cosmetics after patch #69079 Change-Id: Ifa758420421b5a05825a593f6b43504887603ee7	2014-03-03 23:53:08 +01:00
Vikas Arora	b33e8a05ee	Refactor code for HistogramCombine. Refactor code for HistogramCombine and optimize the code by calculating the combined entropy and avoid un-necessary Histogram merges. This speeds up lossless encoding by 1-2% and almost no impact on compression density. Change-Id: Iedfcf4c1f3e88077bc77fc7b8c780c4cd5d6362b	2014-03-03 13:50:42 -08:00
skal	5aeeb087d6	5-10% encoding speedup with faster trellis (-m 6) mostly by: - storing a single rd-score instead of cost / distortion separately - evaluating terminal cost only once - getting some invariants out of the loops - more consts behind fewer variables Change-Id: I79451f3fd1143d6537200fb8b90d0ba252809f8c	2014-03-03 22:07:06 +01:00
Pascal Massimino	4287d0d49b	speed-up trellis quant (~5-10% overall speed-up) store costs[] in node instead of context Change-Id: I6aeb0fd94af9e48580106c41408900fe3467cc54 also: various cosmetics	2014-02-27 00:06:00 -08:00
Pascal Massimino	390c8b316d	lossy encoding: ~3% speed-up incorporate non-last cost in per-level cost table also: correct trellis-quant cost evaluation at nodes (output a little bit different now). Method 6 is ~4% faster. Change-Id: Ic48bd6d33f9193838216e7dc3a9f9c5508a1fbe8	2014-02-26 05:52:24 -08:00
Vikas Arora	c16cd99aba	Speed up lossless encoder. Speedup lossless encoder by 20-25% by optimizing: - GetBestColorTransformForTile: Use techniques like binary search and local minima search to reduce the search space. - VP8LFastSLog2Slow & VP8LFastLog2Slow: Adding the correction factor for log(1 + x) and increase the threshold for calling the approximate version of log_2 (compared to costly call to log()). Change-Id: Ia2444c914521ac298492aafa458e617028fc2f9d	2014-02-21 22:13:50 -08:00
skal	0235d5e44b	1-2% faster quantization in SSE2 C-version is a bit faster too (sub-1% faster on ARM) Change-Id: I077262042f1d0937aba1ecf15174f2c51bf6cd97	2014-02-13 15:55:30 -08:00
James Zern	a42ea9742a	cosmetics: backward_references.c: reindent after a7d2ee3 a7d2ee3 Optimize cache estimate logic. Change-Id: I81dd1eea49f603465dc5f3afae8a101e5205e963	2014-02-11 15:52:22 -08:00
Vikas Arora	fde2904b8a	Increase initial buffer size for VP8L Bit Writer. Increase the initial buffer size for VP8L Bit Writer from 4bpp to 8bpp. The resize buffer is expensive (requires realloc and copy) and this additional memory (0.5 * W * H) doesn't add much overhead on the lossless encoder. Change-Id: Ic1fe55cd7bc3d1afadc799e4c2c8786ec848ee66	2014-02-11 11:13:21 -08:00
Vikas Arora	a7d2ee39be	Optimize cache estimate logic. Optimize 'VP8LCalculateEstimateForCacheSize' for lower quality ranges (Q < 50). The entropy is generally lower for higher cache_bits, so start searching from higher cache_bits and settle for a local minima, instead of evaluating all values. This speeds up the lossless encoding at lower qualities by 10-15%. Change-Id: I33c1e958515a2549f2e6f64b1aab3f128660dcec	2014-02-11 10:59:01 -08:00
Scott Talbot	391316fee2	Don't dereference NULL, ensure HashChain fully initialized Found by clang's static analyzer, they look validly uninitialized to me. Change-Id: I650250f516cdf6081b35cdfe92288c20a3036ac8	2014-02-03 21:16:59 -08:00
skal	9882b2f953	always use fast-analysis for all methods. This makes the segmentation overall less prone to local-optimum or boundary effect. (and overall, encoding is a little faster) Change-Id: I35688098b0f43c28b5cb81c4a92e1575bb0eddb9	2014-01-24 18:17:04 +01:00
skal	c741183c10	make WebPCleanupTransparentArea work with argb picture the -alpha_cleanup flag was ineffective since we switched cwebp to using ARGB input always. Original idea by David Eckel (dvdckl at gmail dot com) Change-Id: I0917a8b91ce15a43199728ff4ee2a163be443bab	2014-01-17 20:19:16 +01:00
James Zern	ea59a8e980	Merge "Merge tag 'v0.4.0'"	2014-01-10 17:09:19 -08:00
skal	7574bed43d	fix comments related to array sizes thanks to foivos_g4 at hotmail dot com for spotting this. Change-Id: I8cd301bae58a6edbc3ef47607654bc21321721ca	2014-01-09 17:30:25 +01:00
James Zern	8c524db84c	bump version to 0.4.0 libwebp{,decoder} - 0.4.0 libwebp libtool - 5.0.0 libwebpdecoder libtool - 1.0.0 mux/demux - 0.2.0 libtool - 1.0.0 Change-Id: Idbd067f95a6af2f0057d6a63ab43176fcdbb767d	2013-12-18 19:20:00 -08:00
Pascal Massimino	495bef413d	fix bug in TrellisQuantize the quantized level should be clipped to 2047, not the original coeff. (similar problem was fixed in the regular quantize function quite some time ago) Change-Id: I2fd2f8d94561ff0204e60535321ab41a565e8f85	2013-12-17 11:08:01 -08:00
James Zern	605a712701	simplify __cplusplus ifdef drop c_plusplus which is from a quite ancient pre-standard compiler Change-Id: I9e357b3292a6b52b14c2641ba11f4f872c04b7fb	2013-12-16 20:16:02 -08:00
James Zern	5227d99146	drop: ifdef __cplusplus checks from C files the prototypes are already marked in the headers Change-Id: I172fe742200c939ca32a70a2299809b8baf9b094	2013-12-13 11:42:13 -08:00
skal	73b731fb42	introduce a special quantization function for WHT WHT is somewhat a special case: no sharpen[] bias, etc. Will be useful in a later CL when precision of input is changed. Change-Id: I851b06deb94abdfc1ef00acafb8aa731801b4299	2013-12-10 14:21:47 +01:00
skal	a3359f5d2c	Only compute quantization params once (all quantization params #1..#15 are the same) Change-Id: If04058bd89fe2677b5b118ee4e1bcce88f0e4bf5	2013-12-10 05:36:23 +01:00

... 2 3 4 5 6 ...

582 Commits