libwebp

mirror of https://github.com/webmproject/libwebp.git synced 2026-04-09 22:30:02 +02:00

Author	SHA1	Message	Date
James Zern	eba6ce06c3	dec_neon: add DC4 intra predictor ~70% faster Change-Id: I2e06907b8d69be71a8c5581832c931923c24bab0	2014-10-23 14:21:08 +02:00
James Zern	79abfbd9df	dec_neon: add TM4 intra predictor ~21% faster Change-Id: Ia9ed4ca650f9d544821fa1faf3173611806a272a	2014-10-23 14:21:08 +02:00
James Zern	fe395f0e4d	dec_neon: add LD4 intra predictor based on SSE2 version, ~55% faster Change-Id: I782282ffc31dcf238890b3ba0decccf1d793dad0	2014-10-23 14:20:47 +02:00
James Zern	32de385eca	dec_neon: add VE4 intra predictor based on SSE2 version, ~59% faster Change-Id: Iaa2181eb51bd975de0e9fe5c7b66ed18188f0e3b	2014-10-23 11:46:08 +02:00
Pascal Massimino	b7a33d7e91	implement VE4/HE4/RD4/... in SSE2 (30% faster prediction functions, but overall speed-up is ~1% only) Change-Id: I2c6e7074aa26a2359c9198a9015e5cbe143c2765	2014-10-22 18:25:36 +02:00
Pascal Massimino	97c76f1f30	make VP8PredLuma4[] non-const and initialize array in VP8DspInit() also convert 'type dst' to 'type dst' Change-Id: I41ab66ad15b548cc45d1cb8b10bbca4fe1528cae	2014-10-22 18:14:20 +02:00
pascal massimino	0ea8c6c219	Merge "PrintReg: output to stderr"	2014-10-22 08:55:10 -07:00
James Zern	f85ec712b0	PrintReg: output to stderr allows use of '-o -' while testing Change-Id: Ibc02d7cede2df4eb8be0a28c0ca4bf5e91864191	2014-10-22 17:28:19 +02:00
Vikas Arora	139142e440	Optimize BackwardReferenceHashChainFollowPath. Instead of calling HashChainFindMethod, call a new (subset) method HashChainFindOffset to get the offset/distance for a given length. The encoding is tad faster at default compression Before After bpp/rate bpp/rate 442 Palette 0.2720/5.270 MP/s 0.2720/5.790 MP/s 558 non-palette 3.7607/0.797 MP/s 3.7607/0.816 MP/s Change-Id: If4041a9c18f7e972f49fcbab8c3e2f013d8bf1cf	2014-10-21 10:04:27 -07:00
James Zern	5f36b68d22	enc/backward_references.c: fix indent reindent after `c24f895` Change-Id: I55adcbef21ea3fdaded84b138745515596191a09	2014-10-20 11:35:20 +02:00
James Zern	e0e9960dd1	Merge "sync version numbers to 0.4.2 release"	2014-10-17 11:47:30 -07:00
James Zern	64ac51446d	sync version numbers to 0.4.2 release libwebp{,decoder} - 0.4.2 libwebp libtool - 5.2.0 libwebpdecoder libtool - 1.2.0 mux/demux - 0.2.2 libtool - 1.2.0 (cherry picked from commit `eec5f5f121`) (cherry picked from commit `857578a811`) Change-Id: Ie9d10c68e28083674a8865ad8447b1a70dcea95d	2014-10-17 19:50:21 +02:00
Vikas Arora	c24f8954be	Simplify and speedup Backward refs computation. Updated VP8LGetBackwardReferences and HashChainFindCopy method with following: - Remove the recursive CostModelBuild. - Reuse the lz77 backward refs in CostModelBuild, instead of evaluating it again (as it was done for recursion_level=0). - Consolidated the Match-length logic inside FindMatchLength method. - Removed the logic for altering best_length/val based on the 2D distance. The additional 162 value (+= 9 * 9 + 9 * 9 - y * y - x * x) can't change the best_val eval computation to choose a different curr_length, as best_val was set to 'curr_length << 16'. Following is the impact on the compression speed/density at default & max quality, overall this speeds up compression by 5-15% (q=100 -> 75) with a tad drop (0.02-0.03%) in compression density for the non-palette images. Before After bpp/Rate(MP/s) bpp/Rate(MP/s) q=75 (def) All 1000 2.4492/1.049 MP/s 2.4498/1.230 MP/s Palette 0.2719/5.060 MP/s 0.2719/6.110 MP/s non-Palette 3.7597/0.732 MP/s 3.7607/0.840 MP/s q=100 All 1000 2.4134/0.125 MP/s 2.4142/0.131 MP/s Palette 0.2692/2.585 MP/s 0.2692/2.885 MP/s non-Palette 3.7040/0.079 MP/s 3.7053/0.083 MP/s Change-Id: I27a5eff3356d876c3e949fd32262244b25678b7a	2014-10-17 09:21:30 -07:00
James Zern	d1c359ef29	fix shared object build with -fvisibility=hidden set WEBP_EXTERN to visibility=default + explicitly mark VP8GetCPUInfo as it's referenced within the examples Change-Id: Ie3d2b15088e888f0b55203b205993eba75899d99	2014-10-17 11:50:52 +02:00
James Zern	a4c3a31b8f	WEBP_TSAN_IGNORE_FUNCTION: fix gcc compat warning move the attribute to the front of the function to quiet clang warning: GCC does not allow no_sanitize_thread attribute in this position on a function definition Change-Id: Ie4cc6e35a07bd00eab67d9cd6801bd2be9cfe676	2014-10-16 18:06:43 +02:00
Pascal Massimino	80247291c6	mark some init function as being safe for thread_sanitizer. introduces the macro WEBP_TSAN_IGNORE_FUNCTION Change-Id: I3de2b6c1a2076fba4da7ae50322551e026b2082b	2014-10-16 16:34:07 +02:00
James Zern	79b5bdbfde	bit_reader.h: cosmetics: fix a typo Change-Id: I1ba09124700b3120f18eb3705eb5ba805feb2ca0	2014-10-16 10:52:47 +02:00
Pascal Massimino	6c6736816c	Improved near-lossless mode. Compared to previous mode it gives another 10-30% improvement in compression keeping comparable PSNR on corresponding quality settings. Still protected by the WEBP_EXPERIMENTAL_FEATURES flag. Change-Id: I4821815b9a508f4f38c98821acaddb74c73c60ac	2014-10-15 10:57:21 -07:00
James Zern	0ce27e715e	enc_mips32: workaround gcc-4.9 bug avoids an ICE with NDK r10b + NDK_TOOLCHAIN_VERSION=4.9 In function 'SSE16x16': enc_mips32.c (684) internal compiler error: Segmentation fault Change-Id: I1a3d33c0a9534c97633ab93bcdf9bf59d3a7e473	2014-10-15 19:14:04 +02:00
James Zern	aca1b98f52	enc/vp8l.c: fix indent reindent after `ca00502` Change-Id: I8c88dbc11dc96c117531b17682b764a235ef23bb	2014-10-13 11:33:23 +02:00
Vikas Arora	ca00502788	Evaluate non-palette compression for palette image Evaluate if for Palette images (num_colors <= 256), non-palette compression path (Subtract green, predictor transform etc) yield an optimal compression density. This change reduces the WebP file (for palette images) size by 0.4% with drop of 3-5% in compression speed. Change-Id: I1ad66fa94db4fd7ba7bc215763791ef662cd4f42	2014-10-10 11:55:45 -07:00
James Zern	c8a87bb62d	AssignSegments: quiet -Warray-bounds warning the number of segments are previously validated, but an explicit check is needed to avoid a warning under gcc-4.9 Change-Id: Ifa7c0dd7f3f075b3860fa8ec176d2c98ff54fcea	2014-10-10 17:18:39 +02:00
pascal massimino	32f67e309f	Merge "enc_neon: initialize vectors w/vdup_n_u32"	2014-10-09 12:23:18 -07:00
Pascal Massimino	fabc65da32	1-3% faster encoding optimizing SSE_NxN functions got rid of the \|a-b\|^\|b-a\| method and went back to just (a-b)^2 instead. quality \| size(bytes) after/before \| time (ms) after/before Change-Id: Ia3e0e6507b3f903deb1e182f78dad6df07380fd0	2014-10-09 07:20:00 -07:00
James Zern	7534d71640	enc_neon: initialize vectors w/vdup_n_u32 replaces {} initialization gnu-ism Change-Id: I5a7b2d4246f0205e4bfb7f4b77d720c47d8674ec	2014-10-09 12:35:41 +02:00
Pascal Massimino	5f81391263	Merge "Fix return code of EncodeImageInternal()"	2014-10-07 23:49:29 -07:00
Pascal Massimino	e321abe43d	Fix return code of EncodeImageInternal() It was returning 'VP8_ENC_OK' in case of memory error. Change-Id: I184a3e29c9f1b863637cacbe389b058d75c3dbf8	2014-10-08 08:48:53 +02:00
Pascal Massimino	f82cb06afb	optimize palette ordering We compact the palette by weighted distance, favoring the green channel. Average gain on paletted file is ~0.5%, with gain up to 6-7% on some favorable cases. Encoding speed is unaffected. Disabled for alpha (or any single-channel input) Also: always use quality=20 for EncodePalette() since it doesn't make any real difference. Change-Id: I19fb14316a366f139a941b45aef5663a33c905e1	2014-10-08 08:42:36 +02:00
Pascal Massimino	f545feee64	don't set the alpha value for histogram index image This leads to tiny extra compression (~few bytes per file) for free Change-Id: Ia4d8cef3de4365e32eacefd69a57689c80042a23	2014-10-08 08:24:19 +02:00
Pascal Massimino	2d9b0a4472	add WebPDispatchAlphaToGreen() to dsp SSE2 version is 2.1x faster This is used to transfer the alpha plane to green channel before lossless compression. Change-Id: I01d9df0051c183b1ff5d6eb69961d4f43e33141a	2014-10-06 23:15:44 +02:00
Vikas Arora	d5e498d47f	Change Entropy based Histogram Combine heuristic. Don't combine the Histograms that have trivial (single valued A, R & B) symbols. Following is the compression savings data along with compression time (before & after) per image. Before After bpp, rate(MP/s) bpp, rate(MP/s) Q=25, method = 4 2.508, 1.807 2.499, 1.916 Q=50, method = 4 2.460, 1.488 2.456, 1.512 Q=75, method = 4 2.452, 1.078 2.450, 1.092 Q=25, method = 5 2.505, 1.398 2.496, 1.383 Q=50, method = 5 2.458, 1.170 2.453, 1.143 Q=75, method = 5 2.453, 0.886 2.450, 0.855 This change provides 0.1-0.4% compression gains and speeds up the lossless compression for the default method=4 (the drop in compression speed is between 1-3.5% for method=5). Change-Id: Idfd88c2092f37afacd26a97097b3053f8183953a	2014-09-30 13:41:39 -07:00
Pascal Massimino	47a2d8e1d9	fix MSVC float->int conversion warning + add a clarifying comment Change-Id: I8ac1df1de2e5277f2d968dec489546e680bb5e0c	2014-09-27 00:36:01 -07:00
James Zern	35ad48b848	HistoHeapInit: correct positions allocation size Change-Id: I1879fd48bee3aea6f0504926d7030b504dd9be07	2014-09-26 11:21:19 -07:00
Pascal Massimino	45d9635fd3	lossless: entropy clustering for high qualities. Tested on 1000 pngs corpus with quality 90-100 it gives ~0.15% improvement in compression density and ~7% speed up. Change-Id: I460f56c96707edb3c1f0b51a024e5122e10458df	2014-09-26 15:26:56 +02:00
Pascal Massimino	dc37df8c7a	fix type warning for VS9_x64 Error report was: src\utils\color_cache.c(48) : warning C4334: '<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?) Change-Id: I93463ba7cd94faf1cf04986acbfaa06b62700d26	2014-09-25 23:47:06 -07:00
Vikas Arora	fdd6528ba2	Remove unused VP8LDecoder member variable Remove the unused VP8LDecoder member variable (last_cached_) Change-Id: I4a7d2f1b72d166efb978850e061dc69c8509e224	2014-09-24 11:59:51 -07:00
James Zern	ea3bba5a66	Merge "rewrite Disto4x4 in enc_neon.c with intrinsic"	2014-09-24 10:51:47 -07:00
Pascal Massimino	f060dfc422	add lossless incremental decoding support * We don't need to change DecodeAlpha, since incremental decoding is not useful for Alpha (we already decode progressively along the RGB) * Similarly, we don't do incremental decoding for level>0 planes: the metadata don't turn into visible pixel (only the ones in level0), so... (No visible speed change) Change-Id: I2fd4b9ba227561a7dbede647686584b752be7baa	2014-09-24 09:55:01 +02:00
Yang Zhang	ab70794ddb	rewrite Disto4x4 in enc_neon.c with intrinsic Performance test: Platform: A9 Input data: bryce.yuv 11158x2156 performance of assembly is the base. Less ratio is better. \|toolchain \|assembly \|intrinsic \| \|gcc4.6 \|100% \|97.15% \| \|gcc4.8 \|100% \|95.51 \| Change-Id: Idc2446685acdeb58a4dbdcdae533c68a83a1b879	2014-09-23 18:28:36 -07:00
Djordje Pesut	d4471637ef	MIPS: dspr2: added optimization for function FilterLoop24 affected functions: VFilter16i, HFilter16i, VFilter8i and HFilter8i Change-Id: I5d2bc7716e60e048a33d630fe4a86011bfb6d42e	2014-09-23 10:32:55 +02:00
skal	2aef54d429	Merge "prepare VP8LDecodeImage for incremental decode"	2014-09-23 00:31:27 -07:00
pascal massimino	aed0f5a231	Merge "MIPS: dspr2: added optimization for function FilterLoop26"	2014-09-23 00:17:25 -07:00
skal	286306853e	prepare VP8LDecodeImage for incremental decode - don't call VP8LClear() when there's no error (and let the caller do it) - only initialize output once if state_ is not READ_DATA - don't over-set dec->status_ = READ_DATA - don't re-set dec->status_ if DecodeImageStream() fails - remove unneeded dec->action_ field - make ReadImageInfo() check br->eos_ - use ErrorStatusLossless() more consistently Change-Id: Ica6e4b1c82e3fce8b1ce0274def551a886b73b0b	2014-09-23 00:13:52 -07:00
skal	248f3aed22	remove br->error_ field it's somewhat redundant with br->eos_ also make the status-check coherent. Change-Id: I98e755e037d45acb0760baf2344bf11fb5fb5cda	2014-09-23 00:04:58 -07:00
Djordje Pesut	49e15044ef	MIPS: dspr2: added optimization for function FilterLoop26 affected functions: VFilter16, HFilter16, VFilter8 and HFilter8 Change-Id: Ib2fc41aaa00b10c2906d689bdc5a10f4568e70a8	2014-09-23 08:46:05 +02:00
skal	c792d4129a	Premultiply with alpha during U/V downsampling This prevents the 'alpha-leak' reported in issue #220 Speed-diff is kept minimal. Change-Id: I1976de5e6de7cfcec89a54df9233c1a6586a5846	2014-09-18 23:40:34 -07:00
Vikas Arora	b901416b90	Record the lossless size stats. Record and show the lossless header and image data sizes in the cwebp. Change-Id: I08f19693cb7a756b6fdce5b55d71f5367b5f02fc	2014-09-17 15:16:05 -07:00
Pascal Massimino	cddd334050	Add a WebPExtractAlpha function to dsp This is the opposite of WebPDispatchAlpha + Implement the SSE2 version Change-Id: I0c297309255f508c5261da8aad01f7e57f924d6c	2014-09-15 08:12:03 +02:00
Pascal Massimino	0716a98eb3	fix indent after I0204949917836f74c0eb4ba5a7f4052a4797833b Change-Id: I5d9e5d0a2ad2cefd8c539571d2eaee948da60ad5	2014-09-12 19:59:53 +02:00
Vikas Arora	f9ced95a9b	Optimize lossless decoding for trivial(ARB) codes. Optimize the decoding for region that have trivial literal codes. The trivial literal is defined as huffman image with Red, Blue and Alpha huffman trees with only single code values. This speeds up lossless decoding by 3% Change-Id: I0204949917836f74c0eb4ba5a7f4052a4797833b	2014-09-12 09:08:08 -07:00

1 2 3 4 5 ...

1471 Commits