libwebp

mirror of https://github.com/webmproject/libwebp.git synced 2025-08-11 02:20:33 +02:00

Author	SHA1	Message	Date
Vikas Arora	e912bd55be	Fix bug in VP8LCalculateEstimateForCacheSize. The method VP8LCalculateEstimateForCacheSize is not evaluating the all possible range for cache_bits. Also added a small penality for choosing the larger cache-size. This is done to strike a balance between additional memory/CPU cost (with larger cache-size) and byte savings from smaller WebP lossless files. This change saves about 0.07% bytes and speeds up compression by 8% (default settings). There's small speedup at Q=50 along with byte savings as well. Compression at Quality=25 is not effected by this change. Change-Id: Id8f87dee6b5bccb2baa6dbdee479ee9cda8f4f77	2014-10-26 20:05:48 -07:00
James Zern	22881c999e	dec_neon: add RD4 intra predictor based on the SSE2 version; a bit rough around the loads, but still ~38% faster. Change-Id: I22426d939a7354cbc9a85ca8c68235d6081b882f	2014-10-24 21:22:07 +02:00
James Zern	1304eb3418	Merge "dec_neon: DC4: use pair-wise adds for top row"	2014-10-23 08:08:34 -07:00
pascal massimino	7083006b61	Merge "dsp/dec_{neon,sse2}: VE4: normalize variable names"	2014-10-23 07:29:27 -07:00
James Zern	0db9031c79	dsp/dec_{neon,sse2}: VE4: normalize variable names use '0' rather than '_' when dealing with variables that result from a shift Change-Id: I29280c0dead645ce39dc4bb42c3e19929b302fd4	2014-10-23 16:04:13 +02:00
James Zern	b5bc15305b	dec_neon: DC4: use pair-wise adds for top row reduces load count, slightly faster Change-Id: I880340ef8ef75ce4ce321c330f56f86b758bda08	2014-10-23 15:48:49 +02:00
Pascal Massimino	5b90d8fe42	Unify the API between VP8BitWriter and VP8LBitWriter BitReader will be next... Change-Id: Icd9e7ab2e3890131e664c0523627d9b8c5399a74	2014-10-23 15:35:16 +02:00
pascal massimino	f7ada560ce	Merge changes I2e06907b,Ia9ed4ca6,I782282ff * changes: dec_neon: add DC4 intra predictor dec_neon: add TM4 intra predictor dec_neon: add LD4 intra predictor	2014-10-23 06:31:54 -07:00
pascal massimino	5beb6bf070	Merge "dec_neon: add VE4 intra predictor"	2014-10-23 05:38:41 -07:00
James Zern	eba6ce06c3	dec_neon: add DC4 intra predictor ~70% faster Change-Id: I2e06907b8d69be71a8c5581832c931923c24bab0	2014-10-23 14:21:08 +02:00
James Zern	79abfbd9df	dec_neon: add TM4 intra predictor ~21% faster Change-Id: Ia9ed4ca650f9d544821fa1faf3173611806a272a	2014-10-23 14:21:08 +02:00
James Zern	fe395f0e4d	dec_neon: add LD4 intra predictor based on SSE2 version, ~55% faster Change-Id: I782282ffc31dcf238890b3ba0decccf1d793dad0	2014-10-23 14:20:47 +02:00
James Zern	32de385eca	dec_neon: add VE4 intra predictor based on SSE2 version, ~59% faster Change-Id: Iaa2181eb51bd975de0e9fe5c7b66ed18188f0e3b	2014-10-23 11:46:08 +02:00
Vikas Arora	c2b5a0396a	Modify CostModel to allocate optimal memory. Change-Id: I7d52675d28bfc109d4e901581fc24cd36fcb79ee	2014-10-22 13:30:33 -07:00
Pascal Massimino	b7a33d7e91	implement VE4/HE4/RD4/... in SSE2 (30% faster prediction functions, but overall speed-up is ~1% only) Change-Id: I2c6e7074aa26a2359c9198a9015e5cbe143c2765	2014-10-22 18:25:36 +02:00
Pascal Massimino	97c76f1f30	make VP8PredLuma4[] non-const and initialize array in VP8DspInit() also convert 'type dst' to 'type dst' Change-Id: I41ab66ad15b548cc45d1cb8b10bbca4fe1528cae	2014-10-22 18:14:20 +02:00
pascal massimino	0ea8c6c219	Merge "PrintReg: output to stderr"	2014-10-22 08:55:10 -07:00
James Zern	f85ec712b0	PrintReg: output to stderr allows use of '-o -' while testing Change-Id: Ibc02d7cede2df4eb8be0a28c0ca4bf5e91864191	2014-10-22 17:28:19 +02:00
Vikas Arora	139142e440	Optimize BackwardReferenceHashChainFollowPath. Instead of calling HashChainFindMethod, call a new (subset) method HashChainFindOffset to get the offset/distance for a given length. The encoding is tad faster at default compression Before After bpp/rate bpp/rate 442 Palette 0.2720/5.270 MP/s 0.2720/5.790 MP/s 558 non-palette 3.7607/0.797 MP/s 3.7607/0.816 MP/s Change-Id: If4041a9c18f7e972f49fcbab8c3e2f013d8bf1cf	2014-10-21 10:04:27 -07:00
James Zern	5f36b68d22	enc/backward_references.c: fix indent reindent after `c24f895` Change-Id: I55adcbef21ea3fdaded84b138745515596191a09	2014-10-20 11:35:20 +02:00
James Zern	e0e9960dd1	Merge "sync version numbers to 0.4.2 release"	2014-10-17 11:47:30 -07:00
James Zern	64ac51446d	sync version numbers to 0.4.2 release libwebp{,decoder} - 0.4.2 libwebp libtool - 5.2.0 libwebpdecoder libtool - 1.2.0 mux/demux - 0.2.2 libtool - 1.2.0 (cherry picked from commit `eec5f5f121`) (cherry picked from commit `857578a811`) Change-Id: Ie9d10c68e28083674a8865ad8447b1a70dcea95d	2014-10-17 19:50:21 +02:00
Vikas Arora	c24f8954be	Simplify and speedup Backward refs computation. Updated VP8LGetBackwardReferences and HashChainFindCopy method with following: - Remove the recursive CostModelBuild. - Reuse the lz77 backward refs in CostModelBuild, instead of evaluating it again (as it was done for recursion_level=0). - Consolidated the Match-length logic inside FindMatchLength method. - Removed the logic for altering best_length/val based on the 2D distance. The additional 162 value (+= 9 * 9 + 9 * 9 - y * y - x * x) can't change the best_val eval computation to choose a different curr_length, as best_val was set to 'curr_length << 16'. Following is the impact on the compression speed/density at default & max quality, overall this speeds up compression by 5-15% (q=100 -> 75) with a tad drop (0.02-0.03%) in compression density for the non-palette images. Before After bpp/Rate(MP/s) bpp/Rate(MP/s) q=75 (def) All 1000 2.4492/1.049 MP/s 2.4498/1.230 MP/s Palette 0.2719/5.060 MP/s 0.2719/6.110 MP/s non-Palette 3.7597/0.732 MP/s 3.7607/0.840 MP/s q=100 All 1000 2.4134/0.125 MP/s 2.4142/0.131 MP/s Palette 0.2692/2.585 MP/s 0.2692/2.885 MP/s non-Palette 3.7040/0.079 MP/s 3.7053/0.083 MP/s Change-Id: I27a5eff3356d876c3e949fd32262244b25678b7a	2014-10-17 09:21:30 -07:00
James Zern	d1c359ef29	fix shared object build with -fvisibility=hidden set WEBP_EXTERN to visibility=default + explicitly mark VP8GetCPUInfo as it's referenced within the examples Change-Id: Ie3d2b15088e888f0b55203b205993eba75899d99	2014-10-17 11:50:52 +02:00
James Zern	a4c3a31b8f	WEBP_TSAN_IGNORE_FUNCTION: fix gcc compat warning move the attribute to the front of the function to quiet clang warning: GCC does not allow no_sanitize_thread attribute in this position on a function definition Change-Id: Ie4cc6e35a07bd00eab67d9cd6801bd2be9cfe676	2014-10-16 18:06:43 +02:00
Pascal Massimino	80247291c6	mark some init function as being safe for thread_sanitizer. introduces the macro WEBP_TSAN_IGNORE_FUNCTION Change-Id: I3de2b6c1a2076fba4da7ae50322551e026b2082b	2014-10-16 16:34:07 +02:00
James Zern	79b5bdbfde	bit_reader.h: cosmetics: fix a typo Change-Id: I1ba09124700b3120f18eb3705eb5ba805feb2ca0	2014-10-16 10:52:47 +02:00
Pascal Massimino	6c6736816c	Improved near-lossless mode. Compared to previous mode it gives another 10-30% improvement in compression keeping comparable PSNR on corresponding quality settings. Still protected by the WEBP_EXPERIMENTAL_FEATURES flag. Change-Id: I4821815b9a508f4f38c98821acaddb74c73c60ac	2014-10-15 10:57:21 -07:00
James Zern	0ce27e715e	enc_mips32: workaround gcc-4.9 bug avoids an ICE with NDK r10b + NDK_TOOLCHAIN_VERSION=4.9 In function 'SSE16x16': enc_mips32.c (684) internal compiler error: Segmentation fault Change-Id: I1a3d33c0a9534c97633ab93bcdf9bf59d3a7e473	2014-10-15 19:14:04 +02:00
James Zern	aca1b98f52	enc/vp8l.c: fix indent reindent after `ca00502` Change-Id: I8c88dbc11dc96c117531b17682b764a235ef23bb	2014-10-13 11:33:23 +02:00
Vikas Arora	ca00502788	Evaluate non-palette compression for palette image Evaluate if for Palette images (num_colors <= 256), non-palette compression path (Subtract green, predictor transform etc) yield an optimal compression density. This change reduces the WebP file (for palette images) size by 0.4% with drop of 3-5% in compression speed. Change-Id: I1ad66fa94db4fd7ba7bc215763791ef662cd4f42	2014-10-10 11:55:45 -07:00
James Zern	c8a87bb62d	AssignSegments: quiet -Warray-bounds warning the number of segments are previously validated, but an explicit check is needed to avoid a warning under gcc-4.9 Change-Id: Ifa7c0dd7f3f075b3860fa8ec176d2c98ff54fcea	2014-10-10 17:18:39 +02:00
pascal massimino	32f67e309f	Merge "enc_neon: initialize vectors w/vdup_n_u32"	2014-10-09 12:23:18 -07:00
Pascal Massimino	fabc65da32	1-3% faster encoding optimizing SSE_NxN functions got rid of the \|a-b\|^\|b-a\| method and went back to just (a-b)^2 instead. quality \| size(bytes) after/before \| time (ms) after/before Change-Id: Ia3e0e6507b3f903deb1e182f78dad6df07380fd0	2014-10-09 07:20:00 -07:00
James Zern	7534d71640	enc_neon: initialize vectors w/vdup_n_u32 replaces {} initialization gnu-ism Change-Id: I5a7b2d4246f0205e4bfb7f4b77d720c47d8674ec	2014-10-09 12:35:41 +02:00
Pascal Massimino	5f81391263	Merge "Fix return code of EncodeImageInternal()"	2014-10-07 23:49:29 -07:00
Pascal Massimino	e321abe43d	Fix return code of EncodeImageInternal() It was returning 'VP8_ENC_OK' in case of memory error. Change-Id: I184a3e29c9f1b863637cacbe389b058d75c3dbf8	2014-10-08 08:48:53 +02:00
Pascal Massimino	f82cb06afb	optimize palette ordering We compact the palette by weighted distance, favoring the green channel. Average gain on paletted file is ~0.5%, with gain up to 6-7% on some favorable cases. Encoding speed is unaffected. Disabled for alpha (or any single-channel input) Also: always use quality=20 for EncodePalette() since it doesn't make any real difference. Change-Id: I19fb14316a366f139a941b45aef5663a33c905e1	2014-10-08 08:42:36 +02:00
Pascal Massimino	f545feee64	don't set the alpha value for histogram index image This leads to tiny extra compression (~few bytes per file) for free Change-Id: Ia4d8cef3de4365e32eacefd69a57689c80042a23	2014-10-08 08:24:19 +02:00
Pascal Massimino	2d9b0a4472	add WebPDispatchAlphaToGreen() to dsp SSE2 version is 2.1x faster This is used to transfer the alpha plane to green channel before lossless compression. Change-Id: I01d9df0051c183b1ff5d6eb69961d4f43e33141a	2014-10-06 23:15:44 +02:00
Vikas Arora	d5e498d47f	Change Entropy based Histogram Combine heuristic. Don't combine the Histograms that have trivial (single valued A, R & B) symbols. Following is the compression savings data along with compression time (before & after) per image. Before After bpp, rate(MP/s) bpp, rate(MP/s) Q=25, method = 4 2.508, 1.807 2.499, 1.916 Q=50, method = 4 2.460, 1.488 2.456, 1.512 Q=75, method = 4 2.452, 1.078 2.450, 1.092 Q=25, method = 5 2.505, 1.398 2.496, 1.383 Q=50, method = 5 2.458, 1.170 2.453, 1.143 Q=75, method = 5 2.453, 0.886 2.450, 0.855 This change provides 0.1-0.4% compression gains and speeds up the lossless compression for the default method=4 (the drop in compression speed is between 1-3.5% for method=5). Change-Id: Idfd88c2092f37afacd26a97097b3053f8183953a	2014-09-30 13:41:39 -07:00
Pascal Massimino	47a2d8e1d9	fix MSVC float->int conversion warning + add a clarifying comment Change-Id: I8ac1df1de2e5277f2d968dec489546e680bb5e0c	2014-09-27 00:36:01 -07:00
James Zern	35ad48b848	HistoHeapInit: correct positions allocation size Change-Id: I1879fd48bee3aea6f0504926d7030b504dd9be07	2014-09-26 11:21:19 -07:00
Pascal Massimino	45d9635fd3	lossless: entropy clustering for high qualities. Tested on 1000 pngs corpus with quality 90-100 it gives ~0.15% improvement in compression density and ~7% speed up. Change-Id: I460f56c96707edb3c1f0b51a024e5122e10458df	2014-09-26 15:26:56 +02:00
Pascal Massimino	dc37df8c7a	fix type warning for VS9_x64 Error report was: src\utils\color_cache.c(48) : warning C4334: '<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?) Change-Id: I93463ba7cd94faf1cf04986acbfaa06b62700d26	2014-09-25 23:47:06 -07:00
Vikas Arora	fdd6528ba2	Remove unused VP8LDecoder member variable Remove the unused VP8LDecoder member variable (last_cached_) Change-Id: I4a7d2f1b72d166efb978850e061dc69c8509e224	2014-09-24 11:59:51 -07:00
James Zern	ea3bba5a66	Merge "rewrite Disto4x4 in enc_neon.c with intrinsic"	2014-09-24 10:51:47 -07:00
Pascal Massimino	f060dfc422	add lossless incremental decoding support * We don't need to change DecodeAlpha, since incremental decoding is not useful for Alpha (we already decode progressively along the RGB) * Similarly, we don't do incremental decoding for level>0 planes: the metadata don't turn into visible pixel (only the ones in level0), so... (No visible speed change) Change-Id: I2fd4b9ba227561a7dbede647686584b752be7baa	2014-09-24 09:55:01 +02:00
Yang Zhang	ab70794ddb	rewrite Disto4x4 in enc_neon.c with intrinsic Performance test: Platform: A9 Input data: bryce.yuv 11158x2156 performance of assembly is the base. Less ratio is better. \|toolchain \|assembly \|intrinsic \| \|gcc4.6 \|100% \|97.15% \| \|gcc4.8 \|100% \|95.51 \| Change-Id: Idc2446685acdeb58a4dbdcdae533c68a83a1b879	2014-09-23 18:28:36 -07:00
Djordje Pesut	d4471637ef	MIPS: dspr2: added optimization for function FilterLoop24 affected functions: VFilter16i, HFilter16i, VFilter8i and HFilter8i Change-Id: I5d2bc7716e60e048a33d630fe4a86011bfb6d42e	2014-09-23 10:32:55 +02:00

1 2 3 4 5 ...

1531 Commits