libwebp

mirror of https://github.com/webmproject/libwebp.git synced 2025-08-11 02:20:33 +02:00

Author	SHA1	Message	Date
Vikas Arora	c24f8954be	Simplify and speedup Backward refs computation. Updated VP8LGetBackwardReferences and HashChainFindCopy method with following: - Remove the recursive CostModelBuild. - Reuse the lz77 backward refs in CostModelBuild, instead of evaluating it again (as it was done for recursion_level=0). - Consolidated the Match-length logic inside FindMatchLength method. - Removed the logic for altering best_length/val based on the 2D distance. The additional 162 value (+= 9 * 9 + 9 * 9 - y * y - x * x) can't change the best_val eval computation to choose a different curr_length, as best_val was set to 'curr_length << 16'. Following is the impact on the compression speed/density at default & max quality, overall this speeds up compression by 5-15% (q=100 -> 75) with a tad drop (0.02-0.03%) in compression density for the non-palette images. Before After bpp/Rate(MP/s) bpp/Rate(MP/s) q=75 (def) All 1000 2.4492/1.049 MP/s 2.4498/1.230 MP/s Palette 0.2719/5.060 MP/s 0.2719/6.110 MP/s non-Palette 3.7597/0.732 MP/s 3.7607/0.840 MP/s q=100 All 1000 2.4134/0.125 MP/s 2.4142/0.131 MP/s Palette 0.2692/2.585 MP/s 0.2692/2.885 MP/s non-Palette 3.7040/0.079 MP/s 3.7053/0.083 MP/s Change-Id: I27a5eff3356d876c3e949fd32262244b25678b7a	2014-10-17 09:21:30 -07:00
James Zern	d1c359ef29	fix shared object build with -fvisibility=hidden set WEBP_EXTERN to visibility=default + explicitly mark VP8GetCPUInfo as it's referenced within the examples Change-Id: Ie3d2b15088e888f0b55203b205993eba75899d99	2014-10-17 11:50:52 +02:00
James Zern	a4c3a31b8f	WEBP_TSAN_IGNORE_FUNCTION: fix gcc compat warning move the attribute to the front of the function to quiet clang warning: GCC does not allow no_sanitize_thread attribute in this position on a function definition Change-Id: Ie4cc6e35a07bd00eab67d9cd6801bd2be9cfe676	2014-10-16 18:06:43 +02:00
Pascal Massimino	80247291c6	mark some init function as being safe for thread_sanitizer. introduces the macro WEBP_TSAN_IGNORE_FUNCTION Change-Id: I3de2b6c1a2076fba4da7ae50322551e026b2082b	2014-10-16 16:34:07 +02:00
James Zern	79b5bdbfde	bit_reader.h: cosmetics: fix a typo Change-Id: I1ba09124700b3120f18eb3705eb5ba805feb2ca0	2014-10-16 10:52:47 +02:00
Pascal Massimino	6c6736816c	Improved near-lossless mode. Compared to previous mode it gives another 10-30% improvement in compression keeping comparable PSNR on corresponding quality settings. Still protected by the WEBP_EXPERIMENTAL_FEATURES flag. Change-Id: I4821815b9a508f4f38c98821acaddb74c73c60ac	2014-10-15 10:57:21 -07:00
James Zern	0ce27e715e	enc_mips32: workaround gcc-4.9 bug avoids an ICE with NDK r10b + NDK_TOOLCHAIN_VERSION=4.9 In function 'SSE16x16': enc_mips32.c (684) internal compiler error: Segmentation fault Change-Id: I1a3d33c0a9534c97633ab93bcdf9bf59d3a7e473	2014-10-15 19:14:04 +02:00
James Zern	aca1b98f52	enc/vp8l.c: fix indent reindent after `ca00502` Change-Id: I8c88dbc11dc96c117531b17682b764a235ef23bb	2014-10-13 11:33:23 +02:00
Vikas Arora	ca00502788	Evaluate non-palette compression for palette image Evaluate if for Palette images (num_colors <= 256), non-palette compression path (Subtract green, predictor transform etc) yield an optimal compression density. This change reduces the WebP file (for palette images) size by 0.4% with drop of 3-5% in compression speed. Change-Id: I1ad66fa94db4fd7ba7bc215763791ef662cd4f42	2014-10-10 11:55:45 -07:00
James Zern	c8a87bb62d	AssignSegments: quiet -Warray-bounds warning the number of segments are previously validated, but an explicit check is needed to avoid a warning under gcc-4.9 Change-Id: Ifa7c0dd7f3f075b3860fa8ec176d2c98ff54fcea	2014-10-10 17:18:39 +02:00
pascal massimino	32f67e309f	Merge "enc_neon: initialize vectors w/vdup_n_u32"	2014-10-09 12:23:18 -07:00
Pascal Massimino	fabc65da32	1-3% faster encoding optimizing SSE_NxN functions got rid of the \|a-b\|^\|b-a\| method and went back to just (a-b)^2 instead. quality \| size(bytes) after/before \| time (ms) after/before Change-Id: Ia3e0e6507b3f903deb1e182f78dad6df07380fd0	2014-10-09 07:20:00 -07:00
James Zern	7534d71640	enc_neon: initialize vectors w/vdup_n_u32 replaces {} initialization gnu-ism Change-Id: I5a7b2d4246f0205e4bfb7f4b77d720c47d8674ec	2014-10-09 12:35:41 +02:00
Pascal Massimino	5f81391263	Merge "Fix return code of EncodeImageInternal()"	2014-10-07 23:49:29 -07:00
Pascal Massimino	e321abe43d	Fix return code of EncodeImageInternal() It was returning 'VP8_ENC_OK' in case of memory error. Change-Id: I184a3e29c9f1b863637cacbe389b058d75c3dbf8	2014-10-08 08:48:53 +02:00
Pascal Massimino	f82cb06afb	optimize palette ordering We compact the palette by weighted distance, favoring the green channel. Average gain on paletted file is ~0.5%, with gain up to 6-7% on some favorable cases. Encoding speed is unaffected. Disabled for alpha (or any single-channel input) Also: always use quality=20 for EncodePalette() since it doesn't make any real difference. Change-Id: I19fb14316a366f139a941b45aef5663a33c905e1	2014-10-08 08:42:36 +02:00
Pascal Massimino	f545feee64	don't set the alpha value for histogram index image This leads to tiny extra compression (~few bytes per file) for free Change-Id: Ia4d8cef3de4365e32eacefd69a57689c80042a23	2014-10-08 08:24:19 +02:00
Pascal Massimino	2d9b0a4472	add WebPDispatchAlphaToGreen() to dsp SSE2 version is 2.1x faster This is used to transfer the alpha plane to green channel before lossless compression. Change-Id: I01d9df0051c183b1ff5d6eb69961d4f43e33141a	2014-10-06 23:15:44 +02:00
Vikas Arora	d5e498d47f	Change Entropy based Histogram Combine heuristic. Don't combine the Histograms that have trivial (single valued A, R & B) symbols. Following is the compression savings data along with compression time (before & after) per image. Before After bpp, rate(MP/s) bpp, rate(MP/s) Q=25, method = 4 2.508, 1.807 2.499, 1.916 Q=50, method = 4 2.460, 1.488 2.456, 1.512 Q=75, method = 4 2.452, 1.078 2.450, 1.092 Q=25, method = 5 2.505, 1.398 2.496, 1.383 Q=50, method = 5 2.458, 1.170 2.453, 1.143 Q=75, method = 5 2.453, 0.886 2.450, 0.855 This change provides 0.1-0.4% compression gains and speeds up the lossless compression for the default method=4 (the drop in compression speed is between 1-3.5% for method=5). Change-Id: Idfd88c2092f37afacd26a97097b3053f8183953a	2014-09-30 13:41:39 -07:00
Pascal Massimino	47a2d8e1d9	fix MSVC float->int conversion warning + add a clarifying comment Change-Id: I8ac1df1de2e5277f2d968dec489546e680bb5e0c	2014-09-27 00:36:01 -07:00
James Zern	35ad48b848	HistoHeapInit: correct positions allocation size Change-Id: I1879fd48bee3aea6f0504926d7030b504dd9be07	2014-09-26 11:21:19 -07:00
Pascal Massimino	45d9635fd3	lossless: entropy clustering for high qualities. Tested on 1000 pngs corpus with quality 90-100 it gives ~0.15% improvement in compression density and ~7% speed up. Change-Id: I460f56c96707edb3c1f0b51a024e5122e10458df	2014-09-26 15:26:56 +02:00
Pascal Massimino	dc37df8c7a	fix type warning for VS9_x64 Error report was: src\utils\color_cache.c(48) : warning C4334: '<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?) Change-Id: I93463ba7cd94faf1cf04986acbfaa06b62700d26	2014-09-25 23:47:06 -07:00
Vikas Arora	fdd6528ba2	Remove unused VP8LDecoder member variable Remove the unused VP8LDecoder member variable (last_cached_) Change-Id: I4a7d2f1b72d166efb978850e061dc69c8509e224	2014-09-24 11:59:51 -07:00
James Zern	ea3bba5a66	Merge "rewrite Disto4x4 in enc_neon.c with intrinsic"	2014-09-24 10:51:47 -07:00
Pascal Massimino	f060dfc422	add lossless incremental decoding support * We don't need to change DecodeAlpha, since incremental decoding is not useful for Alpha (we already decode progressively along the RGB) * Similarly, we don't do incremental decoding for level>0 planes: the metadata don't turn into visible pixel (only the ones in level0), so... (No visible speed change) Change-Id: I2fd4b9ba227561a7dbede647686584b752be7baa	2014-09-24 09:55:01 +02:00
Yang Zhang	ab70794ddb	rewrite Disto4x4 in enc_neon.c with intrinsic Performance test: Platform: A9 Input data: bryce.yuv 11158x2156 performance of assembly is the base. Less ratio is better. \|toolchain \|assembly \|intrinsic \| \|gcc4.6 \|100% \|97.15% \| \|gcc4.8 \|100% \|95.51 \| Change-Id: Idc2446685acdeb58a4dbdcdae533c68a83a1b879	2014-09-23 18:28:36 -07:00
Djordje Pesut	d4471637ef	MIPS: dspr2: added optimization for function FilterLoop24 affected functions: VFilter16i, HFilter16i, VFilter8i and HFilter8i Change-Id: I5d2bc7716e60e048a33d630fe4a86011bfb6d42e	2014-09-23 10:32:55 +02:00
skal	2aef54d429	Merge "prepare VP8LDecodeImage for incremental decode"	2014-09-23 00:31:27 -07:00
pascal massimino	aed0f5a231	Merge "MIPS: dspr2: added optimization for function FilterLoop26"	2014-09-23 00:17:25 -07:00
skal	286306853e	prepare VP8LDecodeImage for incremental decode - don't call VP8LClear() when there's no error (and let the caller do it) - only initialize output once if state_ is not READ_DATA - don't over-set dec->status_ = READ_DATA - don't re-set dec->status_ if DecodeImageStream() fails - remove unneeded dec->action_ field - make ReadImageInfo() check br->eos_ - use ErrorStatusLossless() more consistently Change-Id: Ica6e4b1c82e3fce8b1ce0274def551a886b73b0b	2014-09-23 00:13:52 -07:00
skal	248f3aed22	remove br->error_ field it's somewhat redundant with br->eos_ also make the status-check coherent. Change-Id: I98e755e037d45acb0760baf2344bf11fb5fb5cda	2014-09-23 00:04:58 -07:00
Djordje Pesut	49e15044ef	MIPS: dspr2: added optimization for function FilterLoop26 affected functions: VFilter16, HFilter16, VFilter8 and HFilter8 Change-Id: Ib2fc41aaa00b10c2906d689bdc5a10f4568e70a8	2014-09-23 08:46:05 +02:00
skal	c792d4129a	Premultiply with alpha during U/V downsampling This prevents the 'alpha-leak' reported in issue #220 Speed-diff is kept minimal. Change-Id: I1976de5e6de7cfcec89a54df9233c1a6586a5846	2014-09-18 23:40:34 -07:00
Vikas Arora	b901416b90	Record the lossless size stats. Record and show the lossless header and image data sizes in the cwebp. Change-Id: I08f19693cb7a756b6fdce5b55d71f5367b5f02fc	2014-09-17 15:16:05 -07:00
Pascal Massimino	cddd334050	Add a WebPExtractAlpha function to dsp This is the opposite of WebPDispatchAlpha + Implement the SSE2 version Change-Id: I0c297309255f508c5261da8aad01f7e57f924d6c	2014-09-15 08:12:03 +02:00
Pascal Massimino	0716a98eb3	fix indent after I0204949917836f74c0eb4ba5a7f4052a4797833b Change-Id: I5d9e5d0a2ad2cefd8c539571d2eaee948da60ad5	2014-09-12 19:59:53 +02:00
Vikas Arora	f9ced95a9b	Optimize lossless decoding for trivial(ARB) codes. Optimize the decoding for region that have trivial literal codes. The trivial literal is defined as huffman image with Red, Blue and Alpha huffman trees with only single code values. This speeds up lossless decoding by 3% Change-Id: I0204949917836f74c0eb4ba5a7f4052a4797833b	2014-09-12 09:08:08 -07:00
Pascal Massimino	690b491af1	fix loop bug in DispatchAlpha() * We were re-doing most of the work in plain-C as 'left-over'. * we were always returning has_alpha = true because of a bad mask all_0xff These bugs were conservative and silent, in the sense that we were 'just' doing more work than necessary. Now, the SSE2 version is really 2x faster than the C version. Change-Id: I6c8132a267fe3c7a3d1fa70e7a5fcd10719543fa	2014-09-11 22:35:08 +02:00
Djordje Pesut	3101f53720	MIPS: dspr2: added optimization for TransformOne added macros for TransformOne, TransformAC3 and TransfromDC Change-Id: I4341450f443cf46dcf91c0db17bde63c8fb8afee	2014-09-11 17:02:02 +02:00
Pascal Massimino	a6bb9b17d8	SSE2 for inverse Mult(ARGB)Row and ApplyAlphaMultiply Change-Id: Iab5c0e4a4d2b31f86736a9b277e62b6e28c3d2b4 WebPMultRow: ~7x faster WebPMultARGBRow: ~3x faster ApplyAlphaMultiply: 60% faster	2014-09-11 07:58:42 +02:00
Vikas Arora	d84a8ffdf7	Remove default initialization of decoder status. emove the default initialization of decoder status in the method VP8LDecodeImage(). Change-Id: Ie6b949606349f4e937c4c1dd2c02ff2a4f86870f	2014-09-10 14:55:46 -07:00
Vikas Arora	e0a9932161	Rectify bug in lossless incremental decoding. Handle the corner case when VP8LDecodeImage() method is called with an invalid header data. The lossless decoding doesn't support incremental mode yet. Return the error status as BITSTREAM error in case not all pixels are decoded with the provided bit-stream. Also added asserts in the VP8LDecodeImage() method to validate the decoder header with appropriate/valid data for huffman trees (htree_groups_ etc). Change-Id: Ibac9fcfc4bd0a2c5f624bb9d4a2b9f6459aa19ea	2014-09-09 15:34:16 -07:00
Djordje Pesut	e2502a97c1	MIPS: dspr2: added optimization for TransformAC3 Change-Id: Icd789ee5f6d764297e7dc0a0f8a3bc47ab92ac65	2014-09-09 14:53:36 +02:00
Djordje Pesut	24e1072aac	MIPS: dspr2: added optimization for TransformDC Change-Id: Iee69758f6442ea9c80ddaa32cea8d00dda4c6252	2014-09-09 14:15:04 +02:00
Pascal Massimino	c0e84df8e8	Merge "Slightly faster lossless decoding (1%)"	2014-09-09 03:55:00 -07:00
Pascal Massimino	8dd28bb560	Slightly faster lossless decoding (1%) -> introduce special case 64b pattern-copy, similar to the 8b one for alpha. -> use mempcy() for non-overlapping areas + cosmetics and homogenezation of the code Change-Id: I0e65e04b96fec94c009a4614137dfba2a0f98561	2014-09-09 11:18:30 +02:00
Djordje Pesut	f0103595dd	MIPS: dspr2: added optimization for ColorIndexInverseTransforms Change-Id: I5b6094ce489d4f896bc4b8f575142eb3c5054beb	2014-09-08 17:22:59 +02:00
Pascal Massimino	d3242aee16	make VP8LSetBitPos() set br->eos_ flag ReadSymbol() finishes with a VP8LSetBitPos() call only and could miss an eos_ during the decode loop. Things are faster because of inlining too. Change-Id: I2d2a275f38834ba005bc767d45c5de72d032103e	2014-09-06 08:40:20 +02:00
Pascal Massimino	a9decb5584	Lossless decoding: fix eos_ flag condition eos_ needs to be set only when superfluous bits have actually been requested. Earlier, we were assuming pre-mature end-of-stream to be an error. Now, more precisely, we mark error when we have encountered end-of-stream and we attempt to read more bits after that. This handles cases where image data requires no bits to be read Change-Id: I628e2c39c64f10c443fb51f86b1f5919cc9fd299	2014-09-05 20:21:50 +02:00

... 3 4 5 6 7 ...

1659 Commits