libwebp

mirror of https://github.com/webmproject/libwebp.git synced 2025-10-21 18:01:59 +02:00

Author	SHA1	Message	Date
Vincent Rabaud	a5e3b22574	Lossless decoder SSE2 improvements. Change-Id: Ia901014ac63156a2e278b81e035256c30bdf8706	2016-12-06 13:45:09 +01:00
Pascal Massimino	58a1f124c2	~2% faster predictor #12 in NEON. Change-Id: I6772bb865d0f72720a65561eb55028e538df236d	2016-12-06 10:24:27 +01:00
Pascal Massimino	906c3b6392	Merge "Implement lossless transforms in NEON."	2016-12-03 16:55:14 +00:00
Vincent Rabaud	d23abe4e9f	Implement lossless transforms in NEON. Change-Id: I2172b1a763eb9dfe25d2b9bf1fb6501d7e192e55	2016-12-03 11:20:22 +00:00
Vincent Rabaud	2e6cb6f34e	Give more flexibility to the predictor generating macro. Change-Id: Ia651afa8322cb5c5ae87128340d05245c0f6a900	2016-12-02 12:33:12 -08:00
Vincent Rabaud	28e0bb7088	Merge "Fix race condition in multi-threading initialization."	2016-12-02 17:45:10 +00:00
Vincent Rabaud	647045305a	Fix race condition in multi-threading initialization. Before, a first thread could enter VP8LDspInitSSE2, set VP8LPredictorsAdd to an SSE2 version BEFORE another thread would do the memcpy from VP8LPredictorsAdd to VP8LPredictorsAdd_C thus leading to a C version actually being the SSE2 one (which would then create an infinite recursion in the SSE2 predictors at execution). Change-Id: I224f4ceab31d38f77a1375a7e2636a6014080e3a	2016-12-02 18:28:57 +01:00
Hui Su	1cc79e92ac	AnimEncoder: Correctly skip a frame when sub-rectangle is empty. Change-Id: I0d288bd9561b48cf5a1eae92a1b7106ba44c664e	2016-12-02 11:50:13 +01:00
Pascal Massimino	ea72cd60cb	add missing 'extern' keyword for predictor dcl Change-Id: Ibf3db9b6dae91e53524c31cdfccf4678b3fa1135	2016-12-01 08:15:14 +01:00
Vincent Rabaud	67879e6d48	SSE implementation of decoding predictors. Change-Id: I5c9ae63afc98013cb45ce8a91f051203ac68402c	2016-11-30 12:00:07 +01:00
Vincent Rabaud	a41296aef5	Fix potentially uninitialized value. Change-Id: I721695e22474992db3094942b1ad4754ae7c0a02	2016-11-29 13:19:32 +01:00
Vincent Rabaud	4239a1489c	Make the lossless predictors work on a batch of pixels. Change-Id: Ieaee34f1f97c375b9e97ef7e9df60aed353dffa1	2016-11-28 17:12:10 +01:00
Pascal Massimino	bc18ebad2e	fix extra 'const's in signatures Change-Id: Ie433d0defbc0c6feae2eb2f11e70082f1affada8	2016-11-25 09:45:52 +01:00
Vincent Rabaud	71e2f5cadf	Remove memcpy in lossless decoding. Change-Id: Iba694b306486d67764e2fc5576c98a974c9b886c	2016-11-24 17:45:24 +01:00
Vincent Rabaud	7474d46e45	Do not use a register array in SSE. Change-Id: I79cf95bdac1164fc4de899828e9380c23df8d141	2016-11-24 13:06:44 +01:00
Owen Rodley	67748b41db	Improve latency of FTransform2. Benchmarks from vrabaud@: 8BIT/GRAY corpus speed: faster: -4.3 % , corpus size: unchanged skal/sources_png_skal corpus speed: faster: -5.2 % , corpus size: unchanged images/png_rgb corpus speed: faster: -5.1 % , corpus size: unchanged images/lpcb corpus speed: unchanged, corpus size: unchanged images/png_big corpus speed: faster: -1.7 % , corpus size: unchanged images/png_doc corpus speed: unchanged, corpus size: unchanged images/png_1bit corpus speed: faster: -1.2 % , corpus size: unchanged images/jpeg_small corpus speed: unchanged, corpus size: unchanged images/icip_core1 corpus speed: unchanged, corpus size: unchanged images/png_gray corpus speed: faster: -2.5 % , corpus size: unchanged images/jpeg_high_quality corpus speed: faster: -4.0 % , corpus size: unchanged images/jpeg corpus speed: faster: -2.3 % , corpus size: unchanged images/png_translucent corpus speed: faster: -2.8 % , corpus size: unchanged images/gif corpus speed: faster: -1.4 % , corpus size: unchanged images/png_opaque corpus speed: faster: -2.8 % , corpus size: unchanged images/png_rgb_opaque corpus speed: unchanged, corpus size: unchanged images/png_indexed corpus speed: faster: -2.0 % , corpus size: unchanged images/all corpus speed: faster: -1.5 % , corpus size: unchanged images/png_small corpus speed: unchanged, corpus size: unchanged images/png corpus speed: unchanged, corpus size: unchanged images/gif_still corpus speed: faster: -1.6 % , corpus size: unchanged Change-Id: I69fe11baa188c5d32cbc77a84b8c0deae13d792b	2016-11-24 07:09:50 +00:00
Vincent Rabaud	6540cd0eeb	Provide an SSE implementation of ConvertBGRAToRGB Change-Id: Ida11b079077a47fe3b92754f08aa30d81c301fcf	2016-11-23 16:25:51 +01:00
Pascal Massimino	3c2a61b099	remove some unneeded casts Change-Id: Ie68788c77f016ed11446a55142b1bd8d96261452	2016-11-16 22:54:40 -08:00
Pascal Massimino	9ac063c37f	add dsp functions for SmartYUV + SSE2 implementation Change-Id: I5cfdb62d68b5a95899241a097d3a2f697fbc590e	2016-11-16 14:23:06 +00:00
Pascal Massimino	22efabddb4	Merge "smart_yuv: switch to planar instead of packed r/g/b processing"	2016-11-15 14:55:17 +00:00
Pascal Massimino	1d6e7bf39f	smart_yuv: switch to planar instead of packed r/g/b processing avoiding triplets of data should make it easier to write SSE2 versions. FilterRow() can now filter all input in one single pass -> conversion is 15-20% faster (but still overall slow compared to -pre 0) Change-Id: I14c3215e672fdecde7ec80394e814bdc7445019f	2016-11-15 14:51:34 +01:00
Pascal Massimino	0a3838ca77	fix bug in RefineUsingDistortion() When try_both_modes=0 (that is: -m 0 or -m 1), and the mode is i4, we were still sometimes falling back to (unexplored, uninitialized) i16 mode, which resulted in a enc/dec mismatch. This was mainly occurring for large images (when bit_limit is low enough) We disable the fall-back by disabling bit_limit using a large MAX_COST threshold. Change-Id: I0c60257595812bd813b239ff4c86703ddf63cbf8	2016-11-12 02:15:28 -08:00
James Zern	342e15f0ce	Import: use relative pointer offsets avoids int rollover when working with large input BUG=webp:312 Change-Id: I6ad9f93b6c4b665c559bff87716a7b847f66a20d	2016-11-07 17:08:13 -08:00
James Zern	1147ab4ee7	PreprocessARGB: use relative pointer offsets avoids int rollover when working with large input BUG=webp:312 Change-Id: I2881bec2884b550c966108beeff1bf0d8ef9f76b	2016-11-07 17:08:06 -08:00
Pascal Massimino	e4cd4daf74	fix filtering auto-adjustment the min-distortion was quite too low. And we were also considering the fully skipped macroblocks (nz=0) in the stats. We need to have at least some non-zero dc coeffs (nz=0x100XXXX). Fix also two typos in StoreMaxDelta: the v0/v1 comparison was wrong, and the DCs[] coeffs are actually already in ZigZag order. Change-Id: I602aaa74b36f7ce80017e506212c7d6fd9deba1f	2016-11-07 06:43:51 -08:00
Pascal Massimino	e715285611	fix doc and code snippet for WebPINewDecoder() doc Change-Id: I1a75fdf60f0b9f1816be28f22613438bfe21752b	2016-11-04 12:07:54 +01:00
James Zern	de9fa5074e	ConvertWRGBToYUV: use relative pointer offsets avoids int rollover when working with large input BUG=webp:312 Change-Id: I693cbb295df9cf94aa89294b19c0496bdbe84d18	2016-11-04 00:35:04 -07:00
James Zern	deb1b83199	ImportYUVAFromRGBA: use relative pointer offsets avoids int rollover when working with large input BUG=webp:312 Change-Id: I3d7b689be8d5751248a82d1021243d80d3f67203	2016-11-04 00:34:58 -07:00
Pascal Massimino	31b1e34342	fix SSIM metric ... by ignoring too-dark area Roughly, if both the source and the reference areas are darker too dark (R/G/B <= ~6), they are ignored. One caveat: SSIM calculation won't work for U/V planes, which are 128-centered and not related to luminance. But WebPPlaneDistortion() enforces the conversion to RGB, if needed. Change-Id: I586c2579c475583b8c90c5baefd766b1d5aea591	2016-10-20 15:17:55 +02:00
Pascal Massimino	2f51b614b0	introduce WebPPlaneDistortion to compute plane distortion Make WebPPictureDistortion() only compute distortion on A/R/G/B planes, not Y/U/V(A). (not just for SSIM, but PSNR too). This is to avoid problems with using SSIM on U/V channels. If Y/U/V distortion is needed, one can always use WebPPlaneDistortion() individually. Change-Id: If8bc9c3ac12a8d2220f03224694fc389b16b7da9	2016-10-19 09:12:13 +02:00
Pascal Massimino	4eb5df28d1	remove unused stride fields from VP8Iterator Change-Id: I242aaa746dc53c456eb8f1a71a5a2378f26fa843	2016-10-10 18:08:47 +02:00
Vincent Rabaud	11bc423ae5	MIN_LENGTH cleanups. No change in logic so no change in speed or compression. Change-Id: I744161978c7d058c9b58450f330cba11731530c6	2016-10-10 15:37:45 +02:00
Pascal Massimino	273d035a44	Merge "fix a typo in WebPPictureYUVAToARGB's doc"	2016-10-10 13:30:20 +00:00
Pascal Massimino	dc789ada44	fix a typo in WebPPictureYUVAToARGB's doc method -> colorspace Change-Id: I5c9a2ccc909c967a936758dde2cfce92eb95462a	2016-10-10 04:50:10 -07:00
Vincent Rabaud	539f5a688f	Fix non-included header in config.c. When compiling as experimental, WEBP_EXPERIMENTAL_FEATURES would not be defined because the header defining it would not be included. Hence runtime errors in debug mode when running: ./cwebp -lossles whatever ... Error! Cannot encode picture as WebP Error code: 4 (INVALID_CONFIGURATION: configuration is invalid) (detail: WebPConfig would have a random value set for delta_palettization as config.c does not consider it to exist.) Change-Id: I41761cffe81a971130ed514b195a73d1c6dac1b7	2016-10-10 13:39:17 +02:00
Pascal Massimino	aaf2a6a698	systematically call WebPDemuxReleaseIterator() on dec->prev_iter_ Change-Id: I4a767134dcc52a7ee7c3bc5deb91012eaf7b6512	2016-10-07 17:30:58 -07:00
hui su	68ae5b671f	Add libwebp/src/mux/animi.h Change-Id: I80ca2070d419acf6e8355a295ee965d2df5a4d8f	2016-10-05 10:33:29 -07:00
Vincent Rabaud	28ce304344	Remove some errors when compiling the code as C++. This fixes some cases from https://bugs.chromium.org/p/webp/issues/detail?id=137 Change-Id: I58f3a617bf973dbe4c5794004a01e2aea39ba53a	2016-10-05 09:39:08 +02:00
hui su	b34abcb8b1	Favor keeping the areas locally similar in spatial prediction mode selection About 0.1% compression improvement. Change-Id: If106ab209cc2671ef282b726e09ff2971c3e4abf	2016-10-04 16:28:24 -07:00
Pascal Massimino	ba843a92e7	fix some SSIM calculations * prevent 64bit overflow by controlling the 32b->64b conversions and preventively descaling by 8bit before the final multiply * adjust the threshold constants C1 and C2 to de-emphasis the dark areas * use a hat-like filter instead of box-filtering to avoid blockiness during averaging SSIM distortion calc is actually faster now in SSE2, because of the unrolling during the function rewrite. The C-version is quite slower because still un-optimized. Change-Id: I96e2715827f79d26faae354cc28c7406c6800c90	2016-10-04 01:09:07 -07:00
Vincent Rabaud	f79450ca02	Speedup ApplyMap. If a small hash map can be used, use it to avoid binary search. This fist hash function that is tried works with the previous use case of having indexed data in green. Change-Id: I2f91cec5f3ca7e9c393fd829e69e09bab74f4e7c	2016-09-28 17:18:08 +02:00
Pascal Massimino	cfdda7c6bf	Merge "prevent 32b overflow for very large canvas_width / height"	2016-09-28 15:09:58 +00:00
Vincent Rabaud	30d43706d3	Speed-up Combined entropy for palettized histograms. Change-Id: Ie9bdebb26c726e5b44c2dbcc84d453f85a03f419	2016-09-28 13:22:13 +02:00
Pascal Massimino	86a84b3598	2x faster SSE2 implementation of SSIMGet Change-Id: I53705d7ddfa595389ff2d542e5088f96f948d351	2016-09-23 23:23:06 -07:00
James Zern	b8384b53d6	lower WEBP_MAX_ALLOCABLE_MEMORY default restrict to 2^34 for 64-bit targets, < 2^32 for 32-bit Change-Id: Iff4ce40ae2c3c7fc119f018c2128dbe8f744341f	2016-09-22 23:13:33 -07:00
Pascal Massimino	1c36440094	prevent 32b overflow for very large canvas_width / height some multiplies here and there needed some extra checks and error reporting. Even if width * height is guaranteed to be < 2**32, we were multiplying by num_channels and triggering a 32b overflow. Some multiplies were not using size_t or uint64_t, additionally. Change-Id: If2a35b94c8af204135f4b88a7fd63850aa381bbf	2016-09-23 05:19:32 +00:00
Vincent Rabaud	5f1caf2987	Small LZ77 speedups. The most common conditions are re-ordered and cached. iter_min was recently introduced to make sure enough iterations are made in cases where there are many matches (mostly uniform regions). Now that those are properly analyzed, it becomes useless. Change-Id: Id3010ee4ec66b84d602fcb926f91eb9155ad27f4	2016-09-22 14:03:25 +02:00
hui su	a2fe9bf404	Speedup TrellisQuantizeBlock(). -Skip examining quantized levels that are too high. -Calculate last_pos_cost only when needed. Encoding speed for m6 is increased by about 3%; Compression performance is neutral. Change-Id: I8af70b049587cca0375d9b3eb00479ec7c0c842a	2016-09-20 14:54:15 -07:00
Pascal Massimino	573cce270e	smartYUV improvements * switch to Rec709 transfer function in SmartYUV * use Rec709 for Gray evaluation too. * stop iterations if error is going up See paragraph 1.2 and 3.2: https://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.709-6-201506-I!!PDF-E.pdf (digest: https://en.wikipedia.org/wiki/Rec._709#Transfer_characteristics) suggested by pdknsk@gmail.com on the mailing list Change-Id: I12b5f4d3e318dd5134984e1c0a4b244a620a57d7	2016-09-20 06:14:31 +00:00
Pascal Massimino	21e7537abe	fix infinite loop in case of PARTITION0 overflow max_i4_header_bits_ could drop to zero for difficult image and trigger a loop. Surprisingly, StatLoop() didn't have this bug. Change-Id: Idc0f9eadef30a2b2f02041b994f25def30901e36	2016-09-15 02:38:28 -07:00

1 2 3 4 5 ...

2253 Commits