libwebp

mirror of https://github.com/webmproject/libwebp.git synced 2025-08-11 02:20:33 +02:00

Author	SHA1	Message	Date
James Zern	77fb41c2f1	dec/vp8l/DecodeAlphaData: remove redundant cast 'pos' has been an int since: `c34307a` fix some VS9 warnings about type conversion Change-Id: I56195d4f15278fa268be52a7bfe24b94554890c4	2015-08-18 18:52:43 -07:00
Jyrki Alakuijala	90fcfcd905	Insert less hash chain entries from the beginnings of long copies. This makes the chains more efficient and a larger variety of data is tested. 0.02 % compression gain at q 100, 0.05 % at default quality. 0.8 % speedup by callgrind. 0.16 % compression gain for lossy alpha ?! Change-Id: I888120133352799eb14f5f602c7f40ab404bd665	2015-08-18 18:44:03 -07:00
skal	bd55604d1b	SSE2: add yuv444 converters, re-using yuv_sse2.c Change-Id: I4d5c9df8a4c8e8cb8b5daa537af07382894503a8	2015-08-17 21:15:37 -07:00
skal	3ec1182768	use the DispatchAlpha() call from dsp it's used in YUVA->RGBA case (quite frequent). Change-Id: Ie88f8c7f74cd274b3c6cbe81506f4425c164c7b3	2015-08-17 18:54:39 -07:00
skal	c5f00621c7	incorporate bzero() into WebPRescalerInit() instead of call site Change-Id: I9ebb83e643e24bc685a1a1cb6836cb54e34a0ec8	2015-08-14 19:37:22 -07:00
Pascal Massimino	3ebcdd4133	remove duplicate "#include <stdlib.h>" Change-Id: I01b23efb1229e7dd96c6e15c4385064ad10a575a	2015-08-14 12:33:45 -07:00
James Zern	24a9693223	dec: allow 0 as a scaling dimension this allows scaling to a particular width/height while preserving the source aspect ratio using WebPRescalerGetScaledDimensions(). Change-Id: I77b11528753290c1e9bb942ac761c215ccfb8701	2015-08-13 20:58:17 -07:00
James Zern	b918724280	utils/rescaler: add WebPRescalerGetScaledDimensions + use it in WebPPictureRescale() Change-Id: I491bea8cd56f0eb1ac8bf0829b9f36c77804219a	2015-08-13 20:50:38 -07:00
Pascal Massimino	020fd099f6	Merge "WebPPictureDistortion: support ARGB format for 'pic' when computing distortion."	2015-08-12 15:37:27 +00:00
skal	56a2e9f5e7	WebPPictureDistortion: support ARGB format for 'pic' when computing distortion. using a *tmp_plane buffer to split a/r/g/b planes up appeared to be the easiest route, compared to copy-pasting the whole code and making it x_stride aware... Change-Id: I0898ef1df62bd3e1713b77187b31b5eeef3832fe	2015-08-11 17:28:29 -07:00
James Zern	c2f9dc06cf	bit_writer: convert VP8L macro values to immediates allows the values to be used in preproc checks, fixing a -Wunreachable-code warning in 64-bit builds where VP8L_WRITER_BITS != 16 Change-Id: Ie98dff4e8ef896436557c64d5da2c5d70228a730	2015-08-10 20:35:22 -07:00
Jyrki Alakuijala	b969f888ab	Reduce magic in palette reordering Slightly faster on -m 0 -q 0, particularly for small images (50 x 75 image was 0.1 % faster on callgrind measurement). Increases compression density by 0.005 % for the 1000 images, but small images can improve even 0.5 % (about 4 bytes, depending on the characteristics of the palette). Change-Id: I94f568d396ac62a054a829abeeef3eb0af6b3f94	2015-08-10 19:06:07 -07:00
James Zern	155c1b222b	Merge changes I76f4d6fe,I45434639 * changes: lossless_enc_neon: add VP8LTransformColor lossless_neon: add VP8LTransformColorInverse	2015-08-06 23:00:03 +00:00
Djordje Pesut	717e4d5a7c	mips32/mipsDSPr2: function ImportRow rebased Change-Id: Id58d266040fdb5fe1e507cd0f6370ea625156e4d	2015-08-06 17:09:10 +02:00
Pascal Massimino	7df93893dc	fix rescaling bug (uninitialized read, see bug #254 ). the x_add/x_sub increments were wrong for u/v in the upscaling case. They shouldn't be left to the caller's discretion, but set up by WebPRescalerInit to their exact necessary values. -> Cleaned-up WebPRescalerInit() param list. -> added safety asserts -> removed the mips32/mips_r2 variant of "ImportRow" which were buggy prior Change-Id: I347c75804d835811e7025de92a0758d7929dfc09	2015-08-05 23:00:00 -07:00
James Zern	5cdcd561e2	lossless_enc_neon: add VP8LTransformColor based on SSE2, ~32% faster Change-Id: I76f4d6fe456baceba46ffebf2f699e98691eefdf	2015-08-05 00:15:13 -07:00
James Zern	a53c336919	lossless_neon: add VP8LTransformColorInverse based on SSE2, only ~11% faster Change-Id: I45434639d81e153f01f77c1f5d2da510b542170e	2015-08-04 23:22:36 -07:00
James Zern	99131e7f8c	Merge changes I9fb25a89,Ibc648e9e * changes: lossless_neon: remove predictors 5-13 ll_enc_neon: enable VP8LSubtractGreenFromBlueAndRed	2015-08-04 02:24:15 +00:00
Pascal Massimino	c455676680	simplify the main loop for downscaling (part of bug #254 investigation) no speed change observed. Change-Id: Ie21b33171def367f37643fef6a0bd378e49468c7	2015-08-03 16:57:35 +02:00
James Zern	2a010f992a	lossless_neon: remove predictors 5-13 operating on single uint32's isn't helped by NEON. this improves aarch64 performance by ~4% Change-Id: I9fb25a8962de7b80e893e756ee7c76393cfd40c7	2015-07-28 19:44:58 -07:00
James Zern	ca221bbc48	ll_enc_neon: enable VP8LSubtractGreenFromBlueAndRed this moves the function outside the WEBP_USE_INTRINSICS check. there's no alternative version and it's ~54% faster at the function level and mildly faster overall Change-Id: Ibc648e9ee35021d48901e05aa596aa01067796a2	2015-07-28 19:44:45 -07:00
Jyrki Alakuijala	01d61fd9c6	lossless: ~20 % speedup 0.28 % byte size increase on lossless, 0.18 % increase on lossy alpha Change-Id: I1e001a56831a8f996ac522aa646f9ae587c80d12	2015-07-20 17:13:44 -07:00
Jyrki Alakuijala	f722c8f0bd	lossless: Speed up ComputeCacheEntropy by 40 % a total impact of 1 % on encoding speed This allows for performance neutral removal of the binary search in cache bits selection. This will give a small improvement in compression density. Change-Id: If5d4d59460fa1924ce71af977320834a47c2054a	2015-07-20 17:13:44 -07:00
Pascal Massimino	1ceecdc871	add a VP8LColorCacheSet() method for color cache Change-Id: Iebdc0383474fc3b8fbb0d7da4a35a0a7061bb9b5	2015-07-20 17:13:43 -07:00
Jyrki Alakuijala	17eb609916	lossless: Allow copying from prev row in rle-mode. 0.21 % compression density improvement for 1000 png corpus in lossless mode 0.50 % compression density improvement for 1000 png corpus in lossy mode Change-Id: I14ee8c427ae5d3e116b0ee6695fcdea3321a319d	2015-07-20 17:13:43 -07:00
Jyrki Alakuijala	f3a7a5bf76	lossless: bit writer optimization valgrind --tool=callgrind shows a 9 % speedup: 1021201984 ticks before vs. 927917709 after -q 0 -m 0 -lossless ~/alpi/1.png 22.040 MP/s before 24.796 MP/s after Change-Id: Iaab928167b3e20fb0d9401c6f8317a26c5a610b4	2015-07-20 16:18:40 -07:00
James Zern	d97b9ff755	Merge changes from topic 'lossless-enc-improvements' * changes: lossless: combine the Huffman code with extra bits lossless: Inlining add literal lossless: simplify HashChainFindCopy heuristics lossless: 0.5 % compression density improvement lossless: Add zeroes into the predicted histograms. lossless: encoding, don't compute unnecessary histo lossless: Remove about 25 % of the speed degradation Faster alpha coding for webp lossless: rle mode not to accept lengths smaller than 4. lossless: Less code for the entropy selection lossless: 0.37 % compression density improvement	2015-07-20 19:38:42 +00:00
James Zern	0250dfcc19	msvc: fix pointer type warning in BitsLog2Floor _BitScanReverse() takes an unsigned long* http://msdn.microsoft.com/en-us/library/fbxyd7zd.aspx fixes: C4057: 'function': 'unsigned long ' differs in indirection to slightly different base types from 'uint32_t ' fixes issue #253 Change-Id: I0101ef7be18c7ed188b35e9b17e7f71290953786	2015-07-18 11:12:21 -07:00
Jyrki Alakuijala	52931fd548	lossless: combine the Huffman code with extra bits gives 2 % speedup 24.9 -> 25.5 MP/s for a photo with -q 0 -m 0 Change-Id: If9ae04683a86dd7b1fced2183cf79b9349a24a9e	2015-07-07 20:24:28 -07:00
Jyrki Alakuijala	c4855ca249	lossless: Inlining add literal this is a simple speedup of about 1-2 % Change-Id: I0c7b01c0a69f4aeaf363ffda05a28871f1def696	2015-07-07 20:24:28 -07:00
Jyrki Alakuijala	8e9c94dedb	lossless: simplify HashChainFindCopy heuristics for small speedup 0.0003 % worse compression Change-Id: Ic4b6b21e5279231c6321f2cec1c79f7e17e56afa	2015-07-07 20:24:27 -07:00
Jyrki Alakuijala	888429f409	lossless: 0.5 % compression density improvement do not do length 2 matches far away speedup for non compressible data by inserting two literals at a time when no matches are found Change-Id: Ia8e033071f4186bb8148bb2bf13ca37586734aa3	2015-07-07 20:24:27 -07:00
Jyrki Alakuijala	7b23b19808	lossless: Add zeroes into the predicted histograms. Increases compression density by 0.03 % for lossy. Speeds up at least one of the lossy alpha images by 20 %. Palette entropy 'kludge' seems to save 1-2 % on alpha images. Change-Id: I2116b8d81593ac8173bfba54a7c833997fca0804	2015-07-07 20:24:27 -07:00
Jyrki Alakuijala	85b44d8a69	lossless: encoding, don't compute unnecessary histo share the computation between different modes 3-5 % speedup for lossless alpha 1 % for lossy alpha no change in compression density Change-Id: I5e31413b3efcd4319121587da8320ac4f14550b2	2015-07-07 20:24:26 -07:00
Jyrki Alakuijala	d92453f381	lossless: Remove about 25 % of the speed degradation introduced in: "lossless: 0.37 % compression density improvement" Uses the statistics of red and blue histograms to decide if to run cross color correction at all. Improves compression density by 0.02 % or so. Change-Id: I47429557e9cdbd9fa90c584696f241b17427d73f	2015-07-07 20:24:26 -07:00
Jyrki Alakuijala	2cce031704	Faster alpha coding for webp No significant size degradation (+0.001 %) for 1000 image corpus Fixes the 8 ms vs 2 ms degradation from: "lossless: 0.37 % compression density improvement" Change-Id: Id540169a305d9d5c6213a82b46c879761b3ca608	2015-07-07 20:24:25 -07:00
Jyrki Alakuijala	5e75642efd	lossless: rle mode not to accept lengths smaller than 4. Gives a compression gain of 0.22 % Change-Id: I0f3b8dad6b4c1bfb16eab095a467f34466b9e3b7	2015-07-07 20:24:25 -07:00
Jyrki Alakuijala	84326e4ab0	lossless: Less code for the entropy selection Tested: 1000 png corpus gives same results Change-Id: Ief5ea7727290743b9bd893b08af7aa7951f556cb	2015-07-07 20:24:25 -07:00
Jyrki Alakuijala	16ab951abf	lossless: 0.37 % compression density improvement counting the entropy expectation for five different configurations: palette non-predicted non-predicted with subtract green predicted predicted with subtract green and choose the strategy with the smallest expected entropy Change-Id: Iaaf209c0d565660a54a4f9b3959067afb9951960	2015-07-07 20:24:24 -07:00
James Zern	822f113ebb	add WebPFree() to the API this should be used in preference to free() for releasing memory returned from WebPDecode() / WebPEncode(). this simplifies memory management when working through language bindings Change-Id: I15eb538a45390efc552fda8e5c251a3fbdc13c29	2015-07-06 23:27:51 -07:00
Pascal Massimino	0ae2c2e4b2	SSE2/SSE41: optimize SSE_16xN loops After several trials at re-organizing the main loop and accumulation scheme, this is apparently the faster variant. removed the SSE41 version, which is no longer faster now. For some reason, the AVX variant seems to benefit most for the change. Change-Id: Ib11ee18dbb69596cee1a3a289af8e2b4253de7b5	2015-07-02 20:55:04 +02:00
James Zern	39216e59d9	cosmetics: fix indent after `32462a07` Change-Id: If9a5d91c25e981bc4cd81adb476244e63fc7c3c8	2015-07-01 23:49:20 -07:00
James Zern	559e54ca60	Merge "SSE2: slightly faster FTransformWHT"	2015-07-02 06:36:33 +00:00
Pascal Massimino	8ef9a63b45	SSE2: slightly faster FTransformWHT goes from 0.3% to 0.1% overall CPU time, but... Change-Id: I4c9a92b1e1d6b58ed57c6b890366f1dbeaf84f84	2015-07-01 23:03:17 -07:00
James Zern	f27f773576	lossless_neon: enable VP8LAddGreenToBlueAndRed this moves the function outside the WEBP_USE_INTRINSICS check. there's no alternative version and it's ~70% faster at the function level and 1-2% faster overall Change-Id: I59fb4918ec86b1ac3a47cbd5d05ce62f007461cb	2015-07-01 22:50:54 -07:00
Pascal Massimino	36e9c4bc50	SSE2: minor cosmetrics on in-loop filter code Change-Id: Ic0e6502081d7063bb2841df74e05c450d708aaf2	2015-06-28 11:59:22 +02:00
James Zern	4741fac42e	dsp/lossless_*sse2: remove some unnecessary inlines TransformColor / TransformColorInverse are the top-level function pointer calls Change-Id: Ieabdb4005ff3e4f9bb3ebcb140ccb6bef5d28f8b	2015-06-25 21:02:01 -07:00
Pascal Massimino	1819965e0a	fix warning ("left shift of negative value") using a cast Change-Id: Ie99e8ff87924a1d15e2c5d83bd9adf07dab04e94	2015-06-24 23:46:09 -07:00
Pascal Massimino	7017001462	SSE2: speed-up some lossless-encoding functions optimized: CollectColorRedTransforms, CollectColorBlueTransforms, SubtractGreenFromBlueAndRed overall effect is sub-1% speed-up, though. Change-Id: I9cb49af5c56e4c03db417929b0a2cf575d60a5c6	2015-06-24 20:09:13 -07:00
Pascal Massimino	abcb012841	Merge "SSE2: slightly faster (~5%) AddGreenToBlueAndRed()"	2015-06-24 09:37:46 +00:00

... 5 6 7 8 9 ...

2137 Commits