libwebp

mirror of https://github.com/webmproject/libwebp.git synced 2025-08-11 02:20:33 +02:00

Author	SHA1	Message	Date
Pascal Massimino	cced974bb2	remove _mm_set_epi64x(), which is too specific Change-Id: I4b1035f9c548b804f31c68a00b0a1aa8e13550bb	2015-09-25 14:35:33 -07:00
Pascal Massimino	56668c9fc5	fix warnings about uint64_t -> uint32_t conversion Change-Id: Iee027979b404d4b7edda506b844d354aa1026dae	2015-09-25 17:36:11 +02:00
Pascal Massimino	76a7dc39e5	rescaler: add some SSE2 code The rounding and arithmetic is not the same as previously, to prevent overflow cases for large upscale factors. We still rely on 32b x 32b -> 64b multiplies. Raised the fixed-point precision to 32b so that we have some nice shifts from epi64 to epi32. Changed rescaler_t type to 'uint32_t' in order to squeeze in all the precision required. The MIPS code has been disabled because it's now out-of-sync. Will be fixed in a subsequent CL when the dust settles. ~30-35% faster Change-Id: I32e4ddc00933f1b1aa3463403086199fd5dad07b	2015-09-25 15:07:13 +02:00
James Zern	1df1d0eedb	rescaler: harmonize function protos Change-Id: I13b5f9add83c1225c82a650f3ef717582b057247	2015-09-19 22:57:25 -07:00
Pascal Massimino	9ba1894b9b	rescaler: simplify ImportRow logic incorporates the loop over 'channel' and removes one parameter Change-Id: I4e3b33c111ca825fe96461583420413b17326409	2015-09-19 10:07:26 -07:00
Pascal Massimino	5ff0079ece	fix rescaler vertical interpolation * vertical expansion now uses bilinear interpolation * heavily assumes that the alpha plane is decoded in full, not row-by-row * split the RescalerExportRow and RescalerImportRow methods into Shrink and Expand variants. * MIPS implementation of ExportRowExpand is missing. There's room for extra speed optim and code re-org, but let's keep that for later patches. addresses https://code.google.com/p/webp/issues/detail?id=254 Change-Id: I8f12b855342bf07dd467fe85e4fde5fd814effdb	2015-09-18 17:32:11 -07:00
James Zern	cd82440ec7	VP8LAllocateHistogramSet: align histogram[] entries fixes issue #262: a SIGBUS when accessing a misaligned double in VP8LHistogram Change-Id: Ic78cc5366d7e43d892c375b6a69dce2379db931b	2015-09-17 22:59:01 -07:00
Pascal Massimino	a406b1dda8	Merge "fix memory over-allocation in lossless rescaler init"	2015-09-15 18:52:06 +00:00
Pascal Massimino	0fde33e322	add missing const in VP8InitFrame signature Change-Id: Ibed259ac8e794bd98960f65ba6544d480e7a1806	2015-09-14 23:55:02 -07:00
Pascal Massimino	ac7d5e8d76	fix memory over-allocation in lossless rescaler init num_channels was not needed in sizeof(*scaled_data) Change-Id: Ie9ff31d7c1a262520fe1aac81dc57b53cb07bace	2015-09-14 02:11:10 -07:00
Pascal Massimino	017f8cccec	Loosen the buffer size checks for Y/U/V/A too. (follow-up to `15ca5014`) Change-Id: Ia122e96f616bd6317c24b69c9534cb7919b8a4a4	2015-09-11 15:10:07 +02:00
Pascal Massimino	15ca5014f1	loosen the padding check on buffer size Strictly speaking, the last (or first) row doesn't require padding. cf https://code.google.com/p/webp/issues/detail?id=258 Change-Id: Ie9ec8eb776fec1f5cea4cf9e21e81901fd79bf33	2015-09-09 00:01:26 -07:00
James Zern	d623a8706f	dec_neon: add whitespace around stringizing operator prevents unintentional side-effects (though unlikely in this case) with future compilers, cf: `eebaf97` dsp/mips: add whitespace around stringizing operator Change-Id: I0537091fcc97b4f54d0a156c3c83a28c51456b17	2015-09-03 23:13:56 -07:00
James Zern	29377d55b6	dsp/mips: cosmetics: add whitespace around XSTR macro normalizes formatting after: `eebaf97` dsp/mips: add whitespace around stringizing operator Change-Id: I1e3986b6d08195d79072747eb99d7e0549aece72	2015-09-03 23:09:13 -07:00
James Zern	eebaf97f5a	dsp/mips: add whitespace around stringizing operator fixes compile with gcc 5.1 BUG=259 Change-Id: Ideb39c6290ab8569b1b6cc835bea11c822d0286c	2015-09-02 23:21:13 -07:00
Urvang Joshi	d39dc8f3cc	Create a WebPAnimDecoder API. This is designed for the simple use-case where one wants to decode all frames one-by-one in order. Also, use this API in anim_util library, which is in turn used by anim_diff tool. Change-Id: Ie8b653c04e867d40fd23321b3dd41b87689656c7	2015-09-02 16:23:10 -07:00
James Zern	14efabbf1c	Android: limit use of cpufeatures cpufeatures is only used with armeabi-v7a.* Change-Id: I80284061d71d9defa50d139c7f1bda67c00f567e	2015-08-19 18:44:33 -07:00
Pascal Massimino	7b83adbee6	preparatory cosmetics for Rescaler code fix and clean-up Change-Id: I1278837c8d7813192e8099d6fceaede75f38755b	2015-08-19 18:44:29 -07:00
James Zern	77fb41c2f1	dec/vp8l/DecodeAlphaData: remove redundant cast 'pos' has been an int since: `c34307a` fix some VS9 warnings about type conversion Change-Id: I56195d4f15278fa268be52a7bfe24b94554890c4	2015-08-18 18:52:43 -07:00
Jyrki Alakuijala	90fcfcd905	Insert less hash chain entries from the beginnings of long copies. This makes the chains more efficient and a larger variety of data is tested. 0.02 % compression gain at q 100, 0.05 % at default quality. 0.8 % speedup by callgrind. 0.16 % compression gain for lossy alpha ?! Change-Id: I888120133352799eb14f5f602c7f40ab404bd665	2015-08-18 18:44:03 -07:00
skal	bd55604d1b	SSE2: add yuv444 converters, re-using yuv_sse2.c Change-Id: I4d5c9df8a4c8e8cb8b5daa537af07382894503a8	2015-08-17 21:15:37 -07:00
skal	3ec1182768	use the DispatchAlpha() call from dsp it's used in YUVA->RGBA case (quite frequent). Change-Id: Ie88f8c7f74cd274b3c6cbe81506f4425c164c7b3	2015-08-17 18:54:39 -07:00
skal	c5f00621c7	incorporate bzero() into WebPRescalerInit() instead of call site Change-Id: I9ebb83e643e24bc685a1a1cb6836cb54e34a0ec8	2015-08-14 19:37:22 -07:00
Pascal Massimino	3ebcdd4133	remove duplicate "#include <stdlib.h>" Change-Id: I01b23efb1229e7dd96c6e15c4385064ad10a575a	2015-08-14 12:33:45 -07:00
James Zern	24a9693223	dec: allow 0 as a scaling dimension this allows scaling to a particular width/height while preserving the source aspect ratio using WebPRescalerGetScaledDimensions(). Change-Id: I77b11528753290c1e9bb942ac761c215ccfb8701	2015-08-13 20:58:17 -07:00
James Zern	b918724280	utils/rescaler: add WebPRescalerGetScaledDimensions + use it in WebPPictureRescale() Change-Id: I491bea8cd56f0eb1ac8bf0829b9f36c77804219a	2015-08-13 20:50:38 -07:00
Pascal Massimino	020fd099f6	Merge "WebPPictureDistortion: support ARGB format for 'pic' when computing distortion."	2015-08-12 15:37:27 +00:00
skal	56a2e9f5e7	WebPPictureDistortion: support ARGB format for 'pic' when computing distortion. using a *tmp_plane buffer to split a/r/g/b planes up appeared to be the easiest route, compared to copy-pasting the whole code and making it x_stride aware... Change-Id: I0898ef1df62bd3e1713b77187b31b5eeef3832fe	2015-08-11 17:28:29 -07:00
James Zern	c2f9dc06cf	bit_writer: convert VP8L macro values to immediates allows the values to be used in preproc checks, fixing a -Wunreachable-code warning in 64-bit builds where VP8L_WRITER_BITS != 16 Change-Id: Ie98dff4e8ef896436557c64d5da2c5d70228a730	2015-08-10 20:35:22 -07:00
Jyrki Alakuijala	b969f888ab	Reduce magic in palette reordering Slightly faster on -m 0 -q 0, particularly for small images (50 x 75 image was 0.1 % faster on callgrind measurement). Increases compression density by 0.005 % for the 1000 images, but small images can improve even 0.5 % (about 4 bytes, depending on the characteristics of the palette). Change-Id: I94f568d396ac62a054a829abeeef3eb0af6b3f94	2015-08-10 19:06:07 -07:00
James Zern	155c1b222b	Merge changes I76f4d6fe,I45434639 * changes: lossless_enc_neon: add VP8LTransformColor lossless_neon: add VP8LTransformColorInverse	2015-08-06 23:00:03 +00:00
Djordje Pesut	717e4d5a7c	mips32/mipsDSPr2: function ImportRow rebased Change-Id: Id58d266040fdb5fe1e507cd0f6370ea625156e4d	2015-08-06 17:09:10 +02:00
Pascal Massimino	7df93893dc	fix rescaling bug (uninitialized read, see bug #254 ). the x_add/x_sub increments were wrong for u/v in the upscaling case. They shouldn't be left to the caller's discretion, but set up by WebPRescalerInit to their exact necessary values. -> Cleaned-up WebPRescalerInit() param list. -> added safety asserts -> removed the mips32/mips_r2 variant of "ImportRow" which were buggy prior Change-Id: I347c75804d835811e7025de92a0758d7929dfc09	2015-08-05 23:00:00 -07:00
James Zern	5cdcd561e2	lossless_enc_neon: add VP8LTransformColor based on SSE2, ~32% faster Change-Id: I76f4d6fe456baceba46ffebf2f699e98691eefdf	2015-08-05 00:15:13 -07:00
James Zern	a53c336919	lossless_neon: add VP8LTransformColorInverse based on SSE2, only ~11% faster Change-Id: I45434639d81e153f01f77c1f5d2da510b542170e	2015-08-04 23:22:36 -07:00
James Zern	99131e7f8c	Merge changes I9fb25a89,Ibc648e9e * changes: lossless_neon: remove predictors 5-13 ll_enc_neon: enable VP8LSubtractGreenFromBlueAndRed	2015-08-04 02:24:15 +00:00
Pascal Massimino	c455676680	simplify the main loop for downscaling (part of bug #254 investigation) no speed change observed. Change-Id: Ie21b33171def367f37643fef6a0bd378e49468c7	2015-08-03 16:57:35 +02:00
James Zern	2a010f992a	lossless_neon: remove predictors 5-13 operating on single uint32's isn't helped by NEON. this improves aarch64 performance by ~4% Change-Id: I9fb25a8962de7b80e893e756ee7c76393cfd40c7	2015-07-28 19:44:58 -07:00
James Zern	ca221bbc48	ll_enc_neon: enable VP8LSubtractGreenFromBlueAndRed this moves the function outside the WEBP_USE_INTRINSICS check. there's no alternative version and it's ~54% faster at the function level and mildly faster overall Change-Id: Ibc648e9ee35021d48901e05aa596aa01067796a2	2015-07-28 19:44:45 -07:00
Jyrki Alakuijala	01d61fd9c6	lossless: ~20 % speedup 0.28 % byte size increase on lossless, 0.18 % increase on lossy alpha Change-Id: I1e001a56831a8f996ac522aa646f9ae587c80d12	2015-07-20 17:13:44 -07:00
Jyrki Alakuijala	f722c8f0bd	lossless: Speed up ComputeCacheEntropy by 40 % a total impact of 1 % on encoding speed This allows for performance neutral removal of the binary search in cache bits selection. This will give a small improvement in compression density. Change-Id: If5d4d59460fa1924ce71af977320834a47c2054a	2015-07-20 17:13:44 -07:00
Pascal Massimino	1ceecdc871	add a VP8LColorCacheSet() method for color cache Change-Id: Iebdc0383474fc3b8fbb0d7da4a35a0a7061bb9b5	2015-07-20 17:13:43 -07:00
Jyrki Alakuijala	17eb609916	lossless: Allow copying from prev row in rle-mode. 0.21 % compression density improvement for 1000 png corpus in lossless mode 0.50 % compression density improvement for 1000 png corpus in lossy mode Change-Id: I14ee8c427ae5d3e116b0ee6695fcdea3321a319d	2015-07-20 17:13:43 -07:00
Jyrki Alakuijala	f3a7a5bf76	lossless: bit writer optimization valgrind --tool=callgrind shows a 9 % speedup: 1021201984 ticks before vs. 927917709 after -q 0 -m 0 -lossless ~/alpi/1.png 22.040 MP/s before 24.796 MP/s after Change-Id: Iaab928167b3e20fb0d9401c6f8317a26c5a610b4	2015-07-20 16:18:40 -07:00
James Zern	d97b9ff755	Merge changes from topic 'lossless-enc-improvements' * changes: lossless: combine the Huffman code with extra bits lossless: Inlining add literal lossless: simplify HashChainFindCopy heuristics lossless: 0.5 % compression density improvement lossless: Add zeroes into the predicted histograms. lossless: encoding, don't compute unnecessary histo lossless: Remove about 25 % of the speed degradation Faster alpha coding for webp lossless: rle mode not to accept lengths smaller than 4. lossless: Less code for the entropy selection lossless: 0.37 % compression density improvement	2015-07-20 19:38:42 +00:00
James Zern	0250dfcc19	msvc: fix pointer type warning in BitsLog2Floor _BitScanReverse() takes an unsigned long* http://msdn.microsoft.com/en-us/library/fbxyd7zd.aspx fixes: C4057: 'function': 'unsigned long ' differs in indirection to slightly different base types from 'uint32_t ' fixes issue #253 Change-Id: I0101ef7be18c7ed188b35e9b17e7f71290953786	2015-07-18 11:12:21 -07:00
Jyrki Alakuijala	52931fd548	lossless: combine the Huffman code with extra bits gives 2 % speedup 24.9 -> 25.5 MP/s for a photo with -q 0 -m 0 Change-Id: If9ae04683a86dd7b1fced2183cf79b9349a24a9e	2015-07-07 20:24:28 -07:00
Jyrki Alakuijala	c4855ca249	lossless: Inlining add literal this is a simple speedup of about 1-2 % Change-Id: I0c7b01c0a69f4aeaf363ffda05a28871f1def696	2015-07-07 20:24:28 -07:00
Jyrki Alakuijala	8e9c94dedb	lossless: simplify HashChainFindCopy heuristics for small speedup 0.0003 % worse compression Change-Id: Ic4b6b21e5279231c6321f2cec1c79f7e17e56afa	2015-07-07 20:24:27 -07:00
Jyrki Alakuijala	888429f409	lossless: 0.5 % compression density improvement do not do length 2 matches far away speedup for non compressible data by inserting two literals at a time when no matches are found Change-Id: Ia8e033071f4186bb8148bb2bf13ca37586734aa3	2015-07-07 20:24:27 -07:00

... 4 5 6 7 8 ...

2105 Commits