libwebp

mirror of https://github.com/webmproject/libwebp.git synced 2025-07-05 18:44:31 +02:00

Author	SHA1	Message	Date
Vincent Rabaud	6c702b81ac	Speed up hash chain initialization using memset. That gains 1% on lossy compression. Change-Id: Ib9aa210194ed2f17eaff85b499b55cc4eb99ff11	2015-12-07 11:54:50 +01:00
Lode Vandevenne	6938111357	Improved alpha cleanup for the webp encoder when prediction transform is used. Gives 0.9% smaller (2.4% compared to before alpha cleanup) size on the 1000 PNGs dataset: Alpha cleanup before: 18856614 Alpha cleanup after: 18685802 For reference, with no alpha cleanup: 19159992 Note: WebPCleanupTransparentArea is still also called in WebPEncode. This cleanup still helps preprocessing in the encoder, and the cases when the prediction transform is not used. Change-Id: I63e69f48af6ddeb9804e2e603c59dde2718c6c28	2015-12-04 13:50:56 +00:00
Pascal Massimino	2c08aac81a	introduce WebPMemToUint32 and WebPUint32ToMem for memory access it uses memcpy() when unaligned memory write is tricky Change-Id: I5d966ca9d19e9b43ac90140fa487824116982874	2015-12-04 13:43:01 +00:00
Vincent Rabaud	010ca3d10d	Fix FindMatchLength with non-aligned buffers. The 32-bit buffers are actually rarely 64-bit aligned. The new solution uses memcmp and is alignment agnostic. It is also slightly faster. Change-Id: I863003e9ee4ee8a3eed25b7b2478cb82a0ddbb20	2015-12-04 10:19:58 +01:00
Scott Hancher	5ae220bef6	backward_references.c: Fixed compiler warning "Implicit conversion loses integer precision: 'long' to 'int'." Change-Id: I1aec7431f84123e5280447883eb80b84a3821d91	2015-12-02 23:51:06 -08:00
Vincent Rabaud	a141178255	Optimization in hash chain comparison for 64 bit Arrays were compared 32 bits at a time, it is now done 64 bits at a time. Overall encoding speed-up is only of 0.2% on @skal's small PNG corpus. It is of 3% on my initial 1.3 Mp desktop screenshot image. Change-Id: I1acb32b437397a7bf3dcffbecbcd4b06d29c05e1	2015-12-01 13:01:57 +01:00
Lode Vandevenne	239421c5ef	lossless: make prediction in encoder work per scanline instead of per block. This prepares for a next CL that can make the predictors alter RGB value behind transparent pixels for denser encoding. Some predictors depend on the top-right pixel, and it must have been already processed to know its new RGB value, so requires per scanline instead of per block. Running the encode speed test on 1000 PNGs 10 times with default settings: Before: Compression (output/input): 2.3745/3.2667 bpp, Encode rate (raw data): 1.497 MP/s After: Compression (output/input): 2.3745/3.2667 bpp, Encode rate (raw data): 1.501 MP/s Same but with quality 0, method 0 and 30 iterations: Before: Compression (output/input): 2.9120/3.2667 bpp, Encode rate (raw data): 36.379 MP/s After: Compression (output/input): 2.9120/3.2667 bpp, Encode rate (raw data): 36.462 MP/s No effect on compressed size, this produces exactly same files. No significant measured effect on speed. Expected faster speed from better memory layout with scanline processing but slower speed due to needing to get predictor mode per pixel, may compensate each other. Change-Id: I40f766f1c1c19f87b62c1e2a1c4cd7627a2c3334	2015-11-25 00:38:27 -08:00
Pascal Massimino	3770f3bbb6	Merge "cleanup the YFIX/TFIX difference by removing some code and #define"	2015-11-23 20:47:42 +00:00
Pascal Massimino	997e103871	cleanup the YFIX/TFIX difference by removing some code and #define no speed or output difference Change-Id: I50bfb44f357e19431457b1cf9504a5a6bcce1945	2015-11-21 23:51:58 -08:00
Lode Vandevenne	1f9be97c22	Make discarding invisible RGB values (cleanup alpha) the default. Rename the flag to exact instead of the opposite cleanup_alpha. Add the flag to WebPConfig. Do the cleanup in the webp encoder library rather than the cwebp binary, this will be needed for the next stage: smarter alpha cleanup for better compression which cannot be done as a preprocessing due to depending on predictor choices in the encoder. Change-Id: I2fbf57f918a35f2da6186ef0b5d85e5fd0020eef	2015-11-21 12:32:32 -08:00
Urvang Joshi	397863bd66	Refactor CopyPlane() and CopyPixels() methods: put them in utils. Change-Id: I0e1533df557a0fa42c670e3b826fc0675c36e0a5	2015-11-13 11:39:22 -08:00
Urvang Joshi	6ecd72f845	Re-enable encoding of alpha plane with color cache for next release. This is a revert of: https://chromium-review.googlesource.com/#/c/73607/ Change-Id: I7ec45277d73608d77d5e873290c6c185caa30c32	2015-11-13 07:15:19 +00:00
Pascal Massimino	bfd3fc02df	~2x faster SSE2 RGB24toY, BGR24toY, ARGBToY\|UV global effect is ~2% faster encoding from JPG source and ~8% faster lossless-webp source decoding to PGM (e.g.) Also revamped the YUVA case to first accumulate R/G/B value into 16b temporary buffer, and then doing the UV conversion. -> New function: WebPConvertRGBA32ToUV Change-Id: I1d7d0c4003aa02966ad33490ce0fcdc7925cf9f5	2015-11-06 15:02:01 -08:00
Pascal Massimino	52fdbdfe66	extract some RGB24 to Luma conversion function from enc/ to dsp/ Just for RGB24/BGR24 for now, which are the hard-to-optimize ones. SSE2 implementation coming next. ConvertRowToY() should go into dsp/ too, at some point. Change-Id: Ibc705ede5cbf674deefd0d9332cd82f618bc2425	2015-10-30 00:28:11 -07:00
James Zern	5bd04a087c	sync versions with 0.4.4 libwebp{,decoder} - 0.4.4 libwebp libtool - 5.4.0 libwebpdecoder libtool - 1.4.0 mux/demux - 0.2.2 (unchanged) libtool - 1.2.0 (unchanged) (cherry picked from commit 62864042c053da4482a18252c9b7c28e45af9dc4) Change-Id: I7d421dc47ad4d25a17450ce1b04562c5d58c596b	2015-10-28 23:43:40 -07:00
James Zern	f717b82864	vp8l.c, cosmetics: fix indent after 95509f9 95509f9 large re-organization of the delta-palettization code Change-Id: I9d27f15cb6072a2bd1dd593d53db5b2dd3c30133	2015-10-19 12:28:57 -07:00
Pascal Massimino	fea94b2b36	fix alignment of allocated memory in AllocateTransformBuffer likely to avoid unaligned reads in the future Change-Id: I434ba17c139ad6e190ebd9b909b241c6c6f1e7f8	2015-10-18 13:09:22 -07:00
Pascal Massimino	12ec204ec7	moved ALIGN_CST into util/utils.h and renamed WEBP_ALIGN_xxx Note that ALIGN_CST is still kept different in dec/frame.c for now, because the values is 31 there, not 15. We might re-unite these two later. Change-Id: Ibbee607fac4eef02f175b56f0bb0ba359fda3b87	2015-10-14 00:03:14 -07:00
Pascal Massimino	95509f9914	large re-organization of the delta-palettization code same functionality, but better code layout. What changed: * don't trash the palette_[] in EncodePalette(), so it can be re-used * split generation of image from bit-stream coding * move all the delta-palette code to delta_palettization.c, and only have 1 entry point there WebPSearchOptimalDeltaPalette() * minimize the number of "#ifdef WEBP_EXPERIMENTAL_FEATURES" in vp8l.c * clarify the TransformBuffer stuff. more clean-up to come here... This should make experimenting with delta-palettization easier and more compartimentalized. Change-Id: Iadaa90e6c5b9dabc7791aec2530e18c973a94610	2015-10-14 00:25:42 +02:00
James Zern	f7b8f90740	delta_palettization.*: add copyright Change-Id: I5dc0ae0de88968d2c73b7025ce18319897219630	2015-10-03 10:05:09 -07:00
Mislav Bradac	c1e1b7104c	Changed delta palette to compress better New palette compresses more than 20% better with minimum quality loss. Tested on set of wikipedia images with command line: cwebp -delta_palettization Change-Id: I82ec7d513136599cd70386f607f634502eb9095d	2015-10-03 08:48:42 +00:00
Mislav Bradac	48f66b6687	Add delta_palettization feature to WebP Change-Id: Ibaf4e49aa67d63d0eb11848cca4fd0c60815864a	2015-10-02 14:29:54 -07:00
Pascal Massimino	5ff0079ece	fix rescaler vertical interpolation * vertical expansion now uses bilinear interpolation * heavily assumes that the alpha plane is decoded in full, not row-by-row * split the RescalerExportRow and RescalerImportRow methods into Shrink and Expand variants. * MIPS implementation of ExportRowExpand is missing. There's room for extra speed optim and code re-org, but let's keep that for later patches. addresses https://code.google.com/p/webp/issues/detail?id=254 Change-Id: I8f12b855342bf07dd467fe85e4fde5fd814effdb	2015-09-18 17:32:11 -07:00
James Zern	cd82440ec7	VP8LAllocateHistogramSet: align histogram[] entries fixes issue #262: a SIGBUS when accessing a misaligned double in VP8LHistogram Change-Id: Ic78cc5366d7e43d892c375b6a69dce2379db931b	2015-09-17 22:59:01 -07:00
Jyrki Alakuijala	90fcfcd905	Insert less hash chain entries from the beginnings of long copies. This makes the chains more efficient and a larger variety of data is tested. 0.02 % compression gain at q 100, 0.05 % at default quality. 0.8 % speedup by callgrind. 0.16 % compression gain for lossy alpha ?! Change-Id: I888120133352799eb14f5f602c7f40ab404bd665	2015-08-18 18:44:03 -07:00
skal	c5f00621c7	incorporate bzero() into WebPRescalerInit() instead of call site Change-Id: I9ebb83e643e24bc685a1a1cb6836cb54e34a0ec8	2015-08-14 19:37:22 -07:00
James Zern	b918724280	utils/rescaler: add WebPRescalerGetScaledDimensions + use it in WebPPictureRescale() Change-Id: I491bea8cd56f0eb1ac8bf0829b9f36c77804219a	2015-08-13 20:50:38 -07:00
skal	56a2e9f5e7	WebPPictureDistortion: support ARGB format for 'pic' when computing distortion. using a *tmp_plane buffer to split a/r/g/b planes up appeared to be the easiest route, compared to copy-pasting the whole code and making it x_stride aware... Change-Id: I0898ef1df62bd3e1713b77187b31b5eeef3832fe	2015-08-11 17:28:29 -07:00
Jyrki Alakuijala	b969f888ab	Reduce magic in palette reordering Slightly faster on -m 0 -q 0, particularly for small images (50 x 75 image was 0.1 % faster on callgrind measurement). Increases compression density by 0.005 % for the 1000 images, but small images can improve even 0.5 % (about 4 bytes, depending on the characteristics of the palette). Change-Id: I94f568d396ac62a054a829abeeef3eb0af6b3f94	2015-08-10 19:06:07 -07:00
Pascal Massimino	7df93893dc	fix rescaling bug (uninitialized read, see bug #254 ). the x_add/x_sub increments were wrong for u/v in the upscaling case. They shouldn't be left to the caller's discretion, but set up by WebPRescalerInit to their exact necessary values. -> Cleaned-up WebPRescalerInit() param list. -> added safety asserts -> removed the mips32/mips_r2 variant of "ImportRow" which were buggy prior Change-Id: I347c75804d835811e7025de92a0758d7929dfc09	2015-08-05 23:00:00 -07:00
Jyrki Alakuijala	01d61fd9c6	lossless: ~20 % speedup 0.28 % byte size increase on lossless, 0.18 % increase on lossy alpha Change-Id: I1e001a56831a8f996ac522aa646f9ae587c80d12	2015-07-20 17:13:44 -07:00
Jyrki Alakuijala	f722c8f0bd	lossless: Speed up ComputeCacheEntropy by 40 % a total impact of 1 % on encoding speed This allows for performance neutral removal of the binary search in cache bits selection. This will give a small improvement in compression density. Change-Id: If5d4d59460fa1924ce71af977320834a47c2054a	2015-07-20 17:13:44 -07:00
Jyrki Alakuijala	17eb609916	lossless: Allow copying from prev row in rle-mode. 0.21 % compression density improvement for 1000 png corpus in lossless mode 0.50 % compression density improvement for 1000 png corpus in lossy mode Change-Id: I14ee8c427ae5d3e116b0ee6695fcdea3321a319d	2015-07-20 17:13:43 -07:00
Jyrki Alakuijala	52931fd548	lossless: combine the Huffman code with extra bits gives 2 % speedup 24.9 -> 25.5 MP/s for a photo with -q 0 -m 0 Change-Id: If9ae04683a86dd7b1fced2183cf79b9349a24a9e	2015-07-07 20:24:28 -07:00
Jyrki Alakuijala	c4855ca249	lossless: Inlining add literal this is a simple speedup of about 1-2 % Change-Id: I0c7b01c0a69f4aeaf363ffda05a28871f1def696	2015-07-07 20:24:28 -07:00
Jyrki Alakuijala	8e9c94dedb	lossless: simplify HashChainFindCopy heuristics for small speedup 0.0003 % worse compression Change-Id: Ic4b6b21e5279231c6321f2cec1c79f7e17e56afa	2015-07-07 20:24:27 -07:00
Jyrki Alakuijala	888429f409	lossless: 0.5 % compression density improvement do not do length 2 matches far away speedup for non compressible data by inserting two literals at a time when no matches are found Change-Id: Ia8e033071f4186bb8148bb2bf13ca37586734aa3	2015-07-07 20:24:27 -07:00
Jyrki Alakuijala	7b23b19808	lossless: Add zeroes into the predicted histograms. Increases compression density by 0.03 % for lossy. Speeds up at least one of the lossy alpha images by 20 %. Palette entropy 'kludge' seems to save 1-2 % on alpha images. Change-Id: I2116b8d81593ac8173bfba54a7c833997fca0804	2015-07-07 20:24:27 -07:00
Jyrki Alakuijala	85b44d8a69	lossless: encoding, don't compute unnecessary histo share the computation between different modes 3-5 % speedup for lossless alpha 1 % for lossy alpha no change in compression density Change-Id: I5e31413b3efcd4319121587da8320ac4f14550b2	2015-07-07 20:24:26 -07:00
Jyrki Alakuijala	d92453f381	lossless: Remove about 25 % of the speed degradation introduced in: "lossless: 0.37 % compression density improvement" Uses the statistics of red and blue histograms to decide if to run cross color correction at all. Improves compression density by 0.02 % or so. Change-Id: I47429557e9cdbd9fa90c584696f241b17427d73f	2015-07-07 20:24:26 -07:00
Jyrki Alakuijala	2cce031704	Faster alpha coding for webp No significant size degradation (+0.001 %) for 1000 image corpus Fixes the 8 ms vs 2 ms degradation from: "lossless: 0.37 % compression density improvement" Change-Id: Id540169a305d9d5c6213a82b46c879761b3ca608	2015-07-07 20:24:25 -07:00
Jyrki Alakuijala	5e75642efd	lossless: rle mode not to accept lengths smaller than 4. Gives a compression gain of 0.22 % Change-Id: I0f3b8dad6b4c1bfb16eab095a467f34466b9e3b7	2015-07-07 20:24:25 -07:00
Jyrki Alakuijala	84326e4ab0	lossless: Less code for the entropy selection Tested: 1000 png corpus gives same results Change-Id: Ief5ea7727290743b9bd893b08af7aa7951f556cb	2015-07-07 20:24:25 -07:00
Jyrki Alakuijala	16ab951abf	lossless: 0.37 % compression density improvement counting the entropy expectation for five different configurations: palette non-predicted non-predicted with subtract green predicted predicted with subtract green and choose the strategy with the smallest expected entropy Change-Id: Iaaf209c0d565660a54a4f9b3959067afb9951960	2015-07-07 20:24:24 -07:00
skal	ac76801159	introduce FTransform2 to perform two transforms at a time. FTransform goes from ~12.0% to 11.5% total CPU time. Change-Id: Ibcb23155324f4fd8b235563f80668531c781f624	2015-05-18 21:06:15 -07:00
James Zern	dbba67d1e7	histogram.h: cosmetics: remove unnecessary includes Change-Id: Ia8277d3587534c2a1af05d3df57a6973a68be16d	2015-04-17 12:23:06 -07:00
Pascal Massimino	7fa67c9b9e	change GetPixPairHash64() return type to uint32_t Change-Id: Ibb61c1631d7a4bcda5417b5a85864d5e2c3f3858	2015-04-16 00:55:25 -07:00
Pascal Massimino	7073bfb3ee	Merge "split 64-mult hashing into two 32-bit multiplies"	2015-04-15 23:04:47 -07:00
Pascal Massimino	7fe357b8c0	split 64-mult hashing into two 32-bit multiplies Speed-wise equivalent on x86 and ARM (maybe a tad faster, hard to tell). Note that the two 32-bit multiples are not strictly equivalent to the 64-bit one, since we're missing one carry propagation. In practice, no observable difference was seen because of this slightly different hashing result. Change-Id: I8f2381175eae1cb20dabf149e6b27e1768fba6ab	2015-04-15 17:45:19 +02:00
Pascal Massimino	6121413415	remove VP8Residual::cost unused field Change-Id: Id494475b05c540b40fd104594acbcaa783b88d77	2015-04-15 01:56:31 -07:00

... 6 7 8 9 10 ...

962 Commits