libwebp

mirror of https://github.com/webmproject/libwebp.git synced 2025-08-11 02:20:33 +02:00

Author	SHA1	Message	Date
Vikas Arora	498d4dd634	WebP-Lossless encoding improvements. Lossy (with Alpha) image compression gets 2.3X speedup. Compressing lossless images is 20%-40% faster now. Change-Id: I41f0225838b48ae5c60b1effd1b0de72fecb3ae6 (cherry picked from commit `8eae188a62`)	2013-06-11 15:00:45 -07:00
Urvang Joshi	d52b405dbd	Cosmetic fixes Change-Id: Ia878115086edc3fdfee3f0ca76e5e74ea5906f21 (cherry picked from commit `e9a7990bc5`)	2013-03-29 15:49:15 -07:00
Pascal Massimino	6cb4a61825	misc style fix (cherry picked from commit `142c46291e`) Conflicts: src/webp/format_constants.h Change-Id: Ib764cb09bd78ab6e72c60f495d55b752ad4dbe4d	2013-03-29 15:49:05 -07:00
Pascal Massimino	3c8eb9a806	fix bad saturation order in QuantizeBlock Saturation was done on input coeff, not quantized one. This saturation is not absolutely needed: output of FTransformWHT is in range [-16320, 16321]. At quality 100, max quantization steps is 8, so the maximal range used by QuantizeBlock() is [-2040, 2040]. But there's some extra bias (mtx->bias_[] and mtx->sharpen_[]) so it's better to leave this saturation check for now. addresses issue #145 Change-Id: I4b14f71cdc80c46f9eaadb2a4e8e03d396879d28	2013-03-25 14:53:29 -07:00
James Zern	9048494df6	build: fix install race on shared headers subdirectories with more than one target can have the install targets run in parallel with make -jN. group the shared headers in one place to produce a common install target. Change-Id: I1f3aa338a8ee6d681de1e5d0b2c6244d2c3d5451	2013-03-16 13:29:49 -07:00
skal	126974b45b	add LUT-free reference code for YUV->RGB conversion. Reported to eventually be 4% on ARM (see https://code.google.com/p/webp/issues/detail?id=134 for details) We might activate it selectively later... Output values is not bitwise the same as the LUT-based version, but difference is only +/-1 at max. Change-Id: I1cc790ff4459885ed2ae2e72f31c5f3740095f07	2013-03-15 01:37:55 +01:00
skal	b7eaa85d6a	inline VP8LFastLog2() and VP8LFastSLog2 for small values larger values are still dealt with in the .cc ~5% faster encoding Output size is slightly different (variably), because of different floating-point calculation ordering. Change-Id: I6ede18b09c753997cf78aa1199a807d9ddb5d4b4	2013-02-25 22:46:52 +01:00
skal	943386db4b	disable SSE2 for now (until proper run-time detection is ready) Change-Id: I7b8eee52b23fce2f1612ad7d4ed603ffb02620a2	2013-02-20 08:20:47 +01:00
skal	9479fb7d2d	lossless encoding speedup * add SSE2 variant for lossless * speed-up TransformColor calls using specialized TransformColorBlue/Red * Fuse the Shannon Entropy calls to compute it for X and X+Y simultaneously. This latter changes the output size a little bit. Change-Id: Ie5df94da78bf51a58da859c9099b56340da9ec89	2013-02-20 08:13:12 +01:00
skal	b7490f8553	introduce WEBP_REFERENCE_IMPLEMENTATION compile option This flag will make the code use no uint64, no asm, and no fancy trick, but instead aim at being as simple and straightforward as possible. Main use is to help emscripten generate proper JS code. More code needs to be simplified later. Also: tune the BITS values to be 24 and make use of WEBP_RIGHT_JUSTIFY Here are the typical timing for decoding a large image: ARM7-a: dwebp_justify_32_neon Time to decode picture: 3.280s dwebp_justify_24_neon Time to decode picture: 2.640s dwebp_justify_16_neon Time to decode picture: 2.723s dwebp_justify_8_neon Time to decode picture: 2.802s dwebp_justify_32 Time to decode picture: 4.264s dwebp_justify_24 Time to decode picture: 3.696s dwebp_justify_16 Time to decode picture: 3.779s dwebp_justify_8 Time to decode picture: 3.834s dwebp_32_neon Time to decode picture: 4.010s dwebp_24_neon Time to decode picture: 2.725s dwebp_16_neon Time to decode picture: 2.852s dwebp_8_neon Time to decode picture: 2.778s dwebp_32 Time to decode picture: 4.587s dwebp_24 Time to decode picture: 3.800s dwebp_16 Time to decode picture: 3.902s dwebp_8 Time to decode picture: 3.815s REFERENCE (HEAD) Time to decode picture: 3.818s x86_64: dwebp_justify_32 Time to decode picture: 0.473s dwebp_justify_24 Time to decode picture: 0.434s dwebp_justify_16 Time to decode picture: 0.450s dwebp_justify_8 Time to decode picture: 0.467s dwebp_32 Time to decode picture: 0.474s dwebp_24 Time to decode picture: 0.468s dwebp_16 Time to decode picture: 0.468s dwebp_8 Time to decode picture: 0.481s REFERENCE (HEAD) Time to decode picture: 0.436s i386: dwebp_justify_32 Time to decode picture: 0.723s dwebp_justify_24 Time to decode picture: 0.618s dwebp_justify_16 Time to decode picture: 0.626s dwebp_justify_8 Time to decode picture: 0.651s dwebp_32 Time to decode picture: 0.744s dwebp_24 Time to decode picture: 0.627s dwebp_16 Time to decode picture: 0.642s dwebp_8 Time to decode picture: 0.642s Change-Id: Ie56c7235733a24f94fbfc2e4351aae36ec39c225	2013-02-14 15:46:12 +01:00
pascal massimino	841a3ba5da	Merge "Remove -Wshadow warnings."	2013-01-28 13:15:54 -08:00
Johann	6efed26865	Remove -Wshadow warnings. Accidentally carried some bad habits from SSE code. Copy over fixes from `0d19fbf` Change-Id: I763312c9d176c434ba41f95602bada1aeffebfb2	2013-01-28 12:29:12 -08:00
James Zern	27f8f7420e	upsampling_neon.c: fix build store values to a temporary variable before calling functions that take vector types. removes non-standard constructs such as: (uint8x8x2_t){{ a, b }} fixing: src/dsp/upsampling_neon.c:69:32: error: macro "vst2_u8" passed 3 arguments, but takes just 2 Change-Id: Ib4368e16e3a3efac18024f02be94e76243ade2dc Fixes: https://code.google.com/p/webp/issues/detail?id=140	2013-01-25 19:42:50 -08:00
Mans Rullgard	090b708a00	NEON optimised yuv to rgb conversion - along the lines of the SSE chroma upsampling. Total speedup is ~30%. 4% speed loss on YuvToRgbXX conversion using tables instead of 14-bit fixed precision. TODO(later): investigate, and compare to x86. see http://code.google.com/p/webp/issues/detail?id=134 Change-Id: Idc2261037cd13b4553ca20ecc4c4007099c37009	2013-01-25 15:46:40 -08:00
James Zern	be7c96b069	cosmetics: break a few long lines Change-Id: I785763b974b4e7664ad8e9884251aa2d5274b456	2013-01-23 14:50:19 -08:00
Vikas Arora	0aeba52852	Provide an option to build decoder library. When the config option '--enable-libwebpdecoder' is specified, the lean decoder library 'libwebpdecoder' will be created in addition to libwebp. Also dwebp binary will be linked to libwebpdecoder, if this config option is specified. Change-Id: I9de3e149b59c9a8390fae2ba660941749640e54a	2013-01-23 11:43:36 -08:00
James Zern	2b252a53a8	Merge "Provide option to swap bytes for 16 bit colormodes"	2013-01-22 15:00:39 -08:00
Vikas Arora	94a48b4bc3	Provide option to swap bytes for 16 bit colormodes Color modes: RGB_565 & RGBA_4444 Change-Id: I571b6832b9848e5c4109272978f68623ca373383	2013-01-22 14:51:20 -08:00
skal	0d19fbff51	remove some -Wshadow warnings these are quite noisy, but it's not a big deal to remove them. Change-Id: I5deb08f10263feb77e2cc8a70be44ad4f725febd	2013-01-22 23:06:28 +01:00
skal	a556cb1ab4	Add details and reference about the YUV->RGB conversion Originated from the discussion at http://code.google.com/p/webp/issues/detail?id=134 Change-Id: I24384e2d2f5cf262d8632fc98303cba5e2d27224	2013-01-18 23:26:55 +01:00
pascal massimino	f4a97970de	Merge "Disto4x4 and Disto16x16 in NEON"	2013-01-17 11:07:20 -08:00
Johann	47b7b0ba47	Disto4x4 and Disto16x16 in NEON Change-Id: Ic6d9dbbc97b5025ce359332c33ae306d5d8925a5	2013-01-16 16:57:33 -08:00
vikas arora	e6409adc2e	Remove redundant include from dsp/lossless code. Change-Id: Ie8a497a486653f907c2a27f4027640a3308c6cc8	2013-01-10 15:09:19 -08:00
skal	d5838cd598	faster non-transposing SSE2 4x4 FTransform 1-2% faster. uses pmaddwd instead of transpose + pmullw. Can possibly be simplified further. Change-Id: I420e148816c4c6ab5e2080c9b1719dbbe6762d4e	2012-11-27 08:38:24 +01:00
skal	42c3b550ba	simplify the fwd transform -> remove two shifts Change-Id: Ibc55bca98588da30553a7870224ffd0e13d57f52	2012-11-15 09:51:35 +01:00
skal	118cb31270	Merge "add SSE2 version of Sum of Square error for 16x16, 16x8 and 8x8 case"	2012-11-15 00:07:44 -08:00
skal	e5c3b3f554	Simplify the texture evaluation Disto4x4() We don't need to use the exact forward transform, since it's only a rough evaluation. -> Removed some shifts and rounding constants. Change-Id: I3fdf8b4fe9720473894155e1ad0345f4d1fd9a33	2012-11-14 07:49:31 +01:00
skal	35bfd4c08f	add SSE2 version of Sum of Square error for 16x16, 16x8 and 8x8 case + replace mm_set1_ps(0) by _mm_setzero_si128() Change-Id: I4601033c27466532373f5dabfaf349ce5e5039da	2012-11-14 06:16:49 +01:00
Urvang Joshi	7caab1d8f6	Some cosmetic/comment fixes. Change-Id: Id0613f84cc53fcbeceb913c835a262451687e27b	2012-11-09 10:46:38 -08:00
Pascal Massimino	22a0fd9d01	Add NEON version of FTransformWHT Contributed by Wayne Chen (datoudatou at gmail dot com) Change-Id: I007c21db4eeadbf82b89f0963256f965deda7d90	2012-11-08 08:28:51 -08:00
Pascal Massimino	e8b41ad136	add NEON asm version for WHT inverse transform Contributed by Wayne Chen (datoudatou at gmail dot com) + some header cleanup + remove the NEON suffix in static functions Change-Id: I75bf5e9b54cf5e1acc53764c6f081d61690f8e3d	2012-11-01 16:31:01 -07:00
Pascal Massimino	75e5f17e3b	ARM/NEON: 30% encoding speed-up (implements the backward and forward transforms in the encoder) original patch by Wayne Chen (datoudatou at gmail dot com) Change-Id: Ic00f3bffcdf7a924f043006728735c810ee47a57	2012-10-31 14:00:20 -07:00
skal	f0360b4fcf	add EXPERIMENTAL code for YUV-JPEG colorspace This is mostly for experimentation! Need to define USE_YUVj flag in the code for that. suggested by benwreder at hotmail dot com Change-Id: If0b8e2c1863efc08ce097de6de20f4c7efc3f7e8	2012-10-19 20:15:58 +02:00
skal	5725cabac0	new segmentation algorithm fixes the 'blocky sky problem' (saturation problem: when luma was flat, chroma noise was taking over, resulting in random segment id assigned. When just using a common uniform segment was better). + side clean-up and readibility/experimentability MACRO'ization + added '-map 7' option Change-Id: I35982a9e43c0fecbfdd7b05e4813e8ba8c121d71	2012-09-04 23:09:15 +02:00
Pascal Massimino	5c3a7231ca	Make InitSSE2() functions be empty on non-SSE2 platform this avoids the '.o has no symbols' warning messages Change-Id: I00cf527a9041a810d896bd24b993112af6276323	2012-08-28 11:02:38 -07:00
Pascal Massimino	7c6e60f4bd	make InitSSE2() functions be empty on non-SSE2 platform this avoids the '.o has no symbols' warning messages Change-Id: Idbaa02f5c2f7c632997a26f9507926922d191b6e	2012-08-27 23:40:47 -07:00
Pascal Massimino	c7eb45764f	make VP8DspInitNEON() public this will avoid the "dec_neon.o has no symbol" warning no change in binary size observed on linux. Change-Id: Ia27ae2bc5a03d714afa7e46671fdcf4cb630784d	2012-08-27 00:28:13 -07:00
James Zern	fe1958f17d	RGBA4444: harmonize lossless/lossy alpha values lossy was rounding with a bias toward opaque: [232+, 8] -> [15, 1] now both paths use the range: [240+, 16] -> [15, 1] Change-Id: I3da2063b4959b9e9f45bae09e640acc1f43470c5	2012-08-14 14:02:30 -07:00
James Zern	f06c1d8f7b	Merge "Alignment fix" into 0.2.0	2012-08-09 16:09:58 -07:00
Urvang Joshi	f56e98fd11	Alignment fix Change-Id: Ia5475247f03456b01571ae7531da90f74c068045	2012-08-10 02:10:32 +05:30
Pascal Massimino	528a11af35	fix the ARGB4444 premultiply arithmetic * green was not descaled properly * alpha was over-dithered, making the value '0x0f' not be a fixed point * alpha value was not restored ok. Change-Id: Ia4a4d75bdad41257f7c07ef76a487065ac36fede	2012-08-09 11:32:30 -07:00
Urvang Joshi	a0a488554d	Lossless decoder fix for a special transform order Fix the lossless decoder for the case when it has to apply other inverse transforms before applying Color indexing inverse transform. The main idea is to make ColorIndexingInverse virtually in-place: we use the fact that the argb_cache is allocated to accommodate all unpacked pixels of a macro-row, not just packed pixels. Change-Id: I27f11f3043f863dfd753cc2580bc5b36376800c4	2012-08-08 23:52:08 -07:00
Pascal Massimino	f94b04f045	move some RGB->YUV functions to yuv.h will be needed later Change-Id: I6b9e460db2d398b9fecd5d3c1bbdb3f2f3d4f5db	2012-08-02 17:23:02 -07:00
Pascal Massimino	4af3f6c4d3	fix indentation Change-Id: Ib00b3cdc21ac336a56390f1e71c169e7fd4767a6	2012-08-02 11:55:55 -07:00
Pascal Massimino	323dc4d9b9	remove use of log2(). Use VP8LFastLog2() instead. Order-by-cost mostly unchanged (up to a scaling constant 1/log(2)) (except for few minor diff in < 2% of cases) + remove unused field cost_mode->cache_bits_ Change-Id: I714f8ab12f49a23f5d499a64c741382c9b489a3e	2012-08-02 00:08:58 -07:00
Pascal Massimino	2fc1301577	harmonize authors as "Name (mail@address)" Change-Id: I85bfae61a37de75a5ed945a906002de2ef75149f	2012-07-19 16:09:47 -07:00
James Zern	d5e5ad6356	move decode_vp8.h from webp/ to dec/ the functions contained in it are now private Change-Id: Ief6c81b32ae3f6d97052edac625716e5b909e66e	2012-07-16 22:12:59 -07:00
Pascal Massimino	fcc69923b9	add automatic YUVA/ARGB conversion during WebPEncode() Adds new methods WebPPictureARGBToYUVA() and WebPPictureYUVAToARGB() Depending on the value of picture->use_argb_input, the main call WebPEncode() will convert appropriately. Note that both conversions are lossy, so it's recommended to: * use YUVA input for lossy compression (picture->use_argb_input=0) * use ARGB input for lossless compression (picture->use_argb_input=1) Change-Id: I8269d607723ee8a1136b9f4999f7ff4e657bbb04	2012-06-28 00:34:23 -07:00
Pascal Massimino	802e012a18	fix compilation in non-FANCY_UPSAMPLING mode Change-Id: Id0b1fad3a4888b6e9563a227412b2e6a656d9a2a	2012-06-28 00:26:35 -07:00
Pascal Massimino	637a314f97	remove the now unused *KeepA variants Change-Id: I65217f3075e30bc9a7f38a49d09f01c9d7271d6a	2012-06-27 10:00:48 -07:00

1 2

100 Commits