libwebp

mirror of https://github.com/webmproject/libwebp.git synced 2026-04-09 22:30:02 +02:00

Author	SHA1	Message	Date
skal	69fce2ea78	remove the special casing for res->first in VP8SetResidualCoeffs if res->first = 1, coeffs[0]=0 because of quant.c:749 and line added at quant.c:744 So, no need for the extra case. Going forward, TrellisQuantizeBlock() should also be calling a variant of VP8SetResidualCoeffs() to set the 'last' field. also: fixes a warning for win64 + slight speed-up Change-Id: Ib24b611f7396d24aeb5b56dc74d5c39160f048f0	2014-06-08 06:40:22 +02:00
James Zern	bdfeebaa01	dsp/yuv: move sse2 functions to yuv_sse2.c Change-Id: I2f037ff18e7cf07e8801f49b3a89c1e36ef73000	2014-06-05 23:52:54 -07:00
pascal massimino	46b32e861a	Merge "configure: set WEBP_HAVE_AVX2 when available"	2014-06-05 02:57:42 -07:00
pascal massimino	88305db4fc	Merge "VP8RandomBits2: prevent signed int overflow"	2014-06-05 01:46:42 -07:00
James Zern	73fee88c4a	VP8RandomBits2: prevent signed int overflow 'diff' at its largest may be INT_MAX; << 1 of anything at or above 1 << 30 will overflow. Change-Id: Idb2b5a9b55acc2f6d5e32be8baaebee3f89919ad	2014-06-04 23:19:03 -07:00
James Zern	db4860b355	enc_sse2: prevent signed int overflow _mm_movemask_epi8 returns a 16-bit mask; << 16 can overflow a signed int. Change-Id: Ia0bb0804fe548fb9b0edb3695e82727506066cda	2014-06-04 23:18:22 -07:00
skal	3fdaf4d28c	Merge "real fix for longjmp warning"	2014-06-04 03:01:40 -07:00
skal	385e334019	real fix for longjmp warning the 'volatile' qualifier was at the wrong place Patch by Paul Pluzhnikov Change-Id: I26e6f311a0ccd145de640b3505fe92965389c1d9	2014-06-04 11:02:42 +02:00
James Zern	230a055501	configure: set WEBP_HAVE_AVX2 when available this is used to set WEBP_USE_AVX2 in files where the build flag won't be used, i.e., dsp/enc.c, which enables VP8EncDspInitAVX2() to be called Change-Id: I362f4ba39ca40d3e07a081292d5f743c649d9d7f	2014-06-03 23:29:23 -07:00
skal	a2ac8a420e	restore original value_/range_ field order no speed change, just for coherency Change-Id: Iaa395bca24f33a14b68ba6920b838ef87d0d0db6	2014-06-03 09:36:56 +02:00
James Zern	5e2ee56fdd	Merge "remove libwebpdspdecode dep on libwebpdsp_avx2"	2014-06-03 00:28:54 -07:00
James Zern	61362db57c	remove libwebpdspdecode dep on libwebpdsp_avx2 it's encode only, libwebpdecoder doesn't need the symbols Change-Id: I5633dd2017a96e60068ae5384f1ba27898d29f83	2014-06-03 00:05:56 -07:00
Pascal Massimino	42c447aeb0	Merge "lossy bit-reader clean-up:"	2014-06-02 23:53:00 -07:00
James Zern	479ffd8b5d	Merge "remove unused #include's"	2014-06-02 23:07:20 -07:00
James Zern	9754d39a4e	Merge "strong filtering speed-up (~2-3% x86, ~1-2% for NEON)"	2014-06-02 23:06:18 -07:00
skal	158aff9bb9	remove unused #include's Change-Id: Icd91a4b6a0bde49145f57e3e74a997822c45792c	2014-06-03 08:01:05 +02:00
Pascal Massimino	09545eeadc	lossy bit-reader clean-up: * remove LEFT/RIGHT_JUSTIFY distinction. It's all RIGHT_JUSTIFY now. * simplify VP8GetSigned(), and add some masking branch-less code. Much faster on ARM (~13% speed-up). 8% on x86-64, 5% on MacBook. * split critical implementation into separate bit_reader_inl.h file that is only included where needed (vp8.c / tree.c / bit_reader.c) * bumped BITS value from 16 to 24 for x86-32b too, since it's a bit faster. Change-Id: If41ca1da3e5c3dadacf2379d1ba419b151e7fce8	2014-06-03 07:46:55 +02:00
skal	ea8b0a171d	strong filtering speed-up (~2-3% x86, ~1-2% for NEON) Extract loop invariant and avoid storing/loading samples if they can be re-used. This is particularly interesting when a transpose is involved (HFilter16i). Change-Id: I93274620f6da220a35025ff8708ff0c9ee8c4139	2014-06-03 07:14:23 +02:00
skal	6679f8996f	Optimize VP8SetResidualCoeffs. Brings down WebP lossy encoding timings by 5% Change-Id: Ia4a2fab0a887aaaf7841ce6d9ee16270d3e15489	2014-06-03 06:44:04 +02:00
skal	ac591cf22e	fix for gcc-4.9 warnings about longjmp + local variables Needed to add 'volatile' and some casts. Relevant excerpt from the 'man longjmp': =============== The values of automatic variables are unspecified after a call to longjmp() if they meet all the following criteria: · they are local to the function that made the corresponding setjmp(3) call; · their values are changed between the calls to setjmp(3) and longjmp(); and · they are not declared as volatile. =============== Change-Id: Ic72dc92669513a820369ca52a038afa9ec88091f	2014-05-30 10:19:10 -07:00
James Zern	4dfa86b29c	dsp/cpu: NaCl has no support for xgetbv or the raw opcode; fixes: 934ed4: unrecognized instruction Change-Id: I981870baf0e8b03bf40144ea8ec25eff140d5bc3	2014-05-29 23:02:23 -07:00
James Zern	4c398699ef	Merge "cwebp: fallback to native webp decode in WIC builds"	2014-05-28 15:03:34 -07:00
James Zern	33aa497e1a	Merge "cwebp: add some missing newlines in longhelp output"	2014-05-28 12:52:22 -07:00
skal	c9b340a279	fix missing WebPInitAlphaProcessing call for premultiplied colorspace output (lossless only) Change-Id: Ic2d01c8cf9bc1082f07f348733461eb2ee30288a	2014-05-28 10:44:05 +02:00
pascal massimino	57897bae09	Merge "lossless_neon: use vcreate_*() where appropriate"	2014-05-28 01:36:13 -07:00
pascal massimino	6aa4777b39	Merge "(enc\|dec)_neon: use vcreate_*() where appropriate"	2014-05-28 01:34:56 -07:00
skal	0d346e418d	Always reinit VP8TransformWHT instead of hard-coding Change-Id: I2012749ed29bd166d2a96555372f0d9baa784385	2014-05-28 10:21:07 +02:00
James Zern	7d039fc32d	cwebp: fallback to native webp decode in WIC builds this gives precedence to WIC, but attempts to decode the file as WebP if it fails Change-Id: I3d894f39a26aea88897a8ebd345139b82f74f312	2014-05-27 16:28:37 -07:00
James Zern	d471f424da	cwebp: add some missing newlines in longhelp output + update README Change-Id: Ia84d8857d575bc29ab3ce9c0f10264c042067e78	2014-05-27 16:28:02 -07:00
James Zern	bf0e003067	lossless_neon: use vcreate_*() where appropriate this is more portable than {} initialization. more involved cases are left for a follow-up. Change-Id: If7e111864f287ea0a5de6311454aeda37afbb52a	2014-05-27 16:27:46 -07:00
James Zern	9251c2f6d2	(enc\|dec)_neon: use vcreate_*() where appropriate this is more portable than {} initialization. more involved cases are left for a follow-up. Change-Id: If8783423d17e90694b168a64ba313ed62ce2cc17	2014-05-27 16:26:56 -07:00
skal	399b916d27	lossy decoding: correct alpha-rescaling for YUVA format The luminance needs to be pre- and post- multiplied by the alpha value in case of rescaling, for proper averaging. Also: - removed util/alpha_processing and moved it to dsp/ - removed WebPInitPremultiply() which was mostly useless and merged it with the new function WebPInitAlphaProcessing() Change-Id: If089cefd4ec53f6880a791c476fb1c7f7c5a8e60	2014-05-27 15:27:13 -07:00
James Zern	78c12ed8e6	Merge "Makefile.vc: add rudimentary avx2 support"	2014-05-27 11:13:40 -07:00
skal	dc5b122f23	try to remove the spurious warning for static analysis Change-Id: Ib81f16c70a0bfad05021401c1cf6788c974b63bd	2014-05-26 18:31:00 +02:00
James Zern	ddfefd624c	Makefile.vc: add rudimentary avx2 support similar to makefile.unix: > nmake /f Makefile.vc CFG=release-static HAVE_AVX2=1 from the msdn: The /arch:AVX2 option and __AVX2__ macro were introduced in Visual Studio 2013 Update 2, version 12.0.34567.1 (Update 2, version 12.0.30501.00 seems to work) Change-Id: I649ee47c9fdc399fc71a8ac8464728608d9b6412	2014-05-23 20:52:02 -07:00
Pascal Massimino	a891164398	Merge "simplify VP8LInitBitReader()"	2014-05-22 22:36:41 -07:00
Pascal Massimino	fdbcd44dd3	simplify VP8LInitBitReader() gcc was generating very complex code, one for each case of br->len_ values! also, pretty-fy the mask constants Change-Id: If62b1e8266f3fe5334517305113038d2ea8a6b42	2014-05-22 21:44:16 -07:00
James Zern	7c004287af	makefile.unix: add rudimentary avx2 support $ make -f makefile.unix HAVE_AVX2=1 will define -mavx2 for src/dsp/*_dsp.c Change-Id: Id9651bda54da057cb051dc70f7dcd008a3f803f4	2014-05-22 18:38:40 -07:00
James Zern	515e35cfb1	Merge "add stub dsp/enc_avx2.c"	2014-05-22 18:28:38 -07:00
skal	a05dc1402c	SSE2: yuv->rgb speed-up for point-sampling - use statically initialized tables (if WEBP_YUV_USE_SSE2_TABLES is defined) - use SSE2 row conversion for yuv->ARGB / RGBA / ABGR / RGB / BGR - clean-up and harmonize the WebpUpsamplers[] usage. Change-Id: Ic5f3659a995927bd7363defac99c1fc03a85a47d	2014-05-22 09:56:47 +02:00
James Zern	178e9a69ae	add stub dsp/enc_avx2.c VP8EncDspInitAVX2 is included in sse2 builds for now, later a configure flag should be added to avoid the stub when avx2 is unavailable/disabled Change-Id: I6127b687c273f46f41652aaf8e3b86ae3cfb8108	2014-05-22 00:31:46 -07:00
James Zern	1b99c09cdc	Merge "configure: add a test for -mavx2"	2014-05-22 00:30:10 -07:00
James Zern	fe72807112	configure: add a test for -mavx2 sets AVX2_FLAGS; currently unused Change-Id: Ie07ee6c2fa7c1f0748430010a9f207b1723b6def	2014-05-21 23:17:21 -07:00
James Zern	e46a247c87	cpu: fix check for __cpuidex availability __cpuidex was added in VS2008 /SP1/ Change-Id: Ie49b00b0246bd6537c0ed583412f17d6fd135baa	2014-05-21 22:59:47 -07:00
skal	176fda2650	fix the bit-writer for lossless in 32bit mode Sometimes, we can write 18bit or more at time, and it would overflow the 32bit accumulator. Also clarified the num-bits limitations (and exposed VP8L_MAX_NUM_BIT_READ in bit_reader.h) fixes http://code.google.com/p/webp/issues/detail?id=200 Seems a bit faster (use of local fields for bits_ / used_) also: added the __QNX__ bswap while at it. Change-Id: I876db93a931db15b083cf1d838c70105effa7167	2014-05-22 07:19:22 +02:00
James Zern	541784c710	dsp.h: add a check for AVX2 / define WEBP_USE_AVX2 Change-Id: I90cc870f0bb4426af701779c367587dc2ae79c8b	2014-05-21 20:46:28 -07:00
James Zern	bdb151ee80	dsp/cpu: add AVX2 detection currently unused. https://software.intel.com/en-us/articles/how-to-detect-new-instruction-support-in-the-4th-generation-intel-core-processor-family http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf Change-Id: I314200f890c58b9a587b902b214f90deb95f0579	2014-05-20 22:48:54 -07:00
Pascal Massimino	ab9f2f8685	Merge "revamp the point-sampling functions by processing a full plane"	2014-05-20 15:21:31 -07:00
Pascal Massimino	a2f8b28905	revamp the point-sampling functions by processing a full plane -nofancy is slower than fancy upsampler, because the latter has SSE2 optim. Change-Id: Ibf22e5a8ea1de86a54248d4a4ecc63d514c01b88	2014-05-20 15:13:44 -07:00
Pascal Massimino	ef076026af	use decoder's DSP functions for autofilter -af is now faster (6-7%), since we're using the SSE2 variant Output is binary the same as before. Change-Id: If75694594c9501cd486b8f237a810ddcc145cadd	2014-05-20 14:55:05 -07:00

1 2 3 4 5 ...

2040 Commits