libwebp

mirror of https://github.com/webmproject/libwebp.git synced 2025-08-11 02:20:33 +02:00

Author	SHA1	Message	Date
James Zern	8e42ba4c80	simplify WEBP_EXTERN macro including the type in the macro doesn't bring much benefit to ordering, current platforms work with a prefix, this would be insufficient if the attribute needed to follow the function prototype. this form makes it easier to override on the command line. BUG=webp:355 Change-Id: Iba41ec0bb319403054be0e899c4cc472dd932fd9	2017-07-31 18:27:52 -07:00
James Zern	92982609bc	dsp.h: fix -Wundef w/__mips_dsp_rev Change-Id: I552a543c7b039774041b43ace75b0cbea566b119	2017-07-11 16:12:32 -07:00
James Zern	4ea49f6b82	rescaler_sse2.c: fix WEBP_RESCALER_FIX -> _RFIX typo quiets -Wundef Change-Id: I8f1facf401b6f1ab393005c93086ac3e2ae354d5	2017-07-11 15:35:27 -07:00
James Zern	b34a9db1a1	cosmetics,dec_sse2: remove some redundant comments Change-Id: I5a59d6dde9b6638b318f36d51d0d53870a3de273	2017-07-06 23:19:18 -07:00
Vincent Rabaud	8acb4942f7	Remove the argb* files. Half of the functionality was duplicated. The rest is about the alpha channel handling so we might as well put it in the appropriate file. Change-Id: I8d5ef0afce82cc4842ab7132fd97995c42e6140a	2017-06-25 14:44:33 +02:00
Vincent Rabaud	7ca0df1363	Have the SSE2 version of PackARGB use common code. The common code actually got sped-up by 25% by using the code from PackARGB. Change-Id: I94be6ccff2bfe02fff13c8e2698669e6a0d8fc74	2017-06-20 17:41:14 +02:00
Vincent Rabaud	8f6df1d0b9	Unroll Predictors 10, 11 and 12. We see the following speed-ups: 10 -> 13% 11 -> 13% 12 -> 13% Change-Id: I4734fd388d0f4e508884d0b123976bf2cbe69d2f	2017-06-08 20:37:47 +02:00
Vincent Rabaud	e4eb458741	lossless, VP8LTransformColor_C: make sure no overflow happens with colors. Change-Id: Iec0d07cf1188ba96391cdb1b62131fc1469dfac6	2017-05-24 11:34:40 +02:00
Pascal Massimino	faf42213f4	NEON: implement ConvertRGB24ToY/BGR24/ARGB/RGBA32ToUV/ARGBToUV Change-Id: Ie68aaed36d17f56d998c1b284514860cf5d28b8a	2017-05-09 15:57:20 +02:00
Pascal Massimino	f768218966	yuv: rationalize the C/SSE2 function naming + implement some easy missing targets in SSE2 (565/4444) Change-Id: Ib575f7ada2a0ed7309cddd238f8bfc0e8999f145	2017-04-21 13:52:25 +02:00
Pascal Massimino	52245424b0	NEON implementation of some Sharp-YUV420 functions Change-Id: I449ef9c76b06f971f6e2ad7f9db96bf906d8fe1f new-file: dsp/yuv_neon.c	2017-04-18 19:22:37 +02:00
Pascal Massimino	28c37ebd5a	VP8LEnc: remove use of BitsLog2Ceiling() was only used once. Better fall back for Log2Floor. Change-Id: Ibcc26505440971bffe62ba6aca3d179ca85791d4	2017-03-20 02:58:16 -07:00
James Zern	80a2218668	ssim.c: remove dead include Change-Id: Ia4be534b3b95d5d9f712ff53e530c98b942df860	2017-02-21 20:17:19 -08:00
Pascal Massimino	693bf74ec0	move the SSIM calculation code in ssim.c / ssim_sse2.c Change-Id: I63a63fa7f44f257f2e17e45358b206c23069c448	2017-02-21 12:53:35 +01:00
Pascal Massimino	4105d565d3	disable WEBP_USE_XXX optimisations when EMSCRIPTEN is defined Currently, none are available. If WEBP_HAVE_SSE2 eventually works, we'll have to refine this conditionals. BUG=webp:261 Change-Id: Ibc63ee1c013f2a4169eeb85cc8b6317b6420c2ad	2017-02-08 15:44:20 +00:00
Parag Salasakar	aa893914fc	Add clang build fix for MSA Change-Id: If139f4ecbdce756c69ba4ae032a70f81179683f8	2017-02-01 17:45:17 +05:30
Pascal Massimino	4f3e3bbd44	disable GradientUnfilter_NEON Compile with XCode, it appears quite slower than the C-version, especially for arm64. Change-Id: Ic46dba184a36be454fef674129d2f909003788fc	2017-01-25 16:33:26 -08:00
Pascal Massimino	79bf46f120	rename the pretentious SmartYUV into SharpYUV Change-Id: Ifeeb9cb85896c5f3ba0cc1c2c821f8d00295f69e	2017-01-20 14:36:21 +01:00
James Zern	668e1dd44f	src/{dec,enc,utils}: give filenames a unique suffix this avoids duplicates between these trees and dsp/, e.g., enc/tree.c, dec/tree.c, making pulling the whole library source tree into one target possible BUG=webp:279 Change-Id: I060a614833c7c24ddd37bf641702ae6a5eef1775	2017-01-19 19:09:48 -08:00
Pascal Massimino	71c53f1aeb	NEON: speed-up strong filtering The sub-expression trick removes two constants and two vmlal_s8 instructions. Change-Id: I200022573b4880871b528b13a11a8f3d95def113	2017-01-19 20:46:48 +00:00
Pascal Massimino	749a45a520	Merge "NEON: implement alpha-filters (horizontal/vertical/gradient)"	2017-01-17 15:13:08 +00:00
Pascal Massimino	74c053b57d	Merge "NEON: fix overflow in SSE NxN calculation"	2017-01-17 15:10:54 +00:00
Pascal Massimino	1de931c669	NEON: implement alpha-filters (horizontal/vertical/gradient) gradient-filter code is not much faster, but maybe improvable in the future. Change-Id: Ia16070e409fe8703b02276166f19526917df6b35	2017-01-17 15:44:46 +01:00
Pascal Massimino	9b3aca404d	NEON: fix overflow in SSE NxN calculation vmlal_u8() is prone to overflow during the accumulation. There was a mismatch happening at low q mostly. Because in this case the distortion is important and the accumulated sum was later than 16bit-unsigned. Change-Id: I1a08a2f744bcdf0b26647e61b9ee92a0c2e28fe8	2017-01-17 11:47:36 +01:00
Pascal Massimino	1c07a3c639	dsp: WebPExtractGreen function for alpha decompression + NEON implementation Change-Id: I67204f99d6e4c5974718bdf21dad30381978f72c	2017-01-17 09:33:25 +00:00
Pascal Massimino	8fda56126e	Merge "add a kSlowSSSE3 feature for CPUInfo"	2017-01-13 07:01:48 +00:00
Pascal Massimino	86bbd24552	add a kSlowSSSE3 feature for CPUInfo This is meant to be used for run-time detection of slow platforms regarding instructions like pshufb and bsr. Adapted from libvpx patch: https://chromium-review.googlesource.com/#/c/367731 Change-Id: I2c22fbb9aae699d87a041393ba1ad5f1f21ff640	2017-01-13 06:19:27 +00:00
Vincent Rabaud	7c2779e95a	Get code to fully compile in C++. Change-Id: I6d8490c8c9b955d90dcc89ee8a9cf29ca0f93b08	2017-01-12 18:03:55 +01:00
Vincent Rabaud	250c358662	Merge "When compiling as C++, avoid narrowing warnings."	2017-01-12 13:00:56 +00:00
Vincent Rabaud	c0648ac2ae	When compiling as C++, avoid narrowing warnings. The gcc compilation warning was: narrowing conversion from ‘int’ to ‘int8_t’ Change-Id: I4803dd60ad04060cdb5d61a1aa98b25215b9d4eb	2017-01-12 13:39:22 +01:00
Pascal Massimino	0d55f60c91	40% faster ApplyAlphaMultiply_SSE2 process four pixels at a time Change-Id: I1dee7f70772be4915654fc6638ef4729a1a239d4	2017-01-12 02:33:09 -08:00
Pascal Massimino	49d0280df1	NEON: implement several alpha-processing functions - ApplyAlphaMultiply - DispatchAlpha - DispatchAlphaToGreen - ExtractAlpha Decoding to Argb / rgbA / ... is 10-15% faster (measured on N4) new file: alpha_processing_neon.c Change-Id: I40f1a809e9885d1031ff0bc886d8d001efa66bca	2017-01-11 17:39:29 +01:00
Pascal Massimino	48b1e85fbe	SSE2: 15% faster alpha-processing functions ApplyAlphaMultiply / MultARGBRow / MultRow we use now: x/255 = (x * 0x8081) >> (16 + 7) and x/255 + .5 = ((x + 128) * 0x0101) >> 16 Change-Id: I8931091316ffc8bbf65aa3402f2e7d2b800e1971	2017-01-11 15:35:16 +01:00
Pascal Massimino	28fe054e73	SSE2: 30% faster ApplyAlphaMultiply() and 15% faster MultARGBRow() by switching to formulae: X / 255 = (X + 1 + (X >> 8)) >> 8 for any 16bit value X. (X / 255 + .5) = (XX + (XX >> 8)) >> 8, with XX = X + 128 Change-Id: Ia4a7408aee74d7f61b58f5dff304d05546c04e81	2017-01-10 23:34:22 +01:00
Pascal Massimino	be0ef6395f	fix a comment typo Change-Id: I0fabd08cd8abd3cea7ddfd2e498507adb0d3c67e	2017-01-10 21:17:13 +01:00
Pascal Massimino	00b08c88c0	Merge "NEON: 5% faster conversion to RGB565 and RGBA4444"	2016-12-22 08:39:01 +00:00
Pascal Massimino	0e7f444702	Merge "NEON: faster fancy upsampling"	2016-12-21 14:53:24 +00:00
Pascal Massimino	b016cb91c5	NEON: faster fancy upsampling 2-3% faster decoding overall Change-Id: I2c53e50dc7e0ade5245cff8cc5d7b96a14062955	2016-12-21 15:23:54 +01:00
Vincent Rabaud	1cb638010c	Call the C function to finish off lossless SSE loops only when necessary. Change-Id: I4e221d80879dc9c90c24d69a40bc5811d73787ad	2016-12-21 14:25:54 +01:00
Vincent Rabaud	875fafc191	Implement BundleColorMap in SSE2. Change-Id: I44cd23647bd0a49330b6b2b3ed08050a5500e58e	2016-12-21 10:44:31 +01:00
Pascal Massimino	341d711c43	NEON: 5% faster conversion to RGB565 and RGBA4444 We use the magic 'shift and insert' instruction instead of the multiple shifts and or's. Change-Id: I48df0320668b502a91792defc0423a9441669d19	2016-12-20 17:01:48 +01:00
Pascal Massimino	a4bbe4b38b	fix indentation Change-Id: I5593fb2441f253c6b8cc43949c11909f19184b55	2016-12-13 22:50:29 -08:00
Pascal Massimino	58fc507842	Merge "PredictorSub: implement fully-SSE2 version"	2016-12-13 11:03:13 +00:00
Pascal Massimino	9cc421675b	PredictorSub: implement fully-SSE2 version and inline the C-version too. Predictor #13 is still a hard one. Change-Id: Iedecfb5cbf216da4e28ccfdd0810286133f42331	2016-12-13 02:19:35 -08:00
James Zern	2423017a28	dsp/lossless.c,cosmetics: fix indent after: `fbba5bc` optimize predictor #1 in plain-C For some reason, gcc has hard time inlining this one... Change-Id: I2e2416593acd4c9d14958d8757bfd284d999100b	2016-12-12 12:53:23 -08:00
Pascal Massimino	fbba5bc2c1	optimize predictor #1 in plain-C For some reason, gcc has hard time inlining this one... Also optimize predictor #0 and #1 for encoding, so we don't have to call the generic pointers VP8LPredictors[...] Change-Id: I1ff31e3b83874b53f84fe23487f644619fd61db9	2016-12-12 17:41:36 +01:00
Pascal Massimino	9ae0b3f65a	Merge "SSE2: slightly (~2%) faster Predictor #1 "	2016-12-12 14:46:21 +00:00
Pascal Massimino	c1f97bd758	SSE2: slightly (~2%) faster Predictor #1 by removing a load from memory Change-Id: If6c4aa7fb99309d09f943393ec772891449971f0	2016-12-12 02:24:38 -08:00
Pascal Massimino	ea664b8995	SSE2: 10% faster Predictor #11 Change-Id: I14ae5f6603071b86dfdbe8e6f7dfdbe5d8510185	2016-12-12 02:20:41 -08:00
Pascal Massimino	b3fb8bb602	slightly faster Predictor #11 in NEON (+some slight modifications on Predictor #12) Change-Id: Ic2132dcd83d961cd069fa01ca1670e35e35274e2	2016-12-08 07:32:51 -08:00

1 2 3 4 5 ...

731 Commits