libwebp

mirror of https://github.com/webmproject/libwebp.git synced 2025-07-13 06:24:27 +02:00

Author	SHA1	Message	Date
James Zern	b44eda3f60	dsp: add DSP_INIT_STUB generates a stub function when the specific architecture is not enabled, exposing a symbol in the module, avoiding a compiler warning Change-Id: Ia9336e57466a9b5241b85c1c95838e91c9283147	2015-04-02 23:55:35 -07:00
James Zern	182497993b	dsp: s/VP8LSetHistogramData/VP8SetHistogramData/ this function is for lossy encoding; the VP8L prefix is used by lossless Change-Id: I147590a91477a77af51ed79cc640546dfe53abdb	2015-03-24 18:27:41 -07:00
James Zern	fbdcef2401	dsp/enc*.c: rework WEBP_USE_<arch> ifdef add a dummy init rather than repeating the '#ifdef WEBP_USE_...' pattern. Change-Id: I0cf40b500f9b3eed55a3211213db180c7c0dd43b	2015-03-20 19:19:46 -07:00
James Zern	602a00f93f	fix iOS arm64 build with Xcode 6.3 the standard vtbl functions are available there [1][2]. based on a patch from: aaroncrespo fixes issue #243. [1] http://adcdownload.apple.com//Developer_Tools/Xcode_6.3_beta/Xcode_6.3_beta_Release_Notes.pdf [2] Apple LLVM Compiler Version 6.1 - Xcode 6.3 updates the Apple LLVM compiler to version 6.1.0. [...] Support for the arm64 architecture has been significantly revised to align with ARM's implementation, where the most visible impact is that a few of the vector intrinsics have changed to match ARM's specifications. Change-Id: I79a0016f44b9dbe36d0373f7f00a50ab3c2ca447	2015-02-19 12:16:58 -08:00
James Zern	b969f5dfac	dsp: normalize WEBP_TSAN_IGNORE_FUNCTION usage the attribute is only necessary in one location; remove it from the prototypes. Change-Id: I3820a3c34fbb029fd7ac69a1b0a9b76091bdbde2	2015-02-13 15:23:40 -08:00
James Zern	f8740f0d6c	dsp: s/USE_INTRINSICS/WEBP_USE_INTRINSICS/ for consistency with other defines shared across modules Change-Id: I30cdb9f892e9ea48265883f560500ffb1d6799ee	2015-01-12 14:27:36 -08:00
James Zern	a3946b8956	enc_neon: fix building with non-Xcode clang (iOS) check for __apple_build_version__ to distinguish the two; a version check could work as Apple bumped Xcode's to 5.x/6.x, but it's unclear how upstream will deal with their versioning as they go 3.6+, so avoid it for now. Change-Id: I67cda67c4f68e262a92d805a63cc1496374be063	2014-12-10 15:50:26 -08:00
Pascal Massimino	bad775715a	simplify the Histogram struct, to only store max_value and last_nz we don't need to store the whole distribution in order to compute the alpha Later, we can incorporate the max_value / last_non_zero bookkeeping in SSE2 directly. Change-Id: I748ccea4ac17965d7afcab91845ef01be3aa3e15	2014-12-10 10:44:57 +01:00
James Zern	a4c3a31b8f	WEBP_TSAN_IGNORE_FUNCTION: fix gcc compat warning move the attribute to the front of the function to quiet clang warning: GCC does not allow no_sanitize_thread attribute in this position on a function definition Change-Id: Ie4cc6e35a07bd00eab67d9cd6801bd2be9cfe676	2014-10-16 18:06:43 +02:00
Pascal Massimino	80247291c6	mark some init function as being safe for thread_sanitizer. introduces the macro WEBP_TSAN_IGNORE_FUNCTION Change-Id: I3de2b6c1a2076fba4da7ae50322551e026b2082b	2014-10-16 16:34:07 +02:00
James Zern	7534d71640	enc_neon: initialize vectors w/vdup_n_u32 replaces {} initialization gnu-ism Change-Id: I5a7b2d4246f0205e4bfb7f4b77d720c47d8674ec	2014-10-09 12:35:41 +02:00
Yang Zhang	ab70794ddb	rewrite Disto4x4 in enc_neon.c with intrinsic Performance test: Platform: A9 Input data: bryce.yuv 11158x2156 performance of assembly is the base. Less ratio is better. \|toolchain \|assembly \|intrinsic \| \|gcc4.6 \|100% \|97.15% \| \|gcc4.8 \|100% \|95.51 \| Change-Id: Idc2446685acdeb58a4dbdcdae533c68a83a1b879	2014-09-23 18:28:36 -07:00
skal	73d361dd5f	introduce VP8EncQuantize2Blocks to quantize two blocks at a time No speed diff for now. We might reorder better the instructions later, to speed things up. Change-Id: I1949525a0b329c7fd861b8dbea7db4b23d37709c	2014-08-25 20:21:42 -07:00
James Zern	953acd56a4	enc_neon: enable QuantizeBlock for aarch64 vtbl4_u8 is available everywhere except iOS arm64: use vtbl2q_u8 there with a corresponding change in the load. Change-Id: Ib84212dda3c7875348282726c29e3b79b78b0eac	2014-08-20 11:48:25 -07:00
James Zern	e300c9d819	cosmetics fix some indent/whitespace, remove a few duplicate includes, extra semi-colons Change-Id: If937182b40a21e0f2028496e7b4b06c6e8a41352	2014-08-06 12:10:59 -07:00
James Zern	e59f53600f	neon: normalize vdup_n_* usage with constants, prefer this over vmov_n_* or vcreate_* Change-Id: Ia84b2a82faea58e2626211a7e2257e0ba4af358a	2014-07-01 00:55:05 -07:00
James Zern	bc03670f01	neon: add INIT_VECTOR4 used to initialize NxMx4 vector types replaces initialization via '{{ }}' gnu-ism. Change-Id: I0da7b3d321f3d48579b7863fb2e4d3f449ae7f5e	2014-07-01 00:18:23 -07:00
James Zern	dc7687e51b	neon: add INIT_VECTOR2 used to initialize NxMx2 vector types replaces initialization via '{{ }}' gnu-ism. Change-Id: I4accc305c7dd4c886b63c22e38890b629bffb139	2014-06-30 23:52:42 -07:00
James Zern	9251c2f6d2	(enc\|dec)_neon: use vcreate_*() where appropriate this is more portable than {} initialization. more involved cases are left for a follow-up. Change-Id: If8783423d17e90694b168a64ba313ed62ce2cc17	2014-05-27 16:26:56 -07:00
James Zern	1ba61b09f9	enable NEON intrinsics in aarch64 builds avoids functions that use vtbl? as in iOS builds these are marked unavailable Change-Id: I17aedc3c7dc8f1d5be0941205de0b22c3772ef1b	2014-05-03 12:37:42 -07:00
James Zern	b9d2bb67d6	dsp/neon.h: coalesce intrinsics-related defines Change-Id: Ifadd41a5bbf7f99eeb6d75d2b67daa25e0544946	2014-05-03 11:34:07 -07:00
pascal massimino	3f3d717a6c	Merge "enc_neon: enable intrinsics-only functions"	2014-04-27 02:05:53 -07:00
James Zern	42b35e086b	enc_neon: enable intrinsics-only functions CollectHistogram / SSE* / QuantizeBlock have no inline equivalents, enable them where possible and use USE_INTRINSICS to control borderline cases: it's left undefined for now. Change-Id: I62235bc4ddb8aa0769d1ce18a90e0d7da1e18155	2014-04-26 19:09:04 -07:00
James Zern	5e1a17ef4b	enc_neon: move Transpose4x4 to dsp/neon.h + reuse it in TransformWHT() Change-Id: Idfbd0f9b58d6253ac3d65ba55b58989c427ee989	2014-04-26 14:06:04 -07:00
James Zern	98519dd5c1	enc_neon: convert Disto4x4 to intrinsics Change-Id: I0f00d5af2de2301e8237c2a38a9612d3645abad6	2014-04-17 18:29:31 -07:00
Pascal Massimino	fe9317c9bf	cosmetics: * remove MIPS32 suffix from static function names * fix a long line in enc_neon.c Change-Id: Ia1294ae46f471b3eb1e9ba43c6aa1b29a7aeb447	2014-04-16 00:36:19 -07:00
James Zern	953b074677	enc_neon: cosmetics fix/remove incorrect comments + whitespace Change-Id: Id1b86beb23e5bf946e73c34ab7066b6ca177f33b	2014-04-15 23:57:03 -07:00
skal	3f84b5219d	Merge "replace some mult-long (vmull_u8) with mult-long-accumulate (vmlal_u8)"	2014-04-15 07:09:12 -07:00
skal	95203d2d1b	NEON intrinsics version of CollectHistogram apparently faster, but we might save some load/store to/from memory once we settle for the intrinsics-based FTransform() (also: fixed some #ifdef USE_INTRINSICS problems) Change-Id: I426dea299cea0c64eb21c4d81a04a960e0c263c7	2014-04-14 16:47:20 +02:00
skal	7ca2e74bb4	replace some mult-long (vmull_u8) with mult-long-accumulate (vmlal_u8) saves few instructions Change-Id: If8f464bb2894a209bba94825a4db9267df126d47	2014-04-14 15:14:45 +02:00
skal	8ff96a027a	NEON intrinsics version of FTransform as little bit slower than inlined asm it seems. So disabled for now. Change-Id: I8c942846f9bedaed57275675ea9dbbcb8dfd9ccd	2014-04-14 09:58:35 +02:00
skal	869eaf6c60	~30% encoding speedup: use NEON for QuantizeBlock() also revamped the signature to avoid having to pass the 'first' parameter Change-Id: Ief9af1747dcfb5db0700b595d0073cebd57542a5	2014-04-08 03:08:22 -07:00
James Zern	f758af6b73	enc_neon: convert FTransformWHT to intrinsics slightly faster than the inline asm in practice not much faster than the C-code in a full NEON build, but still better overall in an Android-like one that only enables NEON for certain files. Change-Id: I69534016186064fd92476d5eabc0f53462d53146	2014-04-08 00:20:19 -07:00
skal	4143332b22	NEON intrinsics for encoding * inverse transform is actually slower with intrinsics + gcc-4.6, so is left disabled for now. With gcc-4.8, it's a bit faster than inlined assembly. * Sum of Square error function provide a 2-3% speed up There's enabled by default (since there's no inlined-asm equivalent) Change-Id: I361b3f0497bc935da4cf5b35e330e379e71f498a	2014-04-04 15:02:56 -07:00
James Zern	df230f2723	dsp: reuse wht transform from dec in encoder Change-Id: Ide663db9eaecb7a37fe0e6ad4cd5f37de190c717	2014-03-22 13:25:08 -07:00
James Zern	5227d99146	drop: ifdef __cplusplus checks from C files the prototypes are already marked in the headers Change-Id: I172fe742200c939ca32a70a2299809b8baf9b094	2013-12-13 11:42:13 -08:00
James Zern	4931c3294b	cosmetics: fix some typos Change-Id: I0d6efebd817815139db5ae87236fd8911df4d53c	2013-11-26 19:21:14 -08:00
James Zern	d640614d54	update copyright text rather than symlink the webm/vpx terms, use the same header as libvpx to reference in-tree files based on the discussion in: https://codereview.chromium.org/12771026/ Change-Id: Ia3067ecddefaa7ee01550136e00f7b3f086d4af4	2013-06-06 23:09:14 -07:00
skal	3fe91635df	remove datatype qualifier for vmnv this fix is for clang (LLVM v4.2). gcc was fine. Change-Id: Id4076cda84813f6f9548a01775b094cff22b4be9	2013-05-23 13:52:24 +02:00
skal	9c4ce971a8	Simplify forward-WHT + SSE2 version no precision loss observed speed is not really faster (0.5% at max), as forward-WHT isn't called often. also: replaced a "int << 3" (undefined by C-spec) by a "int * 8" ( supersedes https://gerrit.chromium.org/gerrit/#/c/48739/ ) Change-Id: I2d980ec2f20f4ff6be5636105ff4f1c70ffde401	2013-04-26 08:57:18 +02:00
Pascal Massimino	142c46291e	misc style fix Change-Id: Ib764cb09bd78ab6e72c60f495d55b752ad4dbe4d	2013-03-29 03:13:43 -07:00
Johann	47b7b0ba47	Disto4x4 and Disto16x16 in NEON Change-Id: Ic6d9dbbc97b5025ce359332c33ae306d5d8925a5	2013-01-16 16:57:33 -08:00
skal	42c3b550ba	simplify the fwd transform -> remove two shifts Change-Id: Ibc55bca98588da30553a7870224ffd0e13d57f52	2012-11-15 09:51:35 +01:00
Pascal Massimino	22a0fd9d01	Add NEON version of FTransformWHT Contributed by Wayne Chen (datoudatou at gmail dot com) Change-Id: I007c21db4eeadbf82b89f0963256f965deda7d90	2012-11-08 08:28:51 -08:00
Pascal Massimino	e8b41ad136	add NEON asm version for WHT inverse transform Contributed by Wayne Chen (datoudatou at gmail dot com) + some header cleanup + remove the NEON suffix in static functions Change-Id: I75bf5e9b54cf5e1acc53764c6f081d61690f8e3d	2012-11-01 16:31:01 -07:00
Pascal Massimino	75e5f17e3b	ARM/NEON: 30% encoding speed-up (implements the backward and forward transforms in the encoder) original patch by Wayne Chen (datoudatou at gmail dot com) Change-Id: Ic00f3bffcdf7a924f043006728735c810ee47a57	2012-10-31 14:00:20 -07:00

46 Commits