Sam Clegg
ac4f5784a0
Disable NEON code on Native Client
...
The NEON assember in libwebp has not yet been ported
to Native Client. This changes disables it.
Related issue:
https://code.google.com/p/nativeclient/issues/detail?id=3205
Change-Id: I200291db7aa79d40c1f10cff7622c9b8599e6886
2015-03-10 16:17:25 -07:00
James Zern
b969f5dfac
dsp: normalize WEBP_TSAN_IGNORE_FUNCTION usage
...
the attribute is only necessary in one location; remove it from the
prototypes.
Change-Id: I3820a3c34fbb029fd7ac69a1b0a9b76091bdbde2
2015-02-13 15:23:40 -08:00
James Zern
e15560107c
move some cost tables from enc/ to dsp/
...
removes circular dependency between dsp and enc.
since:
a987fae MIPS: dspr2: added optimization for function GetResidualCost
Change-Id: Ifeb8fc02de89e2ba982ed7ffacd925d649bfec3c
2015-02-11 16:10:06 -08:00
James Zern
e3d9771aa1
VP8EncDspCostInit*: add missing TSan annotations
...
Change-Id: I4cdb84bc8c9a8c6aa34b5773c8fb69e5810a9809
2015-02-09 22:39:14 -08:00
Pascal Massimino
a987faedfa
MIPS: dspr2: added optimization for function GetResidualCost
...
set/get residual C functions moved to new file in src/dsp
mips32 version of GetResidualCost moved to new file
Change-Id: I7cebb7933a89820ff28c187249a9181f281081d2
2015-02-07 02:13:26 -08:00
Pascal Massimino
774d4cb758
make VP8PredLuma16[] array non-const
...
Change-Id: I0ce7e4e847f9fffefb6544db9636068442a2d264
2015-02-04 17:00:22 +01:00
Pascal Massimino
7afdaf8496
Alpha coding: reorganize the filter/unfiltering code
...
Move the filtering code to their own dsp/ spot
New function: VP8FiltersInit()
Change-Id: I0b2041eab42346c59b972f2575b05509e6a8f7b1
2015-01-28 08:02:41 +01:00
Pascal Massimino
d581ba40ba
follow-up: clean up WebPRescalerXXX dsp function
...
by removing redundant RFIX macros and using a plain-C fallback.
Change-Id: I52436c672bf20780b6fe3bcf43fe73e1abac10ff
2015-01-12 15:26:55 -08:00
James Zern
f8740f0d6c
dsp: s/USE_INTRINSICS/WEBP_USE_INTRINSICS/
...
for consistency with other defines shared across modules
Change-Id: I30cdb9f892e9ea48265883f560500ffb1d6799ee
2015-01-12 14:27:36 -08:00
Pascal Massimino
ab66becaae
introduce a separate WebPRescalerDspInit to initialize pointers
...
so that we keep the details of WebPRescaler in utils/rescaler.c
when possible.
Change-Id: Ib6c1029a09b84cbc7a7d2f70dafa4d4d9132cecc
2015-01-12 13:58:30 -08:00
James Zern
4f43d38ca8
enable NEON for Windows ARM builds
...
Change-Id: I230b353214ce44ab29ffd2df6ccd14345d6578e8
2015-01-09 19:11:55 -08:00
pascal massimino
c6d3292738
argb_sse2: cosmetics
...
clarify some variable names in PackARGB() + add some comments
Change-Id: I2bb91d6c52dcbcdebe0f92d5f2136c2d7d11af2a
2015-01-08 00:18:54 -08:00
Pascal Massimino
72d573f693
simplify the PackARGB signature
...
Change-Id: I51570e362126b2681f93211a4f59a3fedb5fd4b5
2015-01-05 02:10:04 -08:00
Djordje Pesut
7ce8788b06
MIPS: dspr2: added optimization for function MakeARGB32
...
inline function MakeARGB32 calls changed to call
via pointers to functions which make (a)rgb for
entire row
Change-Id: Ia4bd4be171a46c1e1821e408b073ff5791c587a9
2014-12-22 12:31:36 +01:00
Pascal Massimino
bad775715a
simplify the Histogram struct, to only store max_value and last_nz
...
we don't need to store the whole distribution in order to compute the alpha
Later, we can incorporate the max_value / last_non_zero bookkeeping
in SSE2 directly.
Change-Id: I748ccea4ac17965d7afcab91845ef01be3aa3e15
2014-12-10 10:44:57 +01:00
Pascal Massimino
66ad372500
factorize BPS definition in dsp.h and add VP8Copy16x8
...
Change-Id: Id73a1e968c96455808755df4d131d74e3e2e135d
2014-12-04 13:45:14 +01:00
James Zern
e18571393d
dsp: initialize VP8PredChroma8 in VP8DspInit()
...
the table becomes non-const to allow for platform-specific optimizations
Change-Id: I32d2b51480020dc653ecfafd20b6b0f096af349f
2014-11-24 22:12:42 -08:00
Pascal Massimino
97c76f1f30
make VP8PredLuma4[] non-const and initialize array in VP8DspInit()
...
also convert 'type *dst' to 'type* dst'
Change-Id: I41ab66ad15b548cc45d1cb8b10bbca4fe1528cae
2014-10-22 18:14:20 +02:00
James Zern
d1c359ef29
fix shared object build with -fvisibility=hidden
...
set WEBP_EXTERN to visibility=default
+ explicitly mark VP8GetCPUInfo as it's referenced within the examples
Change-Id: Ie3d2b15088e888f0b55203b205993eba75899d99
2014-10-17 11:50:52 +02:00
James Zern
a4c3a31b8f
WEBP_TSAN_IGNORE_FUNCTION: fix gcc compat warning
...
move the attribute to the front of the function to quiet clang warning:
GCC does not allow no_sanitize_thread attribute in this position on a
function definition
Change-Id: Ie4cc6e35a07bd00eab67d9cd6801bd2be9cfe676
2014-10-16 18:06:43 +02:00
Pascal Massimino
80247291c6
mark some init function as being safe for thread_sanitizer.
...
introduces the macro WEBP_TSAN_IGNORE_FUNCTION
Change-Id: I3de2b6c1a2076fba4da7ae50322551e026b2082b
2014-10-16 16:34:07 +02:00
Pascal Massimino
2d9b0a4472
add WebPDispatchAlphaToGreen() to dsp
...
SSE2 version is 2.1x faster
This is used to transfer the alpha plane to green channel before lossless compression.
Change-Id: I01d9df0051c183b1ff5d6eb69961d4f43e33141a
2014-10-06 23:15:44 +02:00
Pascal Massimino
cddd334050
Add a WebPExtractAlpha function to dsp
...
This is the opposite of WebPDispatchAlpha
+ Implement the SSE2 version
Change-Id: I0c297309255f508c5261da8aad01f7e57f924d6c
2014-09-15 08:12:03 +02:00
Pascal Massimino
a6bb9b17d8
SSE2 for inverse Mult(ARGB)Row and ApplyAlphaMultiply
...
Change-Id: Iab5c0e4a4d2b31f86736a9b277e62b6e28c3d2b4
WebPMultRow: ~7x faster
WebPMultARGBRow: ~3x faster
ApplyAlphaMultiply: 60% faster
2014-09-11 07:58:42 +02:00
James Zern
8323a9038d
dsp.h: collect gcc/clang version test macros
...
endian_inl.h already relies on dsp.h, grab the definitions from there.
Change-Id: I445f7d0631723043c55da1070498f89965bec7b1
2014-08-27 19:33:09 -07:00
skal
e6c4b52f28
move static initialization of WebPYUV444Converters[] to the Init function.
...
Split initialization of YUV444Converters[] out of Upsamplers init.
update test for NULL function pointers
Change-Id: I9603f54250f90c85a12ffbecfd6c59e9b06c47e0
2014-08-27 11:36:37 -07:00
skal
73d361dd5f
introduce VP8EncQuantize2Blocks to quantize two blocks at a time
...
No speed diff for now. We might reorder better the instructions later,
to speed things up.
Change-Id: I1949525a0b329c7fd861b8dbea7db4b23d37709c
2014-08-25 20:21:42 -07:00
Djordje Pesut
0b21c30b1a
MIPS: dspr2: added optimization for EmitAlphaRGB
...
New dsp function: WebPDispatchAlpha()
Change-Id: I48e539d22471279ec75185759bc68d18b127f716
2014-08-21 20:39:35 -07:00
Djordje Pesut
569771549a
MIPS: dspr2: added optimizations for VP8YuvTo*
...
VP8YuvToRgb
VP8YuvToBgr
VP8YuvToRgb565
VP8YuvToRgba4444
VP8YuvToArgb
VP8YuvToBgra
VP8YuvToRgba
Change-Id: I22212a125d890e1fd28388fec906a1a5c07ff386
2014-08-19 14:29:32 +02:00
Djordje Pesut
b61c9ceca8
MIPS: dspr2: Optimization of some simple point-sampling functions
...
Change-Id: I6a4ab29bd0cc5a2951a8882cf9997032dc38bd79
2014-08-13 17:18:49 +02:00
Djordje Pesut
98c54107df
MIPS: mips32r2: added optimization for BSwap32
...
gcc < 4.8.3 doesn't translate bswap optimally.
use optimized version always
Change-Id: I979ea26ad6dc0166d3d2f39c4148eb8adfb7ddec
2014-08-12 09:29:13 +02:00
Djordje Pesut
b7e5a5c451
MIPS: detect mips32r6 and disable mips32r1 code
...
Change-Id: Id1325c789a990c9a8704e84e99a22d580303eb8a
2014-08-08 17:29:31 +02:00
James Zern
0524d9e5e8
dsp: detect mips64 & disable mips32 code
...
Change-Id: Icf68dafd5cf0614ca25b36a0252caa1784ac8059
2014-08-01 21:18:53 -07:00
James Zern
32b3137936
configure: move config.h to src/webp/config.h
...
this change has the side-effect of using directory names in the
include, silencing a lint warning.
Change-Id: Ib91cf63a90534e32fadfa5c2372bfdb29f854d02
2014-06-10 23:42:00 -07:00
James Zern
6e61a3a905
configure: test for -msse2
...
+ add a WEBP_HAVE_SSE2 to dsp.h
not all 32-bit toolchain configurations will have sse2 enabled by
default
Change-Id: I7c675e511581f93cf55c79f960fa7efa2df4987e
2014-06-07 19:44:08 -07:00
James Zern
230a055501
configure: set WEBP_HAVE_AVX2 when available
...
this is used to set WEBP_USE_AVX2 in files where the build flag won't be
used, i.e., dsp/enc.c, which enables VP8EncDspInitAVX2() to be called
Change-Id: I362f4ba39ca40d3e07a081292d5f743c649d9d7f
2014-06-03 23:29:23 -07:00
skal
399b916d27
lossy decoding: correct alpha-rescaling for YUVA format
...
The luminance needs to be pre- and post- multiplied by
the alpha value in case of rescaling, for proper averaging.
Also:
- removed util/alpha_processing and moved it to dsp/
- removed WebPInitPremultiply() which was mostly useless
and merged it with the new function WebPInitAlphaProcessing()
Change-Id: If089cefd4ec53f6880a791c476fb1c7f7c5a8e60
2014-05-27 15:27:13 -07:00
skal
a05dc1402c
SSE2: yuv->rgb speed-up for point-sampling
...
- use statically initialized tables (if WEBP_YUV_USE_SSE2_TABLES is defined)
- use SSE2 row conversion for yuv->ARGB / RGBA / ABGR / RGB / BGR
- clean-up and harmonize the WebpUpsamplers[] usage.
Change-Id: Ic5f3659a995927bd7363defac99c1fc03a85a47d
2014-05-22 09:56:47 +02:00
James Zern
541784c710
dsp.h: add a check for AVX2 / define WEBP_USE_AVX2
...
Change-Id: I90cc870f0bb4426af701779c367587dc2ae79c8b
2014-05-21 20:46:28 -07:00
James Zern
bdb151ee80
dsp/cpu: add AVX2 detection
...
currently unused.
https://software.intel.com/en-us/articles/how-to-detect-new-instruction-support-in-the-4th-generation-intel-core-processor-family
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-optimization-manual.pdf
Change-Id: I314200f890c58b9a587b902b214f90deb95f0579
2014-05-20 22:48:54 -07:00
Pascal Massimino
a2f8b28905
revamp the point-sampling functions by processing a full plane
...
-nofancy is slower than fancy upsampler, because the latter has SSE2 optim.
Change-Id: Ibf22e5a8ea1de86a54248d4a4ecc63d514c01b88
2014-05-20 15:13:44 -07:00
James Zern
df08e67e06
dsp/cpu: add AVX detection
...
currently unused.
https://software.intel.com/en-us/articles/introduction-to-intel-advanced-vector-extensions
similar checks exist in ffmpeg, libyuv. the visual studio inline asm is
based off of libyuv.
Change-Id: I3e233de3492172434e482607a94b99c617f11aad
2014-05-20 00:25:12 -07:00
James Zern
a577b23a0a
dsp/WEBP_USE_NEON: test for __aarch64__
...
__ARM_NEON__ is unset by current linux gcc/clang + android toolchains
for aarch64/arm64 builds.
Change-Id: Ib2ca172ea6fcf046e4ced19a431088674c99b7f6
2014-05-14 00:07:13 -07:00
James Zern
1ba61b09f9
enable NEON intrinsics in aarch64 builds
...
avoids functions that use vtbl? as in iOS builds these are marked
unavailable
Change-Id: I17aedc3c7dc8f1d5be0941205de0b22c3772ef1b
2014-05-03 12:37:42 -07:00
James Zern
f937e01261
move LOCAL_GCC_VERSION def to dsp.h
...
+ add LOCAL_GCC_PREREQ and use it in lossless_neon.c
Change-Id: Ic9fd99540bc3e19c027d1598e4530dfdc9b9de00
2014-04-26 19:09:04 -07:00
skal
869eaf6c60
~30% encoding speedup: use NEON for QuantizeBlock()
...
also revamped the signature to avoid having to pass the 'first' parameter
Change-Id: Ief9af1747dcfb5db0700b595d0073cebd57542a5
2014-04-08 03:08:22 -07:00
James Zern
df230f2723
dsp: reuse wht transform from dec in encoder
...
Change-Id: Ide663db9eaecb7a37fe0e6ad4cd5f37de190c717
2014-03-22 13:25:08 -07:00
skal
8992ddb756
use static clipping tables
...
(shared with mips32)
removed abs1[] table along the way
sub-1% speed-up, but still...
Change-Id: I8c29a8a0285076cb3423b01ffae9fcc465da6a81
2014-02-13 19:32:59 -08:00
Djordje Pesut
dd438c9a7d
MIPS: MIPS32r1: Optimization of some simple point-sampling functions. PATCH [6/6]
...
Change-Id: I2020e71e9be5d17d4bf67cabf6c470ca43d5d838
2014-01-29 15:37:31 +01:00
Djordje Pesut
53520911c3
Added support for calling sampling functions via pointers.
...
Change-Id: Ic4d72e6b175a6b27bcdcc8cd97828e44ea93e743
2014-01-29 15:32:35 +01:00