The luminance needs to be pre- and post- multiplied by
the alpha value in case of rescaling, for proper averaging.
Also:
- removed util/alpha_processing and moved it to dsp/
- removed WebPInitPremultiply() which was mostly useless
and merged it with the new function WebPInitAlphaProcessing()
Change-Id: If089cefd4ec53f6880a791c476fb1c7f7c5a8e60
VP8EncDspInitAVX2 is included in sse2 builds for now, later a configure
flag should be added to avoid the stub when avx2 is unavailable/disabled
Change-Id: I6127b687c273f46f41652aaf8e3b86ae3cfb8108
the inclusion of the files is harmless when NEON is not enabled and will
allow them to be built with NEON for APP_ABI=arm64-v8a which currently
does not use the '.neon' suffix
Change-Id: I39377876b1b68822c38f4e2396da93c56145fc0f
Functions VP8LFastLog2Slow and VP8LFastSLog2Slow
also: replaced some "% y" by "& (y-1)" in the C-version
(since y is a power-of-two)
Change-Id: I875170384e3c333812ca42d6ce7278aecabd60f0
new file: lossless_neon.c
speedup is ~5%
gcc 4.6.3 seems to be doing some sub-optimal things here,
storing register on stack using 'vstmia' and such.
Looks similar to gcc.gnu.org/bugzilla/show_bug.cgi?id=51509
I've tried adding -fno-split-wide-types and it does help
the generated assembly. But the overall speed gets worse with
this flag. We should only compile lossless_neon.c with it -> urk.
Change-Id: I2ccc0929f5ef9dfb0105960e65c0b79b5f18d3b0
expose the predictor array as function pointers instead
of each individual sub-function
+ merged Average2() into ClampedAddSubtractHalf directly
+ unified the signature as "VP8LProcessBlueAndRedFunc"
no speed diff observed
Change-Id: Ic3c45dff11884a8330a9ad38c2c8e82491c6e044
-ffunction-sections / -fdata-sections
can improve final binary size when used with --gc-sections, speed impact
untested
Change-Id: I37f4b5da2f34acede7965c2da2e1b97125473adc
-> split libraries further into decoder / encoder
-> add libwebpdecoder.a in Makefile.unix
-> make dwebp link against libwebpdecoder.a in Makefile.unix
also: in makefile.unix, pass EXTRA_FLAGS to LDFLAGS too
(otherwise, -m32 wouldn't work, e.g.)
Change-Id: Ief3da02a729dd86bbaf949ed048836716941657f
- along the lines of the SSE chroma upsampling.
Total speedup is ~30%.
4% speed loss on YuvToRgbXX conversion using tables instead
of 14-bit fixed precision. TODO(later): investigate, and compare
to x86.
see http://code.google.com/p/webp/issues/detail?id=134
Change-Id: Idc2261037cd13b4553ca20ecc4c4007099c37009
(implements the backward and forward transforms in the encoder)
original patch by Wayne Chen (datoudatou at gmail dot com)
Change-Id: Ic00f3bffcdf7a924f043006728735c810ee47a57
the extended file format is still under development and related
libs/binaries are not fit for release
configure:
--enable/disable-libwebpmux; default is disabled.
makefile.unix:
src/mux/libwebpmux.a and examples/webpmux must be explicitly specified
Makefile.vc:
$(DIRLIB)\libwebpmux.lib and $(DIRBIN)\webpmux.exe must be explicitly
specified
Change-Id: I8246746b256010dd2a2e4de58291222d7eaf0457
Defining LOCAL_ARM_NEON = true can result in neon instructions being
used in portions unprotected by the cpu check.
This changes defines a WEBP_USE_NEON/WEBP_ANDROID_NEON pair similar to
the SSE2 code and MSVC.
Change-Id: Ifac010b06e42c73d5aca529baa2198c6796674bd
* Method #1 is now calling the lossless encoder on the alpha plane.
Format is not final, it's just a first draft. We need ad-hoc functions.
* removed now useless utils/alpha.*
* added utils/quant_levels.h instead
* removed the TCoder code altogether
Change-Id: I636840b6129a43171b74860e0a0fc5bb1bcffc6a
add proper cpu-detection for Android targets
Fixes issue #118 (and is a better solution for #117).
based on patch by pepijn vaneeckhoudt
Change-Id: I6b00ea6d51ca658ccf6a3d55b87b99c01c6805be
patch by pepijn vaneeckhoudt
- Android.mk should include dec/enc/upsampling sse2 variants. This
provides sse2 optimizations when compiling for Android/x86
- LOCAL_ARM_NEON should be set to true when compiling for armeabi-v7a.
Otherwise __ARM_NEON__ is not defined and all neon code is removed by
the preprocessor.
Change-Id: I54f3505757fc5d2d63cca4b64d61be34a0b34eb8
Add predictive filtering option for Alpha plane.
Valid range for filter option is [0, 5] corresponding to prediction
methods none, horizontal, vertical, gradient & paeth filter.
The prediction method 5 will try all the prediction methods (0 to 4)
and pick the prediction method that gives best compression.
Change-Id: I9244d4a9c5017501a9696c7cec5045f04c16d49b
- add check for native log2 to configure
- use a common define (NOT_HAVE_LOG2) to enable use of local library
version for non-autoconf platforms without their own version,
currently msvc and android
This uses a negative (NOT_HAVE_) to simplify the ifdef
Change-Id: Id0610eed507f8bb9c5da338918112853d5c8127a