add some colorspace conversion functions in NEON

new file: lossless_neon.c
speedup is ~5%

gcc 4.6.3 seems to be doing some sub-optimal things here,
storing register on stack using 'vstmia' and such.
Looks similar to gcc.gnu.org/bugzilla/show_bug.cgi?id=51509

I've tried adding  -fno-split-wide-types and it does help
the generated assembly. But the overall speed gets worse with
this flag. We should only compile lossless_neon.c with it -> urk.

Change-Id: I2ccc0929f5ef9dfb0105960e65c0b79b5f18d3b0
This commit is contained in:
skal
2014-03-31 16:36:33 +02:00
parent daccbf400d
commit 97e5fac389
6 changed files with 96 additions and 1 deletions

View File

@ -79,8 +79,9 @@ ifneq ($(findstring armeabi-v7a, $(TARGET_ARCH_ABI)),)
# instructions to be generated for armv7a code. Instead target the neon code
# specifically.
LOCAL_SRC_FILES += src/dsp/dec_neon.c.neon
LOCAL_SRC_FILES += src/dsp/upsampling_neon.c.neon
LOCAL_SRC_FILES += src/dsp/enc_neon.c.neon
LOCAL_SRC_FILES += src/dsp/lossless_neon.c.neon
LOCAL_SRC_FILES += src/dsp/upsampling_neon.c.neon
endif
LOCAL_STATIC_LIBRARIES := cpufeatures