rescaler: add some SSE2 code

The rounding and arithmetic is not the same as previously, to prevent overflow cases for large upscale factors.

We still rely on 32b x 32b -> 64b multiplies. Raised the fixed-point precision to 32b
so that we have some nice shifts from epi64 to epi32.
Changed rescaler_t type to 'uint32_t' in order to squeeze in all the precision required.

The MIPS code has been disabled because it's now out-of-sync. Will be fixed in
a subsequent CL when the dust settles.
~30-35% faster

Change-Id: I32e4ddc00933f1b1aa3463403086199fd5dad07b
This commit is contained in:
Pascal Massimino
2015-09-25 14:34:02 +02:00
parent 1df1d0eedb
commit 76a7dc39e5
10 changed files with 320 additions and 44 deletions

View File

@ -69,6 +69,7 @@ libwebpdspdecode_sse2_la_SOURCES += alpha_processing_sse2.c
libwebpdspdecode_sse2_la_SOURCES += dec_sse2.c
libwebpdspdecode_sse2_la_SOURCES += filters_sse2.c
libwebpdspdecode_sse2_la_SOURCES += lossless_sse2.c
libwebpdspdecode_sse2_la_SOURCES += rescaler_sse2.c
libwebpdspdecode_sse2_la_SOURCES += upsampling_sse2.c
libwebpdspdecode_sse2_la_SOURCES += yuv_sse2.c
libwebpdspdecode_sse2_la_SOURCES += yuv_tables_sse2.h