rescaler: add some SSE2 code

The rounding and arithmetic is not the same as previously, to prevent overflow cases for large upscale factors. We still rely on 32b x 32b -> 64b multiplies. Raised the fixed-point precision to 32b so that we have some nice shifts from epi64 to epi32. Changed rescaler_t type to 'uint32_t' in order to squeeze in all the precision required. The MIPS code has been disabled because it's now out-of-sync. Will be fixed in a subsequent CL when the dust settles. ~30-35% faster Change-Id: I32e4ddc00933f1b1aa3463403086199fd5dad07b
2025-07-12 22:14:29 +02:00 · 2015-09-25 14:34:02 +02:00
parent 1df1d0eedb
commit 76a7dc39e5
10 changed files with 320 additions and 44 deletions
--- a/Makefile.vc
+++ b/Makefile.vc
@ -207,6 +207,7 @@ DSP_DEC_OBJS = \
    $(DIROBJ)\dsp\rescaler.obj \
    $(DIROBJ)\dsp\rescaler_mips32.obj \
    $(DIROBJ)\dsp\rescaler_mips_dsp_r2.obj \
+    $(DIROBJ)\dsp\rescaler_sse2.obj \
    $(DIROBJ)\dsp\upsampling.obj \
    $(DIROBJ)\dsp\upsampling_mips_dsp_r2.obj \
    $(DIROBJ)\dsp\upsampling_neon.obj \