The rounding and arithmetic is not the same as previously, to prevent overflow cases for large upscale factors.
We still rely on 32b x 32b -> 64b multiplies. Raised the fixed-point precision to 32b
so that we have some nice shifts from epi64 to epi32.
Changed rescaler_t type to 'uint32_t' in order to squeeze in all the precision required.
The MIPS code has been disabled because it's now out-of-sync. Will be fixed in
a subsequent CL when the dust settles.
~30-35% faster
Change-Id: I32e4ddc00933f1b1aa3463403086199fd5dad07b
This is designed for the simple use-case where one wants to decode all
frames one-by-one in order.
Also, use this API in anim_util library, which is in turn used by
anim_diff tool.
Change-Id: Ie8b653c04e867d40fd23321b3dd41b87689656c7
New implementations: SubtractGreenFromBlueAndRed and TransformColor
around 1-2% faster lossless encoding.
Change-Id: I1668e36fdc316ba55b3b798b91b4a3e36ce62861
DispatchAlpha* functions are hard to speed up, compared to SSE2.
ExtractAlpha sees a ~15% speed-up though.
Change-Id: I8715c2defecbc832f469eed7e6ffd012146b52de
It can be used to test if given pair of animated images (GIF and/or
WebP) are identical in terms of pixel match and other animation
properties.
Change-Id: I84adea145e9d062be6ad06a0d4fcdc9658cf52d4
this target is out of date and there are better ways to make a clean
tree (the first 2 versioned, the last one not):
make distclean (when using autoconf)
git clean -fdx
git archive
Change-Id: I766b75e0adf566c6f7db1a087ff486020b031b3a
update makefile.unix to provide 'make extras' building instructions.
note: input ordering depends on WEBP_SWAP_16BIT_CSP for rgb565 and rgb4444
Change-Id: I6f22d32189d9ba2619146a9714cedabfe28e2ad0
had to rename few structs.
-> we can now include both vp8i.h and vp8enci.h without naming
conflicts.
Change-Id: Ib41b498f1b57aab3d6b796361afc45210ec75174
Visible speed-up, thanks to pshufb and pabsw and psignw use.
had to tweak configure.ac to make "smmintri.h" presence correctly
detected (we need to set the CPPFLAGS instead of the CFLAGS!)
Change-Id: I2ab99e16a27a64fdf1f09b2b4e30a5e74ccca080
set/get residual C functions moved to new file in src/dsp
mips32 version of GetResidualCost moved to new file
Change-Id: I7cebb7933a89820ff28c187249a9181f281081d2
The 'inverse' variants are harder to parallelize, since
the result of filtering is used for prediction.
The 'direct' way is relatively easier.
The heavy bottleneck left for optimization is still GradientUnfilter()
Change-Id: I358008f492a887e8fff6600cb27857b18dee86e9
A separate API to generate animated WebP images.
It will eventually replace the internal gif2webp_util methods.
Also: update makefiles.
Change-Id: Idf61dfc1016c10b24fea70425d1a2323cffba515
inline function MakeARGB32 calls changed to call
via pointers to functions which make (a)rgb for
entire row
Change-Id: Ia4bd4be171a46c1e1821e408b073ff5791c587a9
This compresses the uimage using lossless compression and controlable
decimating pre-process.
Code is under WEBP_EXPERIMENTAL_FEATURE while it's being experimented with.
Change-Id: I8b7f4cfcc3c6afc52a556102842bdbb045ed5ee8
moves the following to this header:
- htole*() definitions from bit_writer.c
- __BIG_ENDIAN__ fallback define from bit_reader_inl.h
Change-Id: I7fff59543f08a70bf8f9ddac849b72ed290471b1
* remove LEFT/RIGHT_JUSTIFY distinction. It's all RIGHT_JUSTIFY now.
* simplify VP8GetSigned(), and add some masking branch-less code. Much
faster on ARM (~13% speed-up). 8% on x86-64, 5% on MacBook.
* split critical implementation into separate bit_reader_inl.h file that
is only included where needed (vp8.c / tree.c / bit_reader.c)
* bumped BITS value from 16 to 24 for x86-32b too, since it's a bit faster.
Change-Id: If41ca1da3e5c3dadacf2379d1ba419b151e7fce8
The luminance needs to be pre- and post- multiplied by
the alpha value in case of rescaling, for proper averaging.
Also:
- removed util/alpha_processing and moved it to dsp/
- removed WebPInitPremultiply() which was mostly useless
and merged it with the new function WebPInitAlphaProcessing()
Change-Id: If089cefd4ec53f6880a791c476fb1c7f7c5a8e60
VP8EncDspInitAVX2 is included in sse2 builds for now, later a configure
flag should be added to avoid the stub when avx2 is unavailable/disabled
Change-Id: I6127b687c273f46f41652aaf8e3b86ae3cfb8108
Functions VP8LFastLog2Slow and VP8LFastSLog2Slow
also: replaced some "% y" by "& (y-1)" in the C-version
(since y is a power-of-two)
Change-Id: I875170384e3c333812ca42d6ce7278aecabd60f0
new file: lossless_neon.c
speedup is ~5%
gcc 4.6.3 seems to be doing some sub-optimal things here,
storing register on stack using 'vstmia' and such.
Looks similar to gcc.gnu.org/bugzilla/show_bug.cgi?id=51509
I've tried adding -fno-split-wide-types and it does help
the generated assembly. But the overall speed gets worse with
this flag. We should only compile lossless_neon.c with it -> urk.
Change-Id: I2ccc0929f5ef9dfb0105960e65c0b79b5f18d3b0
expose the predictor array as function pointers instead
of each individual sub-function
+ merged Average2() into ClampedAddSubtractHalf directly
+ unified the signature as "VP8LProcessBlueAndRedFunc"
no speed diff observed
Change-Id: Ic3c45dff11884a8330a9ad38c2c8e82491c6e044
...and make gif2webp and vwebp compile without them.
Makefile.vc still to be updated...
Meanwhile the CL environment variable can be supplemented with
set CL=/DWEBP_HAVE_GL /IC:\opt\freeglut\include
Change-Id: I37a60b8c32aafd125bffa98b6cc9f57c022ebbd0
* make the 'all' target really build everything (default is still the
core examples).
* add demux/mux.h to HDRS_INSTALLED, install the corresponding libs
too
* install vwebp, webpmux, gif2webp and related manpages
Change-Id: Ib6036f2a1a05e40f106914c4bdbe9e3ad7336464
-> split libraries further into decoder / encoder
-> add libwebpdecoder.a in Makefile.unix
-> make dwebp link against libwebpdecoder.a in Makefile.unix
also: in makefile.unix, pass EXTRA_FLAGS to LDFLAGS too
(otherwise, -m32 wouldn't work, e.g.)
Change-Id: Ief3da02a729dd86bbaf949ed048836716941657f
- along the lines of the SSE chroma upsampling.
Total speedup is ~30%.
4% speed loss on YuvToRgbXX conversion using tables instead
of 14-bit fixed precision. TODO(later): investigate, and compare
to x86.
see http://code.google.com/p/webp/issues/detail?id=134
Change-Id: Idc2261037cd13b4553ca20ecc4c4007099c37009
The binaries should have dependency on related libraries only.
For example, 'vwebp' sould not depend on libpng, libjpeg etc.
Change-Id: I517b7ae6a9092d5de891b46d0e75d07d91548e93
- Separate out mux.h and demux.h
- muxtypes.h: new header for data types common to mux/demux
- Move some misc read/write utilities to utils/utils.h
- Remove some duplicate methods.
- Separate out mux/demux libraries
Change-Id: If9b9569b10d55d922ad9317ef51710544315d6de
(implements the backward and forward transforms in the encoder)
original patch by Wayne Chen (datoudatou at gmail dot com)
Change-Id: Ic00f3bffcdf7a924f043006728735c810ee47a57
this target is experimental so must be explicitly set
missing since:
e41a759 build: remove libwebpmux from default targets/config
Change-Id: I128d9e6a06ba3e7102ae3175dc3db52c6f4b1439
the extended file format is still under development and related
libs/binaries are not fit for release
configure:
--enable/disable-libwebpmux; default is disabled.
makefile.unix:
src/mux/libwebpmux.a and examples/webpmux must be explicitly specified
Makefile.vc:
$(DIRLIB)\libwebpmux.lib and $(DIRBIN)\webpmux.exe must be explicitly
specified
Change-Id: I8246746b256010dd2a2e4de58291222d7eaf0457
this is more a workaround than a fix. this change forces the tables
from yuv.h into the data section rather than generating them as common
blocks. prior to this the linkage would fail with e.g.,
Undefined symbols:
...
"_VP8kClip4Bits", referenced from:
_Yuv444ToRgba4444 in libwebp.a(libwebpdsp_la-upsampling.o)
broken since:
48f8275 add colorspace for premultiplied alpha
Change-Id: I69c36758c31cd3a4b3ea5c412674d1a47b45447e
* Method #1 is now calling the lossless encoder on the alpha plane.
Format is not final, it's just a first draft. We need ad-hoc functions.
* removed now useless utils/alpha.*
* added utils/quant_levels.h instead
* removed the TCoder code altogether
Change-Id: I636840b6129a43171b74860e0a0fc5bb1bcffc6a
standalone from the normal examples due to the additional OpenGL
dependencies.
$ make -f makefile.unix examples/vwebp
Change-Id: I7e3f0b5e0328230cb32acdd1e2a011cbbb80e55d