Commit Graph

124 Commits

Author SHA1 Message Date
Pascal Massimino
a987faedfa MIPS: dspr2: added optimization for function GetResidualCost
set/get residual C functions moved to new file in src/dsp
mips32 version of GetResidualCost moved to new file

Change-Id: I7cebb7933a89820ff28c187249a9181f281081d2
2015-02-07 02:13:26 -08:00
Pascal Massimino
022d2f886c add SSE2 variants for alpha filtering functions
The 'inverse' variants are harder to parallelize, since
the result of filtering is used for prediction.
The 'direct' way is relatively easier.

The heavy bottleneck left for optimization is still GradientUnfilter()

Change-Id: I358008f492a887e8fff6600cb27857b18dee86e9
2015-01-29 08:46:22 +01:00
Pascal Massimino
7afdaf8496 Alpha coding: reorganize the filter/unfiltering code
Move the filtering code to their own dsp/ spot
New function: VP8FiltersInit()

Change-Id: I0b2041eab42346c59b972f2575b05509e6a8f7b1
2015-01-28 08:02:41 +01:00
Djordje Pesut
cbcbedd0de move rescaler functions to rescaler* files in src/dsp/
Change-Id: I906add1b1010a59ebfcc2dd81e15745433cc206b
2015-01-09 16:47:09 +01:00
Pascal Massimino
ca7f60db5f SSE2 implementation of VP8PackARGB
Change-Id: I40c0e26a6a2701216e4ddebcf793aa535677f437
2015-01-05 05:17:51 -08:00
Djordje Pesut
7ce8788b06 MIPS: dspr2: added optimization for function MakeARGB32
inline function MakeARGB32 calls changed to call
via pointers to functions which make (a)rgb for
entire row

Change-Id: Ia4bd4be171a46c1e1821e408b073ff5791c587a9
2014-12-22 12:31:36 +01:00
Djordje Pesut
829a8c19a0 MIPS: dspr2: added optimization for ITransform
Change-Id: I3534fca143535c53d18a3749b3a1b0c8a7563463
2014-10-28 14:28:14 +01:00
Djordje Pesut
24e1072aac MIPS: dspr2: added optimization for TransformDC
Change-Id: Iee69758f6442ea9c80ddaa32cea8d00dda4c6252
2014-09-09 14:15:04 +02:00
Djordje Pesut
f0103595dd MIPS: dspr2: added optimization for ColorIndexInverseTransforms
Change-Id: I5b6094ce489d4f896bc4b8f575142eb3c5054beb
2014-09-08 17:22:59 +02:00
skal
fc98edd936 add a DispatchAlpha() for SSE2 that handles 8 pixels at a time
Only slightly faster.

Change-Id: Ie2e57e6a0950166124cf1075c6c9b45b7abdad8c
2014-08-25 21:03:03 -07:00
Djordje Pesut
0b21c30b1a MIPS: dspr2: added optimization for EmitAlphaRGB
New dsp function: WebPDispatchAlpha()

Change-Id: I48e539d22471279ec75185759bc68d18b127f716
2014-08-21 20:39:35 -07:00
Djordje Pesut
569771549a MIPS: dspr2: added optimizations for VP8YuvTo*
VP8YuvToRgb
VP8YuvToBgr
VP8YuvToRgb565
VP8YuvToRgba4444
VP8YuvToArgb
VP8YuvToBgra
VP8YuvToRgba

Change-Id: I22212a125d890e1fd28388fec906a1a5c07ff386
2014-08-19 14:29:32 +02:00
Djordje Pesut
b4dc4069a2 MIPS: dspr2: added optimization for (un)filters
HorizontalFilter
VerticalFilter
GradientFilter
HorizontalUnfilter
VerticalUnfilter
GradientUnfilter

Change-Id: I54055b4767c37719691811072e95bf79c1f627b1
2014-08-14 11:55:19 -07:00
Djordje Pesut
b61c9ceca8 MIPS: dspr2: Optimization of some simple point-sampling functions
Change-Id: I6a4ab29bd0cc5a2951a8882cf9997032dc38bd79
2014-08-13 17:18:49 +02:00
skal
b5a36cc9ad add -near_lossless [0..100] experimental option
This compresses the uimage using lossless compression and controlable
decimating pre-process.
Code is under WEBP_EXPERIMENTAL_FEATURE while it's being experimented with.

Change-Id: I8b7f4cfcc3c6afc52a556102842bdbb045ed5ee8
2014-08-05 19:17:10 +02:00
Pascal Massimino
736f2a175e extract colorspace code from picture.c into picture_csp.c
had to refactor few functions here and there.

Change-Id: I86fde6fec7c2fc7eb48f0ecf327dbbd2bd40b9d4
2014-07-16 16:37:26 -07:00
Pascal Massimino
fbadb48026 split monolithic picture.c into picture_{tools,psnr,rescale}.c
Change-Id: Ia5eb5496e4337e5bac8203872c5b014cad21c4f9
2014-07-12 09:13:33 -07:00
James Zern
ca0fa7c7a5 Android.mk: move dwebp to examples/Android.mk
this depends on the top-level Android.mk for shared flags

Change-Id: Id418eb9639e839518a921ffcb6a376ce10aafbd2
2014-06-17 23:41:01 -07:00
James Zern
73d8fca01e Android.mk: add ENABLE_SHARED flag
builds libwebp.so instead of libwebp.a
$ ndk-build ENABLE_SHARED=1

Change-Id: Ide05e3be4f9848852e6d7e9d99abe11344419241
2014-06-17 23:30:20 -07:00
James Zern
b9d2efc629 rename upsampling_mips32.c to yuv_mips32.c
matches yuv_sse2 added in;
bdfeeba dsp/yuv: move sse2 functions to yuv_sse2.c

Change-Id: I84f7d7858ca6851c956e8366a7c76b45070dcbc3
2014-06-07 12:35:47 -07:00
James Zern
bdfeebaa01 dsp/yuv: move sse2 functions to yuv_sse2.c
Change-Id: I2f037ff18e7cf07e8801f49b3a89c1e36ef73000
2014-06-05 23:52:54 -07:00
skal
399b916d27 lossy decoding: correct alpha-rescaling for YUVA format
The luminance needs to be pre- and post- multiplied by
the alpha value in case of rescaling, for proper averaging.

Also:
- removed util/alpha_processing and moved it to dsp/
- removed WebPInitPremultiply() which was mostly useless
and merged it with the new function WebPInitAlphaProcessing()

Change-Id: If089cefd4ec53f6880a791c476fb1c7f7c5a8e60
2014-05-27 15:27:13 -07:00
James Zern
178e9a69ae add stub dsp/enc_avx2.c
VP8EncDspInitAVX2 is included in sse2 builds for now, later a configure
flag should be added to avoid the stub when avx2 is unavailable/disabled

Change-Id: I6127b687c273f46f41652aaf8e3b86ae3cfb8108
2014-05-22 00:31:46 -07:00
James Zern
c7164490da Android.mk: always include *_neon.c in the build
the inclusion of the files is harmless when NEON is not enabled and will
allow them to be built with NEON for APP_ABI=arm64-v8a which currently
does not use the '.neon' suffix

Change-Id: I39377876b1b68822c38f4e2396da93c56145fc0f
2014-05-14 00:11:46 -07:00
Pascal Massimino
f1e771735a remove all unused layer code
Change-Id: I220590162b24c70f404fe3087f19dd3e6cac3608
2014-05-08 22:37:38 -07:00
Jovan Zelincevic
baabf1ea3a MIPS: MIPS32r1: Added optimizations for FastLog2
Functions VP8LFastLog2Slow and VP8LFastSLog2Slow

also: replaced some "% y" by "& (y-1)" in the C-version
(since y is a power-of-two)

Change-Id: I875170384e3c333812ca42d6ce7278aecabd60f0
2014-04-10 08:32:51 -07:00
Djordje Pesut
0ca2914b23 MIPS: MIPS32r1: Add optimization for ITransform
Change-Id: Ie4c8b9bc3a7826bd443cdebf05386786fafe8c56
2014-04-04 10:50:35 +02:00
skal
97e5fac389 add some colorspace conversion functions in NEON
new file: lossless_neon.c
speedup is ~5%

gcc 4.6.3 seems to be doing some sub-optimal things here,
storing register on stack using 'vstmia' and such.
Looks similar to gcc.gnu.org/bugzilla/show_bug.cgi?id=51509

I've tried adding  -fno-split-wide-types and it does help
the generated assembly. But the overall speed gets worse with
this flag. We should only compile lossless_neon.c with it -> urk.

Change-Id: I2ccc0929f5ef9dfb0105960e65c0b79b5f18d3b0
2014-03-31 17:47:46 +02:00
skal
0f4f721b12 separate SSE2 lossless functions into its own file
expose the predictor array as function pointers instead
of each individual sub-function

+ merged Average2() into ClampedAddSubtractHalf directly
+ unified the signature as "VP8LProcessBlueAndRedFunc"

no speed diff observed

Change-Id: Ic3c45dff11884a8330a9ad38c2c8e82491c6e044
2014-03-27 21:43:55 +01:00
James Zern
80e218d43a Android.mk: fix build with APP_ABI=armeabi-v7a-hard
added in r9d; relax the check to build neon code

Change-Id: Ic52b3fbd3bf53617ee52b07a55b0ed05f6f9b26f
2014-03-20 23:24:39 -07:00
skal
8992ddb756 use static clipping tables
(shared with mips32)
removed abs1[] table along the way

sub-1% speed-up, but still...

Change-Id: I8c29a8a0285076cb3423b01ffae9fcc465da6a81
2014-02-13 19:32:59 -08:00
James Zern
393f89b763 Android.mk: avoid gcc-specific flags with clang
Change-Id: Idb1ed2bb1dd5d9f65ca07185ef9838e587dc4e64
2014-02-07 20:31:44 -08:00
Djordje Pesut
dd438c9a7d MIPS: MIPS32r1: Optimization of some simple point-sampling functions. PATCH [6/6]
Change-Id: I2020e71e9be5d17d4bf67cabf6c470ca43d5d838
2014-01-29 15:37:31 +01:00
Djordje Pesut
04336fc7f8 MIPS: MIPS32r1: Optimization of function TransformOne. PATCH [4/6]
Change-Id: I5b98e2de940977538cf91bfa2128f4d1daa5c170
2014-01-28 20:10:43 -08:00
Charles Munger
4ad7d33510 Android.mk: add some release compile flags
-ffunction-sections / -fdata-sections
can improve final binary size when used with --gc-sections, speed impact
untested

Change-Id: I37f4b5da2f34acede7965c2da2e1b97125473adc
2013-12-03 23:00:49 -08:00
Pascal Massimino
98aa33cf1e extract random utils to their own file util/random.[ch]
they'll be used for decoding too, probably.

Change-Id: Id9cbb250c74fc0e876d4ea46b1b3dbf8356d6725
2013-10-30 02:00:33 -07:00
James Zern
5b2e6bd3e8 Android.mk: add a dwebp target
Change-Id: Ic697660c1a5f7185d2ad00934b314b44870cec00
2013-10-04 11:26:37 +02:00
James Zern
f910a84ea5 Android.mk: update build flags
- split out release specific flags
- set LOCAL_ARM_MODE to arm

Change-Id: I272855216583d6c8d0a4106e8b3fde46aa59dfa9
2013-10-04 11:18:09 +02:00
Pascal Massimino
8dcae8b3cf fix rescaling-with-alpha inaccuracy
(still missing YUVA decoding case for now)

https://code.google.com/p/webp/issues/detail?id=160

Change-Id: If723b4a5c0a303d0853ec9d839f995adce056095
2013-07-26 12:10:26 -07:00
skal
5cf7792e40 split quant_levels.c into decoder and encoder version
-> split libraries further into decoder / encoder
-> add libwebpdecoder.a in Makefile.unix
-> make dwebp link against libwebpdecoder.a in Makefile.unix

also: in makefile.unix, pass EXTRA_FLAGS to LDFLAGS too
(otherwise, -m32 wouldn't work, e.g.)

Change-Id: Ief3da02a729dd86bbaf949ed048836716941657f
2013-02-24 21:40:39 +01:00
Mans Rullgard
090b708a00 NEON optimised yuv to rgb conversion
- along the lines of the SSE chroma upsampling.
Total speedup is ~30%.

4% speed loss on YuvToRgbXX conversion using tables instead
of 14-bit fixed precision. TODO(later): investigate, and compare
to x86.

see http://code.google.com/p/webp/issues/detail?id=134

Change-Id: Idc2261037cd13b4553ca20ecc4c4007099c37009
2013-01-25 15:46:40 -08:00
skal
657f5c91b1 move token buffer to its own file (token.c)
Change-Id: Ib9791c52f48d98fad5ed3830f36894ef5ac362fa
2012-12-03 13:50:14 +01:00
Pascal Massimino
75e5f17e3b ARM/NEON: 30% encoding speed-up
(implements the backward and forward transforms in the encoder)

original patch by Wayne Chen (datoudatou at gmail dot com)

Change-Id: Ic00f3bffcdf7a924f043006728735c810ee47a57
2012-10-31 14:00:20 -07:00
Pascal Massimino
9b261bf521 remove the last NOT_HAVE_LOG2 instances
Change-Id: I193ecf82316cd1d5d7ddeebebf8fc98afccf0ede
2012-08-02 06:42:41 -07:00
Pascal Massimino
80cc7303ab WebPCheckMalloc() and WebPCheckCalloc():
safe size-checking versions of malloc() and calloc()

Change-Id: Iffa3138c48b9b254b3d7eaad913e1f852d9dafba
2012-07-31 16:56:39 -07:00
James Zern
c3b014db28 Android.mk: add missing lossless files
Change-Id: I5d97e22392ed3847e97f3b31c928d3828fa6732a
2012-07-16 18:43:07 -07:00
James Zern
e41a759615 build: remove libwebpmux from default targets/config
the extended file format is still under development and related
libs/binaries are not fit for release

configure:
--enable/disable-libwebpmux; default is disabled.

makefile.unix:
src/mux/libwebpmux.a and examples/webpmux must be explicitly specified

Makefile.vc:
$(DIRLIB)\libwebpmux.lib and $(DIRBIN)\webpmux.exe must be explicitly
specified

Change-Id: I8246746b256010dd2a2e4de58291222d7eaf0457
2012-07-11 14:25:41 -07:00
James Zern
e5f467420d add demux.c to the makefiles
Change-Id: If98c0ea859c845261218196df35e905589f09e76
2012-06-21 23:21:41 -07:00
James Zern
4f067fb254 Merge "Android: only build dec_neon with NEON support" 2012-05-23 22:26:03 -07:00
James Zern
255c66b48f Android: only build dec_neon with NEON support
Defining LOCAL_ARM_NEON = true can result in neon instructions being
used in portions unprotected by the cpu check.
This changes defines a WEBP_USE_NEON/WEBP_ANDROID_NEON pair similar to
the SSE2 code and MSVC.

Change-Id: Ifac010b06e42c73d5aca529baa2198c6796674bd
2012-05-23 22:21:10 -07:00
Pascal Massimino
3e863dda61 remove tcoder, switch alpha-plane compression to lossless
* Method #1 is now calling the lossless encoder on the alpha plane.
Format is not final, it's just a first draft. We need ad-hoc functions.
* removed now useless utils/alpha.*
* added utils/quant_levels.h instead
* removed the TCoder code altogether

Change-Id: I636840b6129a43171b74860e0a0fc5bb1bcffc6a
2012-05-21 06:24:48 -07:00
James Zern
834f937f3c dsp/cpu.c: Android: fix crash on non-neon arm builds
add proper cpu-detection for Android targets

Fixes issue #118 (and is a better solution for #117).

based on patch by pepijn vaneeckhoudt

Change-Id: I6b00ea6d51ca658ccf6a3d55b87b99c01c6805be
2012-05-07 17:52:15 -07:00
James Zern
7ae225218d Android.mk: SSE2 & NEON updates
patch by pepijn vaneeckhoudt
- Android.mk should include dec/enc/upsampling sse2 variants. This
  provides sse2 optimizations when compiling for Android/x86
- LOCAL_ARM_NEON should be set to true when compiling for armeabi-v7a.
  Otherwise __ARM_NEON__ is not defined and all neon code is removed by
  the preprocessor.

Change-Id: I54f3505757fc5d2d63cca4b64d61be34a0b34eb8
2012-04-20 12:07:03 -07:00
James Zern
3f8ec1c21f makefile.unix & Android.mk: cosmetics
split object/source list to single line & sort
remove unnecessary tabs

Change-Id: I1db03555f9578e128c746853084438da2b9186eb
2012-04-12 12:29:57 -07:00
James Zern
0df04b9e19 Android.mk & Makefile.vc: add new files
from the lossless additions, fixes the build.

Change-Id: I3243c48f21cfb4e2244614bb5ab12ff94dd684e5
2012-04-11 15:47:04 -07:00
Pascal Massimino
61c2d51fd7 move the rescaling code into its own file and make enc/ and dec/ use it.
(cherry picked from commit 8e92d9e380a89b7443a2e2c3d16ce5a222e8c1e8)

Conflicts:

	Android.mk
	makefile.unix
	src/dec/vp8l.c
	src/utils/Makefile.am
2012-03-30 12:16:25 -07:00
pascal massimino
d8a47e66f7 Merge "Add predictive filtering option for Alpha." 2012-01-05 00:33:32 -08:00
Vikas Arora
252028aaac Add predictive filtering option for Alpha.
Add predictive filtering option for Alpha plane.
Valid range for filter option is [0, 5] corresponding to prediction
methods none, horizontal, vertical, gradient & paeth filter.
The prediction method 5 will try all the prediction methods (0 to 4)
and pick the prediction method that gives best compression.

Change-Id: I9244d4a9c5017501a9696c7cec5045f04c16d49b
2012-01-05 13:55:35 +05:30
pascal massimino
9b69be1cdc Merge "Simplify mux library code" 2012-01-05 00:15:48 -08:00
Urvang Joshi
a056170eca Simplify mux library code
Refactor mux code into Read APIs, Set/Delete APIs and internal
objects/utils.

Change-Id: Ia4ce32ec18cf0c1c75de9084fbb28840d46892b4
2012-01-05 12:23:08 +05:30
James Zern
992187a380 improve log2 test
- add check for native log2 to configure
- use a common define (NOT_HAVE_LOG2) to enable use of local library
  version for non-autoconf platforms without their own version,
  currently msvc and android

This uses a negative (NOT_HAVE_) to simplify the ifdef

Change-Id: Id0610eed507f8bb9c5da338918112853d5c8127a
2012-01-04 18:49:37 -08:00
Pascal Massimino
e852f83205 update Android.mk file list
Change-Id: Idffc497931c85582725e65f1d7b7384bcee0fdd5
2011-12-26 02:39:56 -08:00
James Zern
8962030f52 remove orphan source file
enc/enc.c became a part of dsp/ w/e06ac08.

Change-Id: I9e99aa6eac7b28162f58f3f0dc180a68a279d136
2011-11-14 19:31:02 -08:00
James Zern
d8329d41e3 Android.mk: add missing source files
Fell out of date with some of the dsp.c moves.
Fixes Issue #93.

Change-Id: I2fb4424dbb0bc31af236cef2bed5ee542622067b
2011-10-19 20:02:23 -07:00
Somnath Banerjee
d4e9f5598d NEON decode support in WebP
Change-Id: I0d6fa456ca68468353adcd64669f1737d1446f65
2011-09-13 16:00:47 -07:00
Pascal Massimino
b112e83647 create a libwebputils under src/utils
with bit_reader bit_writer and thread for now.

Change-Id: If961933fcfc43e60220913fe4d527230ba8f46bb
2011-09-13 15:34:15 -07:00
Pascal Massimino
e06ac0887f create a separate libwebpdsp under src/dsp
Gathers all DSP-related function (and SSE2 implementations).
Clean-up some unwanted symbolic dependencies so that webp_encode,
webp_decode and webp_dsp are truly independent libraries.

+ opportunistic clean-up:
  * remove unneeded VP8DspInitTables(), now integrated in VP8DspInit()
  * make consistent use of VP8GetCPUInfo() in the various DspInit() funcs
  * change OUT macro to DST
2011-09-13 12:29:44 -07:00
Pascal Massimino
fc7815d692 multi-thread decoding: ~25-30% faster
To be enabled with the flag WEBP_USE_THREAD.
For now it's only available on unix (pthread), when using Makefile.unix
Will be switched on more generally later.

In-loop filtering and output (=rescaling/yuv->rgb conversion)
is done in parallel to bitstream decoding, lagging 1 row behind.

Example:
examples/dwebp bryce.webp -v
Time to decode picture: 0.680s

examples/dwebp bryce.webp -v -mt
Time to decode picture: 0.515s

Change-Id: Ic30a897423137a3bdace9c4e30465ef758fe53f2
2011-07-22 15:14:18 -07:00
Pascal Massimino
d260310511 add Advanced Decoding Interface
You can now use WebPDecBuffer, WebPBitstreamFeatures and WebPDecoderOptions
to have better control over the decoding process (and the speed/quality tradeoff).

WebPDecoderOptions allow to:
 - turn fancy upsampler on/off
 - turn in-loop filter on/off
 - perform on-the-fly cropping
 - perform on the-fly rescale
(and more to come. Not all features are implemented yet).

On-the-fly cropping and scaling allow to save quite some memory
(as the decoding operation will now scale with the output's size, not
the input's one). It saves some CPU too (since for instance,
in-loop filtering is partially turned off where it doesn't matter,
and some YUV->RGB conversion operations are ommitted too).

The scaler uses summed area, so is mainly meant to be used for
downscaling (like: for generating thumbnails or previews).

Incremental decoding works with these new options.
More doc to come soon.

dwebp is now using the new decoding interface, with the new flags:
  -nofancy
  -nofilter
  -crop top left width height
  -scale width height

Change-Id: I08baf2fa291941686f4ef70a9cc2e4137874e85e
2011-06-20 15:30:52 -07:00
Pascal Massimino
6d0e66c23e prepare experimentation with yuv444 / 422
+ add a simple rescaling function: WebPPictureRescale() for encoding
+ clean-up the memory managment around the alpha plane
+ fix some includes path by using "../webp/xxx.h" instead of "webp/xxx.h"

New flags for 'cwebp':
 -resize <width> <height>
 -444  (no effect)
 -422  (no effect)
 -400

Change-Id: I25a95f901493f939c2dd789e658493b83bd1abfa
2011-05-04 15:41:08 -07:00
Pascal Massimino
efbc6c41fc update Android.mk
idec.c was missing

Change-Id: I18fe8eb6576c52b56519567b9905ad5ac2db60c0
2011-03-26 07:33:19 -07:00
Pascal Massimino
f61d14aabf a WebP encoder
converts PNG & JPEG to WebP

This is an experimental early version, with lot of room
of later optimizations in both speed and quality.

Compile with the usual `./configure && make`
Command line example is examples/cwebp

Usage:

   cwebp [options] -q quality input.png -o output.webp

where 'quality' is between 0 (poor) to 100 (very good).
Typical value is around 80.

More encoding options with 'cwebp -longhelp'

Change-Id: I577a94f6f622a0c44bdfa9daf1086ace89d45539
2011-02-18 23:54:59 -08:00
Pierre Joye
c8d15efa12 convert to ANSI-C 2010-10-06 14:37:28 -07:00
Pascal Massimino
c3f41cb47e Initial commit
Change-Id: I4712afb3912625e7aaccfa5160dcf78ee252f159
2010-09-30 09:55:07 -04:00