Commit Graph

198 Commits

Author SHA1 Message Date
James Zern
2719bb7e98 dec_neon: TransformAC3: work on packed vectors
pack 2 rows in 1 vector similar to TransformDC

Change-Id: I3b240ffb4f51a632b5c8c2daf54d938333ed4b0d
2014-02-18 19:47:20 -08:00
James Zern
b7b60ca16c dec_neon: add SaturateAndStore4x4
converts 2 s16 vectors to 2 u8 and store to uint8_t destination;
TransformAC3 can reuse this after a rework

Change-Id: Ia9370283ee3d9bfbc8c008fa883412100ff483d0
2014-02-18 19:42:35 -08:00
James Zern
e02f16ef45 dec_neon.c: convert TransformDC to intrinsics
no noticeable difference in performance

Change-Id: Ia2d287289c3865ddd0fc99edaf7a030778aa7025
2014-02-14 12:11:58 -08:00
skal
9cba963f9a add missing file
Change-Id: I17eab2fedc64ee3bba941a592ecef765fcd2b402
2014-02-13 21:56:19 -08:00
skal
8992ddb756 use static clipping tables
(shared with mips32)
removed abs1[] table along the way

sub-1% speed-up, but still...

Change-Id: I8c29a8a0285076cb3423b01ffae9fcc465da6a81
2014-02-13 19:32:59 -08:00
skal
0235d5e44b 1-2% faster quantization in SSE2
C-version is a bit faster too (sub-1% faster on ARM)

Change-Id: I077262042f1d0937aba1ecf15174f2c51bf6cd97
2014-02-13 15:55:30 -08:00
James Zern
228e4877ab dec_neon.c: add TransformAC3
based on SSE2 version

Change-Id: Icc6782955253c98e83d5984153b596ef5f1c0d34
2014-02-08 12:47:54 -08:00
skal
32aeaf115a revamp VP8LColorSpaceTransform() a bit
-> remove the 'color_transform' multiplier, use more constants, etc.

This function is particularly critical, mostly because of
GetBestColorTransformForTile().
Loop is a bit faster (maybe ~1%)

Change-Id: I90c96a3437cafb184773acef55c77e40c224388f
2014-02-05 10:37:06 +01:00
skal
926ff40229 WEBP_SWAP_16BIT_CSP: remove code dup
and prepare for potentially supporting both RGBA4444 and BARG4444

Change-Id: If5200289bc6338757a2ceb2df1a19de732595052
2014-02-03 13:24:33 -08:00
Vikas Arora
1d1cd3bbd6 Fix decode bug for rgbA_4444/RGBA_4444 color-modes.
The WEBP_SWAP_16BIT_CSP flag needs to be honored while filling the Alpha (4 bits)
data in the destination buffer and while pre-multiplying the alpha to RGB colors.

Change-Id: I3b07307d60963db8d09c3b078888a839cefb35ba
2014-02-03 09:20:54 -08:00
James Zern
8934a622ac cosmetics: *_mips32.c
indent, comments, unused includes

Change-Id: Id0aabc52d05bb633f62aec022155ec27699cf5a0
2014-01-30 18:03:48 -08:00
Djordje Pesut
dd438c9a7d MIPS: MIPS32r1: Optimization of some simple point-sampling functions. PATCH [6/6]
Change-Id: I2020e71e9be5d17d4bf67cabf6c470ca43d5d838
2014-01-29 15:37:31 +01:00
Djordje Pesut
53520911c3 Added support for calling sampling functions via pointers.
Change-Id: Ic4d72e6b175a6b27bcdcc8cd97828e44ea93e743
2014-01-29 15:32:35 +01:00
Jovan Zelincevic
d16c69749b MIPS: MIPS32r1: Optimization of filter functions. PATCH [5/6]
Change-Id: Ifbd305e0514f09a587db02c3970f22190808503a
2014-01-29 15:03:45 +01:00
Djordje Pesut
04336fc7f8 MIPS: MIPS32r1: Optimization of function TransformOne. PATCH [4/6]
Change-Id: I5b98e2de940977538cf91bfa2128f4d1daa5c170
2014-01-28 20:10:43 -08:00
Pascal Massimino
c1cb1933d5 disable NEON for arm64 platform
The registers and instructions are quite different to 32bit
and the assembly code needs a rewrite.

more info: http://people.linaro.org/~rikuvoipio/aarch64-talk/

Change-Id: Id75dbc1b7bf47f43a426ba2831f25bb8fa252c4f
2014-01-23 12:35:01 -08:00
skal
66a32af5e1 Merge "NEON speed up" 2013-12-18 14:17:19 -08:00
skal
26d842eb8f NEON speed up
add TransformDC special case, and make the switch function inlined.
Recovers a few of the CPU lost during the addition of TransformAC3
(only on ARM)

Change-Id: I21c1f0c6a9cb9d1dfc1e307b4f473a2791273bd6
2013-12-18 22:32:58 +01:00
James Zern
605a712701 simplify __cplusplus ifdef
drop c_plusplus which is from a quite ancient pre-standard compiler

Change-Id: I9e357b3292a6b52b14c2641ba11f4f872c04b7fb
2013-12-16 20:16:02 -08:00
James Zern
5227d99146 drop: ifdef __cplusplus checks from C files
the prototypes are already marked in the headers

Change-Id: I172fe742200c939ca32a70a2299809b8baf9b094
2013-12-13 11:42:13 -08:00
skal
73b731fb42 introduce a special quantization function for WHT
WHT is somewhat a special case: no sharpen[] bias, etc.
Will be useful in a later CL when precision of input is changed.

Change-Id: I851b06deb94abdfc1ef00acafb8aa731801b4299
2013-12-10 14:21:47 +01:00
skal
41c0cc4b9a Make Forward WHT transform use 32bit fixed-point calculation
This is in preparation for a future change where input will
be 16bit instead of 12bit

No speed diff observed.

Note that the NEON implementation was using 32bit calc already.

Change-Id: If06935db5c56a77fc9cefcb2dec617483f5f62b4
2013-12-10 06:10:52 +01:00
skal
d513bb62bc * fix off-by-one zthresh calculation
* remove the sharpening for non luma-AC coeffs
* adjust the bias a little bit to compensate for this

Using the multiply-by-reciprocal doesn't always give the same result
as the exact divide, given the QFIX fixed-point precision we use.
-> removed few now-unneeded SSE2 instructions (and checked for
bit-exactness using -noasm)

Change-Id: Ib68057cbdd69c4e589af56a01a8e7085db762c24
2013-12-09 13:56:04 +01:00
James Zern
4931c3294b cosmetics: fix some typos
Change-Id: I0d6efebd817815139db5ae87236fd8911df4d53c
2013-11-26 19:21:14 -08:00
Pascal Massimino
596a6d73ce make use of 'extern' consistent in function declarations
Change-Id: I18e050db3111e52acfe97da09cdf1860f3e15936
2013-10-30 03:23:21 -07:00
skal
0b2b05049f Use deterministic random-dithering during RGB->YUV conversion
-> helps debanding (sky, gradients, etc.)

This dithering can only be triggered when using -preset photo
or -pre 2 (as a preprocessing). Everything is unchanged otherwise.

Note that this change is likely to make the perceived PSNR/SSIM drop
since we're altering the input internally.

Change-Id: Id8d4326245d9b828141de162c94ba381b1fa5813
2013-10-17 22:36:49 +02:00
James Zern
dca8a4d315 Merge "NEON/simple loopfilter: avoid q4-q7 registers" 2013-10-10 01:58:41 -07:00
pascal massimino
9e84d901d2 Merge "NEON/TransformWHT: avoid q4-q7 registers" 2013-10-09 09:32:59 -07:00
James Zern
fc10249b36 NEON/simple loopfilter: avoid q4-q7 registers
very tiny speed improvement

Change-Id: I3024f120feb7275ce20bfff21af31ea8650a5a03
2013-10-09 18:17:31 +02:00
James Zern
2f09d63e30 NEON/TransformWHT: avoid q4-q7 registers
very tiny speed improvement

Change-Id: Iace78b9038af412d0a794845ff19f54afa88ccdc
2013-10-09 18:17:23 +02:00
skal
f9bbc2a034 Special-case sparse transform
If the number of non-zero coeffs is <= 3, use a
simplified transform for luma.

Change-Id: I78a1252704228d21720d4bc1221252c84338d9c8
2013-10-08 22:05:38 +02:00
Pascal Massimino
f8398c9dab fix compile error on ARM/gcc
use of uint8_t type was causing error like:
src/dsp/upsampling.c:223:1: internal compiler error: in vect_determine_vectorization_factor, at tree-vect-loop.c:349

with gcc 4.6.3

Change-Id: Ieb6189a1375c47fc4ff992e6c09b34a7f1f605da
2013-09-06 03:07:28 -07:00
James Zern
b25a6fbfdc yuv.h: fix indent
Change-Id: I0c0bd5f7f71bc44e10134bd4f788769ec25cec1f
2013-08-19 18:06:15 -07:00
James Zern
388a7249c9 cosmetics: fix indent
Change-Id: Iad0fce79886bed0d61ddf2510ce133a5355ebc1f
2013-08-19 17:51:04 -07:00
James Zern
4c7322c86f Merge "dsp: msvc compatibility" 2013-08-19 17:42:16 -07:00
skal
df6cebfa9e 5-7% faster SSE2 versions of YUV->RGB conversion functions
The C-version gets ~7-8% slower in order to match the SSE2
output exactly. The old (now off-by-1) code is kept under
the WEBP_YUV_USE_TABLE flag for reference.

(note that calc rounding precision is slightly better ~= +0.02dB)

on ARM-neon, we somehow recover the ~4% speed that was lost by mimicking
the initial C-version (see https://gerrit.chromium.org/gerrit/#/c/41610)

Change-Id: Ia4363c5ed9b4c9edff5d932b002e57bb7814bf6f
2013-08-19 17:05:58 -07:00
skal
ad6ac32d7c simplify upsampler calls: only allow 'bottom' to be NULL
If 'top' was meant to be NULL, then bottom and top can be
swapped. Logic is simpler.

+ fix compilation in non-FANCY_UPSAMPLING mode

Change-Id: I7c62bbb59454017f072c0945d1ff2d24d89286ff
2013-08-19 16:47:51 -07:00
James Zern
f358450feb dsp: msvc compatibility
intrin.h is available after VS2003

patch from the FreeImage project

Change-Id: I58a18a0db00e247f871d05e3ba99772704f0e079
2013-08-16 20:46:16 -07:00
Vikas Arora
e081f2f359 Pack code & extra_bits to Struct (VP8LPrefixCode).
Also created variant VP8LPrefixEncodeBits that returns the
code & extra_bits only.
There's no impact on compression density and compression speed.

Change-Id: I2cafdd3438ac9270cd72ad9d57b383cdddfdfa4c
2013-08-12 11:56:42 -07:00
Vikas Arora
69257f70df Create LUT for PrefixEncode.
This speeds up lossless compression by 5%.
Change-Id: Ifd114b1d9850dc3aac74593809e7d48529d35e3d
2013-08-05 10:20:18 -07:00
Vikas Arora
8967b9f37e SSE2 for lossless decoding (critical) functions.
This speeds up WebP lossless decoding by 20%. In particular, the
photographic images get 35% speedup.

Change-Id: Idb94750342a140ec05df52c07e12be4bba335adc
2013-06-27 11:42:45 -07:00
James Zern
d640614d54 update copyright text
rather than symlink the webm/vpx terms, use the same header as libvpx to
reference in-tree files

based on the discussion in:
https://codereview.chromium.org/12771026/

Change-Id: Ia3067ecddefaa7ee01550136e00f7b3f086d4af4
2013-06-06 23:09:14 -07:00
skal
af358e68ed Merge "remove datatype qualifier for vmnv" 2013-05-23 06:12:06 -07:00
skal
3fe91635df remove datatype qualifier for vmnv
this fix is for clang (LLVM v4.2). gcc was fine.

Change-Id: Id4076cda84813f6f9548a01775b094cff22b4be9
2013-05-23 13:52:24 +02:00
James Zern
2ca83968ae webp/lossless: fix big endian BGRA output
Change-Id: I3d4b3d21f561cb526dbe7697a31ea847d3e8b2c1
2013-05-17 00:32:01 -07:00
skal
87a4fca25f remove some warnings:
* "declaration of ‘index’ shadows a global declaration [-Wshadow]"
* "signed and unsigned type in conditional expression [-Wsign-compare]"

Change-Id: I891182d919b18b6c84048486e0385027bd93b57d
2013-05-14 22:28:32 +02:00
Urvang Joshi
64c844863a Further reduce memory to decode lossy+alpha images
Earlier such images were using roughly 9 * width * height bytes for
decoding. Now, they take 6 * width * height memory.

Change-Id: Ie4a681ca5074d96d64f30b2597fafdca648dd8f7
2013-05-13 16:24:49 -07:00
Vikas Arora
8eae188a62 WebP-Lossless encoding improvements.
Lossy (with Alpha) image compression gets 2.3X speedup.
Compressing lossless images is 20%-40% faster now.

Change-Id: I41f0225838b48ae5c60b1effd1b0de72fecb3ae6
2013-05-08 17:22:11 -07:00
skal
9c4ce971a8 Simplify forward-WHT + SSE2 version
no precision loss observed
speed is not really faster (0.5% at max), as forward-WHT isn't called often.

also: replaced a "int << 3" (undefined by C-spec) by a "int * 8"
( supersedes https://gerrit.chromium.org/gerrit/#/c/48739/ )

Change-Id: I2d980ec2f20f4ff6be5636105ff4f1c70ffde401
2013-04-26 08:57:18 +02:00
Urvang Joshi
d52b405dbd Cosmetic fixes
Change-Id: Ia878115086edc3fdfee3f0ca76e5e74ea5906f21
(cherry picked from commit e9a7990bc5)
2013-03-29 15:49:15 -07:00
Pascal Massimino
6cb4a61825 misc style fix
(cherry picked from commit 142c46291e)

Conflicts:
	src/webp/format_constants.h

Change-Id: Ib764cb09bd78ab6e72c60f495d55b752ad4dbe4d
2013-03-29 15:49:05 -07:00
Pascal Massimino
3c8eb9a806 fix bad saturation order in QuantizeBlock
Saturation was done on input coeff, not quantized one.

This saturation is not absolutely needed: output of FTransformWHT
is in range [-16320, 16321]. At quality 100, max quantization steps is 8,
so the maximal range used by QuantizeBlock() is [-2040, 2040].
But there's some extra bias (mtx->bias_[] and mtx->sharpen_[]) so
it's better to leave this saturation check for now.

addresses issue #145

Change-Id: I4b14f71cdc80c46f9eaadb2a4e8e03d396879d28
2013-03-25 14:53:29 -07:00
James Zern
9048494df6 build: fix install race on shared headers
subdirectories with more than one target can have the install targets
run in parallel with make -jN. group the shared headers in one place to
produce a common install target.

Change-Id: I1f3aa338a8ee6d681de1e5d0b2c6244d2c3d5451
2013-03-16 13:29:49 -07:00
skal
126974b45b add LUT-free reference code for YUV->RGB conversion.
Reported to eventually be 4% on ARM
(see https://code.google.com/p/webp/issues/detail?id=134 for details)
We might activate it selectively later...

Output values is not bitwise the same as the LUT-based
version, but difference is only +/-1 at max.

Change-Id: I1cc790ff4459885ed2ae2e72f31c5f3740095f07
2013-03-15 01:37:55 +01:00
skal
b7eaa85d6a inline VP8LFastLog2() and VP8LFastSLog2 for small values
larger values are still dealt with in the .cc

~5% faster encoding
Output size is slightly different (variably), because of
different floating-point calculation ordering.

Change-Id: I6ede18b09c753997cf78aa1199a807d9ddb5d4b4
2013-02-25 22:46:52 +01:00
skal
943386db4b disable SSE2 for now
(until proper run-time detection is ready)

Change-Id: I7b8eee52b23fce2f1612ad7d4ed603ffb02620a2
2013-02-20 08:20:47 +01:00
skal
9479fb7d2d lossless encoding speedup
* add SSE2 variant for lossless
* speed-up TransformColor calls using specialized TransformColorBlue/Red
* Fuse the Shannon Entropy calls to compute it for X and X+Y simultaneously.

This latter changes the output size a little bit.

Change-Id: Ie5df94da78bf51a58da859c9099b56340da9ec89
2013-02-20 08:13:12 +01:00
skal
b7490f8553 introduce WEBP_REFERENCE_IMPLEMENTATION compile option
This flag will make the code use no uint64, no asm, and no fancy
trick, but instead aim at being as simple and straightforward as
possible.
Main use is to help emscripten generate proper JS code.
More code needs to be simplified later.

Also: tune the BITS values to be 24 and make use of WEBP_RIGHT_JUSTIFY
Here are the typical timing for decoding a large image:

        ARM7-a:
        dwebp_justify_32_neon Time to decode picture: 3.280s
        dwebp_justify_24_neon Time to decode picture: 2.640s
        dwebp_justify_16_neon Time to decode picture: 2.723s
        dwebp_justify_8_neon Time to decode picture: 2.802s
        dwebp_justify_32 Time to decode picture: 4.264s
        dwebp_justify_24 Time to decode picture: 3.696s
        dwebp_justify_16 Time to decode picture: 3.779s
        dwebp_justify_8 Time to decode picture: 3.834s
        dwebp_32_neon Time to decode picture: 4.010s
        dwebp_24_neon Time to decode picture: 2.725s
        dwebp_16_neon Time to decode picture: 2.852s
        dwebp_8_neon Time to decode picture: 2.778s
        dwebp_32 Time to decode picture: 4.587s
        dwebp_24 Time to decode picture: 3.800s
        dwebp_16 Time to decode picture: 3.902s
        dwebp_8 Time to decode picture: 3.815s
        REFERENCE (HEAD) Time to decode picture: 3.818s

        x86_64:
        dwebp_justify_32 Time to decode picture: 0.473s
        dwebp_justify_24 Time to decode picture: 0.434s
        dwebp_justify_16 Time to decode picture: 0.450s
        dwebp_justify_8 Time to decode picture: 0.467s
        dwebp_32 Time to decode picture: 0.474s
        dwebp_24 Time to decode picture: 0.468s
        dwebp_16 Time to decode picture: 0.468s
        dwebp_8 Time to decode picture: 0.481s
        REFERENCE (HEAD) Time to decode picture: 0.436s

        i386:
        dwebp_justify_32 Time to decode picture: 0.723s
        dwebp_justify_24 Time to decode picture: 0.618s
        dwebp_justify_16 Time to decode picture: 0.626s
        dwebp_justify_8 Time to decode picture: 0.651s
        dwebp_32 Time to decode picture: 0.744s
        dwebp_24 Time to decode picture: 0.627s
        dwebp_16 Time to decode picture: 0.642s
        dwebp_8 Time to decode picture: 0.642s

Change-Id: Ie56c7235733a24f94fbfc2e4351aae36ec39c225
2013-02-14 15:46:12 +01:00
pascal massimino
841a3ba5da Merge "Remove -Wshadow warnings." 2013-01-28 13:15:54 -08:00
Johann
6efed26865 Remove -Wshadow warnings.
Accidentally carried some bad habits from SSE code. Copy over fixes
from 0d19fbf

Change-Id: I763312c9d176c434ba41f95602bada1aeffebfb2
2013-01-28 12:29:12 -08:00
James Zern
27f8f7420e upsampling_neon.c: fix build
store values to a temporary variable before calling functions that take
vector types.
removes non-standard constructs such as:
  (uint8x8x2_t){{ a, b }}
fixing:
  src/dsp/upsampling_neon.c:69:32: error: macro "vst2_u8" passed 3
arguments, but takes just 2

Change-Id: Ib4368e16e3a3efac18024f02be94e76243ade2dc
Fixes: https://code.google.com/p/webp/issues/detail?id=140
2013-01-25 19:42:50 -08:00
Mans Rullgard
090b708a00 NEON optimised yuv to rgb conversion
- along the lines of the SSE chroma upsampling.
Total speedup is ~30%.

4% speed loss on YuvToRgbXX conversion using tables instead
of 14-bit fixed precision. TODO(later): investigate, and compare
to x86.

see http://code.google.com/p/webp/issues/detail?id=134

Change-Id: Idc2261037cd13b4553ca20ecc4c4007099c37009
2013-01-25 15:46:40 -08:00
James Zern
be7c96b069 cosmetics: break a few long lines
Change-Id: I785763b974b4e7664ad8e9884251aa2d5274b456
2013-01-23 14:50:19 -08:00
Vikas Arora
0aeba52852 Provide an option to build decoder library.
When the config option '--enable-libwebpdecoder' is specified, the
lean decoder library 'libwebpdecoder' will be created in addition to
libwebp. Also dwebp binary will be linked to libwebpdecoder, if this
config option is specified.

Change-Id: I9de3e149b59c9a8390fae2ba660941749640e54a
2013-01-23 11:43:36 -08:00
James Zern
2b252a53a8 Merge "Provide option to swap bytes for 16 bit colormodes" 2013-01-22 15:00:39 -08:00
Vikas Arora
94a48b4bc3 Provide option to swap bytes for 16 bit colormodes
Color modes: RGB_565 & RGBA_4444
Change-Id: I571b6832b9848e5c4109272978f68623ca373383
2013-01-22 14:51:20 -08:00
skal
0d19fbff51 remove some -Wshadow warnings
these are quite noisy, but it's not a big deal to remove
them.

Change-Id: I5deb08f10263feb77e2cc8a70be44ad4f725febd
2013-01-22 23:06:28 +01:00
skal
a556cb1ab4 Add details and reference about the YUV->RGB conversion
Originated from the discussion at
   http://code.google.com/p/webp/issues/detail?id=134

Change-Id: I24384e2d2f5cf262d8632fc98303cba5e2d27224
2013-01-18 23:26:55 +01:00
pascal massimino
f4a97970de Merge "Disto4x4 and Disto16x16 in NEON" 2013-01-17 11:07:20 -08:00
Johann
47b7b0ba47 Disto4x4 and Disto16x16 in NEON
Change-Id: Ic6d9dbbc97b5025ce359332c33ae306d5d8925a5
2013-01-16 16:57:33 -08:00
vikas arora
e6409adc2e Remove redundant include from dsp/lossless code.
Change-Id: Ie8a497a486653f907c2a27f4027640a3308c6cc8
2013-01-10 15:09:19 -08:00
skal
d5838cd598 faster non-transposing SSE2 4x4 FTransform
1-2% faster.
uses pmaddwd instead of transpose + pmullw.
Can possibly be simplified further.

Change-Id: I420e148816c4c6ab5e2080c9b1719dbbe6762d4e
2012-11-27 08:38:24 +01:00
skal
42c3b550ba simplify the fwd transform
-> remove two shifts

Change-Id: Ibc55bca98588da30553a7870224ffd0e13d57f52
2012-11-15 09:51:35 +01:00
skal
118cb31270 Merge "add SSE2 version of Sum of Square error for 16x16, 16x8 and 8x8 case" 2012-11-15 00:07:44 -08:00
skal
e5c3b3f554 Simplify the texture evaluation Disto4x4()
We don't need to use the exact forward transform,
since it's only a rough evaluation.
-> Removed some shifts and rounding constants.

Change-Id: I3fdf8b4fe9720473894155e1ad0345f4d1fd9a33
2012-11-14 07:49:31 +01:00
skal
35bfd4c08f add SSE2 version of Sum of Square error for 16x16, 16x8 and 8x8 case
+ replace mm_set1_ps(0) by _mm_setzero_si128()

Change-Id: I4601033c27466532373f5dabfaf349ce5e5039da
2012-11-14 06:16:49 +01:00
Urvang Joshi
7caab1d8f6 Some cosmetic/comment fixes.
Change-Id: Id0613f84cc53fcbeceb913c835a262451687e27b
2012-11-09 10:46:38 -08:00
Pascal Massimino
22a0fd9d01 Add NEON version of FTransformWHT
Contributed by Wayne Chen (datoudatou at gmail dot com)

Change-Id: I007c21db4eeadbf82b89f0963256f965deda7d90
2012-11-08 08:28:51 -08:00
Pascal Massimino
e8b41ad136 add NEON asm version for WHT inverse transform
Contributed by Wayne Chen (datoudatou at gmail dot com)

+ some header cleanup
+ remove the NEON suffix in static functions

Change-Id: I75bf5e9b54cf5e1acc53764c6f081d61690f8e3d
2012-11-01 16:31:01 -07:00
Pascal Massimino
75e5f17e3b ARM/NEON: 30% encoding speed-up
(implements the backward and forward transforms in the encoder)

original patch by Wayne Chen (datoudatou at gmail dot com)

Change-Id: Ic00f3bffcdf7a924f043006728735c810ee47a57
2012-10-31 14:00:20 -07:00
skal
f0360b4fcf add EXPERIMENTAL code for YUV-JPEG colorspace
This is mostly for experimentation!
Need to define USE_YUVj flag in the code for that.

suggested by benwreder at hotmail dot com

Change-Id: If0b8e2c1863efc08ce097de6de20f4c7efc3f7e8
2012-10-19 20:15:58 +02:00
skal
5725cabac0 new segmentation algorithm
fixes the 'blocky sky problem' (saturation problem: when luma was flat,
chroma noise was taking over, resulting in random segment id assigned.
When just using a common uniform segment was better).

+ side clean-up and readibility/experimentability MACRO'ization
+ added '-map 7' option

Change-Id: I35982a9e43c0fecbfdd7b05e4813e8ba8c121d71
2012-09-04 23:09:15 +02:00
Pascal Massimino
5c3a7231ca Make *InitSSE2() functions be empty on non-SSE2 platform
this avoids the '*.o has no symbols' warning messages

Change-Id: I00cf527a9041a810d896bd24b993112af6276323
2012-08-28 11:02:38 -07:00
Pascal Massimino
7c6e60f4bd make *InitSSE2() functions be empty on non-SSE2 platform
this avoids the '*.o has no symbols' warning messages

Change-Id: Idbaa02f5c2f7c632997a26f9507926922d191b6e
2012-08-27 23:40:47 -07:00
Pascal Massimino
c7eb45764f make VP8DspInitNEON() public
this will avoid the "dec_neon.o has no symbol" warning

no change in binary size observed on linux.

Change-Id: Ia27ae2bc5a03d714afa7e46671fdcf4cb630784d
2012-08-27 00:28:13 -07:00
James Zern
fe1958f17d RGBA4444: harmonize lossless/lossy alpha values
lossy was rounding with a bias toward opaque:
[232+, 8] -> [15, 1]
now both paths use the range:
[240+, 16] -> [15, 1]

Change-Id: I3da2063b4959b9e9f45bae09e640acc1f43470c5
2012-08-14 14:02:30 -07:00
James Zern
f06c1d8f7b Merge "Alignment fix" into 0.2.0 2012-08-09 16:09:58 -07:00
Urvang Joshi
f56e98fd11 Alignment fix
Change-Id: Ia5475247f03456b01571ae7531da90f74c068045
2012-08-10 02:10:32 +05:30
Pascal Massimino
528a11af35 fix the ARGB4444 premultiply arithmetic
* green was not descaled properly
* alpha was over-dithered, making the value '0x0f' not be a fixed point
* alpha value was not restored ok.

Change-Id: Ia4a4d75bdad41257f7c07ef76a487065ac36fede
2012-08-09 11:32:30 -07:00
Urvang Joshi
a0a488554d Lossless decoder fix for a special transform order
Fix the lossless decoder for the case when it has to apply other
inverse transforms before applying Color indexing inverse transform.

The main idea is to make ColorIndexingInverse virtually in-place: we
use the fact that the argb_cache is allocated to accommodate all
*unpacked* pixels of a macro-row, not just *packed* pixels.

Change-Id: I27f11f3043f863dfd753cc2580bc5b36376800c4
2012-08-08 23:52:08 -07:00
Pascal Massimino
f94b04f045 move some RGB->YUV functions to yuv.h
will be needed later

Change-Id: I6b9e460db2d398b9fecd5d3c1bbdb3f2f3d4f5db
2012-08-02 17:23:02 -07:00
Pascal Massimino
4af3f6c4d3 fix indentation
Change-Id: Ib00b3cdc21ac336a56390f1e71c169e7fd4767a6
2012-08-02 11:55:55 -07:00
Pascal Massimino
323dc4d9b9 remove use of log2(). Use VP8LFastLog2() instead.
Order-by-cost mostly unchanged (up to a scaling constant 1/log(2))
(except for few minor diff in < 2% of cases)

+ remove unused field cost_mode->cache_bits_

Change-Id: I714f8ab12f49a23f5d499a64c741382c9b489a3e
2012-08-02 00:08:58 -07:00
Pascal Massimino
2fc1301577 harmonize authors as "Name (mail@address)"
Change-Id: I85bfae61a37de75a5ed945a906002de2ef75149f
2012-07-19 16:09:47 -07:00
James Zern
d5e5ad6356 move decode_vp8.h from webp/ to dec/
the functions contained in it are now private

Change-Id: Ief6c81b32ae3f6d97052edac625716e5b909e66e
2012-07-16 22:12:59 -07:00
Pascal Massimino
fcc69923b9 add automatic YUVA/ARGB conversion during WebPEncode()
Adds new methods WebPPictureARGBToYUVA() and WebPPictureYUVAToARGB()
Depending on the value of picture->use_argb_input,
the main call WebPEncode() will convert appropriately.
Note that both conversions are lossy, so it's recommended to:
* use YUVA input for lossy compression (picture->use_argb_input=0)
* use ARGB input for lossless compression (picture->use_argb_input=1)

Change-Id: I8269d607723ee8a1136b9f4999f7ff4e657bbb04
2012-06-28 00:34:23 -07:00
Pascal Massimino
802e012a18 fix compilation in non-FANCY_UPSAMPLING mode
Change-Id: Id0b1fad3a4888b6e9563a227412b2e6a656d9a2a
2012-06-28 00:26:35 -07:00
Pascal Massimino
637a314f97 remove the now unused *KeepA variants
Change-Id: I65217f3075e30bc9a7f38a49d09f01c9d7271d6a
2012-06-27 10:00:48 -07:00
Pascal Massimino
78f3e34504 Enable lossless encoder code
Remove USE_LOSSLESS_ENCODER compile flag
Update Makefile.am and makefile.unix

Change-Id: If7080c4d8f37994c7c784730c5e547bb0a851455
2012-06-13 00:26:58 -07:00
James Zern
42f6df9da3 fix some implicit type conversion warnings
Change-Id: I0653d10410c0d46f91fedad4c4dffa9c1de402cb
2012-06-04 22:33:32 -07:00