Commit Graph

579 Commits

Author SHA1 Message Date
Johann
47b7b0ba47 Disto4x4 and Disto16x16 in NEON
Change-Id: Ic6d9dbbc97b5025ce359332c33ae306d5d8925a5
2013-01-16 16:57:33 -08:00
vikas arora
e6409adc2e Remove redundant include from dsp/lossless code.
Change-Id: Ie8a497a486653f907c2a27f4027640a3308c6cc8
2013-01-10 15:09:19 -08:00
skal
d5838cd598 faster non-transposing SSE2 4x4 FTransform
1-2% faster.
uses pmaddwd instead of transpose + pmullw.
Can possibly be simplified further.

Change-Id: I420e148816c4c6ab5e2080c9b1719dbbe6762d4e
2012-11-27 08:38:24 +01:00
skal
42c3b550ba simplify the fwd transform
-> remove two shifts

Change-Id: Ibc55bca98588da30553a7870224ffd0e13d57f52
2012-11-15 09:51:35 +01:00
skal
118cb31270 Merge "add SSE2 version of Sum of Square error for 16x16, 16x8 and 8x8 case" 2012-11-15 00:07:44 -08:00
skal
e5c3b3f554 Simplify the texture evaluation Disto4x4()
We don't need to use the exact forward transform,
since it's only a rough evaluation.
-> Removed some shifts and rounding constants.

Change-Id: I3fdf8b4fe9720473894155e1ad0345f4d1fd9a33
2012-11-14 07:49:31 +01:00
skal
35bfd4c08f add SSE2 version of Sum of Square error for 16x16, 16x8 and 8x8 case
+ replace mm_set1_ps(0) by _mm_setzero_si128()

Change-Id: I4601033c27466532373f5dabfaf349ce5e5039da
2012-11-14 06:16:49 +01:00
Urvang Joshi
7caab1d8f6 Some cosmetic/comment fixes.
Change-Id: Id0613f84cc53fcbeceb913c835a262451687e27b
2012-11-09 10:46:38 -08:00
Pascal Massimino
22a0fd9d01 Add NEON version of FTransformWHT
Contributed by Wayne Chen (datoudatou at gmail dot com)

Change-Id: I007c21db4eeadbf82b89f0963256f965deda7d90
2012-11-08 08:28:51 -08:00
Pascal Massimino
e8b41ad136 add NEON asm version for WHT inverse transform
Contributed by Wayne Chen (datoudatou at gmail dot com)

+ some header cleanup
+ remove the NEON suffix in static functions

Change-Id: I75bf5e9b54cf5e1acc53764c6f081d61690f8e3d
2012-11-01 16:31:01 -07:00
Pascal Massimino
75e5f17e3b ARM/NEON: 30% encoding speed-up
(implements the backward and forward transforms in the encoder)

original patch by Wayne Chen (datoudatou at gmail dot com)

Change-Id: Ic00f3bffcdf7a924f043006728735c810ee47a57
2012-10-31 14:00:20 -07:00
skal
f0360b4fcf add EXPERIMENTAL code for YUV-JPEG colorspace
This is mostly for experimentation!
Need to define USE_YUVj flag in the code for that.

suggested by benwreder at hotmail dot com

Change-Id: If0b8e2c1863efc08ce097de6de20f4c7efc3f7e8
2012-10-19 20:15:58 +02:00
skal
5725cabac0 new segmentation algorithm
fixes the 'blocky sky problem' (saturation problem: when luma was flat,
chroma noise was taking over, resulting in random segment id assigned.
When just using a common uniform segment was better).

+ side clean-up and readibility/experimentability MACRO'ization
+ added '-map 7' option

Change-Id: I35982a9e43c0fecbfdd7b05e4813e8ba8c121d71
2012-09-04 23:09:15 +02:00
Pascal Massimino
5c3a7231ca Make *InitSSE2() functions be empty on non-SSE2 platform
this avoids the '*.o has no symbols' warning messages

Change-Id: I00cf527a9041a810d896bd24b993112af6276323
2012-08-28 11:02:38 -07:00
Pascal Massimino
7c6e60f4bd make *InitSSE2() functions be empty on non-SSE2 platform
this avoids the '*.o has no symbols' warning messages

Change-Id: Idbaa02f5c2f7c632997a26f9507926922d191b6e
2012-08-27 23:40:47 -07:00
Pascal Massimino
c7eb45764f make VP8DspInitNEON() public
this will avoid the "dec_neon.o has no symbol" warning

no change in binary size observed on linux.

Change-Id: Ia27ae2bc5a03d714afa7e46671fdcf4cb630784d
2012-08-27 00:28:13 -07:00
James Zern
fe1958f17d RGBA4444: harmonize lossless/lossy alpha values
lossy was rounding with a bias toward opaque:
[232+, 8] -> [15, 1]
now both paths use the range:
[240+, 16] -> [15, 1]

Change-Id: I3da2063b4959b9e9f45bae09e640acc1f43470c5
2012-08-14 14:02:30 -07:00
James Zern
f06c1d8f7b Merge "Alignment fix" into 0.2.0 2012-08-09 16:09:58 -07:00
Urvang Joshi
f56e98fd11 Alignment fix
Change-Id: Ia5475247f03456b01571ae7531da90f74c068045
2012-08-10 02:10:32 +05:30
Pascal Massimino
528a11af35 fix the ARGB4444 premultiply arithmetic
* green was not descaled properly
* alpha was over-dithered, making the value '0x0f' not be a fixed point
* alpha value was not restored ok.

Change-Id: Ia4a4d75bdad41257f7c07ef76a487065ac36fede
2012-08-09 11:32:30 -07:00
Urvang Joshi
a0a488554d Lossless decoder fix for a special transform order
Fix the lossless decoder for the case when it has to apply other
inverse transforms before applying Color indexing inverse transform.

The main idea is to make ColorIndexingInverse virtually in-place: we
use the fact that the argb_cache is allocated to accommodate all
*unpacked* pixels of a macro-row, not just *packed* pixels.

Change-Id: I27f11f3043f863dfd753cc2580bc5b36376800c4
2012-08-08 23:52:08 -07:00
Pascal Massimino
f94b04f045 move some RGB->YUV functions to yuv.h
will be needed later

Change-Id: I6b9e460db2d398b9fecd5d3c1bbdb3f2f3d4f5db
2012-08-02 17:23:02 -07:00
Pascal Massimino
4af3f6c4d3 fix indentation
Change-Id: Ib00b3cdc21ac336a56390f1e71c169e7fd4767a6
2012-08-02 11:55:55 -07:00
Pascal Massimino
323dc4d9b9 remove use of log2(). Use VP8LFastLog2() instead.
Order-by-cost mostly unchanged (up to a scaling constant 1/log(2))
(except for few minor diff in < 2% of cases)

+ remove unused field cost_mode->cache_bits_

Change-Id: I714f8ab12f49a23f5d499a64c741382c9b489a3e
2012-08-02 00:08:58 -07:00
Pascal Massimino
2fc1301577 harmonize authors as "Name (mail@address)"
Change-Id: I85bfae61a37de75a5ed945a906002de2ef75149f
2012-07-19 16:09:47 -07:00
James Zern
d5e5ad6356 move decode_vp8.h from webp/ to dec/
the functions contained in it are now private

Change-Id: Ief6c81b32ae3f6d97052edac625716e5b909e66e
2012-07-16 22:12:59 -07:00
Pascal Massimino
fcc69923b9 add automatic YUVA/ARGB conversion during WebPEncode()
Adds new methods WebPPictureARGBToYUVA() and WebPPictureYUVAToARGB()
Depending on the value of picture->use_argb_input,
the main call WebPEncode() will convert appropriately.
Note that both conversions are lossy, so it's recommended to:
* use YUVA input for lossy compression (picture->use_argb_input=0)
* use ARGB input for lossless compression (picture->use_argb_input=1)

Change-Id: I8269d607723ee8a1136b9f4999f7ff4e657bbb04
2012-06-28 00:34:23 -07:00
Pascal Massimino
802e012a18 fix compilation in non-FANCY_UPSAMPLING mode
Change-Id: Id0b1fad3a4888b6e9563a227412b2e6a656d9a2a
2012-06-28 00:26:35 -07:00
Pascal Massimino
637a314f97 remove the now unused *KeepA variants
Change-Id: I65217f3075e30bc9a7f38a49d09f01c9d7271d6a
2012-06-27 10:00:48 -07:00
Pascal Massimino
78f3e34504 Enable lossless encoder code
Remove USE_LOSSLESS_ENCODER compile flag
Update Makefile.am and makefile.unix

Change-Id: If7080c4d8f37994c7c784730c5e547bb0a851455
2012-06-13 00:26:58 -07:00
James Zern
42f6df9da3 fix some implicit type conversion warnings
Change-Id: I0653d10410c0d46f91fedad4c4dffa9c1de402cb
2012-06-04 22:33:32 -07:00
Pascal Massimino
d2b6c6c03b cosmetic fixes after Idaba281a
Change-Id: I275a3dee5696fe1a3e2db0976f8241f2044be512
2012-06-04 13:19:28 -07:00
Pascal Massimino
48f827574e add colorspace for premultiplied alpha
The new modes are
       MODE_rgbA
       MODE_bgrA
       MODE_Argb
       MODE_rgbA_4444
It's binary incompatible, since the enums changed.

While at it, i removed the now unneeded KeepAlpha methods.
-> Saved ~12k of code!

* made explicit mention that alpha_plane is persistent,
so we have access to the full alpha plane data at all time.
Incremental decoding of alpha was planned for, but not
implemented. So better not dragged this constaint for now
and make the code easier until we revisit that.

Change-Id: Idaba281a6ca819965ca062d1c23329f36d90c7ff
2012-06-04 07:50:41 -07:00
Urvang Joshi
c13f663261 Move consts to internal header format_constants.h
Change-Id: Ic6180c16d0b4245680738992925e4608c593fbe8
2012-05-24 15:02:02 +05:30
James Zern
4f067fb254 Merge "Android: only build dec_neon with NEON support" 2012-05-23 22:26:03 -07:00
James Zern
255c66b48f Android: only build dec_neon with NEON support
Defining LOCAL_ARM_NEON = true can result in neon instructions being
used in portions unprotected by the cpu check.
This changes defines a WEBP_USE_NEON/WEBP_ANDROID_NEON pair similar to
the SSE2 code and MSVC.

Change-Id: Ifac010b06e42c73d5aca529baa2198c6796674bd
2012-05-23 22:21:10 -07:00
Pascal Massimino
75d7f3b222 Merge "make input data be 'const' for VP8LInverseTransform()" 2012-05-23 07:54:12 -07:00
Pascal Massimino
9a721c6d24 make input data be 'const' for VP8LInverseTransform()
Change-Id: I5b5b1e29bca6c42704df141b21632a0d0fcb07cf
2012-05-23 07:21:53 -07:00
Pascal Massimino
f7ae5e370a cosmetics: join line
Change-Id: Ib27ed202fff439b94431360c8b0654d88962fb9a
2012-05-23 01:56:29 -07:00
Vikas Arora
237eab6764 Add two more color-spaces for lossless decoding.
Added color-spaces (RGBA_4444 and RGB_565), required for Android device
to lossless decoding.

Change-Id: I229832edd4deca59e066f463e7454f77457c5bcd
2012-05-23 12:10:13 +05:30
James Zern
37a77a6bf4 remove some variable shadowing
Change-Id: I4348253ec6b50639095b22c4745dc26da0904466
2012-05-15 14:04:24 -07:00
James Zern
3926b5be3b Merge "dsp/cpu.c: Android: fix crash on non-neon arm builds" 2012-05-07 17:53:03 -07:00
James Zern
834f937f3c dsp/cpu.c: Android: fix crash on non-neon arm builds
add proper cpu-detection for Android targets

Fixes issue #118 (and is a better solution for #117).

based on patch by pepijn vaneeckhoudt

Change-Id: I6b00ea6d51ca658ccf6a3d55b87b99c01c6805be
2012-05-07 17:52:15 -07:00
James Zern
e38602d2ad Merge branch 'lossless_encoder'
* lossless_encoder: (46 commits)
  split StoreHuffmanCode() into smaller functions
  more consolidation: introduce VP8LHistogramSet
  big code clean-up and refactoring and optimization
  Some cosmetics in histogram.c
  Approximate FastLog between value range [256, 8192]
  Forgot to update out_bit_costs to symbol_bit_costs at one instance.
  Evaluate output cluster's bit_costs once in HistogramRefine.
  Simple Huffman code changes.
  Lossless decoder: remove an unneeded param in ReadHuffmanCodeLengths().
  Reducing emerging palette size from 11 to 9 bits.
  Move GetHistImageSymbols to histogram.c
  Improve predict vs no-predict heuristic.
  code-moving and clean-up
  reduce memory usage by allocating only one histo
  Restrict histo_bits to ensure histo_image size is under 32MB
  further simplification for the meta-Huffman coding
  A quick pass of cleanup in backward reference code
  Make transform bits a function of encode method (-m).
  introduce -lossless option, protected by USE_LOSSLESS_ENCODER
  Run TraceBackwards for higher qualities.
  ...

Conflicts:
	src/enc/webpenc.c

Change-Id: I9a5d98cba0889ea91d10699466939cc283da345a
2012-05-07 14:27:17 -07:00
Vikas Arora
ada6ff77df Approximate FastLog between value range [256, 8192]
Profiled data: Profiled few images and found that in the function VP8LFastLog,
90% of time table lookup is performed, while rest of time (10%) call to log
function is made. Typical lookup accounts for 10 CPU instructions and call to
log 200 instruction counts. The weighted average comes out to be 30
instructions per call. For mid qualities (25-75), this function (VP8LFastLog)
accounts for 30-50% of total CPU cycles (via call path: VP8LCOlorSpaceTransform
-> PredictionCostCrossColor -> ShannonEntropy). After this change, the log is
called less that 1% of time, with average instructions being 15 per call.
Measured the performance over 1000 files for various qualities and found
overall compression speedup between 10-15% (in quality range [0, 75]). The
compression density loss is around 0.5% (though at some qualities, compression
is little better as well).

Change-Id: I247bc6a8d4351819c871f19d65455dc23aea8650
2012-05-07 14:25:26 -07:00
Urvang Joshi
0993a611cd Full and final fix for prediction transform
use (tile_size + 1) rows of scratch area.

Change-Id: I06d612fff1794fc045ba76275e94e7210802c332
2012-05-07 14:24:43 -07:00
Urvang Joshi
afd2102f43 Fix cross-color transform in lossless encoder
make elements of "Multiplier" struct unsigned, so that any negative values are
automatically converted to "mod 256" values.

Change-Id: Iab4f9bacc50dcd94a557944727d9338dbb0982f7
2012-05-07 14:24:41 -07:00
Urvang Joshi
4f0c5caf67 Fix prediction transform in lossless encoder.
(Keep one tile as a scratch buffer).

Change-Id: If112ada29bfd0bdc81b82e849a566b30dd331d2f
2012-05-07 14:24:35 -07:00
Vikas Arora
d673b6b9a0 Change the predictor function to pass left pixel
instead of pointer to the source.

Change-Id: Ia2c8e17c3140709a825c2f85a88c5e31bd6e462f
2012-05-07 14:24:29 -07:00
Urvang Joshi
b2f99465a7 Fix CopyTileWithPrediction()
so that it uses original values of left, top etc for prediction rather than the
predicted values of the same. Also, do some renaming in the same to make it
more readable.

Change-Id: I2fe94e35a6700bd437f5c601e2af12323bf32445
2012-05-07 14:24:27 -07:00
Urvang Joshi
6b38378acb Guard the lossless encoder (in flux) under a flag
Change-Id: I6dd8fd17089c199001c06b1afde14233dc3e3234
2012-05-07 14:24:23 -07:00
Vikas Arora
09f7532cce Fix few nits (const qualifiers)
Change-Id: I527e82af49956b695ab18625d34e143854067421
2012-05-07 14:24:21 -07:00
Vikas Arora
648be3939f Added implementation for various lossless functions
- VP8LEncAnalyze, EvalAndApplySubtractGreen, ApplyPredictFilter,
  ApplyCrossColorFilter
- Added palette handling and transform buffer management in VP8LEncodeImage()
- Add Transforms (subtract Green, Predict, cross_color) to dsp/lossless.c.

These are more-or-less copied from src/lossless code.

After this Change, will implement the EncodeImageInternal() method.

Change-Id: Idf71f803c24b3b5ae3b5079b15e019721784611d
2012-05-07 14:24:19 -07:00
Pascal Massimino
b38dfccf8d remove unneeded reference to NUM_LITERAL_CODES
Change-Id: I3e98acce3a69fa45054ffcf77644fcbbc04bd366
2012-05-04 19:01:09 -07:00
James Zern
532020f24a lossless: remove some size_t -> int conversions
Sizes are given as ints in the documentation and used as such elsewhere.

Change-Id: I51ecd9e501cf9b4e3948aa0e947d2c9b5c85a30f
2012-04-24 16:00:00 -07:00
James Zern
39a57dae22 Makefile.am: header file maintenance
src/dec/Makefile.am: add missing reference to vp8li.h
src/{dec,dsp,enc}/Makefile.am: move some headers to noinst_

Change-Id: I0e2bc69980bd8175d99ad0ab63f537ef9e425b77
2012-04-23 18:53:48 -07:00
James Zern
b08819a624 dsp/lossless: silence some build warnings
src/dsp/lossless.c: In function 'VP8LInverseTransform':
src/dsp/lossless.c:312:23: warning: 'packed_pixels' may be used
uninitialized in this function [-Wuninitialized]
src/dsp/lossless.c:304:16: note: 'packed_pixels' was declared here
src/dsp/lossless.c:258:34: warning: 'm.red_to_blue_' may be used
uninitialized in this function [-Wuninitialized]
src/dsp/lossless.c:275:17: note: 'm.red_to_blue_' was declared here
src/dsp/lossless.c:257:34: warning: 'm.green_to_blue_' may be used
uninitialized in this function [-Wuninitialized]
src/dsp/lossless.c:275:17: note: 'm.green_to_blue_' was declared here
src/dsp/lossless.c:255:33: warning: 'm.green_to_red_' may be used
uninitialized in this function [-Wuninitialized]
src/dsp/lossless.c:275:17: note: 'm.green_to_red_' was declared here

patch by pepijn vaneeckhoudt

Change-Id: Iffa4764487a75479df45e772169325cd9ee60d94
2012-04-20 12:35:35 -07:00
James Zern
4cce137ebf Merge "enc_sse2 add missing stdlib.h include" 2012-04-19 15:51:53 -07:00
James Zern
80256b8567 enc_sse2 add missing stdlib.h include
lost in fbd82b5; most platforms were getting it indirectly through
emmintrin.h.

Change-Id: I310f8bc8e82d63cfbde74c34cd21b72514a16a01
2012-04-19 15:47:58 -07:00
pascal massimino
64083d3c89 Merge "Makefile.am: cosmetics" 2012-04-19 13:51:33 -07:00
James Zern
fbd82b5a39 types.h: centralize use of stddef.h
for size_t / NULL

Change-Id: If1331d3cf44296ed0ba9e838eae2f5b1bcaeb61b
2012-04-12 17:14:58 -07:00
James Zern
2154835f73 Makefile.am: cosmetics
- use common file organization across subdir makefiles
- append lib/source/header list variables and sort

Change-Id: I0653e1c73a4552b0c43d21f321b22b4972d6e87b
2012-04-12 15:53:06 -07:00
James Zern
f2623dbe58 enable lossless decoder
import changes from experimental 5529a2e^
and enable build in autoconf and makefile.unix; windows will be treated
separately.

Change-Id: Ie2e177a99db63190b4cd647b3edee3b4e13719e9
2012-04-10 23:06:36 -07:00
James Zern
514d008921 add dsp/lossless.[hc] from experimental
Pulled from the current HEAD (218c32e).
The history of this and related files is a bit entangled so rather
trying to split the changes and introduce some noise in master's history
we'll start with a fresh snapshot.
The file progression is still available in the experimental branch.

Change-Id: I40538799dbf999abb9408ac83f55b897d8e22498
2012-04-10 17:37:44 -07:00
James Zern
5081db78be configure/automake: no -version-info for convenience libs
Silences:
libtool: link: warning: `-version-info/-version-number' is ignored for
convenience libraries

Change-Id: I5705383b58f529fb06c2bf0932976b5a202446b6
2012-02-07 18:06:20 -08:00
James Zern
a0b2736d79 cosmetics & warnings
- remove some unused functions
- move global arrays from data to read only section
- explicitly cast malloc returns; not specifically necessary, but helps
  show intent
- miscellaneous formatting

Change-Id: Ib15fe5b37fe6c29c369ad928bdc3a7290cd13c84
2012-01-30 17:19:53 -08:00
Johann
ba503fdac6 NEON TransformOne
As with the loop filter, implementing this with intrinsics is difficult
because we require subscript access for reading and writing 32 bits at a
time.

Approximately 5% decode speed improvement. This could be increased by
exposing TransformOne and rewriting TransformTwo to only handle dual
IDCTs.

Change-Id: Idd409264ab5d154a537107a1d54b419a48f7c1a8
2012-01-26 11:37:32 -08:00
James Zern
e4e3ec19ad fix gcc-4.0 apple 32-bit build
gcc-4.0 defines __PIC__ but not __pic__. This leaves the test for
__pic__ should the inverse case exist.
Fixes issue #103; build failing with:
"error: can't find a register in class 'BREG' while reloading 'asm'"

Change-Id: Ia767a733de6ce0294146f9477ff9c46f0ebe13b0
2012-01-18 13:12:45 -08:00
James Zern
ad1e163a0d cosmetics: normalize copyright headers
Change-Id: I5e2462b101e0447a4f15a1455c07131bc97a52dd
2012-01-06 14:49:06 -08:00
James Zern
f06817aaea simplify checks for enabling SSE2 code
also fixes build issues under vs11 which has a native arm compiler for
windows 8 targets

Change-Id: Id76c2deae9fc9de147d13ad0d34edffcb5a726c4
2011-12-20 17:41:55 -08:00
Pascal Massimino
91e27f4573 better fitting names for upsampling functions
Change-Id: I816e81586c9e1a74ebc5516598dbd4ae0ddf48d8
2011-12-08 06:42:53 -08:00
Urvang Joshi
8666a93aae Some bug-fixes for images with alpha.
- Fix the off-by-one diff when cropping with simple-filter.
- Fix a bug in incremental decoding in case of alpha.
- In VP8FinishRow(), do not decode alpha when y_start > y_end.
- Correct output of alpha channel for MODE_ARGB.
- Correct output of alpha channel for MODE_RGBA_4444.

Change-Id: I785763a2a704b973cc742ad93ffbb53699d1fc0a
2011-12-07 15:12:50 +05:30
James Zern
469d6eb974 Merge "Makefile.am: remove redundant noinst_HEADERS" 2011-11-16 18:41:55 -08:00
James Zern
ced3e3f4e0 Makefile.am: remove redundant noinst_HEADERS
When they appear in _SOURCES they won't be installed [1].

[1]: http://www.gnu.org/software/automake/manual/automake.html#Headers

Change-Id: I14fd816294682e7bd0fccefd6428e1526c9470d8
2011-11-11 16:29:32 -08:00
James Zern
964387ed19 use WEBP_INLINE for inline function declarations
removes a #define inline, objectionable in certain projects

Change-Id: Iebe0ce0b25a030756304d402679ef769e5f854d1
2011-11-11 10:53:58 -08:00
Somnath Banerjee
d4e9f5598d NEON decode support in WebP
Change-Id: I0d6fa456ca68468353adcd64669f1737d1446f65
2011-09-13 16:00:47 -07:00
Pascal Massimino
ee697d9fc9 harmonize the include guards and #endif comments 2011-09-13 15:31:52 -07:00
Somnath Banerjee
a1ec07a618 Fixing compiler error in non x86 arch.
Compiler is not getting the definition of NULL.

Change-Id: I521a99c715bb43e633abd4a26d73ad25bbbafc94
2011-09-13 15:27:58 -07:00
Pascal Massimino
e06ac0887f create a separate libwebpdsp under src/dsp
Gathers all DSP-related function (and SSE2 implementations).
Clean-up some unwanted symbolic dependencies so that webp_encode,
webp_decode and webp_dsp are truly independent libraries.

+ opportunistic clean-up:
  * remove unneeded VP8DspInitTables(), now integrated in VP8DspInit()
  * make consistent use of VP8GetCPUInfo() in the various DspInit() funcs
  * change OUT macro to DST
2011-09-13 12:29:44 -07:00