Commit Graph

719 Commits

Author SHA1 Message Date
pascal massimino
f4a97970de Merge "Disto4x4 and Disto16x16 in NEON" 2013-01-17 11:07:20 -08:00
Johann
47b7b0ba47 Disto4x4 and Disto16x16 in NEON
Change-Id: Ic6d9dbbc97b5025ce359332c33ae306d5d8925a5
2013-01-16 16:57:33 -08:00
vikas arora
e6409adc2e Remove redundant include from dsp/lossless code.
Change-Id: Ie8a497a486653f907c2a27f4027640a3308c6cc8
2013-01-10 15:09:19 -08:00
Pascal Massimino
0f57dcc31f decoding speed-up (~1%)
- precompute filtering strength once for all at the beginning
  instead of per-macroblock
- reduce size of VP8MB struct from 8 bytes to 4.
- removed VP8StoreBlock() accordingly

Change-Id: Icf3d329473e21c464770be3d72a04c9ee4c321f2
2012-12-14 10:22:54 -08:00
James Zern
d6b88b7694 cosmetics: use '== 0' in size checks
Change-Id: I8ac18e2e570e4c6a8569a3955afa11fc943bee28
2012-12-10 23:27:34 -08:00
James Zern
d65ec6786a fix build, move token.c to src/enc/
broken in:
  657f5c9 move token buffer to its own file (token.c)

Change-Id: I8944a0b5760979bd43008c501b55df1d22d32180
2012-12-03 11:16:08 -08:00
skal
657f5c91b1 move token buffer to its own file (token.c)
Change-Id: Ib9791c52f48d98fad5ed3830f36894ef5ac362fa
2012-12-03 13:50:14 +01:00
skal
c34a3758ad introduce GetLargeValue() to slim-fast GetCoeffs().
GetCoeffs is (by far) the most consuming function of the decoder.
No speed change (unfortunately), but the main loop is somehow clearer.

Change-Id: I78f1c10cadc2c8696c041f5cbda86cab92cc6598
2012-11-28 08:24:23 +01:00
skal
d5838cd598 faster non-transposing SSE2 4x4 FTransform
1-2% faster.
uses pmaddwd instead of transpose + pmullw.
Can possibly be simplified further.

Change-Id: I420e148816c4c6ab5e2080c9b1719dbbe6762d4e
2012-11-27 08:38:24 +01:00
skal
f76191f9db speed up GetResidualCost()
* treat the last coeff as a special case
* re-arrange the inner code to be shorter
* replace some VP8EncBands[n] by n, for n = 0 or 1

Change-Id: I71e17b014cffad7b073e787fde06260905a6953f
2012-11-26 23:50:37 +01:00
skal
ba2aa0fdda Add support for BITS=24 case
The main advantage is that you can avoid the use of uint64_t
some times, sticking to 32bit only.
Default still is BITS=32, this is mainly "in case".

Change-Id: Id694028793117ba822c37d46ef6c52fa0afed4ac
2012-11-26 23:47:08 +01:00
Urvang Joshi
23782f95b4 Separate out mux and demux code and libraries:
- Separate out mux.h and demux.h
- muxtypes.h: new header for data types common to mux/demux
- Move some misc read/write utilities to utils/utils.h
- Remove some duplicate methods.
- Separate out mux/demux libraries

Change-Id: If9b9569b10d55d922ad9317ef51710544315d6de
2012-11-19 11:40:18 -08:00
skal
42c3b550ba simplify the fwd transform
-> remove two shifts

Change-Id: Ibc55bca98588da30553a7870224ffd0e13d57f52
2012-11-15 09:51:35 +01:00
skal
118cb31270 Merge "add SSE2 version of Sum of Square error for 16x16, 16x8 and 8x8 case" 2012-11-15 00:07:44 -08:00
skal
99e0a707da Merge "Simplify the texture evaluation Disto4x4()" 2012-11-15 00:07:11 -08:00
skal
0f923c3ffd make the bundling work in a tmp buffer
This avoids modifying the source picture.

Change-Id: I5b472859cda17fd3236a9e0fbedbb68977e09f85
2012-11-15 09:05:25 +01:00
skal
e5c3b3f554 Simplify the texture evaluation Disto4x4()
We don't need to use the exact forward transform,
since it's only a rough evaluation.
-> Removed some shifts and rounding constants.

Change-Id: I3fdf8b4fe9720473894155e1ad0345f4d1fd9a33
2012-11-14 07:49:31 +01:00
skal
35bfd4c08f add SSE2 version of Sum of Square error for 16x16, 16x8 and 8x8 case
+ replace mm_set1_ps(0) by _mm_setzero_si128()

Change-Id: I4601033c27466532373f5dabfaf349ce5e5039da
2012-11-14 06:16:49 +01:00
Urvang Joshi
2ca642e02a Rectify WebPMuxGetFeatures:
It should return ALPHA_FLAG for lossless bit-stream

Change-Id: I900bd5b58bf75bc25fca1abf4ecc12aea26eac1c
2012-11-09 14:37:20 -08:00
Urvang Joshi
7caab1d8f6 Some cosmetic/comment fixes.
Change-Id: Id0613f84cc53fcbeceb913c835a262451687e27b
2012-11-09 10:46:38 -08:00
Pascal Massimino
c7127a4dec Merge "Add NEON version of FTransformWHT" 2012-11-09 06:54:28 -08:00
Urvang Joshi
74356eb558 Add a simple cleanup step in mux assembly:
In particular, this removes any unnecessary FRGM/ANMF/ANIM chunks, and
indirectly leads to removal of unnecessary VP8X chunks as well.
This is especially useful for GIF to WebP conversion - it saves 56 bytes
(ANMF: 16+8 bytes, ANIM: 6+8 bytes, VP8X: 10+8 bytes) for non-animated GIFs.

Change-Id: I3b50a96ca585844c421b0fa4cd8593e52c3f95c5
2012-11-08 11:15:22 -08:00
Urvang Joshi
51bb1e5de7 mux.h: correct WebPDemuxSelectFragment() prototype
This is a correction to the following change:
a00a3daf5b Use 'frgm' instead of 'tile' in
webpmux parameters

Change-Id: I8fa0bce98efdde38827fd25712017a98a6ea7388
2012-11-08 11:04:39 -08:00
Pascal Massimino
22a0fd9d01 Add NEON version of FTransformWHT
Contributed by Wayne Chen (datoudatou at gmail dot com)

Change-Id: I007c21db4eeadbf82b89f0963256f965deda7d90
2012-11-08 08:28:51 -08:00
Urvang Joshi
fa30c86323 Update mux code to match the spec wrt animation
- Allow a duration of 0
- Rename LOOP chunk to ANIM and add the background color field to it.
- Add a disposal method field for each animation frame.
- Modify webpmux.c binary interface to allow the input of background color
  and disposal methods. Also make '-loop' and '-bgcolor' arguments optional
  with some default values.

Change-Id: I807372a61cdb8a0d3080ae3552caf2848070bf4d
2012-11-07 11:43:06 -08:00
Pascal Massimino
d9c5fbefa4 by-pass Analysis pass in case segments=1
10-15% faster encoding.

Almost same output, binary wise. The main difference is
that we can't compute uv_alpha susceptibility, means there
can be subtle differences with different -sns values.

Change-Id: Id1b1a50929bf125b6372212fee1ed75a3bed975f
2012-11-06 22:53:13 -08:00
pascal massimino
d2ad4450ce Merge changes Ibeccffc3,Id1585b16
* changes:
  Use 'frgm' instead of 'tile' in webpmux parameters
  Design change in ANMF and FRGM chunks:
2012-11-06 22:43:43 -08:00
James Zern
5c8be2515d Merge "Chunk fourCCs for XMP/EXIF" 2012-11-06 16:17:32 -08:00
Urvang Joshi
a00a3daf5b Use 'frgm' instead of 'tile' in webpmux parameters
- Also, use the term 'fragments' instead of 'tiling' in code
- This makes code consistent with the spec.

Change-Id: Ibeccffc35db23bbedb88cc5e18e29e51621931f8
2012-11-06 16:09:10 -08:00
Urvang Joshi
81b8a741ed Design change in ANMF and FRGM chunks:
- Make ANMF and FRGM chunks hierarchical so that they encompass all chunks of
  that frame.
- Use this in demuxer: stop parsing a frame if all image data for it isn't
  available yet. Thus, we have a frame-level incremental support; that is,
  all frames that are fully available can be parsed.
- Note: We still keep incremental support for single images - so that they can
  be decoded with incremental decoding.

Change-Id: Id1585b16b06caee1d84009c42a25d2de29fa6135
2012-11-06 16:04:33 -08:00
Urvang Joshi
f903cbab9a Chunk fourCCs for XMP/EXIF
Use separate fourCCs "XMP " and "EXIF" instead of a common "META"
Also, some refactorization in webpmux.c

Change-Id: Iad3337e5c1b81e785c60670ce28b1f536dd7ee31
2012-11-06 14:53:21 -08:00
vikas arora
812933d6ba Tune performance of HistogramCombine
Number of pairs selected are limited between 25% of histogram
images (at start) and number of histogram images left at any iteration.
Increase the range of iter_mult.
Removed min_cluster_size as parameter for tuning HistogramCombine.

Change-Id: Ia4068cd7af4d0f63c5af9001aceda8a40b9de740
2012-11-05 16:45:39 -08:00
James Zern
1c4609b1f8 Merge commit 'v0.2.1'
* commit 'v0.2.1':
  Update ChangeLog
  update NEWS
  bump version to 0.2.1
  libwebp: validate chunk size in ParseOptionalChunks
  cwebp (windows): fix alpha image import on XP
  autoconf/libwebp: enable dll builds for mingw
  [cd]webp: always output windows errors
  fix double to float conversion warning
  cwebp: fix jpg encodes on XP
  VP8LAllocateHistogramSet: fix overflow in size calculation
  GetHistoBits: fix integer overflow
  EncodeImageInternal: fix uninitialized free
  fix the -g/O3 discrepancy for 32bit compile
  fix the BITS=8 case
  Make *InitSSE2() functions be empty on non-SSE2 platform
  make *InitSSE2() functions be empty on non-SSE2 platform
  make VP8DspInitNEON() public

Conflicts:
	src/Makefile.am
	src/dsp/dec_neon.c

Change-Id: Iddc5152e4a6892db96c12d7c3f74adbc85fe6178
2012-11-02 12:20:19 -07:00
Pascal Massimino
e8b41ad136 add NEON asm version for WHT inverse transform
Contributed by Wayne Chen (datoudatou at gmail dot com)

+ some header cleanup
+ remove the NEON suffix in static functions

Change-Id: I75bf5e9b54cf5e1acc53764c6f081d61690f8e3d
2012-11-01 16:31:01 -07:00
pascal massimino
a61a824b3a Merge "Add NULL check in chunk APIs" 2012-10-31 16:07:24 -07:00
Pascal Massimino
0e8b7eedaa fix WebPPictureView() unassigned strides
y_stride/uv_stride/argb_stride were not set properly.

Change-Id: I001b8d46f873ca04b5c68eccd6f232061020f9ec
2012-10-31 16:01:34 -07:00
Pascal Massimino
75e5f17e3b ARM/NEON: 30% encoding speed-up
(implements the backward and forward transforms in the encoder)

original patch by Wayne Chen (datoudatou at gmail dot com)

Change-Id: Ic00f3bffcdf7a924f043006728735c810ee47a57
2012-10-31 14:00:20 -07:00
Urvang Joshi
02b4356875 Add NULL check in chunk APIs
Change-Id: I173ff6c9259111762580c1963ff60e34fd1e9b6b
2012-10-31 13:33:20 -07:00
Urvang Joshi
a077072777 mux struct naming
members of public structs should not have a trailing underscore.

Change-Id: Ieef42e1da115bf42b0ea42159701e32bed7b9f60
2012-10-31 11:37:49 -07:00
pascal massimino
6c66dde80f Merge "Tune Lossless encoder" 2012-10-30 18:27:14 -07:00
vikas arora
ab5ea217f7 Tune Lossless encoder
- Changed the dynamic range where more aggressive
  (BackwardReferencesTraceBackward) heuristic is run from quality > 10
  (instead of quality > 25).
- Limit the backward-ref Window size to 16*width & 256*width for lower
  qualities ([0, 25[ & [25, 50[) respectively, instead of 1M window.
- Evaluate the params for HashChainFindCopy outside this function call
  and pass it, instead of recomputing them for every call.

Change-Id: If9eedfc14b978e7632d7cf69c96186e2910b0554
2012-10-30 17:30:17 -07:00
Urvang Joshi
92f8059ce4 Rename some chunks:
TILE --> FRGM and FRM --> ANMF

Change-Id: I752f90b950413501aecb021a8f57882da0e01484
2012-10-30 15:02:15 -07:00
James Zern
3bb4bbeb60 Merge "Mux API change:" 2012-10-30 14:55:39 -07:00
Urvang Joshi
d0c79f0552 Mux API change:
Create common APIs for image, frame and tile.

Change-Id: I709ad752133094bd5bc89dd9c832ff79802aac68
2012-10-30 14:16:29 -07:00
James Zern
25f585c4f2 bump version to 0.2.1
lib - 0.2.1
libtool - 4.1.0 (compatible release)

Change-Id: Ib6ca47f2008d5c97d818422816ac60d0d0d8cffa
2012-10-29 19:17:17 -07:00
James Zern
fed7c0485a libwebp: validate chunk size in ParseOptionalChunks
the max wasn't checked leading to a rollover case, possibly exploitable.
additionally check the RIFF size early, to avoid similar issues.

pulled from chromium:
 http://codereview.chromium.org/11229048/

Change-Id: I4050b13a7e61ec023c0ef50958c45f651cf34c49
2012-10-29 14:00:33 -07:00
James Zern
b14fea993a autoconf/libwebp: enable dll builds for mingw
Change-Id: Ic1ace62aad1d8de95bc370c6729ded83b7b71f0b
2012-10-29 14:00:24 -07:00
James Zern
d662158010 fix double to float conversion warning
introduced in:
 a792b91 fix the -g/O3 discrepancy for 32bit compile

Change-Id: Ic77d6170a5a91cf58ec10c68656ac61a7c0ee41d
2012-10-29 14:00:14 -07:00
James Zern
734f762a08 VP8LAllocateHistogramSet: fix overflow in size calculation
the multiplications done for total_size would be done with integers,
possibly overflowing, before being promoted to 64-bit for the addition

Change-Id: I32c3a6400fc2ef120c38e01a8693f4cb1727234d
2012-10-29 14:00:05 -07:00
James Zern
f9cb58fbce GetHistoBits: fix integer overflow
huff_image_size was a size_t (=32 bits with 32-bit builds) which could
rollover causing an incorrectly sized allocation and a crash in lossless
encoding.
fixes issue #128

Change-Id: I0f20cee98c29b2b40b02607930b6b7a7ca56996d
2012-10-29 14:00:01 -07:00