Commit Graph

1029 Commits

Author SHA1 Message Date
Pascal Massimino
241d11f141 add missing const
Change-Id: Id1c767d21d52197ed2e4497005eb9c4795c602f0
2013-10-25 20:34:14 +02:00
Pascal Massimino
86c0031eb2 add a 'format' field to WebPBitstreamFeatures
Change-Id: I79a688e4c34fb77527127bbdf4bc844efa6aa9a4
2013-10-25 20:34:06 +02:00
Urvang Joshi
dde91fde96 Demux: Correct the extended format validation
Earlier "f = f->next_" was executing for both inner and outer loop, thus
skipping validation of some frames.

Change-Id: Ice5cdb4ff5da78384aa0573addd3a5e5efa0b10c
2013-10-23 17:25:23 -07:00
skal
7c098d1814 Use some gamma-curve range compression when computing U/V average
This helps for discolorated chroma-subsampled edges.

Change-Id: I1d8ce87b66cb7e8b3572e6722905beabf0f50554
2013-10-18 21:28:05 +02:00
skal
0b2b05049f Use deterministic random-dithering during RGB->YUV conversion
-> helps debanding (sky, gradients, etc.)

This dithering can only be triggered when using -preset photo
or -pre 2 (as a preprocessing). Everything is unchanged otherwise.

Note that this change is likely to make the perceived PSNR/SSIM drop
since we're altering the input internally.

Change-Id: Id8d4326245d9b828141de162c94ba381b1fa5813
2013-10-17 22:36:49 +02:00
skal
8a2fa099cc Add a second multi-thread method
method 1 grouping: [parse + reconstruction] // [filtering + output]
method 2 grouping: [parse] // [reconstruction+filtering + output]

Depending on some heuristics (see VP8ThreadMethod()), we
can pick one of the other when -mt flag (or option.use_threads)
is selected.

Conservatively, we always use method #2 for now until the heuristic
is refined (so, timing should be the same the before this patch)

+ replace 'use_threads' by 'mt_method'
+ define MIN_WIDTH_FOR_THREADS constant
+ fix comment alignment

Change-Id: I11a756dea9070d6e21b1a9481d357a1e8aa0663e
2013-10-15 23:58:31 +02:00
skal
0532149c8a up to 20% faster multi-threaded decoding
Mostly visible for large images.
Reconstruction+filtering is now done in parallel to bitstream-parsing.

Change-Id: I4cc4483d803b255f4d97a2fcd9158b1c291dd900
2013-10-15 00:25:21 +02:00
skal
cb22155201 Decode a full row of bitstream before reconstructing
Needs more memory but allows for future parallelization.
Noticeably faster on ARM, slightly faster on x86

also: remove dec->filter_row_ unnecessary field

Change-Id: I044a808839b4e000c838a477e3e8688820436d9a
2013-10-10 21:29:58 +02:00
James Zern
dca8a4d315 Merge "NEON/simple loopfilter: avoid q4-q7 registers" 2013-10-10 01:58:41 -07:00
pascal massimino
9e84d901d2 Merge "NEON/TransformWHT: avoid q4-q7 registers" 2013-10-09 09:32:59 -07:00
James Zern
fc10249b36 NEON/simple loopfilter: avoid q4-q7 registers
very tiny speed improvement

Change-Id: I3024f120feb7275ce20bfff21af31ea8650a5a03
2013-10-09 18:17:31 +02:00
James Zern
2f09d63e30 NEON/TransformWHT: avoid q4-q7 registers
very tiny speed improvement

Change-Id: Iace78b9038af412d0a794845ff19f54afa88ccdc
2013-10-09 18:17:23 +02:00
skal
0b28d7ab08 use a macrofunc for setting NzCoeffs bits
(avoids code dup)

Change-Id: I776f065538e562673ca08f3bc43c7167d13254d9
2013-10-09 11:46:32 +02:00
skal
f9bbc2a034 Special-case sparse transform
If the number of non-zero coeffs is <= 3, use a
simplified transform for luma.

Change-Id: I78a1252704228d21720d4bc1221252c84338d9c8
2013-10-08 22:05:38 +02:00
Pascal Massimino
6a8c0eb718 Merge "small optimization in segment-smoothing loop" 2013-10-08 04:13:37 -07:00
Pascal Massimino
f7146bc1e6 small optimization in segment-smoothing loop
probably not much of a speed difference

Change-Id: I08c41d82c3c2eb5ff9ec9ca9d81af2bb09b362de
2013-10-07 07:44:51 -07:00
skal
63f9aba4b3 special-case WHT transform when there's only DC
happens surprisingly often at low quality, so we might
as well hard-code a simplified TransformWHT() directly.

Change-Id: Ib7a858ef74e8f334bd59d6512bf5bd3e455c5459
2013-10-02 14:27:36 +02:00
skal
2a98136667 7-8% faster decoding by rewriting GetCoeffs()
Change-Id: Ib7c27e985d3b5222e8fa1f98cec462458caa9541
2013-09-30 23:17:34 +02:00
skal
92d47e4ca9 improve VP8L signature detection by checking the version bits too
Change-Id: I20bea00b9582d7ea8c7b643616c78f717ce1bdf2
2013-09-27 18:17:23 +02:00
Pascal Massimino
40ae3520b1 fix memleak in WebPIDelete()
happens when decoding is partial (past Partition0), without error and
interrupted by calling WebPIDelete()

WebPIDelete() needs to call VP8ExitCritical() to free in-flight resources

Change-Id: Id4faef1b92f7edd8c17d642c58860e70dd570506
2013-09-17 15:45:43 -07:00
Urvang Joshi
d9662658d9 mux.h doc: WebPMuxGetFrame() can return WEBP_MUX_MEMORY_ERROR too.
Change-Id: Icc0305f162d1f5d15ad1b4713ac08e210c85c306
2013-09-17 14:37:12 -07:00
Pascal Massimino
3ebe175781 Merge "break down the proba 4D-array into some handy structs" 2013-09-14 03:14:30 -07:00
Pascal Massimino
6a44550a8c break down the proba 4D-array into some handy structs
Makes it easy to later add derived satellite fields...

Change-Id: I445767ea78cc788d11aec367479e74e485fdabe5
2013-09-14 03:12:09 -07:00
skal
2b0a759335 Merge "fix some warnings from static analysis" 2013-09-13 13:00:12 -07:00
Urvang Joshi
22dd07cee9 mux.h: Some doc corrections
Change-Id: Ia2840a489d87001de241a9a8fbc88a127af55ff9
2013-09-13 12:51:36 -07:00
skal
d51f45f047 fix some warnings from static analysis
http://code.google.com/p/webp/issues/detail?id=138

Change-Id: I21470e965357cc14eab356e2c477c7846ff76ef2
2013-09-13 11:33:30 +02:00
Pascal Massimino
d134307b7f fix conversion warning on MSVC
"src\enc\frame.c(88) : warning C4244: '=' : conversion from 'const double' to 'float', possible loss of data"

Change-Id: I143cb0bb6b69e1b8befe9b4f24b71adbc28095c2
2013-09-12 18:57:51 -07:00
skal
80b54e1c69 allow search with token buffer loop and fix PARTITION0 problem
The convergence algo is noticeably faster and more accurate.

Try it with: 'cwebp -size xxxxx -pass 8 ...' or 'cwebp -psnr 39 -pass 8 ...'
for instance

Allow full-looping with TokenBuffer case, and make the non-TokenBuffer
case match too.

In case Partition0 is likely to overflow, retry encoding with harder
limits on max_i4_header_bits_.

This CL should make -partition_limit option somewhat useless,
since the fix made automatically (albeit in a non-optimal way yet).

Change-Id: I46fde3564188b13b89d4cb69f847a5f24b8c735b
2013-09-11 21:15:28 +02:00
skal
b7d4e04255 add VP8EstimateTokenSize()
estimates final size of coded tokens given a set of probabilities

Change-Id: Ia5a459422557d98b78b3cd8e1a88cb30835825b6
2013-09-11 10:08:49 +02:00
James Zern
10fddf53bb enc/quant.c: silence a warning
score_t -> int: rd_i4.H contains a value from a uint16_t
lookup

Change-Id: I7227de2dfab74b4f796abbc47955197ffa0e6110
2013-09-11 00:04:11 -07:00
Pascal Massimino
399cd4568b Merge "fix compile error on ARM/gcc" 2013-09-10 15:12:30 -07:00
Pascal Massimino
9f24519e82 encoder: misc rate-related fixes
* fix VP8FixedCostsI4ÆÅ table
   (the constant cost '211' was erronenously included)
* use the rd-score for '211' correctly (calling SetRDScore() for good)
* count partition0 bits separately during rd-opt

No meaningful difference in rd-curve.

Change-Id: I6c49a150cf28928d9a92c32fff097600d7145ca4
2013-09-10 00:25:32 -07:00
James Zern
c663bb214a Merge "simplify VP8IteratorSaveBoundary() arg passing" 2013-09-06 14:21:42 -07:00
Urvang Joshi
fa46b31269 Demux.h: Correct a method name reference
Change-Id: I5638a4e40d9fcb44860952028f8b5ef2ea78621d
2013-09-06 11:26:00 -07:00
Pascal Massimino
f8398c9dab fix compile error on ARM/gcc
use of uint8_t type was causing error like:
src/dsp/upsampling.c:223:1: internal compiler error: in vect_determine_vectorization_factor, at tree-vect-loop.c:349

with gcc 4.6.3

Change-Id: Ieb6189a1375c47fc4ff992e6c09b34a7f1f605da
2013-09-06 03:07:28 -07:00
Pascal Massimino
f691f0e461 simplify VP8IteratorSaveBoundary() arg passing
we only need to save yuv_out_, so no need for the arg

Change-Id: I7bad5d910e81ed2eda5c9787821fd1cfe905bd92
2013-09-06 02:11:16 -07:00
Pascal Massimino
42542be855 up to 6% faster encoding with clang compiler
mostly by revamping the main loop of GetResidualCost() and avoiding some branches

Change-Id: Ib05763e18a6bf46c82dc3d5d1d8eb65e99474207
2013-09-05 10:38:22 -07:00
skal
93402f02db multi-threaded segment analysis
When -mt is used, the analysis pass will be split in two
and each halves performed in parallel. This gives a 5%-9% speed-up.

This was a good occasion to revamp the iterator and analysis-loop
code. As a result, the default (non-mt) behaviour is a tad (~1%) faster.

Change-Id: Id0828c2ebe2e968db8ca227da80af591d6a4055f
2013-09-05 09:13:36 +02:00
skal
7e2d65950f Merge "remove the PACK() bit-packing tricks" 2013-09-04 23:55:41 -07:00
skal
c13fecf908 remove the PACK() bit-packing tricks
was too smart for its own good :)
This is more ARM-friendly, since it removes a mult.

Change-Id: If146034c8efa2e71e3eaaf1230cb553884a42ebb
2013-09-05 08:53:36 +02:00
Pascal Massimino
2fd091c9ae Merge "use NULL for lf_stats_ testing, not bool" 2013-09-04 07:18:42 -07:00
James Zern
4bb8465f8c Merge "(de)mux.h: wrap pseudo-code in /* */" 2013-09-03 15:47:42 -07:00
skal
cfb56b1707 make -pass option work with token buffers
-pass 2 can be useful sometimes. More passes usually don't help more.
This change is a step toward being able to re-code the whole picture
with varying parameter (when token buffer is used).

Change-Id: Ia2538e2069a53c080e2ad248c18a1e04623a9304
2013-09-03 23:37:42 +02:00
James Zern
5416aab479 (de)mux.h: wrap pseudo-code in /* */
easier to copy / read as a block

Change-Id: Ia22689a13afd8ea2325dcdf432e35fc802d8177a
2013-09-03 14:31:02 -07:00
Pascal Massimino
35dba337a3 use NULL for lf_stats_ testing, not bool
Change-Id: I14c6341bce1c745b7f2d1b790f2d4f8441d01be6
2013-09-03 08:45:59 -07:00
skal
733a7faae4 enc->Iterator memory cleanup
* move yuv_in_/out_* scratch buffers to iterator
* add y_top_/uv_top_ shortcuts in iterator

That's ~3k of stack size instead of heap.
But it allows having several iterators work in parallel.

Change-Id: I6a437c0f2ef1e5d398c1d6a2fd4974fa0869f0c1
2013-08-31 23:38:11 +02:00
skal
bef7e9ccd1 Add doc precision about demux object keeping pointers to data.
Change-Id: I3d2139f975eedcce36606e586e0cbd6fa7d207e6
2013-08-21 11:09:37 -07:00
skal
be20decb5c fix compilation for BITS 24
in_bits is const. Trying to apply bswap on it, one gets the error message:
error: read-only variable 'in_bits' used as 'asm' output

Change-Id: I0bef494b822c83d8ea87b1938b0e486d94de4742
2013-08-20 18:55:00 -07:00
James Zern
b25a6fbfdc yuv.h: fix indent
Change-Id: I0c0bd5f7f71bc44e10134bd4f788769ec25cec1f
2013-08-19 18:06:15 -07:00
James Zern
388a7249c9 cosmetics: fix indent
Change-Id: Iad0fce79886bed0d61ddf2510ce133a5355ebc1f
2013-08-19 17:51:04 -07:00