Commit Graph

2702 Commits

Author SHA1 Message Date
80b54e1c69 allow search with token buffer loop and fix PARTITION0 problem
The convergence algo is noticeably faster and more accurate.

Try it with: 'cwebp -size xxxxx -pass 8 ...' or 'cwebp -psnr 39 -pass 8 ...'
for instance

Allow full-looping with TokenBuffer case, and make the non-TokenBuffer
case match too.

In case Partition0 is likely to overflow, retry encoding with harder
limits on max_i4_header_bits_.

This CL should make -partition_limit option somewhat useless,
since the fix made automatically (albeit in a non-optimal way yet).

Change-Id: I46fde3564188b13b89d4cb69f847a5f24b8c735b
2013-09-11 21:15:28 +02:00
b7d4e04255 add VP8EstimateTokenSize()
estimates final size of coded tokens given a set of probabilities

Change-Id: Ia5a459422557d98b78b3cd8e1a88cb30835825b6
2013-09-11 10:08:49 +02:00
10fddf53bb enc/quant.c: silence a warning
score_t -> int: rd_i4.H contains a value from a uint16_t
lookup

Change-Id: I7227de2dfab74b4f796abbc47955197ffa0e6110
2013-09-11 00:04:11 -07:00
399cd4568b Merge "fix compile error on ARM/gcc" 2013-09-10 15:12:30 -07:00
9f24519e82 encoder: misc rate-related fixes
* fix VP8FixedCostsI4ÆÅ table
   (the constant cost '211' was erronenously included)
* use the rd-score for '211' correctly (calling SetRDScore() for good)
* count partition0 bits separately during rd-opt

No meaningful difference in rd-curve.

Change-Id: I6c49a150cf28928d9a92c32fff097600d7145ca4
2013-09-10 00:25:32 -07:00
c663bb214a Merge "simplify VP8IteratorSaveBoundary() arg passing" 2013-09-06 14:21:42 -07:00
fa46b31269 Demux.h: Correct a method name reference
Change-Id: I5638a4e40d9fcb44860952028f8b5ef2ea78621d
2013-09-06 11:26:00 -07:00
f8398c9dab fix compile error on ARM/gcc
use of uint8_t type was causing error like:
src/dsp/upsampling.c:223:1: internal compiler error: in vect_determine_vectorization_factor, at tree-vect-loop.c:349

with gcc 4.6.3

Change-Id: Ieb6189a1375c47fc4ff992e6c09b34a7f1f605da
2013-09-06 03:07:28 -07:00
f691f0e461 simplify VP8IteratorSaveBoundary() arg passing
we only need to save yuv_out_, so no need for the arg

Change-Id: I7bad5d910e81ed2eda5c9787821fd1cfe905bd92
2013-09-06 02:11:16 -07:00
42542be855 up to 6% faster encoding with clang compiler
mostly by revamping the main loop of GetResidualCost() and avoiding some branches

Change-Id: Ib05763e18a6bf46c82dc3d5d1d8eb65e99474207
2013-09-05 10:38:22 -07:00
93402f02db multi-threaded segment analysis
When -mt is used, the analysis pass will be split in two
and each halves performed in parallel. This gives a 5%-9% speed-up.

This was a good occasion to revamp the iterator and analysis-loop
code. As a result, the default (non-mt) behaviour is a tad (~1%) faster.

Change-Id: Id0828c2ebe2e968db8ca227da80af591d6a4055f
2013-09-05 09:13:36 +02:00
7e2d65950f Merge "remove the PACK() bit-packing tricks" 2013-09-04 23:55:41 -07:00
c13fecf908 remove the PACK() bit-packing tricks
was too smart for its own good :)
This is more ARM-friendly, since it removes a mult.

Change-Id: If146034c8efa2e71e3eaaf1230cb553884a42ebb
2013-09-05 08:53:36 +02:00
2fd091c9ae Merge "use NULL for lf_stats_ testing, not bool" 2013-09-04 07:18:42 -07:00
4bb8465f8c Merge "(de)mux.h: wrap pseudo-code in /* */" 2013-09-03 15:47:42 -07:00
cfb56b1707 make -pass option work with token buffers
-pass 2 can be useful sometimes. More passes usually don't help more.
This change is a step toward being able to re-code the whole picture
with varying parameter (when token buffer is used).

Change-Id: Ia2538e2069a53c080e2ad248c18a1e04623a9304
2013-09-03 23:37:42 +02:00
5416aab479 (de)mux.h: wrap pseudo-code in /* */
easier to copy / read as a block

Change-Id: Ia22689a13afd8ea2325dcdf432e35fc802d8177a
2013-09-03 14:31:02 -07:00
35dba337a3 use NULL for lf_stats_ testing, not bool
Change-Id: I14c6341bce1c745b7f2d1b790f2d4f8441d01be6
2013-09-03 08:45:59 -07:00
733a7faae4 enc->Iterator memory cleanup
* move yuv_in_/out_* scratch buffers to iterator
* add y_top_/uv_top_ shortcuts in iterator

That's ~3k of stack size instead of heap.
But it allows having several iterators work in parallel.

Change-Id: I6a437c0f2ef1e5d398c1d6a2fd4974fa0869f0c1
2013-08-31 23:38:11 +02:00
bef7e9ccd1 Add doc precision about demux object keeping pointers to data.
Change-Id: I3d2139f975eedcce36606e586e0cbd6fa7d207e6
2013-08-21 11:09:37 -07:00
be20decb5c fix compilation for BITS 24
in_bits is const. Trying to apply bswap on it, one gets the error message:
error: read-only variable 'in_bits' used as 'asm' output

Change-Id: I0bef494b822c83d8ea87b1938b0e486d94de4742
2013-08-20 18:55:00 -07:00
b25a6fbfdc yuv.h: fix indent
Change-Id: I0c0bd5f7f71bc44e10134bd4f788769ec25cec1f
2013-08-19 18:06:15 -07:00
388a7249c9 cosmetics: fix indent
Change-Id: Iad0fce79886bed0d61ddf2510ce133a5355ebc1f
2013-08-19 17:51:04 -07:00
4c7322c86f Merge "dsp: msvc compatibility" 2013-08-19 17:42:16 -07:00
df6cebfa9e 5-7% faster SSE2 versions of YUV->RGB conversion functions
The C-version gets ~7-8% slower in order to match the SSE2
output exactly. The old (now off-by-1) code is kept under
the WEBP_YUV_USE_TABLE flag for reference.

(note that calc rounding precision is slightly better ~= +0.02dB)

on ARM-neon, we somehow recover the ~4% speed that was lost by mimicking
the initial C-version (see https://gerrit.chromium.org/gerrit/#/c/41610)

Change-Id: Ia4363c5ed9b4c9edff5d932b002e57bb7814bf6f
2013-08-19 17:05:58 -07:00
ad6ac32d7c simplify upsampler calls: only allow 'bottom' to be NULL
If 'top' was meant to be NULL, then bottom and top can be
swapped. Logic is simpler.

+ fix compilation in non-FANCY_UPSAMPLING mode

Change-Id: I7c62bbb59454017f072c0945d1ff2d24d89286ff
2013-08-19 16:47:51 -07:00
f358450feb dsp: msvc compatibility
intrin.h is available after VS2003

patch from the FreeImage project

Change-Id: I58a18a0db00e247f871d05e3ba99772704f0e079
2013-08-16 20:46:16 -07:00
43a7c8ebee Merge "cosmetics" 2013-08-15 11:24:30 -07:00
4c5f19c148 Merge "bit_reader.h: cosmetics" 2013-08-15 11:11:03 -07:00
f72fab7045 cosmetics
Change-Id: Ib6fa2a7e1db8c5c48607c7097520ffca34a6cb66
2013-08-15 04:16:37 -07:00
a2f5f73de3 Merge "Support for "Do not blend" in mux and demux libraries" 2013-08-14 09:27:58 -07:00
e081f2f359 Pack code & extra_bits to Struct (VP8LPrefixCode).
Also created variant VP8LPrefixEncodeBits that returns the
code & extra_bits only.
There's no impact on compression density and compression speed.

Change-Id: I2cafdd3438ac9270cd72ad9d57b383cdddfdfa4c
2013-08-12 11:56:42 -07:00
6284854bd5 Support for "Do not blend" in mux and demux libraries
Change-Id: I9566a8e2d059fe1ebd9ca99c7e13311bf3f8f281
2013-08-12 11:49:00 -07:00
f486aaa9f8 Merge "slightly faster ParseIntraMode" 2013-08-09 02:17:16 -07:00
d17186328c slightly faster ParseIntraMode
+ cosmetics

Change-Id: Icb906d5f84f025ef9e04b71a73801a22cc990ee5
2013-08-09 02:15:21 -07:00
3ceca8ad31 bit_reader.h: cosmetics
- use const where applicable
- drop unnecessary string.h include

Change-Id: I560eef84fe17d3925768f6817c02ea79604c4379
2013-08-06 14:25:26 -07:00
69257f70df Create LUT for PrefixEncode.
This speeds up lossless compression by 5%.
Change-Id: Ifd114b1d9850dc3aac74593809e7d48529d35e3d
2013-08-05 10:20:18 -07:00
988b70844e add WebPWorkerExecute() for convenient bypass
This is mainly for re-using the worker structs without using the
thread.

Change-Id: I8e1be29e53874ef425b15c192fb68036b4c0a359
2013-08-02 12:20:15 -07:00
06e24987e7 Merge "VP8EncIterator clean-up" 2013-08-02 00:40:02 -07:00
de4d4ad598 VP8EncIterator clean-up
- remove unused fields from iterator
- introduce VP8IteratorSetRow() too
- rename 'done_' to 'countdown_'
- bring y_left_/u_left_/v_left_ from VP8Encoder

Change-Id: Idc1c15743157936e4cbb7002ebb5cc3c90e7f92a
2013-08-01 23:05:54 -07:00
7bbe95293f Merge "cosmetics: thread.c: drop a redundant comment" 2013-07-31 23:56:06 -07:00
da41148560 cosmetics: thread.c: drop a redundant comment
+ fix #if/#else/#endif comment

Change-Id: I76bfd3b123ed181897ed0feba721d5c1a3a2b0d7
2013-07-31 22:55:51 -07:00
feb4b6e6b3 thread.h: #ifdef when checking WEBP_USE_THREAD
prevents a warning with -Wundef and is consistent with thread.c

Change-Id: I60fa337c3b9daeea7302e86802402cb943cdb262
2013-07-31 22:50:57 -07:00
8924a3a704 thread.c: drop WebPWorker prefix from static funcs
Change-Id: I7cd39c9e41bbf11157c488aff18631ef17fde464
2013-07-31 19:22:44 -07:00
1aed8f2afc Merge "fix indent" 2013-07-26 17:08:37 -07:00
4038ed154d fix indent
Change-Id: I7f226ec4276e5e1b46896086a7d96cb67f36de6a
2013-07-26 17:03:19 -07:00
1693fd9b16 Demux: A new state WEBP_DEMUX_PARSE_ERROR
WebPDemuxPartial() returns NULL for both of the following cases:
- There was a parsing error.
- It doesn't have enough data to start parsing.

Now, one can differentiate between these two cases by checking the value
of 'state' returned by WebPDemuxPartial().

Change-Id: Ia2377f0c516b3fcfae475c0662c4932d2eddcd0b
2013-07-26 14:35:46 -07:00
8dcae8b3cf fix rescaling-with-alpha inaccuracy
(still missing YUVA decoding case for now)

https://code.google.com/p/webp/issues/detail?id=160

Change-Id: If723b4a5c0a303d0853ec9d839f995adce056095
2013-07-26 12:10:26 -07:00
11249abfc7 Merge changes I9b4dc36c,I4e0eef4d
* changes:
  Mux: support parsing unknown chunks within a frame/fragment.
  Stricter check for presence of alpha when writing lossless images
2013-07-22 17:05:27 -07:00
52508a1fe4 Mux: support parsing unknown chunks within a frame/fragment.
Change-Id: I9b4dc36c5ccc4b46f60cd64c1ee21008e20c8b95
2013-07-22 17:00:41 -07:00