libwebp

mirror of https://github.com/webmproject/libwebp.git synced 2025-10-27 00:23:55 +01:00

Author	SHA1	Message	Date
Pascal Massimino	58410cd6dc	fix bug in RefineUsingDistortion() When try_both_modes=0 (that is: -m 0 or -m 1), and the mode is i4, we were still sometimes falling back to (unexplored, uninitialized) i16 mode, which resulted in a enc/dec mismatch. This was mainly occurring for large images (when bit_limit is low enough) We disable the fall-back by disabling bit_limit using a large MAX_COST threshold. Change-Id: I0c60257595812bd813b239ff4c86703ddf63cbf8 (cherry picked from commit `0a3838ca77`)	2016-12-08 15:48:16 -08:00
Pascal Massimino	e168af8c6c	fix filtering auto-adjustment the min-distortion was quite too low. And we were also considering the fully skipped macroblocks (nz=0) in the stats. We need to have at least some non-zero dc coeffs (nz=0x100XXXX). Fix also two typos in StoreMaxDelta: the v0/v1 comparison was wrong, and the DCs[] coeffs are actually already in ZigZag order. Change-Id: I602aaa74b36f7ce80017e506212c7d6fd9deba1f (cherry picked from commit `e4cd4daf74`)	2016-12-08 15:48:08 -08:00
James Zern	9629f4bcda	SimplifySegments: quiet -Warray-bounds warning the number of segments are previously validated, but an explicit check is needed to avoid a warning under gcc-4.9 this is similar to the changes made in: `c8a87bb` AssignSegments: quiet -Warray-bounds warning `3e7f34a` AssignSegments: quiet array-bounds warning Change-Id: Iec7d470be424390c66f769a19576021d0cd9a2fd	2016-05-02 12:17:49 -07:00
Pascal Massimino	e88c4ca013	fix -m 2 mode-cost evaluation (causing partition0 overflow) The mode's bits were not taken into account, which is ok for most of cases. But in case of super large image, with 'easy' content, their overhead starts mattering a lot and we were omitting to optimize for these. Now, these mode bits have their own lambda values associated, limiting the jerkiness. We also limit (for -m 2 only) the individual number of bits to something that will prevent the partition 0 overflow. removed the I4_PENALTY constant, which was a rather crude approximation. Replaced by some q-dependent expression. fixes issue #289 Change-Id: I956ae2d2308c339adc4706d52722f0bb61ccf18c	2016-03-11 20:34:45 +01:00
Pascal Massimino	367bf903b3	fix PrintBlockInfo() ... which has gone out of sync since the last block-cache layout change. Change-Id: Ic441ec07b0198b508ce3fd34ab582cb60b1daabc	2015-12-17 15:47:25 +01:00
Pascal Massimino	9cf1cc2bd6	remove few TODO: * 256 -> RD_DISTO_MULT * don't use TDisto for UV mode picking Change-Id: I243148c716fe688b5c1b1fb9b7a6e58d0b5e6835	2015-12-15 22:52:12 -08:00
Pascal Massimino	e6c9351918	add disto-based refinement for UV mode (if method = 1 or 2) This doesn't slow down much and give some quality improvement. Change-Id: I5afbe62b9c3922b3ec1bf6538c68dcdb0f25d2e4	2015-12-11 03:15:59 -08:00
Pascal Massimino	14d27a46be	improve method #2 by merging DistoRefine() and SimpleQuantize() it's now a single function, that reconstructs the intra4x4 block during the scan The I4_PENALTY had to be adjusted. Overall, result is better quality-wise (esp. at q < 50), and a tad faster too. method #0, #1 and #3+ are unchanged Change-Id: If262aeb552397860b3dd532df8df6b1357779222	2015-12-10 08:04:04 +01:00
skal	ac76801159	introduce FTransform2 to perform two transforms at a time. FTransform goes from ~12.0% to 11.5% total CPU time. Change-Id: Ibcb23155324f4fd8b235563f80668531c781f624	2015-05-18 21:06:15 -07:00
Pascal Massimino	82d980209b	add a dec/common.h header to collect common enc/dec #defines had to rename few structs. -> we can now include both vp8i.h and vp8enci.h without naming conflicts. Change-Id: Ib41b498f1b57aab3d6b796361afc45210ec75174	2015-03-31 22:17:58 -07:00
Pascal Massimino	2382050748	1-2% faster encoding by removing an indirection in GetResidualCost() The MIPS code for cost is not updated yet, that's why i keep Residual::cost around for now. Should be removed in favor of costs later. Change-Id: Id1d09a8c37ea8c5b34ad5eb8811d6a3ec6c4d89f	2015-02-19 08:44:35 +01:00
James Zern	9475bef4d7	PickBestUV: fix VP8Copy16x8 invocation param order is src, dst broken in: `66ad372` factorize BPS definition in dsp.h and add VP8Copy16x8 Change-Id: I761f618e3fe31ae7f58953256381f4f16bdb238e	2014-12-04 23:12:30 -08:00
Pascal Massimino	66ad372500	factorize BPS definition in dsp.h and add VP8Copy16x8 Change-Id: Id73a1e968c96455808755df4d131d74e3e2e135d	2014-12-04 13:45:14 +01:00
Pascal Massimino	57606047ec	encoder: switch BPS to 32 instead of 16 this is a first step to unifying encoding/decoding cache stride and possibly sharing the prediction functions in dsp/ With this layout, there's a little (~7%) space lost with unused samples. But no speed change was observed. Change-Id: I016df8cad41bde5088df3579e6ad65d884ee711e	2014-12-04 09:17:18 +01:00
skal	a48a2d7635	~3-5% faster encoding optimizing PickBestIntra() Add early-out check for Intra16 * replace some memcpy() by pointer swap Change-Id: I5edc5f7fbc8e39984deb48e6c045c97c61418589	2014-09-01 14:40:25 +02:00
skal	e632b0929b	fix indentation Change-Id: I2294a6c83e5f345f64bd5120b91532e00ed6c543	2014-08-25 23:52:09 -07:00
skal	73d361dd5f	introduce VP8EncQuantize2Blocks to quantize two blocks at a time No speed diff for now. We might reorder better the instructions later, to speed things up. Change-Id: I1949525a0b329c7fd861b8dbea7db4b23d37709c	2014-08-25 20:21:42 -07:00
skal	69fce2ea78	remove the special casing for res->first in VP8SetResidualCoeffs if res->first = 1, coeffs[0]=0 because of quant.c:749 and line added at quant.c:744 So, no need for the extra case. Going forward, TrellisQuantizeBlock() should also be calling a variant of VP8SetResidualCoeffs() to set the 'last' field. also: fixes a warning for win64 + slight speed-up Change-Id: Ib24b611f7396d24aeb5b56dc74d5c39160f048f0	2014-06-08 06:40:22 +02:00
skal	869eaf6c60	~30% encoding speedup: use NEON for QuantizeBlock() also revamped the signature to avoid having to pass the 'first' parameter Change-Id: Ief9af1747dcfb5db0700b595d0073cebd57542a5	2014-04-08 03:08:22 -07:00
James Zern	fbed36433d	Merge "dsp: reuse wht transform from dec in encoder"	2014-03-26 15:13:07 -07:00
skal	d1b33ad58b	2-5% faster trellis with clang/MacOS (and ~2-3% on ARM) We don't need to store cost/score for each node, but only for the current and previous one -> simplify code and save some memory. Also made the 'Node' structure tighter. Change-Id: Ie3ad7d3b678992b396242f56e2ac387fe43852e6	2014-03-26 22:33:01 +01:00
James Zern	df230f2723	dsp: reuse wht transform from dec in encoder Change-Id: Ide663db9eaecb7a37fe0e6ad4cd5f37de190c717	2014-03-22 13:25:08 -07:00
Pascal Massimino	59daf08362	Merge "cosmetics:"	2014-03-18 04:02:33 -07:00
Pascal Massimino	536220084c	cosmetics: - use VP8ScanUV, separate from VP8Scan[] (for luma) - fix indentation - few missing consts - change TrellisQuantizeBlock() signature Change-Id: I94b437d791cbf887015772b5923feb83dd145530	2014-03-18 03:34:56 -07:00
skal	30176619c6	4-5% faster trellis by removing some unneeded calculations. (We didn't need the exact value of the max_error properly. We can work with relative values instead of absolute) Output is bitwise the same as before. Change-Id: I67aeaaea5f81bfd9ca8e1158387a5083a2b6c649	2014-03-06 15:57:25 +01:00
skal	82af82644b	few cosmetics after patch #69079 Change-Id: Ifa758420421b5a05825a593f6b43504887603ee7	2014-03-03 23:53:08 +01:00
skal	5aeeb087d6	5-10% encoding speedup with faster trellis (-m 6) mostly by: - storing a single rd-score instead of cost / distortion separately - evaluating terminal cost only once - getting some invariants out of the loops - more consts behind fewer variables Change-Id: I79451f3fd1143d6537200fb8b90d0ba252809f8c	2014-03-03 22:07:06 +01:00
Pascal Massimino	4287d0d49b	speed-up trellis quant (~5-10% overall speed-up) store costs[] in node instead of context Change-Id: I6aeb0fd94af9e48580106c41408900fe3467cc54 also: various cosmetics	2014-02-27 00:06:00 -08:00
Pascal Massimino	390c8b316d	lossy encoding: ~3% speed-up incorporate non-last cost in per-level cost table also: correct trellis-quant cost evaluation at nodes (output a little bit different now). Method 6 is ~4% faster. Change-Id: Ic48bd6d33f9193838216e7dc3a9f9c5508a1fbe8	2014-02-26 05:52:24 -08:00
skal	0235d5e44b	1-2% faster quantization in SSE2 C-version is a bit faster too (sub-1% faster on ARM) Change-Id: I077262042f1d0937aba1ecf15174f2c51bf6cd97	2014-02-13 15:55:30 -08:00
Pascal Massimino	495bef413d	fix bug in TrellisQuantize the quantized level should be clipped to 2047, not the original coeff. (similar problem was fixed in the regular quantize function quite some time ago) Change-Id: I2fd2f8d94561ff0204e60535321ab41a565e8f85	2013-12-17 11:08:01 -08:00
James Zern	5227d99146	drop: ifdef __cplusplus checks from C files the prototypes are already marked in the headers Change-Id: I172fe742200c939ca32a70a2299809b8baf9b094	2013-12-13 11:42:13 -08:00
skal	73b731fb42	introduce a special quantization function for WHT WHT is somewhat a special case: no sharpen[] bias, etc. Will be useful in a later CL when precision of input is changed. Change-Id: I851b06deb94abdfc1ef00acafb8aa731801b4299	2013-12-10 14:21:47 +01:00
skal	a3359f5d2c	Only compute quantization params once (all quantization params #1..#15 are the same) Change-Id: If04058bd89fe2677b5b118ee4e1bcce88f0e4bf5	2013-12-10 05:36:23 +01:00
skal	d513bb62bc	* fix off-by-one zthresh calculation * remove the sharpening for non luma-AC coeffs * adjust the bias a little bit to compensate for this Using the multiply-by-reciprocal doesn't always give the same result as the exact divide, given the QFIX fixed-point precision we use. -> removed few now-unneeded SSE2 instructions (and checked for bit-exactness using -noasm) Change-Id: Ib68057cbdd69c4e589af56a01a8e7085db762c24	2013-12-09 13:56:04 +01:00
James Zern	4931c3294b	cosmetics: fix some typos Change-Id: I0d6efebd817815139db5ae87236fd8911df4d53c	2013-11-26 19:21:14 -08:00
skal	e3312ea681	detect flatness in blocks and favor DC prediction this avoids local-minima that look bad, even if the distortion looks low (e.g. gradients, sky,...). Mostly visible in the q=50-80 range. Output size is mostly unchanged. Change-Id: I425b600ec45420db409911367cda375870bc2c63	2013-11-01 00:47:04 +01:00
skal	a014e9c9cd	tune quantization biases toward higher precision * raise U/V quantization bias to more neutral values * also raise the non-zero AC bias for Y1/Y2 matrices (we need all the precision we can for U/V leves, which are often empty) This will increase quality in the higher range (q >= 90) mostly. Files size is exacted to raise a little (5-7%). and SSIM accordingly of course. Change-Id: I8a9ffdb6d8fb6dadb959e3fd392e66dc5aaed64e	2013-10-30 23:57:23 +01:00
skal	1e898619cb	add helpful PrintBlockInfo() function (protected under a DEBUG_BLOCK compile flag) Change-Id: Icb8da02dbd00e5cf856c314943c212f1c9578d9b	2013-10-30 19:25:27 +01:00
James Zern	d3408720d8	Merge "fast auto-determined filtering strength"	2013-10-29 12:47:58 -07:00
skal	f8bfd5cd1e	fast auto-determined filtering strength kLevelsFromDelta[sharpness][delta] is an inverse look-up table that tells the minimum filtering strength needed to trigger the filtering of a step with amplitude 'delta'. We use this table in various situations: a) when computing the initial (/global) filtering strength for each segment. We look at the quantization step and deduce the proper filtering strength needed to result this quantization noise (talking the -f option into account). b) during intra16 calculation, when a block ends up very empty (only DC coeffs are non-zero, all ACs have vanished). We'll rely on the in-loop filtering to restore the smoothness (if the source was gradient-like smooth. That's why we look at the distortion too before triggering the filtering). Step b) goes _in addition_ to a), potentially raising the filtering strength if blockiness is likely. Change-Id: Icaeca93ef21da195b079e6587a44d9edfc8e9efa	2013-10-29 20:13:29 +01:00
Pascal Massimino	ac0bf951ca	small clean-up in ExpandMatrix() Change-Id: Ib06cb1658a6548f06bb7320310b3864881b606a7	2013-10-29 19:58:57 +01:00
James Zern	10fddf53bb	enc/quant.c: silence a warning score_t -> int: rd_i4.H contains a value from a uint16_t lookup Change-Id: I7227de2dfab74b4f796abbc47955197ffa0e6110	2013-09-11 00:04:11 -07:00
Pascal Massimino	9f24519e82	encoder: misc rate-related fixes * fix VP8FixedCostsI4ÆÅ table (the constant cost '211' was erronenously included) * use the rd-score for '211' correctly (calling SetRDScore() for good) * count partition0 bits separately during rd-opt No meaningful difference in rd-curve. Change-Id: I6c49a150cf28928d9a92c32fff097600d7145ca4	2013-09-10 00:25:32 -07:00
skal	93402f02db	multi-threaded segment analysis When -mt is used, the analysis pass will be split in two and each halves performed in parallel. This gives a 5%-9% speed-up. This was a good occasion to revamp the iterator and analysis-loop code. As a result, the default (non-mt) behaviour is a tad (~1%) faster. Change-Id: Id0828c2ebe2e968db8ca227da80af591d6a4055f	2013-09-05 09:13:36 +02:00
skal	de4d4ad598	VP8EncIterator clean-up - remove unused fields from iterator - introduce VP8IteratorSetRow() too - rename 'done_' to 'countdown_' - bring y_left_/u_left_/v_left_ from VP8Encoder Change-Id: Idc1c15743157936e4cbb7002ebb5cc3c90e7f92a	2013-08-01 23:05:54 -07:00
James Zern	d640614d54	update copyright text rather than symlink the webm/vpx terms, use the same header as libvpx to reference in-tree files based on the discussion in: https://codereview.chromium.org/12771026/ Change-Id: Ia3067ecddefaa7ee01550136e00f7b3f086d4af4	2013-06-06 23:09:14 -07:00
Pascal Massimino	58ca6f65b7	rebalance method tools (-m) for methods [0..4] (methods 5 and 6 are still untouched). Methods #0 and #1 got much faster Method #2 gets vastly improved in quality Method #3 is noticeably faster for little lower quality Method #4 (default) is 10-20% faster for comparable quality + update the internal doc about the methods' tools. Example of speed difference: Time to encode picture: Method \| Before \| After -m 0 \| 1.272s \| 0.517s -m 1 \| 1.295s \| 0.623s -m 2 \| 2.217s \| 0.834s -m 3 \| 2.816s \| 2.243s -m 4 \| 3.235s \| 3.014s -m 5 \| 3.668s \| 3.654s -m 6 \| 8.296s \| 8.235s Change-Id: Ic41fda5de65066b3a6586cb8ae1ebb0206d47fe0	2013-02-27 02:19:20 -08:00
Pascal Massimino	5189957e07	describe rd-opt levels introduce VP8RDLevel enum makes things somehow clearer compared to using magic constants Change-Id: I9115cee71252511f722806427ee8a97f1a1cd95f	2013-02-26 02:20:59 -08:00
skal	e895059a05	add a -jpeg_like option This option remaps internal parameters to better match the expected compression curve of JPEG and produce output files of similar size, but with better quality. Change-Id: I96a1cbb480b1f6a0c6845a23c33dfd63f197b689	2013-02-06 14:19:16 +01:00

1 2

67 Commits