101 Commits

Author SHA1 Message Date
Pascal Massimino
423ecaf484 move some SSIM-accumulation function for dsp/
This is in preparation for some SSE2 code.

And generally speaking, the whole SSIM code needs some
revamp: we're not averaging the SSIM value at each pixels
but just computing the overall SSIM value once, for the whole
plane. The former might be better than the latter.

Change-Id: I935784a917f84a18ef08dc5ec9a7b528abea46a5
2016-03-08 07:50:09 +01:00
James Zern
db86088426 Merge "remove useless #include" 2016-02-17 23:17:12 +00:00
Pascal Massimino
75f4af4d54 remove useless #include
Change-Id: Id34f12ec94d8be6853fabd67609a6006ac99f152
2016-01-25 09:34:10 -08:00
James Zern
71100500a8 bump version to 0.5.0
libwebp{,decoder} - 0.5.0
libwebp libtool - 6.0.0
libwebpdecoder libtool - 2.0.0

mux/demux - 0.3.0
libtool - 2.0.0

Change-Id: I5346d13eb827fb5890efbb63ff3f28cea9d0c55f
2015-12-17 19:45:14 -08:00
Lode Vandevenne
fb4c7832f1 lossless: simpler alpha cleanup preprocessing
setting all transparent pixels to black rather than the "flatten" method.

0.3% smaller filesize on the 1000 PNGs if alpha cleanup is used (before: 18685774, after: 18622472)

Change-Id: Ib0db9e7ccde55b36e82de07855f2dbb630fe62b1
2015-12-17 15:04:50 +01:00
James Zern
5bd04a087c sync versions with 0.4.4
libwebp{,decoder} - 0.4.4
libwebp libtool - 5.4.0
libwebpdecoder libtool - 1.4.0

mux/demux - 0.2.2 (unchanged)
libtool - 1.2.0 (unchanged)

(cherry picked from commit 62864042c053da4482a18252c9b7c28e45af9dc4)

Change-Id: I7d421dc47ad4d25a17450ce1b04562c5d58c596b
2015-10-28 23:43:40 -07:00
Pascal Massimino
12ec204ec7 moved ALIGN_CST into util/utils.h and renamed WEBP_ALIGN_xxx
Note that ALIGN_CST is still kept different in dec/frame.c for now,
because the values is 31 there, not 15. We might re-unite these two
later.

Change-Id: Ibbee607fac4eef02f175b56f0bb0ba359fda3b87
2015-10-14 00:03:14 -07:00
Mislav Bradac
48f66b6687 Add delta_palettization feature to WebP
Change-Id: Ibaf4e49aa67d63d0eb11848cca4fd0c60815864a
2015-10-02 14:29:54 -07:00
James Zern
db12250fd1 cosmetics: vp8enci.h: break long line
Change-Id: Ib7c7ef6171506e826ed5f7df20c5644f240fd645
2015-04-06 16:11:02 -07:00
Pascal Massimino
82d980209b add a dec/common.h header to collect common enc/dec #defines
had to rename few structs.

-> we can now include both vp8i.h and vp8enci.h without naming
conflicts.

Change-Id: Ib41b498f1b57aab3d6b796361afc45210ec75174
2015-03-31 22:17:58 -07:00
James Zern
92a5da9c8c sync versions with 0.4.3
libwebp{,decoder} - 0.4.3
libwebp libtool - 5.3.0
libwebpdecoder libtool - 1.3.0

mux/demux - 0.2.2 (unchanged)
libtool - 1.2.0 (unchanged)

(cherry picked from commit bd852f5d81edbcf201a4f6a1567689c9b95444d1)

Change-Id: Ie8c35ffc20c1bfd782bdafd99da6c6b1373022c1
2015-03-11 17:29:23 -07:00
Pascal Massimino
44bd95612e fix signature for VP8RecordCoeffTokens()
Change-Id: Ia2fe764b7280931335237ced8190604129fae565
2015-03-02 23:38:20 -08:00
Pascal Massimino
2382050748 1-2% faster encoding by removing an indirection in GetResidualCost()
The MIPS code for cost is not updated yet, that's why i keep Residual::*cost
around for now. Should be removed in favor of *costs later.

Change-Id: Id1d09a8c37ea8c5b34ad5eb8811d6a3ec6c4d89f
2015-02-19 08:44:35 +01:00
James Zern
e15560107c move some cost tables from enc/ to dsp/
removes circular dependency between dsp and enc.

since:
a987fae MIPS: dspr2: added optimization for function GetResidualCost

Change-Id: Ifeb8fc02de89e2ba982ed7ffacd925d649bfec3c
2015-02-11 16:10:06 -08:00
Pascal Massimino
bad775715a simplify the Histogram struct, to only store max_value and last_nz
we don't need to store the whole distribution in order to compute the alpha

Later, we can incorporate the max_value / last_non_zero bookkeeping
in SSE2 directly.

Change-Id: I748ccea4ac17965d7afcab91845ef01be3aa3e15
2014-12-10 10:44:57 +01:00
Pascal Massimino
66ad372500 factorize BPS definition in dsp.h and add VP8Copy16x8
Change-Id: Id73a1e968c96455808755df4d131d74e3e2e135d
2014-12-04 13:45:14 +01:00
Pascal Massimino
57606047ec encoder: switch BPS to 32 instead of 16
this is a first step to unifying encoding/decoding cache stride
and possibly sharing the prediction functions in dsp/

With this layout, there's a little (~7%) space lost with unused samples.
But no speed change was observed.

Change-Id: I016df8cad41bde5088df3579e6ad65d884ee711e
2014-12-04 09:17:18 +01:00
James Zern
64ac51446d sync version numbers to 0.4.2 release
libwebp{,decoder} - 0.4.2
libwebp libtool - 5.2.0
libwebpdecoder libtool - 1.2.0

mux/demux - 0.2.2
libtool - 1.2.0

(cherry picked from commit eec5f5f12179a4d950fc27aef1c70f087beade59)
(cherry picked from commit 857578a811ea96abfa3235e4e47dd1ab4726e655)

Change-Id: Ie9d10c68e28083674a8865ad8447b1a70dcea95d
2014-10-17 19:50:21 +02:00
Pascal Massimino
6c6736816c Improved near-lossless mode.
Compared to previous mode it gives another 10-30% improvement in compression keeping comparable PSNR on corresponding quality settings.

Still protected by the WEBP_EXPERIMENTAL_FEATURES flag.

Change-Id: I4821815b9a508f4f38c98821acaddb74c73c60ac
2014-10-15 10:57:21 -07:00
James Zern
b47fb00ac0 vp8enci.h: cosmetics: fix '*' placement
associate with the type

Change-Id: Icf94f11bf79f6ccee3150e27b228755f8f3f0f37
2014-08-05 22:14:12 -07:00
skal
b5a36cc9ad add -near_lossless [0..100] experimental option
This compresses the uimage using lossless compression and controlable
decimating pre-process.
Code is under WEBP_EXPERIMENTAL_FEATURE while it's being experimented with.

Change-Id: I8b7f4cfcc3c6afc52a556102842bdbb045ed5ee8
2014-08-05 19:17:10 +02:00
James Zern
29a9fe222a libwebp 0.4.1
- 7/24/14: version 0.4.1
   This is a binary compatible release.
   * AArch64 (arm64) & MIPS support/optimizations
   * NEON assembly additions:
     - ~25% faster lossy decode / encode (-m 4)
     - ~10% faster lossless decode
     - ~5-10% faster lossless encode (-m 3/4)
   * dwebp/vwebp can read from stdin
   * cwebp/gif2webp can write to stdout
   * cwebp can read webp files; useful if storing sources as webp lossless
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJT1xp9AAoJEPnD1r24IytdjDEP/3ZOnrWG0OIThlGE6bqgO3oy
 Y5O7RrvzFuPdGEZ1Kl9jDXjzsYY018/+HJmOD3kf+Qt/+F/8hpGH520VuEiJdVIW
 UcvoYaYq9xrmKNqEJx910Vh8TP7wE2T62OJcqKWg2JEczfUWn8WOKjmM5c8N1kJ2
 q6EbpCdWlxcD49L/MavJ5Yfw9jSZAjKzOIxxz0C294iMTK4IcSmeVvdqhkdyh96E
 CABw3o8sJfqB6p+KXjweXcE2KOhvzAWqTRcIogDC0jV/PgOlindf6k0am2FJHvMM
 A+sf/pmD0YKI1vEaXW+Vs6cz6LzvwbIkJSwuzBA7FYHAG5yqTSkQDxTSttw/RwiW
 fUScqHjQVBUqkM5bdOsdYBSDutQKDF2+WfcK5jXFdnydkQi59HKHV2R0K5cXYqfN
 Tu7aMBqFcfGunLlzfKCJcz8SElEmUjG6oAzRZYcdM9dmnR7ypQK17A/GbaysKKOE
 HMmep7uNX25w+6AL7zExnmPPPtSz+kj1SXt9fgldkelDhg1faAgfwXb/N4E+00lA
 1+aJD3gHcR4QnDI4gnKBKHyIktQPfNKMQ6xuL0oyvsalQ/loz08wu0aACcGDFrg4
 uOVVxTqU+pEITuwGcNk228+O2EbMWzzi3+Vhi1v3Gg3jJ3TRB3QN6NohmrsIackL
 4W2V5NoX5i2VizGfLy2g
 =GWd5
 -----END PGP SIGNATURE-----

Merge tag 'v0.4.1'

libwebp 0.4.1
- 7/24/14: version 0.4.1
  This is a binary compatible release.
  * AArch64 (arm64) & MIPS support/optimizations
  * NEON assembly additions:
    - ~25% faster lossy decode / encode (-m 4)
    - ~10% faster lossless decode
    - ~5-10% faster lossless encode (-m 3/4)
  * dwebp/vwebp can read from stdin
  * cwebp/gif2webp can write to stdout
  * cwebp can read webp files; useful if storing sources as webp lossless

* tag 'v0.4.1':
  update ChangeLog
  iosbuild.sh: specify optimization flags
  update ChangeLog
  makefile.unix: add vwebp.1 to the dist target
  update ChangeLog
  gif2webp: dust up the help message
  remove -noalphadither option from README/vwebp.1
  update NEWS for the next release
  update AUTHORS
  bump version to 0.4.1
  restore mux API compatibility
  remove the !WEBP_REFERENCE_IMPLEMENTATION tweak in Put8x8uv
  restore encode API compatibility
  restore decode API compatibility
  gif2webp: fix compile with giflib 5.1.0
  gif2webp: simplify giflib version checking

Change-Id: Icf599f29bc6c0db757bc133aaddb3dbbbc316e08
2014-07-29 18:06:58 -07:00
James Zern
85213b9bbe bump version to 0.4.1
libwebp{,decoder} - 0.4.1
libwebp libtool - 5.1.0
libwebpdecoder libtool - 1.1.0

mux/demux - 0.2.1
libtool - 1.1.0

Change-Id: If593a198f802fd68c7dbbdbe0fc2612dbc44e2df
2014-07-23 17:17:25 -07:00
James Zern
c2fc52e4ec restore encode API compatibility
protect WebPConfigLosslessPreset/WebPMemoryWriterClear w/a
WEBP_ENCODER_ABI_VERSION check

Change-Id: If4debc15fee172a3f18079bc2bd29eb8447bc14b
2014-07-22 22:19:55 -07:00
Pascal Massimino
736f2a175e extract colorspace code from picture.c into picture_csp.c
had to refactor few functions here and there.

Change-Id: I86fde6fec7c2fc7eb48f0ecf327dbbd2bd40b9d4
2014-07-16 16:37:26 -07:00
Pascal Massimino
fbadb48026 split monolithic picture.c into picture_{tools,psnr,rescale}.c
Change-Id: Ia5eb5496e4337e5bac8203872c5b014cad21c4f9
2014-07-12 09:13:33 -07:00
Pascal Massimino
f1e771735a remove all unused layer code
Change-Id: I220590162b24c70f404fe3087f19dd3e6cac3608
2014-05-08 22:37:38 -07:00
skal
5fe628d35d make the token page size be variable instead of fixed 8192
also changed the token-page layout a little bit to remove
a not-needed field.

This reduces the number of malloc()/free() calls substantially
with minimal increase in memory consumption (~2%).
For the tail of large sources, the number of malloc calls goes
typically from ~10000 to ~100 (e.g.: bryce_big.jpg: 22711 -> 105)

Change-Id: Ib847f41e618ed8c303d26b76da982fbc48de45b9
2014-05-05 14:26:14 -07:00
skal
d1b33ad58b 2-5% faster trellis with clang/MacOS
(and ~2-3% on ARM)

We don't need to store cost/score for each node, but only for
the current and previous one -> simplify code and save some memory.

Also made the 'Node' structure tighter.

Change-Id: Ie3ad7d3b678992b396242f56e2ac387fe43852e6
2014-03-26 22:33:01 +01:00
Pascal Massimino
536220084c cosmetics:
- use VP8ScanUV, separate from VP8Scan[] (for luma)
 - fix indentation
 - few missing consts
 - change TrellisQuantizeBlock() signature

Change-Id: I94b437d791cbf887015772b5923feb83dd145530
2014-03-18 03:34:56 -07:00
skal
0235d5e44b 1-2% faster quantization in SSE2
C-version is a bit faster too (sub-1% faster on ARM)

Change-Id: I077262042f1d0937aba1ecf15174f2c51bf6cd97
2014-02-13 15:55:30 -08:00
James Zern
ea59a8e980 Merge "Merge tag 'v0.4.0'" 2014-01-10 17:09:19 -08:00
skal
7574bed43d fix comments related to array sizes
thanks to foivos_g4 at hotmail dot com for spotting this.

Change-Id: I8cd301bae58a6edbc3ef47607654bc21321721ca
2014-01-09 17:30:25 +01:00
James Zern
8c524db84c bump version to 0.4.0
libwebp{,decoder} - 0.4.0
libwebp libtool - 5.0.0
libwebpdecoder libtool - 1.0.0

mux/demux - 0.2.0
libtool - 1.0.0

Change-Id: Idbd067f95a6af2f0057d6a63ab43176fcdbb767d
2013-12-18 19:20:00 -08:00
James Zern
605a712701 simplify __cplusplus ifdef
drop c_plusplus which is from a quite ancient pre-standard compiler

Change-Id: I9e357b3292a6b52b14c2641ba11f4f872c04b7fb
2013-12-16 20:16:02 -08:00
James Zern
4931c3294b cosmetics: fix some typos
Change-Id: I0d6efebd817815139db5ae87236fd8911df4d53c
2013-11-26 19:21:14 -08:00
skal
f8bfd5cd1e fast auto-determined filtering strength
kLevelsFromDelta[sharpness][delta] is an inverse look-up table
that tells the minimum filtering strength needed to trigger the
filtering of a step with amplitude 'delta'. We use this table
in various situations:

a) when computing the initial (/global) filtering
strength for each segment. We look at the quantization
step and deduce the proper filtering strength needed
to result this quantization noise (talking the -f option
into account).

b) during intra16 calculation, when a block ends up
very empty (only DC coeffs are non-zero, all ACs have
vanished). We'll rely on the in-loop filtering to
restore the smoothness (if the source was gradient-like
smooth. That's why we look at the distortion too before
triggering the filtering).

Step b) goes _in addition_ to a), potentially raising
the filtering strength if blockiness is likely.

Change-Id: Icaeca93ef21da195b079e6587a44d9edfc8e9efa
2013-10-29 20:13:29 +01:00
skal
b7d4e04255 add VP8EstimateTokenSize()
estimates final size of coded tokens given a set of probabilities

Change-Id: Ia5a459422557d98b78b3cd8e1a88cb30835825b6
2013-09-11 10:08:49 +02:00
Pascal Massimino
9f24519e82 encoder: misc rate-related fixes
* fix VP8FixedCostsI4ÆÅ table
   (the constant cost '211' was erronenously included)
* use the rd-score for '211' correctly (calling SetRDScore() for good)
* count partition0 bits separately during rd-opt

No meaningful difference in rd-curve.

Change-Id: I6c49a150cf28928d9a92c32fff097600d7145ca4
2013-09-10 00:25:32 -07:00
Pascal Massimino
f691f0e461 simplify VP8IteratorSaveBoundary() arg passing
we only need to save yuv_out_, so no need for the arg

Change-Id: I7bad5d910e81ed2eda5c9787821fd1cfe905bd92
2013-09-06 02:11:16 -07:00
skal
93402f02db multi-threaded segment analysis
When -mt is used, the analysis pass will be split in two
and each halves performed in parallel. This gives a 5%-9% speed-up.

This was a good occasion to revamp the iterator and analysis-loop
code. As a result, the default (non-mt) behaviour is a tad (~1%) faster.

Change-Id: Id0828c2ebe2e968db8ca227da80af591d6a4055f
2013-09-05 09:13:36 +02:00
skal
733a7faae4 enc->Iterator memory cleanup
* move yuv_in_/out_* scratch buffers to iterator
* add y_top_/uv_top_ shortcuts in iterator

That's ~3k of stack size instead of heap.
But it allows having several iterators work in parallel.

Change-Id: I6a437c0f2ef1e5d398c1d6a2fd4974fa0869f0c1
2013-08-31 23:38:11 +02:00
skal
de4d4ad598 VP8EncIterator clean-up
- remove unused fields from iterator
- introduce VP8IteratorSetRow() too
- rename 'done_' to 'countdown_'
- bring y_left_/u_left_/v_left_ from VP8Encoder

Change-Id: Idc1c15743157936e4cbb7002ebb5cc3c90e7f92a
2013-08-01 23:05:54 -07:00
James Zern
5a92c1a5e9 bump version to 0.3.1
libwebp{,decoder} - 0.3.1
libwebp libtool - 4.3.0 (compatible release)
libwebpdecoder libtool - 0.1.0 (compatible release)

mux/demux - 0.1.1
libtool - 0.1.0 (compatible release)

Change-Id: Icc8329a6bcd9eea5a715ea83f1535a66d6ba4b58
2013-06-12 23:19:13 -07:00
James Zern
c7e89cbb02 update copyright text
rather than symlink the webm/vpx terms, use the same header as libvpx to
reference in-tree files

based on the discussion in:
https://codereview.chromium.org/12771026/

Change-Id: Ia3067ecddefaa7ee01550136e00f7b3f086d4af4
(cherry picked from commit d640614d544453ce14e2bbeeef8f33c78c8b09fd)
2013-06-11 15:03:22 -07:00
Pascal Massimino
3c8eb9a806 fix bad saturation order in QuantizeBlock
Saturation was done on input coeff, not quantized one.

This saturation is not absolutely needed: output of FTransformWHT
is in range [-16320, 16321]. At quality 100, max quantization steps is 8,
so the maximal range used by QuantizeBlock() is [-2040, 2040].
But there's some extra bias (mtx->bias_[] and mtx->sharpen_[]) so
it's better to leave this saturation check for now.

addresses issue #145

Change-Id: I4b14f71cdc80c46f9eaadb2a4e8e03d396879d28
2013-03-25 14:53:29 -07:00
James Zern
cef9388283 bump version to 0.3.0
libwebp{,decoder} - 0.3.0
libwebp libtool - 4.2.0 (compatible release)
libwebpdecoder libtool - 0.0.0 (new release)

mux/demux - 0.1.0
libtool - 0.0.0 (new release)

Change-Id: Ied6efa390b2f97f1f41fc8349a365613c639d6cc
2013-03-16 14:08:14 -07:00
skal
9bfbdd144f 1.5x-2x faster encoding for method 3 and up
using token-buffer (that is: slightly more memory. O(output_size))

This change is ON by default. To return to previous behaviour, use
'cwebp -low_memory' or set config.low_memory to true.

Side-effect of this new mode: it forces 1 partition only (which was
default anyway), and makes some statistics about the bitstream
no longer available. cwebp will no longer report 'intra4-coeffs', etc.

This mode also doesn't work (yet) with multi-pass, and -low_memory
is currently forced for multi-pass.

also: reversed the flag: USE_TOKEN_BUFFER -> DISABLE_TOKEN_BUFFER
also: fixed the kAverageBytesPerMB estimate

Change-Id: I4ea80382038d6df4309663e0cb7bd88d9bca9cf1
2013-03-11 17:01:33 -07:00
James Zern
153f94e8b5 fix windows build
broken since:
ad25032 Merge "multi-threaded alpha encoding for lossy"

this produced an error due to an empty VP8TBuffer struct.

Change-Id: I640809d07d20092c1d660e2b59b58a62a12e4371
2013-03-08 15:50:04 -08:00
skal
ad2503203a Merge "multi-threaded alpha encoding for lossy" 2013-03-01 16:00:10 -08:00