2513 Commits

Author SHA1 Message Date
Pascal Massimino
8dd28bb560 Slightly faster lossless decoding (1%)
-> introduce special case 64b pattern-copy, similar to the 8b one for alpha.
-> use mempcy() for non-overlapping areas
+ cosmetics and homogenezation of the code

Change-Id: I0e65e04b96fec94c009a4614137dfba2a0f98561
2014-09-09 11:18:30 +02:00
Djordje Pesut
f0103595dd MIPS: dspr2: added optimization for ColorIndexInverseTransforms
Change-Id: I5b6094ce489d4f896bc4b8f575142eb3c5054beb
2014-09-08 17:22:59 +02:00
Pascal Massimino
d3242aee16 make VP8LSetBitPos() set br->eos_ flag
ReadSymbol() finishes with a VP8LSetBitPos() call only and could miss an eos_ during the decode loop.

Things are faster because of inlining too.

Change-Id: I2d2a275f38834ba005bc767d45c5de72d032103e
2014-09-06 08:40:20 +02:00
Pascal Massimino
a9decb5584 Lossless decoding: fix eos_ flag condition
eos_ needs to be set only when superfluous bits have actually
been requested.
Earlier, we were assuming pre-mature end-of-stream to be an error.
Now, more precisely, we mark error when we have encountered end-of-stream *and*
we attempt to read more bits after that.

This handles cases where image data requires no bits to be read

Change-Id: I628e2c39c64f10c443fb51f86b1f5919cc9fd299
2014-09-05 20:21:50 +02:00
Pascal Massimino
3fea6a28da fix erroneous dec->status_ setting
We only need to set BITSTREAM_ERROR if !ok.

Change-Id: I5bd13e64797e8bc509477edb29158abb39cb0ee1
2014-09-05 19:48:11 +02:00
Djordje Pesut
80b8099fd8 MIPS: dspr2: add some specific mips code to commit I2c3f2b12f8df15b785fad5a9c56316e954ae0c53
added some C-code tuning also

Change-Id: I67ce70a063ef6b5821b9158a4defd6987eccbb9a
2014-09-04 13:42:39 +02:00
skal
e564062522 Merge "further refine the COPY_PATTERN optim for DecodeAlpha" 2014-09-04 03:43:55 -07:00
James Zern
854509fec0 enc/histogram.c: reindent after f4059d0
fixes indent in HistogramRemap after:
f4059d0 Code cleanup for HistogramRemap.

Change-Id: I9f53a088749e9100a70331bda1662488666c5156
2014-09-03 16:58:49 -07:00
skal
344219645b Merge "~3-5% faster encoding optimizing PickBestIntra*()" 2014-09-03 15:53:32 -07:00
skal
865069c12e further refine the COPY_PATTERN optim for DecodeAlpha
* use functions instead of MACRO
* adjust var's name

Overall, same speed, with more readible code

Change-Id: I2c3f2b12f8df15b785fad5a9c56316e954ae0c53
2014-09-04 00:25:27 +02:00
Djordje Pesut
a59562283f added C-level optimization for DecodeAlphaData function
Copies with short distances of 1,2 and 4 are specialized.

up to 10-14% faster alpha decoding.

Change-Id: I9708e98193910bfaf8ef43091f3fdea73b63896d
2014-09-03 16:49:17 +02:00
skal
187d379db6 add a fallback to ALPHA_NO_COMPRESSION
if ALPHA_LOSSLESS_COMPRESSION produces a too big file (very rare!),
we fall-back to no-compression automatically.

Change-Id: I5f3f509c635ce43a5e7c23f5d0f0c8329a5f24b7
2014-09-02 21:55:13 +02:00
skal
a48a2d7635 ~3-5% faster encoding optimizing PickBestIntra*()
* Add early-out check for Intra16
* replace some memcpy() by pointer swap

Change-Id: I5edc5f7fbc8e39984deb48e6c045c97c61418589
2014-09-01 14:40:25 +02:00
skal
77d4c7e337 address cosmetic comments from patch #71380
Change-Id: Iaba301b9e77aa4febe0efe1e6016fab42d5589f3
2014-08-28 18:08:00 -07:00
skal
f75dfbf23d Speed up Huffman decoding for lossless
speed-up is ~1.6% for photographic image to 10% for graphical image
(1000 images corpus was sped up by 5.8 %)

Code by akramarz@google.com and jyrki@google.com

Change-Id: Iceb2e50e6cc761b9315a3865d22ec9d19b8011c6
2014-08-28 12:28:04 -07:00
James Zern
637b388809 dsp/lossless: workaround gcc-4.9 bug on arm
force Sub3() to not be inlined, otherwise the code in Select() will be
incorrect.
https://android-review.googlesource.com/#/c/102511

Change-Id: I90ae58bf3e6cc92ca9897f69974733d562e29aaf
2014-08-27 20:31:21 -07:00
James Zern
8323a9038d dsp.h: collect gcc/clang version test macros
endian_inl.h already relies on dsp.h, grab the definitions from there.

Change-Id: I445f7d0631723043c55da1070498f89965bec7b1
2014-08-27 19:33:09 -07:00
skal
e6c4b52f28 move static initialization of WebPYUV444Converters[] to the Init function.
Split initialization of YUV444Converters[] out of Upsamplers init.

update test for NULL function pointers

Change-Id: I9603f54250f90c85a12ffbecfd6c59e9b06c47e0
2014-08-27 11:36:37 -07:00
skal
49911d4df2 Merge "fix indentation" 2014-08-27 07:52:36 -07:00
Vikas Arora
f4059d0c7d Code cleanup for HistogramRemap.
Avoid call to HistogramAddThresh when there's only one Histogram image.
Change-Id: I43b09e8e2d218c95969567034224777dcce37ab9
2014-08-26 15:45:22 -07:00
skal
e632b0929b fix indentation
Change-Id: I2294a6c83e5f345f64bd5120b91532e00ed6c543
2014-08-25 23:52:09 -07:00
skal
f5c04d64b7 Merge "add a DispatchAlpha() for SSE2 that handles 8 pixels at a time" 2014-08-25 22:43:42 -07:00
skal
fc98edd936 add a DispatchAlpha() for SSE2 that handles 8 pixels at a time
Only slightly faster.

Change-Id: Ie2e57e6a0950166124cf1075c6c9b45b7abdad8c
2014-08-25 21:03:03 -07:00
skal
73d361dd5f introduce VP8EncQuantize2Blocks to quantize two blocks at a time
No speed diff for now. We might reorder better the instructions later,
to speed things up.

Change-Id: I1949525a0b329c7fd861b8dbea7db4b23d37709c
2014-08-25 20:21:42 -07:00
Djordje Pesut
0b21c30b1a MIPS: dspr2: added optimization for EmitAlphaRGB
New dsp function: WebPDispatchAlpha()

Change-Id: I48e539d22471279ec75185759bc68d18b127f716
2014-08-21 20:39:35 -07:00
James Zern
953acd56a4 enc_neon: enable QuantizeBlock for aarch64
vtbl4_u8 is available everywhere except iOS arm64: use vtbl2q_u8 there
with a corresponding change in the load.

Change-Id: Ib84212dda3c7875348282726c29e3b79b78b0eac
2014-08-20 11:48:25 -07:00
Djordje Pesut
f4ae143720 MIPS: mips32: code rebase
mips code rebased to be same as C code
from commit I8c29a8a0285076cb3423b01ffae9fcc465da6a81

Change-Id: I3848f4ce43387c3a62b336606498779f7b07ec44
2014-08-19 15:13:16 +02:00
Djordje Pesut
569771549a MIPS: dspr2: added optimizations for VP8YuvTo*
VP8YuvToRgb
VP8YuvToBgr
VP8YuvToRgb565
VP8YuvToRgba4444
VP8YuvToArgb
VP8YuvToBgra
VP8YuvToRgba

Change-Id: I22212a125d890e1fd28388fec906a1a5c07ff386
2014-08-19 14:29:32 +02:00
skal
2523aa73cb SmartRGBYUV: fix odd-width problem with pixel replication
rightmost pixel was missing a copy, which could lead to invalid read.

Also added a lower dimension of 4, below which we use the regular conversion.
This is to prevent corner cases, in addition to not being overkill.

Change-Id: Iac12e7a3d74590f12fe8eeb1830b9891e61439f6
2014-08-18 15:58:36 -07:00
Pascal Massimino
ee52dc4e54 fix some MSVC64 warning about float conversion
Change-Id: I27ab27fc15033d27d0505729f6275fb542c8d473
2014-08-16 00:15:29 -07:00
James Zern
3fca851a20 cpu: check for _MSC_VER before using msvc inline asm
_M_IX86 will be defined in mingw builds after including windows.h. as
the gcc inline asm is first, this missing check would only have caused
an error if the code was reorganized.

Change-Id: I395679bcfc43e94d308d1ceb0c0fbf932b2c378c
2014-08-15 15:11:40 -07:00
skal
e2a83d7109 faster RGB->YUV conversion function (~7% speedup)
with a special case for dithering==0., it gets somewhat faster on x86
thanks to inlining.

Also, less macros.

Change-Id: Ic2f2bf6718310743bb40cef2104fa759a073e6d5
2014-08-15 11:13:25 -07:00
skal
de2d03e12f Merge "Add smart RGB->YUV conversion option -pre 4" 2014-08-15 11:07:49 -07:00
skal
3fc4c539aa Add smart RGB->YUV conversion option -pre 4
New function: WebPPictureSmartARGBToYUVA()
This implement smart RGB->YUV conversion.

This is rather undocumented for now, and is triggered using '-pre 4'
preprocessing option.

This is slow-ish and use quite some memory, but should be improvable.
This is somehow a usable beta version.

Change-Id: Ia50a8c30134e4cab8a7d3eb70aef13ce1f6187a1
2014-08-15 10:55:09 -07:00
Djordje Pesut
b4dc4069a2 MIPS: dspr2: added optimization for (un)filters
HorizontalFilter
VerticalFilter
GradientFilter
HorizontalUnfilter
VerticalUnfilter
GradientUnfilter

Change-Id: I54055b4767c37719691811072e95bf79c1f627b1
2014-08-14 11:55:19 -07:00
Djordje Pesut
b61c9ceca8 MIPS: dspr2: Optimization of some simple point-sampling functions
Change-Id: I6a4ab29bd0cc5a2951a8882cf9997032dc38bd79
2014-08-13 17:18:49 +02:00
Djordje Pesut
98c54107df MIPS: mips32r2: added optimization for BSwap32
gcc < 4.8.3 doesn't translate bswap optimally.
use optimized version always

Change-Id: I979ea26ad6dc0166d3d2f39c4148eb8adfb7ddec
2014-08-12 09:29:13 +02:00
Djordje Pesut
b7e5a5c451 MIPS: detect mips32r6 and disable mips32r1 code
Change-Id: Id1325c789a990c9a8704e84e99a22d580303eb8a
2014-08-08 17:29:31 +02:00
pascal massimino
bb07022b66 Merge "cosmetics" 2014-08-06 12:30:08 -07:00
James Zern
e300c9d819 cosmetics
fix some indent/whitespace, remove a few duplicate includes, extra
semi-colons

Change-Id: If937182b40a21e0f2028496e7b4b06c6e8a41352
2014-08-06 12:10:59 -07:00
pascal massimino
0e519eea8e Merge "cosmetics: remove some extraneous 'extern's" 2014-08-05 23:00:04 -07:00
pascal massimino
3ef0f08af5 Merge "vp8enci.h: cosmetics: fix '*' placement" 2014-08-05 22:34:13 -07:00
James Zern
4c6dde37b9 bit_writer: cosmetics: rename kFlush() -> Flush()
Change-Id: I8907927974188bee85ffade1d75d2e50817aa115
2014-08-05 22:14:29 -07:00
James Zern
f7b4c48bba cosmetics: remove some extraneous 'extern's
Change-Id: Ib3f0cff37120c51633387dd1c46592c53ab0ba6d
2014-08-05 22:14:24 -07:00
James Zern
b47fb00ac0 vp8enci.h: cosmetics: fix '*' placement
associate with the type

Change-Id: Icf94f11bf79f6ccee3150e27b228755f8f3f0f37
2014-08-05 22:14:12 -07:00
skal
b5a36cc9ad add -near_lossless [0..100] experimental option
This compresses the uimage using lossless compression and controlable
decimating pre-process.
Code is under WEBP_EXPERIMENTAL_FEATURE while it's being experimented with.

Change-Id: I8b7f4cfcc3c6afc52a556102842bdbb045ed5ee8
2014-08-05 19:17:10 +02:00
James Zern
0524d9e5e8 dsp: detect mips64 & disable mips32 code
Change-Id: Icf68dafd5cf0614ca25b36a0252caa1784ac8059
2014-08-01 21:18:53 -07:00
James Zern
29a9fe222a libwebp 0.4.1
- 7/24/14: version 0.4.1
   This is a binary compatible release.
   * AArch64 (arm64) & MIPS support/optimizations
   * NEON assembly additions:
     - ~25% faster lossy decode / encode (-m 4)
     - ~10% faster lossless decode
     - ~5-10% faster lossless encode (-m 3/4)
   * dwebp/vwebp can read from stdin
   * cwebp/gif2webp can write to stdout
   * cwebp can read webp files; useful if storing sources as webp lossless
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJT1xp9AAoJEPnD1r24IytdjDEP/3ZOnrWG0OIThlGE6bqgO3oy
 Y5O7RrvzFuPdGEZ1Kl9jDXjzsYY018/+HJmOD3kf+Qt/+F/8hpGH520VuEiJdVIW
 UcvoYaYq9xrmKNqEJx910Vh8TP7wE2T62OJcqKWg2JEczfUWn8WOKjmM5c8N1kJ2
 q6EbpCdWlxcD49L/MavJ5Yfw9jSZAjKzOIxxz0C294iMTK4IcSmeVvdqhkdyh96E
 CABw3o8sJfqB6p+KXjweXcE2KOhvzAWqTRcIogDC0jV/PgOlindf6k0am2FJHvMM
 A+sf/pmD0YKI1vEaXW+Vs6cz6LzvwbIkJSwuzBA7FYHAG5yqTSkQDxTSttw/RwiW
 fUScqHjQVBUqkM5bdOsdYBSDutQKDF2+WfcK5jXFdnydkQi59HKHV2R0K5cXYqfN
 Tu7aMBqFcfGunLlzfKCJcz8SElEmUjG6oAzRZYcdM9dmnR7ypQK17A/GbaysKKOE
 HMmep7uNX25w+6AL7zExnmPPPtSz+kj1SXt9fgldkelDhg1faAgfwXb/N4E+00lA
 1+aJD3gHcR4QnDI4gnKBKHyIktQPfNKMQ6xuL0oyvsalQ/loz08wu0aACcGDFrg4
 uOVVxTqU+pEITuwGcNk228+O2EbMWzzi3+Vhi1v3Gg3jJ3TRB3QN6NohmrsIackL
 4W2V5NoX5i2VizGfLy2g
 =GWd5
 -----END PGP SIGNATURE-----

Merge tag 'v0.4.1'

libwebp 0.4.1
- 7/24/14: version 0.4.1
  This is a binary compatible release.
  * AArch64 (arm64) & MIPS support/optimizations
  * NEON assembly additions:
    - ~25% faster lossy decode / encode (-m 4)
    - ~10% faster lossless decode
    - ~5-10% faster lossless encode (-m 3/4)
  * dwebp/vwebp can read from stdin
  * cwebp/gif2webp can write to stdout
  * cwebp can read webp files; useful if storing sources as webp lossless

* tag 'v0.4.1':
  update ChangeLog
  iosbuild.sh: specify optimization flags
  update ChangeLog
  makefile.unix: add vwebp.1 to the dist target
  update ChangeLog
  gif2webp: dust up the help message
  remove -noalphadither option from README/vwebp.1
  update NEWS for the next release
  update AUTHORS
  bump version to 0.4.1
  restore mux API compatibility
  remove the !WEBP_REFERENCE_IMPLEMENTATION tweak in Put8x8uv
  restore encode API compatibility
  restore decode API compatibility
  gif2webp: fix compile with giflib 5.1.0
  gif2webp: simplify giflib version checking

Change-Id: Icf599f29bc6c0db757bc133aaddb3dbbbc316e08
2014-07-29 18:06:58 -07:00
James Zern
85213b9bbe bump version to 0.4.1
libwebp{,decoder} - 0.4.1
libwebp libtool - 5.1.0
libwebpdecoder libtool - 1.1.0

mux/demux - 0.2.1
libtool - 1.1.0

Change-Id: If593a198f802fd68c7dbbdbe0fc2612dbc44e2df
2014-07-23 17:17:25 -07:00
James Zern
695f80ae25 Merge "restore mux API compatibility" into 0.4.1 2014-07-23 17:11:33 -07:00