Commit Graph

1914 Commits

Author SHA1 Message Date
James Zern
d2afe974f9 thread: use CreateThread for windows phone
_beginthreadex is unavailable for winrt/uwp

Change-Id: Ie7412a568278ac67f0047f1764e2521193d74d4d
2015-11-23 23:00:40 -08:00
James Zern
0fd0e12bfe thread: use WaitForSingleObjectEx if available
Windows XP and up

Change-Id: Ie1a46a82722b8624437c8aba0aa4566a4b0b3f57
2015-11-23 23:00:05 -08:00
James Zern
63fadc9ffa thread: use InitializeCriticalSectionEx if available
Windows Vista / Server 2008 and up

Change-Id: I32c5b4e5384d614c5a821ef511293ff014c67966
2015-11-23 22:58:28 -08:00
James Zern
110ad5835e thread: use native windows cond var if available
Vista / Server 2008 and up. no speed difference observed.

Change-Id: Ice19704777cb679b290dc107a751a0f36dd0c0a9
2015-11-23 22:58:11 -08:00
James Zern
158763dea3 Merge "always call WebPInitSamplers(), don't try to be smart" 2015-11-23 22:23:21 +00:00
Pascal Massimino
3770f3bbb6 Merge "cleanup the YFIX/TFIX difference by removing some code and #define" 2015-11-23 20:47:42 +00:00
James Zern
a40f60a9b4 Merge "3% speed improvement for lossless webp encoder for low effort mode:" 2015-11-23 20:44:15 +00:00
Pascal Massimino
ed1c2bc655 always call WebPInitSamplers(), don't try to be smart
if FANCY_UPSAMPLING was not defined but io->fancy_upsampling was set,
then the call to WebPInitSamplers() was skipped -> boom.

Change-Id: Id63e2ecc09f532fbe2ec9936d9ce4b502ba8fac5
2015-11-23 09:53:52 -08:00
Lode Vandevenne
b8c44f1aa4 3% speed improvement for lossless webp encoder for low effort mode:
prevent updating unused histogram.

Benchmark on 1000 PNGs, 30 iterations, lossless, quality 0, method 0:
before: Compression (output/input): 2.9120/3.2667 bpp, Encode rate (raw data): 34.578 MP/s
after: Compression (output/input): 2.9120/3.2667 bpp, Encode rate (raw data): 36.980 MP/s

Change-Id: Id62759d4d111a6ba41c85c611a15d4f6ffc9f935
2015-11-22 09:12:54 +01:00
Pascal Massimino
997e103871 cleanup the YFIX/TFIX difference by removing some code and #define
no speed or output difference

Change-Id: I50bfb44f357e19431457b1cf9504a5a6bcce1945
2015-11-21 23:51:58 -08:00
Lode Vandevenne
1f9be97c22 Make discarding invisible RGB values (cleanup alpha) the default.
Rename the flag to exact instead of the opposite cleanup_alpha. Add the flag to
WebPConfig. Do the cleanup in the webp encoder library rather than the cwebp
binary, this will be needed for the next stage: smarter alpha cleanup for
better compression which cannot be done as a preprocessing due to depending on
predictor choices in the encoder.

Change-Id: I2fbf57f918a35f2da6186ef0b5d85e5fd0020eef
2015-11-21 12:32:32 -08:00
Pascal Massimino
b37b0179c5 fix for issue #275: don't compare to out-of-bound pointers
the original change triggered several internal API modifs.
This is to ensure that we're never computing pointer that can
possibly wrap around, or differences between pointers that can
overflow.

no observed speed difference

Change-Id: I9c94dda38d94fecc010305e4ad12f13b8fda5380
2015-11-20 16:25:17 -08:00
Pascal Massimino
21735e06f7 speed-up trivial one-symbol decoding case for lossless
We now consider 3 special cases:
 * htree-group has only 1 code (no bit is read from bitstream)
 * htree-group has few enough literal symbols, so that all the bit
   codes can fit into a look-up table of less than 64 entries
 * htree-group has a trivial arb literal (not GREEN!), like before

No overall speed change.

Change-Id: I6077fa0b7e5c31a6c67aa8aca859c22cc50ee254
2015-11-16 14:04:51 -08:00
Urvang Joshi
397863bd66 Refactor CopyPlane() and CopyPixels() methods: put them in utils.
Change-Id: I0e1533df557a0fa42c670e3b826fc0675c36e0a5
2015-11-13 11:39:22 -08:00
Urvang Joshi
6ecd72f845 Re-enable encoding of alpha plane with color cache for next release.
This is a revert of: https://chromium-review.googlesource.com/#/c/73607/

Change-Id: I7ec45277d73608d77d5e873290c6c185caa30c32
2015-11-13 07:15:19 +00:00
Pascal Massimino
775d3a373c remove unused fields from WebPDecoderOptions and WebPBitstreamFeatures
Change-Id: I92692d2975644dba10a7ac54f5c0f63ebd1580e6
2015-11-13 00:16:29 +01:00
Urvang Joshi
c13245c7d8 AnimEncoder: Add a GetError() method.
We now get error string instead of printing it.
The verbose option is now only used to print info and warnings.

Change-Id: I985c5acd427a9d1973068e7b7a8af5dd0d6d2585
2015-11-11 16:14:09 -08:00
Urvang Joshi
688b265d5e AnimDecoder API: Add a GetDemuxer() method.
Change-Id: Ic6a86e8788f1a3e21d1287ece36d80d1153b8f5a
2015-11-11 10:36:17 -08:00
Urvang Joshi
1aa4e3d6ba WebPAnimDecoder: add an option to enable multi-threaded decoding.
Change-Id: I3ff12bc07fc5a1b57a6950afa0e5f54a12985e75
2015-11-11 10:34:42 -08:00
Urvang Joshi
3584abca16 AnimDecoder: option to decode to common color modes.
Change-Id: I77ddab9abe3c4b35a9bcfe4c90b3e43d3aef166d
2015-11-10 09:27:59 -08:00
Urvang Joshi
945cfa3b7c mux.h does NOT need to include encode.h
It was needed earlier for WebPAnimEncoder API when it was using structs
like WebPConfig, but it only uses pointers to those now.

Change-Id: Ic0c144966421c678e8ef54b3fa81574bb2c9cd08
2015-11-09 15:40:09 -08:00
Pascal Massimino
bfd3fc02df ~2x faster SSE2 RGB24toY, BGR24toY, ARGBToY|UV
global effect is ~2% faster encoding from JPG source
and ~8% faster lossless-webp source decoding to PGM (e.g.)

Also revamped the YUVA case to first accumulate R/G/B value into 16b
temporary buffer, and then doing the UV conversion.
-> New function: WebPConvertRGBA32ToUV

Change-Id: I1d7d0c4003aa02966ad33490ce0fcdc7925cf9f5
2015-11-06 15:02:01 -08:00
Pascal Massimino
52fdbdfe66 extract some RGB24 to Luma conversion function from enc/ to dsp/
Just for RGB24/BGR24 for now, which are the hard-to-optimize ones.
SSE2 implementation coming next.

ConvertRowToY() should go into dsp/ too, at some point.

Change-Id: Ibc705ede5cbf674deefd0d9332cd82f618bc2425
2015-10-30 00:28:11 -07:00
Pascal Massimino
ab8c2300b6 add missing \n
Change-Id: I0c9236bbeef5868629d4dc02e3fae6e79ca55949
2015-10-30 00:02:27 -07:00
James Zern
5bd04a087c sync versions with 0.4.4
libwebp{,decoder} - 0.4.4
libwebp libtool - 5.4.0
libwebpdecoder libtool - 1.4.0

mux/demux - 0.2.2 (unchanged)
libtool - 1.2.0 (unchanged)

(cherry picked from commit 62864042c0)

Change-Id: I7d421dc47ad4d25a17450ce1b04562c5d58c596b
2015-10-28 23:43:40 -07:00
Pascal Massimino
8f1fcc15af Merge "Move ARGB->YUV functions from dec/vp8l.c to dsp/yuv.c" 2015-10-29 06:38:52 +00:00
Pascal Massimino
fa8927efe4 Move ARGB->YUV functions from dec/vp8l.c to dsp/yuv.c
also switch to using ExtractAlpha() instead of hard-coding the loop.

The ARGBToY/UV functions are rather easy to port to SSE2 / NEON.

Change-Id: I8f1346a9ca427a36ce2d6c848369ca7964d8b3c7
2015-10-28 01:45:08 -07:00
James Zern
f7c507a5f8 Merge "remove unnecessary #include "yuv.h"" 2015-10-27 21:54:21 +00:00
Pascal Massimino
14e4043b67 remove unnecessary #include "yuv.h"
Change-Id: I8b277433663e063e7a182f66818afec1654a39bd
2015-10-27 01:27:36 -07:00
Pascal Massimino
d64d376c2a change WEBP_ALIGN_CST value to 31
(and make dec/frame.c use the common macros too)

Change-Id: Ie44dbd82e067934b17ca3ffba4dd45ab0d61d3f6
2015-10-19 21:39:55 +00:00
James Zern
f717b82864 vp8l.c, cosmetics: fix indent after 95509f9
95509f9 large re-organization of the delta-palettization code

Change-Id: I9d27f15cb6072a2bd1dd593d53db5b2dd3c30133
2015-10-19 12:28:57 -07:00
James Zern
927ccdc43b Merge "fix alignment of allocated memory in AllocateTransformBuffer" 2015-10-19 19:15:04 +00:00
Pascal Massimino
fea94b2b36 fix alignment of allocated memory in AllocateTransformBuffer
likely to avoid unaligned reads in the future

Change-Id: I434ba17c139ad6e190ebd9b909b241c6c6f1e7f8
2015-10-18 13:09:22 -07:00
Pascal Massimino
5aa8d61f75 Merge "MIPS: rescaler code synced with C implementation" 2015-10-17 07:52:36 +00:00
Djordje Pesut
e7fb267df7 MIPS: rescaler code synced with C implementation
Change-Id: I4cec115d3fe6f3f825084d7388249694c500256a
2015-10-17 00:16:27 -07:00
Pascal Massimino
93c86ed5b9 Merge "format_constants.h: MKFOURCC, correct cast" 2015-10-17 05:45:05 +00:00
James Zern
5d791d2603 format_constants.h: MKFOURCC, correct cast
'd' should be promoted to uint32 before shifting by 24

Change-Id: I6212661af3802709b0098af8402ed73a0d9373ee
2015-10-16 18:43:40 -07:00
James Zern
65726cd3a7 dsp/lossless: Average2, make a constant unsigned
use 'u' rather than the unnecessary 'l' as a suffix. this prevents a
conversion warning with some toolchains

Change-Id: I21c33ce08819b3c839c75e03a8f7f3a6041d0695
2015-10-16 18:39:42 -07:00
Johann
d26d9def80 Use __has_builtin to check clang support
Older versions of Xcode with clang reporting versions 4.[012] and 5.0
did not include support for __builtin_bswap16. Checking in this manner
avoids using brittle version checks.

Matches a change to libvpx:
https://chromium-review.googlesource.com/305573
to fix:
https://code.google.com/p/webm/issues/detail?id=1082

Change-Id: I23ea466ee1b53b12cd3fb45f65a2186c8dda95a1
2015-10-14 17:48:08 -07:00
Pascal Massimino
12ec204ec7 moved ALIGN_CST into util/utils.h and renamed WEBP_ALIGN_xxx
Note that ALIGN_CST is still kept different in dec/frame.c for now,
because the values is 31 there, not 15. We might re-unite these two
later.

Change-Id: Ibbee607fac4eef02f175b56f0bb0ba359fda3b87
2015-10-14 00:03:14 -07:00
Pascal Massimino
67c547fdcd rescaler: ~20% faster SSE2 implementation for lossless ImportRowExpand
lossy (1-channel) speed-up is more on the 5% side.

Change-Id: Id19d97b9e9a34804b59604a5b48f94a37fdafd62
2015-10-14 07:32:12 +02:00
Pascal Massimino
99e3f8128a Merge "large re-organization of the delta-palettization code" 2015-10-14 05:11:47 +00:00
Pascal Massimino
95509f9914 large re-organization of the delta-palettization code
same functionality, but better code layout.

What changed:
  * don't trash the palette_[] in EncodePalette(), so it can be re-used
  * split generation of image from bit-stream coding
  * move all the delta-palette code to delta_palettization.c, and only have 1 entry point there WebPSearchOptimalDeltaPalette()
  * minimize the number of "#ifdef WEBP_EXPERIMENTAL_FEATURES" in vp8l.c
  * clarify the TransformBuffer stuff. more clean-up to come here...

This should make experimenting with delta-palettization easier and more compartimentalized.

Change-Id: Iadaa90e6c5b9dabc7791aec2530e18c973a94610
2015-10-14 00:25:42 +02:00
Pascal Massimino
74fb458bbc fix for weird msvc warning message
" warning C4098: 'RescalerImportRowShrinkSSE2' : 'void' function returning a value"

Change-Id: Ifa893502e3e4b394910e142d954393dda9d59d1a
2015-10-10 22:35:59 -07:00
Pascal Massimino
ae49ad8641 Merge "SSE2 implementation of ImportRowShrink" 2015-10-10 06:02:24 +00:00
Pascal Massimino
932fd4df61 SSE2 implementation of ImportRowShrink
some limitations: only for RGBA output,
and if reduction factor is not too small (dst_width > src_width / 128)

20-25% faster, ~4-6% global improvement total decoding.

Change-Id: I95366ddaa4a38e0a96bed754dfe790126f7bb84a
2015-10-09 13:04:54 -07:00
Pascal Massimino
b0c9d8af32 label rename: NO_CHANGE -> NoChange
Change-Id: I5b2beb93169d7c2bc95e6cdeb57770fc44b4963f
2015-10-07 22:53:34 -07:00
skal
b4e731cd93 neon-implementation for rescaler code
It's better to stay with a 32b fixed-point precision overall, otherwise
the C-version on ARM gets *slower*.
Actually, gcc ARM compiler optimizes some instructions pretty
well when WEBP_RESCALER_FIX is exactly 32, even in C.

Change-Id: I0eea97f7db5947470f5af355dee098eca81e178d
2015-10-07 21:18:39 -07:00
Pascal Massimino
6dfa5e3e58 rescaler: better handling of the fxy_scale=0 special case.
Change-Id: I635cb62c028e373a54fcafdc6b996812a9b2ace5
2015-10-07 17:53:16 -07:00
Pascal Massimino
55c05293d5 Revert "rescaler: better handling of the fxy_scale=0 special case."
This reverts commit 9f226bf8c3.

I dropped a 'dst_height' from 'ratio'!! My bad...

Change-Id: Id355f0f012a754cddf97012715d69aa5e03c2e5c
2015-10-07 17:49:24 -07:00