Commit Graph

2027 Commits

Author SHA1 Message Date
James Zern
71100500a8 bump version to 0.5.0
libwebp{,decoder} - 0.5.0
libwebp libtool - 6.0.0
libwebpdecoder libtool - 2.0.0

mux/demux - 0.3.0
libtool - 2.0.0

Change-Id: I5346d13eb827fb5890efbb63ff3f28cea9d0c55f
2015-12-17 19:45:14 -08:00
James Zern
d48e427b1d Merge "demux: accept raw bitstreams" 2015-12-17 22:52:10 +00:00
James Zern
99a01f4f8b Merge "Unify some entropy functions." 2015-12-17 22:35:29 +00:00
James Zern
4b025f10f7 Merge "configure: disable asserts by default" 2015-12-17 22:28:37 +00:00
James Zern
92cbddf89c Merge "fix PrintBlockInfo()" 2015-12-17 21:00:57 +00:00
Vincent Rabaud
ca509a3362 Unify some entropy functions.
The code and logic is unified when computing bit entropy + Huffman cost.

Speed-wise, we gain 8% for lossless encoding.
Logic-wise, the beginning/end of the distributions are handled properly
and the compression ratio does not change much.

Change-Id: Ifa91d7d3e667c9a9a421faec4e845ecb6479a633
2015-12-17 17:00:08 +01:00
Pascal Massimino
367bf903b3 fix PrintBlockInfo()
... which has gone out of sync since the last block-cache layout change.

Change-Id: Ic441ec07b0198b508ce3fd34ab582cb60b1daabc
2015-12-17 15:47:25 +01:00
Pascal Massimino
b0547ff0b4 move back common constants for lossless_enc*.c into the .h
Change-Id: I11bc979db691f6518d85e2e1c3ac7f05d69681b0
2015-12-17 15:11:56 +01:00
Lode Vandevenne
fb4c7832f1 lossless: simpler alpha cleanup preprocessing
setting all transparent pixels to black rather than the "flatten" method.

0.3% smaller filesize on the 1000 PNGs if alpha cleanup is used (before: 18685774, after: 18622472)

Change-Id: Ib0db9e7ccde55b36e82de07855f2dbb630fe62b1
2015-12-17 15:04:50 +01:00
Vincent Rabaud
47ddd5a4cc Move some codec logic out of ./dsp .
The functions containing magic constants are moved out of ./dsp .
VP8LPopulationCost got put back in ./enc
VP8LGetCombinedEntropy is now unrefined (refinement happening in ./enc)
VP8LBitsEntropy is now unrefined (refinement happening in ./enc)
VP8LHistogramEstimateBits got put back in ./enc
VP8LHistogramEstimateBitsBulk got deleted.

Change-Id: I09c4101eebbc6f174403157026fe4a23a5316beb
2015-12-17 07:03:25 +00:00
James Zern
357f455dec yuv_sse2: fix 32-bit visual studio build
src\dsp\yuv_sse2.c : C2719: 'in': formal parameter with
  __declspec(align('16')) won't be aligned
src\dsp\yuv_sse2.c : C2719: 'out': formal parameter with
  __declspec(align('16')) won't be aligned

Change-Id: Ifd79e33b35c70748faff19cd64eba4a8ffce5a5a
2015-12-16 15:04:36 -08:00
James Zern
b9d80fa4e8 configure: disable asserts by default
--enable-asserts can be used to avoid defining NDEBUG

Change-Id: I6216668e3f79f69bd8c453f0b36cecb3b585688e
2015-12-16 13:15:53 -08:00
Pascal Massimino
7badd3da4a cosmetic fix: sizeof(type) -> sizeof(*var)
Change-Id: I1a39fccfdcb9f0a4b9b025d3c9b522e8edfe7fd6
2015-12-16 18:29:14 +01:00
Vincent Rabaud
80ce27d34e Speed up 24-bit packing / unpacking in YUV / RGB conversions.
This implementation brings:
- an SSE implementation of packing / unpacking
- bigger buffers processed at the same time
The speedup is of 4% on lossy decoding (YUV to RGB), 0.5% on
lossy encoding (RGB to YUV was already optimized).

Change-Id: Iec677ee17f91c08614d1adab67c6df551925767f
2015-12-16 11:06:42 +01:00
Pascal Massimino
68eebcb0ff remove a TODO about rotation
(won't happen yet)

Change-Id: Ibb4ceccd1d7af0f76594e71062983dc311ba9aa2
2015-12-15 23:36:12 -08:00
Pascal Massimino
2dee2966df remove few obsolete TODO about aligned loads in SSE2
Change-Id: I3628602942ea2ce34dbcb85975d15afc1041f76c
2015-12-15 23:00:41 -08:00
Pascal Massimino
e0c0bb3480 remove TODO about unused ref_lf_delta[]
Change-Id: I54983c0dfc6927564143bad56bd2e4c4cdfefc0e
2015-12-15 22:57:53 -08:00
Pascal Massimino
9cf1cc2bd6 remove few TODO:
* 256 -> RD_DISTO_MULT
  * don't use TDisto for UV mode picking

Change-Id: I243148c716fe688b5c1b1fb9b7a6e58d0b5e6835
2015-12-15 22:52:12 -08:00
James Zern
791896455a Merge changes from topic 'demux-fragment-cleanup'
* changes:
  demux: remove GetFragment()
  demux: remove dead fragment related TODO
  demux, Frame: remove is_fragment_ field
  demux,WebPIterator: remove fragment_num/num_fragments
  demux: remove WebPDemuxSelectFragment
2015-12-16 06:45:00 +00:00
James Zern
47399f92b0 demux: remove GetFragment()
Change-Id: Ibea117b64ca91ccafde80411c10e0035dc3247f3
2015-12-15 19:26:13 -08:00
James Zern
d3cfb79ad6 demux: remove dead fragment related TODO
Change-Id: Iea6bf4742f803af46cd18f5d26843548e1b5cf00
2015-12-15 17:44:17 -08:00
James Zern
ab714b8ac4 demux, Frame: remove is_fragment_ field
this hasn't been set since parsing of the experimental chunk was
removed.
+ cleanup IsValidExtendedFormat(). is_fragmented has caused immediate
  failure since:
  4e2589f demux: restore strict fragment flag check

Change-Id: If9ecfc19556297100a6d5de1ba2cffdcbdc6c8fd
2015-12-15 17:43:23 -08:00
James Zern
b105921c7d yuv_sse2, cosmetics: fix indent
+ remove unneeded header

Change-Id: I3247378fd3315d95bb3345625d3575aa9e05c1b8
2015-12-15 17:29:04 -08:00
James Zern
466c92e829 demux,WebPIterator: remove fragment_num/num_fragments
these are remnants of an unused experiment

Change-Id: Ia08f9e6a895d5afff41a49f6e680fd76f024a5ee
2015-12-15 14:06:34 -08:00
James Zern
11714ff158 demux: remove WebPDemuxSelectFragment
this never had any affect, fragments were an abandoned experiment

Change-Id: Ifef15486a04cdb58f89f7faf56c31fd0a06e44ab
2015-12-15 13:02:32 -08:00
Pascal Massimino
c0f7cc47f2 fix for bug #280: UMR in next->bits
exit as early as possible upon error.

Change-Id: I4f7702228a146c31cab3c3d21079fa1fe6904cb2
2015-12-15 14:05:13 +01:00
James Zern
d4f9c2efd4 enc/Makefile.am: add missing headers
Change-Id: Ic29497f425909eda1a7f23e6c8e92bd4ca17d44b
2015-12-14 23:07:54 -08:00
James Zern
3f3ea2c539 demux: accept raw bitstreams
i.e., allow the WebP file header and leading chunk header to be
stripped.

Change-Id: I43818ed56a14881d9ad11aaddcd0bf5c0ca6aef3
2015-12-14 20:14:54 -08:00
Sriraman Tallam
b275e598b5 fix optimized build with -mcmodel=medium
INFO: From Compiling src/dsp/cpu.c:
src/dsp/cpu.c: In function 'x86CPUInfo':
src/dsp/cpu.c:36:3: inconsistent operand constraints in an 'asm'

With PIC and mcmodel=medium, the %rbx register must be saved and
restored which causes this problem.  This was also solved in GCC-4.9 with
this patch:
https://gcc.gnu.org/ml/gcc-patches/2012-12/msg01484.html

Tested:
Builds fine with this change.

Change-Id: Icca8eea7bf5af3ef9f17f6ae2886e3430143febf
2015-12-11 16:49:10 -08:00
Pascal Massimino
038a060dfc Merge "add disto-based refinement for UV mode (if method = 1 or 2)" 2015-12-11 23:57:25 +00:00
Vincent Rabaud
2835089d6a Provide an SSE2 implementation of CombinedShannonEntropy.
CombinedShannonEntropy takes 30% for lossless compression.
This implementation speeds up the overall process by 2 to 3 %.

Change-Id: I04a71743284c38814fd0726034d51a02b1b6ba8f
2015-12-11 15:12:19 +01:00
Pascal Massimino
e6c9351918 add disto-based refinement for UV mode (if method = 1 or 2)
This doesn't slow down much and give some quality improvement.

Change-Id: I5afbe62b9c3922b3ec1bf6538c68dcdb0f25d2e4
2015-12-11 03:15:59 -08:00
James Zern
04507dc91f Merge "fix undefined behaviour during shift, using a cast" 2015-12-11 06:32:45 +00:00
Vincent Rabaud
d3d163972f Optimize the heap usage in HistogramCombineGreedy.
The previous priority system used a heap which was too heavy to
maintain (what was gained from insertions / deletions was lost
due to a linear that still happened on the heap for invalidation).
The new structure is a priority queue where only the head is
ordered.

Change-Id: Id13f8694885a934fe2b2f115f8f84ada061b9016
2015-12-10 12:44:11 +01:00
Pascal Massimino
202a710b26 fix undefined behaviour during shift, using a cast
Change-Id: Ibca261d01092cecf8b37c54e9fcc920c9527c0a9
2015-12-10 08:09:23 +01:00
Pascal Massimino
14d27a46be improve method #2 by merging DistoRefine() and
SimpleQuantize()

it's now a single function, that reconstructs the intra4x4 block during the scan
The I4_PENALTY had to be adjusted.

Overall, result is better quality-wise (esp. at q < 50), and a tad faster too.

method #0, #1 and #3+ are unchanged

Change-Id: If262aeb552397860b3dd532df8df6b1357779222
2015-12-10 08:04:04 +01:00
Pascal Massimino
cb1ce9969c Merge "10% faster table-less SSE2/NEON version of YUV->RGB conversion" 2015-12-09 10:41:24 +00:00
Pascal Massimino
ac761a3738 10% faster table-less SSE2/NEON version of YUV->RGB conversion
* Precision is slightly different
* also implemented in SSE2 the missing WebPUpsamplers for MODE_ARGB, MODE_Argb, MODE_RGB565, etc.
* removing yuv_tables_sse2.h saved ~8k of binary size
* the mips32/mips_dsp_r2 code is disabled for now, since it has drifted away
* the NEON code is somewhat tricky

Change-Id: Icf205faa62cf46c2825d79f3af6725dc1ec7f052
2015-12-08 20:05:56 -08:00
Pascal Massimino
7eb01ff3e8 Merge "Improved alpha cleanup for the webp encoder when prediction transform is used." 2015-12-08 11:32:37 +00:00
Pascal Massimino
fb8c9106c7 Merge "introduce WebPMemToUint32 and WebPUint32ToMem for memory access" 2015-12-08 11:32:05 +00:00
James Zern
bd91af200a Merge "bit_reader: remove aarch64 BITS TODO" 2015-12-08 07:39:11 +00:00
Vincent Rabaud
6c702b81ac Speed up hash chain initialization using memset.
That gains 1% on lossy compression.

Change-Id: Ib9aa210194ed2f17eaff85b499b55cc4eb99ff11
2015-12-07 11:54:50 +01:00
James Zern
464ed10fa9 bit_reader: remove aarch64 BITS TODO
set BITS=56 in this case as it's mildly better on iOS (Xcode 7) and
Android (r10e + gcc-4.9)

Change-Id: I3265021a3572987d01edfafd5c1431207f07a170
2015-12-04 19:58:23 -08:00
Lode Vandevenne
6938111357 Improved alpha cleanup for the webp encoder when prediction transform is used.
Gives 0.9% smaller (2.4% compared to before alpha cleanup) size on the 1000 PNGs dataset:
Alpha cleanup before: 18856614
Alpha cleanup after: 18685802
For reference, with no alpha cleanup: 19159992

Note: WebPCleanupTransparentArea is still also called in WebPEncode. This cleanup still helps
preprocessing in the encoder, and the cases when the prediction transform is not used.

Change-Id: I63e69f48af6ddeb9804e2e603c59dde2718c6c28
2015-12-04 13:50:56 +00:00
Pascal Massimino
2c08aac81a introduce WebPMemToUint32 and WebPUint32ToMem for memory access
it uses memcpy() when unaligned memory write is tricky

Change-Id: I5d966ca9d19e9b43ac90140fa487824116982874
2015-12-04 13:43:01 +00:00
Vincent Rabaud
010ca3d10d Fix FindMatchLength with non-aligned buffers.
The 32-bit buffers are actually rarely 64-bit aligned.
The new solution uses memcmp and is alignment agnostic.
It is also slightly faster.

Change-Id: I863003e9ee4ee8a3eed25b7b2478cb82a0ddbb20
2015-12-04 10:19:58 +01:00
James Zern
e4a7eed49d cosmetics: fix indent
Change-Id: I8be5152115618016e1e2a59fbfec78d5282ce57e
2015-12-03 00:53:59 -08:00
James Zern
0837512964 Merge "Make a separate case for low_effort in CopyImageWithPrediction" 2015-12-03 08:46:31 +00:00
James Zern
aa2eb2d4a1 Merge "cosmetics: fix indent" 2015-12-03 08:44:54 +00:00
James Zern
b7551e90e1 cosmetics: fix indent
Change-Id: I67e5a0308a964bc37b2314d96f3691fc0550e9bc
2015-12-03 00:34:15 -08:00
Lode Vandevenne
5bda52d4e8 Make a separate case for low_effort in CopyImageWithPrediction
for more speed.

This gives a roughly a 1% speedup for low_effort. But actually this is a
preparation for the upcoming CL that changes RGB values of transparent pixels
based on prediction, which should not be done for low_effort because that would
slightly hurt its performance.

On 1000 PNGs, with quality 0, method 0:
Before:
Compression (output/input): 2.9120/3.2667 bpp, Encode rate (raw data): 36.034 MP/s
After:
Compression (output/input): 2.9120/3.2667 bpp, Encode rate (raw data): 36.428 MP/s

Change-Id: I5ed9f599bbf908a917723f3c780551ceb7fd724d
2015-12-03 00:22:50 -08:00
Scott Hancher
5ae220bef6 backward_references.c: Fixed compiler warning
"Implicit conversion loses integer precision: 'long' to 'int'."

Change-Id: I1aec7431f84123e5280447883eb80b84a3821d91
2015-12-02 23:51:06 -08:00
Pascal Massimino
363babe255 Merge "fix some warning about unaligned 32b reads" 2015-12-02 10:29:40 +00:00
Vincent Rabaud
a141178255 Optimization in hash chain comparison for 64 bit
Arrays were compared 32 bits at a time, it is now done 64 bits at a time.
Overall encoding speed-up is only of 0.2% on @skal's small PNG corpus.
It is of 3% on my initial 1.3 Mp desktop screenshot image.

Change-Id: I1acb32b437397a7bf3dcffbecbcd4b06d29c05e1
2015-12-01 13:01:57 +01:00
Vincent Rabaud
829bd14145 Combine Huffman cost and bit entropy into one loop
The same computation was done for both values: go over two buffers,
sum them up, and take a decision on the sum at each iteration.

MIPS32 code has been disabled for now, pending a code update.

Change-Id: I997984326f7092b3dbb8cfa1e524bd8132b2ab9d
2015-11-30 13:57:25 +01:00
James Zern
a7a954c851 Merge "lossless: make prediction in encoder work per scanline" 2015-11-25 20:40:44 +00:00
Pascal Massimino
61b605b407 Merge "fix of undefined multiply (int32 overflow)" 2015-11-25 08:39:33 +00:00
Lode Vandevenne
239421c5ef lossless: make prediction in encoder work per scanline
instead of per block. This prepares for a next CL that can make the
predictors alter RGB value behind transparent pixels for denser
encoding. Some predictors depend on the top-right pixel, and it must
have been already processed to know its new RGB value, so requires per
scanline instead of per block.

Running the encode speed test on 1000 PNGs 10 times with default
settings:
Before:
Compression (output/input): 2.3745/3.2667 bpp, Encode rate (raw data): 1.497 MP/s
After:
Compression (output/input): 2.3745/3.2667 bpp, Encode rate (raw data): 1.501 MP/s

Same but with quality 0, method 0 and 30 iterations:
Before:
Compression (output/input): 2.9120/3.2667 bpp, Encode rate (raw data): 36.379 MP/s
After:
Compression (output/input): 2.9120/3.2667 bpp, Encode rate (raw data): 36.462 MP/s

No effect on compressed size, this produces exactly same files. No
significant measured effect on speed. Expected faster speed from better
memory layout with scanline processing but slower speed due to needing
to get predictor mode per pixel, may compensate each other.

Change-Id: I40f766f1c1c19f87b62c1e2a1c4cd7627a2c3334
2015-11-25 00:38:27 -08:00
Pascal Massimino
f5ca40e05f fix of undefined multiply (int32 overflow)
the problem was the incorporation of the extra constant 1<<16 in the kC1
constant, to emulate the addition. It's now removed and the addition is
performed explicitly.

No real speed difference observed.

cf. issue #278

Change-Id: I2c6499031571d98afff392fb5ebe21a5fa60722d
2015-11-24 23:18:31 -08:00
James Zern
5cd2ef4c4a Merge changes from topic 'win-threading-compat'
* changes:
  Makefile.vc: enable WEBP_USE_THREAD for windows phone
  thread: use CreateThread for windows phone
  thread: use WaitForSingleObjectEx if available
  thread: use InitializeCriticalSectionEx if available
  thread: use native windows cond var if available
2015-11-25 00:41:02 +00:00
James Zern
d2afe974f9 thread: use CreateThread for windows phone
_beginthreadex is unavailable for winrt/uwp

Change-Id: Ie7412a568278ac67f0047f1764e2521193d74d4d
2015-11-23 23:00:40 -08:00
James Zern
0fd0e12bfe thread: use WaitForSingleObjectEx if available
Windows XP and up

Change-Id: Ie1a46a82722b8624437c8aba0aa4566a4b0b3f57
2015-11-23 23:00:05 -08:00
James Zern
63fadc9ffa thread: use InitializeCriticalSectionEx if available
Windows Vista / Server 2008 and up

Change-Id: I32c5b4e5384d614c5a821ef511293ff014c67966
2015-11-23 22:58:28 -08:00
James Zern
110ad5835e thread: use native windows cond var if available
Vista / Server 2008 and up. no speed difference observed.

Change-Id: Ice19704777cb679b290dc107a751a0f36dd0c0a9
2015-11-23 22:58:11 -08:00
James Zern
912c9fdf0c dec/webp: use GetLE(24|32) from utils
picks up the undefined behavior fix from the previous commit

BUG=278

Change-Id: Ie17bf7db827b1dc564194aadcf6c5e47f61681f7
2015-11-23 22:53:27 -08:00
James Zern
f1694481a9 utils/GetLE32: correct uint32 promotion
avoids undefined behavior when shifting an int by 24.

BUG=278

Change-Id: I7b5ad96715002c8f425d81789bb75f22c176ab76
2015-11-23 22:51:33 -08:00
James Zern
158763dea3 Merge "always call WebPInitSamplers(), don't try to be smart" 2015-11-23 22:23:21 +00:00
Pascal Massimino
3770f3bbb6 Merge "cleanup the YFIX/TFIX difference by removing some code and #define" 2015-11-23 20:47:42 +00:00
James Zern
a40f60a9b4 Merge "3% speed improvement for lossless webp encoder for low effort mode:" 2015-11-23 20:44:15 +00:00
Pascal Massimino
ed1c2bc655 always call WebPInitSamplers(), don't try to be smart
if FANCY_UPSAMPLING was not defined but io->fancy_upsampling was set,
then the call to WebPInitSamplers() was skipped -> boom.

Change-Id: Id63e2ecc09f532fbe2ec9936d9ce4b502ba8fac5
2015-11-23 09:53:52 -08:00
Lode Vandevenne
b8c44f1aa4 3% speed improvement for lossless webp encoder for low effort mode:
prevent updating unused histogram.

Benchmark on 1000 PNGs, 30 iterations, lossless, quality 0, method 0:
before: Compression (output/input): 2.9120/3.2667 bpp, Encode rate (raw data): 34.578 MP/s
after: Compression (output/input): 2.9120/3.2667 bpp, Encode rate (raw data): 36.980 MP/s

Change-Id: Id62759d4d111a6ba41c85c611a15d4f6ffc9f935
2015-11-22 09:12:54 +01:00
Pascal Massimino
997e103871 cleanup the YFIX/TFIX difference by removing some code and #define
no speed or output difference

Change-Id: I50bfb44f357e19431457b1cf9504a5a6bcce1945
2015-11-21 23:51:58 -08:00
Lode Vandevenne
1f9be97c22 Make discarding invisible RGB values (cleanup alpha) the default.
Rename the flag to exact instead of the opposite cleanup_alpha. Add the flag to
WebPConfig. Do the cleanup in the webp encoder library rather than the cwebp
binary, this will be needed for the next stage: smarter alpha cleanup for
better compression which cannot be done as a preprocessing due to depending on
predictor choices in the encoder.

Change-Id: I2fbf57f918a35f2da6186ef0b5d85e5fd0020eef
2015-11-21 12:32:32 -08:00
Pascal Massimino
b37b0179c5 fix for issue #275: don't compare to out-of-bound pointers
the original change triggered several internal API modifs.
This is to ensure that we're never computing pointer that can
possibly wrap around, or differences between pointers that can
overflow.

no observed speed difference

Change-Id: I9c94dda38d94fecc010305e4ad12f13b8fda5380
2015-11-20 16:25:17 -08:00
Pascal Massimino
21735e06f7 speed-up trivial one-symbol decoding case for lossless
We now consider 3 special cases:
 * htree-group has only 1 code (no bit is read from bitstream)
 * htree-group has few enough literal symbols, so that all the bit
   codes can fit into a look-up table of less than 64 entries
 * htree-group has a trivial arb literal (not GREEN!), like before

No overall speed change.

Change-Id: I6077fa0b7e5c31a6c67aa8aca859c22cc50ee254
2015-11-16 14:04:51 -08:00
Urvang Joshi
397863bd66 Refactor CopyPlane() and CopyPixels() methods: put them in utils.
Change-Id: I0e1533df557a0fa42c670e3b826fc0675c36e0a5
2015-11-13 11:39:22 -08:00
Urvang Joshi
6ecd72f845 Re-enable encoding of alpha plane with color cache for next release.
This is a revert of: https://chromium-review.googlesource.com/#/c/73607/

Change-Id: I7ec45277d73608d77d5e873290c6c185caa30c32
2015-11-13 07:15:19 +00:00
Pascal Massimino
775d3a373c remove unused fields from WebPDecoderOptions and WebPBitstreamFeatures
Change-Id: I92692d2975644dba10a7ac54f5c0f63ebd1580e6
2015-11-13 00:16:29 +01:00
Urvang Joshi
c13245c7d8 AnimEncoder: Add a GetError() method.
We now get error string instead of printing it.
The verbose option is now only used to print info and warnings.

Change-Id: I985c5acd427a9d1973068e7b7a8af5dd0d6d2585
2015-11-11 16:14:09 -08:00
Urvang Joshi
688b265d5e AnimDecoder API: Add a GetDemuxer() method.
Change-Id: Ic6a86e8788f1a3e21d1287ece36d80d1153b8f5a
2015-11-11 10:36:17 -08:00
Urvang Joshi
1aa4e3d6ba WebPAnimDecoder: add an option to enable multi-threaded decoding.
Change-Id: I3ff12bc07fc5a1b57a6950afa0e5f54a12985e75
2015-11-11 10:34:42 -08:00
Urvang Joshi
3584abca16 AnimDecoder: option to decode to common color modes.
Change-Id: I77ddab9abe3c4b35a9bcfe4c90b3e43d3aef166d
2015-11-10 09:27:59 -08:00
Urvang Joshi
945cfa3b7c mux.h does NOT need to include encode.h
It was needed earlier for WebPAnimEncoder API when it was using structs
like WebPConfig, but it only uses pointers to those now.

Change-Id: Ic0c144966421c678e8ef54b3fa81574bb2c9cd08
2015-11-09 15:40:09 -08:00
Pascal Massimino
bfd3fc02df ~2x faster SSE2 RGB24toY, BGR24toY, ARGBToY|UV
global effect is ~2% faster encoding from JPG source
and ~8% faster lossless-webp source decoding to PGM (e.g.)

Also revamped the YUVA case to first accumulate R/G/B value into 16b
temporary buffer, and then doing the UV conversion.
-> New function: WebPConvertRGBA32ToUV

Change-Id: I1d7d0c4003aa02966ad33490ce0fcdc7925cf9f5
2015-11-06 15:02:01 -08:00
Pascal Massimino
52fdbdfe66 extract some RGB24 to Luma conversion function from enc/ to dsp/
Just for RGB24/BGR24 for now, which are the hard-to-optimize ones.
SSE2 implementation coming next.

ConvertRowToY() should go into dsp/ too, at some point.

Change-Id: Ibc705ede5cbf674deefd0d9332cd82f618bc2425
2015-10-30 00:28:11 -07:00
Pascal Massimino
ab8c2300b6 add missing \n
Change-Id: I0c9236bbeef5868629d4dc02e3fae6e79ca55949
2015-10-30 00:02:27 -07:00
James Zern
5bd04a087c sync versions with 0.4.4
libwebp{,decoder} - 0.4.4
libwebp libtool - 5.4.0
libwebpdecoder libtool - 1.4.0

mux/demux - 0.2.2 (unchanged)
libtool - 1.2.0 (unchanged)

(cherry picked from commit 62864042c0)

Change-Id: I7d421dc47ad4d25a17450ce1b04562c5d58c596b
2015-10-28 23:43:40 -07:00
Pascal Massimino
8f1fcc15af Merge "Move ARGB->YUV functions from dec/vp8l.c to dsp/yuv.c" 2015-10-29 06:38:52 +00:00
Pascal Massimino
25bf2ce5cc fix some warning about unaligned 32b reads
on x86 + gcc, the assembly code is the same.

Change-Id: Ib0d23772ccf928f8d9ebcb0e157c0573d1f6a786
2015-10-28 15:51:55 -07:00
Pascal Massimino
fa8927efe4 Move ARGB->YUV functions from dec/vp8l.c to dsp/yuv.c
also switch to using ExtractAlpha() instead of hard-coding the loop.

The ARGBToY/UV functions are rather easy to port to SSE2 / NEON.

Change-Id: I8f1346a9ca427a36ce2d6c848369ca7964d8b3c7
2015-10-28 01:45:08 -07:00
James Zern
f7c507a5f8 Merge "remove unnecessary #include "yuv.h"" 2015-10-27 21:54:21 +00:00
Pascal Massimino
14e4043b67 remove unnecessary #include "yuv.h"
Change-Id: I8b277433663e063e7a182f66818afec1654a39bd
2015-10-27 01:27:36 -07:00
Pascal Massimino
d64d376c2a change WEBP_ALIGN_CST value to 31
(and make dec/frame.c use the common macros too)

Change-Id: Ie44dbd82e067934b17ca3ffba4dd45ab0d61d3f6
2015-10-19 21:39:55 +00:00
James Zern
f717b82864 vp8l.c, cosmetics: fix indent after 95509f9
95509f9 large re-organization of the delta-palettization code

Change-Id: I9d27f15cb6072a2bd1dd593d53db5b2dd3c30133
2015-10-19 12:28:57 -07:00
James Zern
927ccdc43b Merge "fix alignment of allocated memory in AllocateTransformBuffer" 2015-10-19 19:15:04 +00:00
Pascal Massimino
fea94b2b36 fix alignment of allocated memory in AllocateTransformBuffer
likely to avoid unaligned reads in the future

Change-Id: I434ba17c139ad6e190ebd9b909b241c6c6f1e7f8
2015-10-18 13:09:22 -07:00
Pascal Massimino
5aa8d61f75 Merge "MIPS: rescaler code synced with C implementation" 2015-10-17 07:52:36 +00:00
Djordje Pesut
e7fb267df7 MIPS: rescaler code synced with C implementation
Change-Id: I4cec115d3fe6f3f825084d7388249694c500256a
2015-10-17 00:16:27 -07:00
Pascal Massimino
93c86ed5b9 Merge "format_constants.h: MKFOURCC, correct cast" 2015-10-17 05:45:05 +00:00
James Zern
5d791d2603 format_constants.h: MKFOURCC, correct cast
'd' should be promoted to uint32 before shifting by 24

Change-Id: I6212661af3802709b0098af8402ed73a0d9373ee
2015-10-16 18:43:40 -07:00