skal
ca1bfff53f
Merge "5-10% encoding speedup with faster trellis (-m 6)"
2014-03-03 13:09:17 -08:00
skal
5aeeb087d6
5-10% encoding speedup with faster trellis (-m 6)
...
mostly by:
- storing a single rd-score instead of cost / distortion separately
- evaluating terminal cost only once
- getting some invariants out of the loops
- more consts behind fewer variables
Change-Id: I79451f3fd1143d6537200fb8b90d0ba252809f8c
2014-03-03 22:07:06 +01:00
James Zern
82ae1bf299
cosmetics: normalize VP8GetCPUInfo checks
...
- use '!= NULL'
+ dec_neon/STORE_WHT: align '\'s
Change-Id: I0f0ce49bd9c58e771bafb24c51c070d5ebd77e53
2014-02-28 18:47:41 -08:00
James Zern
e3dd9243cb
Merge "Refactor GetBestPredictorForTile for future tuning."
2014-02-28 18:39:27 -08:00
Vikas Arora
206cc1be5a
Refactor GetBestPredictorForTile for future tuning.
...
This change doesn't impact compression gain or compression speed.
Change-Id: Ia87d8a46c6f1ce0f8974178d75a6b0ba0a6e3696
2014-02-28 11:30:23 -08:00
James Zern
3cb8406262
Merge "speed-up trellis quant (~5-10% overall speed-up)"
2014-02-27 14:34:01 -08:00
Pascal Massimino
b66f2227c1
Merge "lossy encoding: ~3% speed-up"
2014-02-27 11:42:16 -08:00
Pascal Massimino
4287d0d49b
speed-up trellis quant (~5-10% overall speed-up)
...
store costs[] in node instead of context
Change-Id: I6aeb0fd94af9e48580106c41408900fe3467cc54
also: various cosmetics
2014-02-27 00:06:00 -08:00
Pascal Massimino
390c8b316d
lossy encoding: ~3% speed-up
...
incorporate non-last cost in per-level cost table
also: correct trellis-quant cost evaluation at nodes
(output a little bit different now). Method 6 is ~4% faster.
Change-Id: Ic48bd6d33f9193838216e7dc3a9f9c5508a1fbe8
2014-02-26 05:52:24 -08:00
James Zern
9a463c4a51
Merge "dec_neon: convert TransformWHT to intrinsics"
2014-02-25 14:36:44 -08:00
pascal massimino
e8605e9625
Merge "dec_neon: add ConvertU8ToS16"
2014-02-25 08:56:17 -08:00
Djordje Pesut
4aa3e4122b
MIPS: MIPS32r1: rescaler bugfix
...
Change-Id: I6de6e2488bd5bd58c1f705739e4467feb211f8b4
2014-02-25 14:36:48 +01:00
Vikas Arora
c16cd99aba
Speed up lossless encoder.
...
Speedup lossless encoder by 20-25% by optimizing:
- GetBestColorTransformForTile: Use techniques like binary search and
local minima search to reduce the search space.
- VP8LFastSLog2Slow & VP8LFastLog2Slow: Adding the correction factor for
log(1 + x) and increase the threshold for calling the approximate
version of log_2 (compared to costly call to log()).
Change-Id: Ia2444c914521ac298492aafa458e617028fc2f9d
2014-02-21 22:13:50 -08:00
James Zern
9d6b5ff1e6
dec_neon: convert TransformWHT to intrinsics
...
Change-Id: I34dc1d75ddebab131cfed031764117e3f7b75c6b
2014-02-21 11:23:46 -08:00
James Zern
2ff0aae2fe
dec_neon: add ConvertU8ToS16
...
Change-Id: Ifc4fb8e7f862e72154d2f2739811b1022d2b9416
2014-02-20 15:35:33 -08:00
skal
77a8f91981
fix compilation with USE_YUVj flag
...
(not that we'll ever need it, but...)
Change-Id: I9af993c62372097846c5ca6bae8362b59c3502dc
2014-02-20 13:23:18 +01:00
James Zern
4acbec1bef
Merge changes I3b240ffb,Ia9370283,Ia2d28728
...
* changes:
dec_neon: TransformAC3: work on packed vectors
dec_neon: add SaturateAndStore4x4
dec_neon.c: convert TransformDC to intrinsics
2014-02-19 14:47:33 -08:00
James Zern
2719bb7e98
dec_neon: TransformAC3: work on packed vectors
...
pack 2 rows in 1 vector similar to TransformDC
Change-Id: I3b240ffb4f51a632b5c8c2daf54d938333ed4b0d
2014-02-18 19:47:20 -08:00
James Zern
b7b60ca16c
dec_neon: add SaturateAndStore4x4
...
converts 2 s16 vectors to 2 u8 and store to uint8_t destination;
TransformAC3 can reuse this after a rework
Change-Id: Ia9370283ee3d9bfbc8c008fa883412100ff483d0
2014-02-18 19:42:35 -08:00
Pascal Massimino
b7685d73fe
Rescale: let ImportRow / ExportRow be pointer-to-function
...
Separate the C version from the MIPS32 version and have run-time
initialization during RescalerInit()
Change-Id: I93cfa5691c073a099fe62eda1333ad2bb749915b
2014-02-17 00:58:17 -08:00
James Zern
e02f16ef45
dec_neon.c: convert TransformDC to intrinsics
...
no noticeable difference in performance
Change-Id: Ia2d287289c3865ddd0fc99edaf7a030778aa7025
2014-02-14 12:11:58 -08:00
skal
9cba963f9a
add missing file
...
Change-Id: I17eab2fedc64ee3bba941a592ecef765fcd2b402
2014-02-13 21:56:19 -08:00
skal
8992ddb756
use static clipping tables
...
(shared with mips32)
removed abs1[] table along the way
sub-1% speed-up, but still...
Change-Id: I8c29a8a0285076cb3423b01ffae9fcc465da6a81
2014-02-13 19:32:59 -08:00
skal
0235d5e44b
1-2% faster quantization in SSE2
...
C-version is a bit faster too (sub-1% faster on ARM)
Change-Id: I077262042f1d0937aba1ecf15174f2c51bf6cd97
2014-02-13 15:55:30 -08:00
Pascal Massimino
b2fbc36c26
fix VC12-x64 warning
...
"conversion from 'vp8l_atype_t' to 'uint8_t', possible loss of data"
Change-Id: I7607a688d16aca8fae8ce472450f8423c48f3a26
2014-02-12 12:19:32 -08:00
James Zern
6e37cb942f
Merge "cosmetics: backward_references.c: reindent after a7d2ee3"
2014-02-11 16:28:55 -08:00
James Zern
a42ea9742a
cosmetics: backward_references.c: reindent after a7d2ee3
...
a7d2ee3
Optimize cache estimate logic.
Change-Id: I81dd1eea49f603465dc5f3afae8a101e5205e963
2014-02-11 15:52:22 -08:00
skal
6c32744214
Merge "fix missing __BIG_ENDIAN__ definition on some platform"
2014-02-11 14:51:11 -08:00
skal
a8b6aad155
fix missing __BIG_ENDIAN__ definition on some platform
...
e.g: mips-gcc doesn't define __BIG_ENDIAN__
Change-Id: Ic06bf453164ddddc69a523e7845a4993e14a1af2
2014-02-11 14:43:44 -08:00
Vikas Arora
fde2904b8a
Increase initial buffer size for VP8L Bit Writer.
...
Increase the initial buffer size for VP8L Bit Writer from 4bpp to 8bpp.
The resize buffer is expensive (requires realloc and copy) and this additional
memory (0.5 * W * H) doesn't add much overhead on the lossless encoder.
Change-Id: Ic1fe55cd7bc3d1afadc799e4c2c8786ec848ee66
2014-02-11 11:13:21 -08:00
Vikas Arora
a7d2ee39be
Optimize cache estimate logic.
...
Optimize 'VP8LCalculateEstimateForCacheSize' for lower quality ranges (Q < 50).
The entropy is generally lower for higher cache_bits, so start searching from
higher cache_bits and settle for a local minima, instead of evaluating all
values.
This speeds up the lossless encoding at lower qualities by 10-15%.
Change-Id: I33c1e958515a2549f2e6f64b1aab3f128660dcec
2014-02-11 10:59:01 -08:00
pascal massimino
7fb6095b03
Merge "dec_neon.c: add TransformAC3"
2014-02-11 09:17:34 -08:00
skal
bf182e837e
VP8LBitWriter: use a bit-accumulator
...
* simplify the endian logic
* remove the need for memset()
* write 16 or 32 at a time (likely aligned)
Makes the code a bit faster on ARM (~1%)
Change-Id: I650bc5654e8d0b0454318b7a78206b301c5f6c2c
2014-02-11 09:12:45 -08:00
Djordje Pesut
3f40b4a581
Merge "MIPS: MIPS32r1: clang macro warning resolved"
2014-02-11 00:06:36 -08:00
Urvang Joshi
1684f4ee37
WebP Decoder: Mark some truncated bitstreams as invalid
...
Specifically, check for truncated RIFF and/or VP8(L) chunks.
For more context, see:
https://code.google.com/p/webp/issues/detail?id=185
Change-Id: I91ca2dbf05080660fbc513244fc53adc57fc04b5
2014-02-10 16:35:27 -08:00
Jovan Zelincevic
acbedac475
MIPS: MIPS32r1: clang macro warning resolved
...
.set macro - Enables the expansion of macro instructions.
Change-Id: I1e44fe056798aeff803cc97171724d21da1fc2bf
2014-02-10 06:50:11 -08:00
James Zern
228e4877ab
dec_neon.c: add TransformAC3
...
based on SSE2 version
Change-Id: Icc6782955253c98e83d5984153b596ef5f1c0d34
2014-02-08 12:47:54 -08:00
James Zern
393f89b763
Android.mk: avoid gcc-specific flags with clang
...
Change-Id: Idb1ed2bb1dd5d9f65ca07185ef9838e587dc4e64
2014-02-07 20:31:44 -08:00
skal
32aeaf115a
revamp VP8LColorSpaceTransform() a bit
...
-> remove the 'color_transform' multiplier, use more constants, etc.
This function is particularly critical, mostly because of
GetBestColorTransformForTile().
Loop is a bit faster (maybe ~1%)
Change-Id: I90c96a3437cafb184773acef55c77e40c224388f
2014-02-05 10:37:06 +01:00
James Zern
0c7cc4ca20
Merge "Don't dereference NULL, ensure HashChain fully initialized"
2014-02-03 22:58:37 -08:00
Scott Talbot
391316fee2
Don't dereference NULL, ensure HashChain fully initialized
...
Found by clang's static analyzer, they look validly uninitialized
to me.
Change-Id: I650250f516cdf6081b35cdfe92288c20a3036ac8
2014-02-03 21:16:59 -08:00
skal
926ff40229
WEBP_SWAP_16BIT_CSP: remove code dup
...
and prepare for potentially supporting both RGBA4444 and BARG4444
Change-Id: If5200289bc6338757a2ceb2df1a19de732595052
2014-02-03 13:24:33 -08:00
Vikas Arora
1d1cd3bbd6
Fix decode bug for rgbA_4444/RGBA_4444 color-modes.
...
The WEBP_SWAP_16BIT_CSP flag needs to be honored while filling the Alpha (4 bits)
data in the destination buffer and while pre-multiplying the alpha to RGB colors.
Change-Id: I3b07307d60963db8d09c3b078888a839cefb35ba
2014-02-03 09:20:54 -08:00
Pascal Massimino
939e70e7d3
update AUTHORS file
...
Change-Id: I50e8f20016097cf63eaeb46a8588203a2165b161
2014-01-31 12:08:05 -08:00
James Zern
8934a622ac
cosmetics: *_mips32.c
...
indent, comments, unused includes
Change-Id: Id0aabc52d05bb633f62aec022155ec27699cf5a0
2014-01-30 18:03:48 -08:00
Djordje Pesut
dd438c9a7d
MIPS: MIPS32r1: Optimization of some simple point-sampling functions. PATCH [6/6]
...
Change-Id: I2020e71e9be5d17d4bf67cabf6c470ca43d5d838
2014-01-29 15:37:31 +01:00
Djordje Pesut
53520911c3
Added support for calling sampling functions via pointers.
...
Change-Id: Ic4d72e6b175a6b27bcdcc8cd97828e44ea93e743
2014-01-29 15:32:35 +01:00
Jovan Zelincevic
d16c69749b
MIPS: MIPS32r1: Optimization of filter functions. PATCH [5/6]
...
Change-Id: Ifbd305e0514f09a587db02c3970f22190808503a
2014-01-29 15:03:45 +01:00
Djordje Pesut
04336fc7f8
MIPS: MIPS32r1: Optimization of function TransformOne. PATCH [4/6]
...
Change-Id: I5b98e2de940977538cf91bfa2128f4d1daa5c170
2014-01-28 20:10:43 -08:00
Djordje Pesut
92d8fc7dd4
MIPS: MIPS32r1: Optimization of function WebPRescalerImportRow. PATCH [3/6]
...
Change-Id: I32339a8d2d03f1a8d8638563d2b2c9e3a13a4909
2014-01-29 00:12:41 +01:00