2-5% faster trellis with clang/MacOS

(and ~2-3% on ARM)

We don't need to store cost/score for each node, but only for
the current and previous one -> simplify code and save some memory.

Also made the 'Node' structure tighter.

Change-Id: Ie3ad7d3b678992b396242f56e2ac387fe43852e6
This commit is contained in:
skal
2014-03-22 10:20:42 +01:00
parent 80e218d43a
commit d1b33ad58b
2 changed files with 54 additions and 35 deletions

View File

@ -160,6 +160,8 @@ extern const int VP8I4ModeOffsets[NUM_BMODES];
#define I4TMP (6 * 16 * BPS + 8 * BPS + 8)
typedef int64_t score_t; // type used for scores, rate, distortion
// Note that MAX_COST is not the maximum allowed by sizeof(score_t),
// in order to allow overflowing computations.
#define MAX_COST ((score_t)0x7fffffffffffffLL)
#define QFIX 17