avoiding triplets of data should make it easier to write SSE2 versions.
FilterRow() can now filter all input in one single pass
-> conversion is 15-20% faster (but still overall slow compared to -pre 0)
Change-Id: I14c3215e672fdecde7ec80394e814bdc7445019f
we don't need to centralize best_uv[] since target_uv[] and best_rgb_uv[]
are already centralized. The diff 'W' was just in the ~[-2,2] range, so
we can ignore the correction.
Overall speed-impact is not large, though. Around ~4% faster conversion.
Output with -pre 4 is expected to be slightly different
Change-Id: Ib59f033955577c49b084d0560108020f42d84102
also: remove the useless clipping in StoreGray()
For speed reason, the 'gray' plane was initialized with the same
value for 2x2 block. But in some cases (underlying camera noise, e.g.),
it could lead to instability during iteration, noise amplification,
and visible banding.
Using a precise (but slower) initialization solves the issue, and
since the convergence is faster, we might actually gain some speed.
Change-Id: I81c42101497e7096a8f60289d710f5a3bcb0ddea
We usually need at least 2 iterations to converge
(and usually not much more after that). Only 1 was not enough.
Change-Id: Iaf802ea81afa2596f4ba045c92f5eaff61623b7b
global effect is ~2% faster encoding from JPG source
and ~8% faster lossless-webp source decoding to PGM (e.g.)
Also revamped the YUVA case to first accumulate R/G/B value into 16b
temporary buffer, and then doing the UV conversion.
-> New function: WebPConvertRGBA32ToUV
Change-Id: I1d7d0c4003aa02966ad33490ce0fcdc7925cf9f5
Just for RGB24/BGR24 for now, which are the hard-to-optimize ones.
SSE2 implementation coming next.
ConvertRowToY() should go into dsp/ too, at some point.
Change-Id: Ibc705ede5cbf674deefd0d9332cd82f618bc2425
we look at average global improvement and stop when things are
moving slow, or when we had a quite good first iteration already
(means: the picture is "not difficult")
Change-Id: I8ab7d100353039b5b32bb5fac3fe03c8440c78d5
kGammaFix is now only defined with USE_GAMMA_COMPRESSION;
fixes:
use of undeclared identifier 'kGammaFix'
Change-Id: Ib1e2f410eff9b83be065894f88181f91dd2776e1
* use the same TFIX == YFIX precision (2bits)
* use int instead of float in LinearToGammaF()
output is visually equivalent. Code is a little faster.
Change-Id: Ie3cfebca351dbcbd924b3d00801d6523dca6981f
inline function MakeARGB32 calls changed to call
via pointers to functions which make (a)rgb for
entire row
Change-Id: Ia4bd4be171a46c1e1821e408b073ff5791c587a9
rightmost pixel was missing a copy, which could lead to invalid read.
Also added a lower dimension of 4, below which we use the regular conversion.
This is to prevent corner cases, in addition to not being overkill.
Change-Id: Iac12e7a3d74590f12fe8eeb1830b9891e61439f6
with a special case for dithering==0., it gets somewhat faster on x86
thanks to inlining.
Also, less macros.
Change-Id: Ic2f2bf6718310743bb40cef2104fa759a073e6d5
New function: WebPPictureSmartARGBToYUVA()
This implement smart RGB->YUV conversion.
This is rather undocumented for now, and is triggered using '-pre 4'
preprocessing option.
This is slow-ish and use quite some memory, but should be improvable.
This is somehow a usable beta version.
Change-Id: Ia50a8c30134e4cab8a7d3eb70aef13ce1f6187a1