A minor improvement for arm targets with ndk r27/gcc-13 in H/VFilter8 (a
couple fewer moves w/aarch64) and much better vectorization of
DitherCombine8x8_C in most targets.
This only affects non-vector pointers; any vector pointers are left as a
follow up.
Change-Id: I03e73e6d6404261bb8408a9ae76a4b6ef142f8f0
on SetResidualCoeffs_*. This results in some minor code reordering when
targeting arvm7 with ndk r27 and other recent versions of clang. No
changes in the x86 compilations with clang-16 / gcc-13.
This only affects non-vector pointers; any vector pointers are left as a
follow up.
Change-Id: I7c3554ece848fafbc5ac9c4944f1dc85129f6fd8
histogram_symbols is converted to uint32_t and <<8 into
histogram_argb.
Using a uint32_t buffer from the start prevents copying and
converting the data.
Change-Id: I245003a6a0f048c31519afa25a600d4479e762e3
This is useful for a forward change that will improve compression.
It splits the residual computation and the best predictor
selection.
The only downside is that more memory is allocated: we had 2
histograms before, we now have 14, but this is necessary for the
later change. Still, this is nothing compared to what is done
later in the pipeline in HistogramSetTotalSize where the number of
histograms created is the number of pixels in the subsampled image.
Change-Id: If03501a26f00462dd1809daa6e9314abd180945d
WebPCleanupTransparentAreaLossless() was renamed to
WebPReplaceTransparentPixels() in:
55a080e5 Add WebPReplaceTransparentPixels() in dsp
Change-Id: I91e32574e6add2748c0655146f100eb2b40498b2
In practice, this can never happen because:
- 'streak' is at most as long as a histogram
- 'count' counts the number of streaks
'streak' and 'count' are therefore at most as big as the histogram
length which is at most the max of VP8LHistogramNumCodes,
which is 256+24+(1<<10).
Change-Id: I31c8834543479c8a9260732313ea26b045519515
The row parameter became a constant in:
2102ccd update the Unfilter API in dsp to process one row independently
num_rows is always equal to height.
Change-Id: Ie43dc5ef222e442ce8c92766da0b9824ccbca236
The inverse parameter became a constant in:
2102ccd update the Unfilter API in dsp to process one row independently
The row parameter to these functions is in a similar state; it will be
removed in a follow up.
Change-Id: I94cd8babe0e42474ff794ba5fa29dd48039de5f8
Replace vmovl_u8 -> s16 + signed vaddq with unsigned vaddw.
No change in assembly with clang-16 (armv7 & aarch64) and gcc-13
(aarch64). armv7 gcc-13 had kept the vmovl instructions, those are now
gone.
Change-Id: Ibb4fbdd5680d3e9dd06933c100528a6f363de472
This brings minor size improvements because repetitive values in
the transform images are easily explainable through LZ77. Still,
it makes an upcoming pull request a bit more stable.
This is a rollforward of
7ec51c591608333fd5e6fdddbee16e1d65bef3db
ee26766a89a149afe5f73fdcb8f2493ec808f2b7
Change-Id: I254ab3ccd5053344f89099280e8d994ecd55aee0
This needs to be done with signed saturation as the sum may be negative.
fixes mismatch with C code after:
3bfb05e3 Add AArch64 Neon implementation of Intra16Preds
Change-Id: I017e939d7155cc3489ceb76fc8ad50ac9917f23d
This needs to be done with signed saturation as the sum may be negative.
fixes mismatch with C code after:
baa93808 Add AArch64 Neon implementation of Intra4Preds
Change-Id: I190c3d7f78cfd2c7ae83fb7059de41e307abda36
* changes:
Use QuantizeBlock_NEON for VP8EncQuantizeBlockWHT on Arm
Add AArch64 Neon implementation of Intra16Preds
Add AArch64 Neon implementation of Intra4Preds
Add a Neon implementation of Intra16Preds for use on 64-bit Arm
platforms. (This implementation cannot be used on 32-bit Arm
platforms as it makes use of a number of AArch64-only Neon
instructions.)
Change-Id: I24c67cd54b66307e3924fd332c2795fd7422f082
Add Neon implementation of Intra4Preds for use on 64-bit Arm
platforms. (The same implementation cannot be used for 32-bit Arm
platforms as it uses a number of AArch64-only Neon instructions.)
Change-Id: Id781e7614f4e8e876dfeecd95cfc85e04611d8c6
The lossless encoding speed-ups are:
- up to 1% with default parameters
- up to 4% in cruncher mode: -q 100 -m 6
Change-Id: Id92d4bad0b0a2c28c8aa9ff5280eea5717017f30
This reverts commit ee26766a89a149afe5f73fdcb8f2493ec808f2b7.
This change also reverts the parent.
Revert "Increase the transform bits if possible."
This reverts commit 7ec51c591608333fd5e6fdddbee16e1d65bef3db.
These changes result in non-lossless encodes.
Bug: oss-fuzz:69231, oss-fuzz:69109, oss-fuzz:69208
Bug: b:341475869, b:342743143
Change-Id: Ia28f558992e0aa6f024af1ff66da52e0a5e26fa3
A 3 by 1 image would not have its 1st and 3rd lines compared at
the second iteration.
BUG=oss-fuzz:69208
Change-Id: I9213e73995d31907f358310a0b7d5ebb21c1f8b2
This brings minor size improvements because repetitive values in
the transform images are easily explainable through LZ77. Still,
it makes an upcoming pull request a bit more stable.
This is 971a03d8204c3ab1e12db8ae68ba6f919c615ade with a fix to
not forget to analyze the end of the line.
A const has also been added to match VP8LColorSpaceTransform's
signature.
Change-Id: Iae03216fef298c7abc96a766f8a799552b05ade5
This reverts commit 971a03d8204c3ab1e12db8ae68ba6f919c615ade.
Reason for revert:
This creates non-lossless encodes.
Original change's description:
> Increase the transform bits if possible.
>
> This brings minor size improvements because repetitive values in
> the transform images are easily explainable through LZ77. Still,
> it makes an upcoming pull request a bit more stable.
>
> Change-Id: I1c7135675cb59b5e27ca960738d74465f10d0deb
Bug: oss-fuzz:69109, b:341475869
Change-Id: I3b9f21a5498735eb3681e62fb35bf9f9c2ed4f9f
This brings minor size improvements because repetitive values in
the transform images are easily explainable through LZ77. Still,
it makes an upcoming pull request a bit more stable.
Change-Id: I1c7135675cb59b5e27ca960738d74465f10d0deb