Taking the comments from the internal ParseHeadersInternal.
(which is called by GetFeatures).
BUG=webp:411
Change-Id: I9999b4a183805e2db1456610a30024a0d8be4d00
It's safer to clip the passed param instead of doing 32b arithmetic
and clipping afterward.
Output is unchanged, but code no longer rely on UB.
Change-Id: Ia5b4de6e8863981753f1d17f062965a6a5da5bed
dst->cur_ was not set.
The bug occurred only with several VP8LBitWriter instances
(thread_level > 0) and in 32-bit (in 64-bit, src->cur_ was
always 0 in VP8LBitWriterClone()).
BUG=chromium:917029
Change-Id: I0d94a3d8e62b247fd616eebe1009868dc8a5ed2e
Move IsFlat to its own header. This allows it to continue to be
inlined. Using the RTCD and creating a distinct function slows down arm
builds.
flower mug
C 3.59 2.12
NEON 3.47 2.01
BUG=b/118740850
Change-Id: Id77e8f76d9e9790c498806e7070bbe37c10bc2e9
thresh is defined by FLATNESS_LIMIT_* which ranges from 2-10.
score_t is int64 which is a touch overkill.
Change-Id: I308bd440bf11643665d3642fe361495a257b6e52
Direct copy of sse2. Slight improvement because neon has
abs().
flower.ppm had minimal improvement. Somewhat expected because
GetResidualCost_C is only ~3.6%
mug.ppm had a better improvement because GetResidualCost_C is
almost 9%.
C 2.150
NEON 2.130
BUG=b/118740850
Change-Id: Ibc0dd97a81596635f5599cf568205974b4fd2597
Much faster with aarch64. Still somewhat faster without vmaxv.
C: 3.700s
ArmV7: 3.675
aarch64: 3.600
BUG=b/118740850
Change-Id: I3be852da89633eca4bddce443c87f5e4a2f55868
The old code simply did not make sense.
The effect is that the pair would be popped from the
queue no matter what; as the queue is small, it does
not matter that much on the results.
But it will matter for a later CL.
Change-Id: If50c9fa9d7f3ac3c48bb7336d81479287d4944c4
(cherry picked from commit 485ff86fbb)
The old code simply did not make sense.
The effect is that the pair would be popped from the
queue no matter what; as the queue is small, it does
not matter that much on the results.
But it will matter for a later CL.
Change-Id: If50c9fa9d7f3ac3c48bb7336d81479287d4944c4
Also, histograms in a HistogramSet can be initialized all
at once.
Change-Id: Ibbfa6034dce58dca8bb9113487e2ae507222ce7d
(cherry picked from commit 6752904b2f)
When histograms are empty, it is easy to add them.
They should also not be considered when merging histograms
(it is a waste of CPU).
This does not change the compression performance,
just the speed.
Change-Id: I42c721ca0f9c5ea067e73b792aa3db6d5e71d01f
(cherry picked from commit decf6f6b87)
When histograms are empty, it is easy to add them.
They should also not be considered when merging histograms
(it is a waste of CPU).
This does not change the compression performance,
just the speed.
Change-Id: I42c721ca0f9c5ea067e73b792aa3db6d5e71d01f
We should be using 'floor' when doing the final divide.
-> new MACRO is MULT_FIX_FLOOR()
XXX*** Mips code is DISABLED for now ***XXX
I'll update and re-enable it in a later
patch, since this code needs some refactoring first.
BUG=oss-fuzz:9179
Change-Id: Ic0693cdca4e71f5beab1029475e35c4d06b12d13
* Assert chunklist
* fix potential memory leak and
* fix null pointer access
There should not be several alpha_ or img_ chunks in SynthesizeBitstream. Use ChunkListDelete in MuxImageRelease to be safe.
A null pointer accessed in WebPMuxPushFrame triggered a harmless runtime error.
Change-Id: I3027f8752093652bd41f55e667d041c0de77ab6e
The chunk list only has two operations: append and set
to one element. The two operations are split and the append
one is sped up by storing the last element.
Corrupted data could make a very long list to search through.
BUG=oss-fuzz:9190
Change-Id: I1aa813ca629df29efaa3b46dbd4c4c42dbeaa34c
The standard allows for Huffman images with any coefficients.
Hence potentially big memory allocations. The previous workaround
was "trying" things out, the new one is more rigorous and
only allocates what is needed, modifying the Huffman image
to contain the minimal set of coefficients.
BUG=oss-fuzz:8623,oss-fuzz:9111,oss-fuzz:9134
Change-Id: I6a972e90e4ae509c15cb41ee22c58b775fa3f4aa
idec_dec.c, DecodeRemaining: Set decoder state to ERROR to prevent VP8ExitCritical to be called again
Change-Id: Id5f893f45c348e1c529680d930e640f780a73d4c
treat an ANMF chunk containing multiple VP8/VP8L file as malformed.
fixes a WebPMuxImage::img_ leak.
Though the invalid free in #9106 was avoided in (ubsan):
be738c6d muxread,ChunkVerifyAndAssign: validate chunk_size
that file would still cause a leak similar to #9099.
BUG=oss-fuzz:9099,oss-fuzz:9106
Change-Id: Ib873446a1188afeeb2fe5d53a86b75e0c5de9573
(we also limit radius based on height too, for good measure, although it's not an asan bug)
fixes oss-fuzz issue #9105
Change-Id: Ie0d79dd81480dc4e2b653b7e992e5cdcd3dfa834
before accounting for padding which might overflow if chunk_size is >
MAX_CHUNK_PAYLOAD.
BUG=webp:387,webp:388
Change-Id: I3985b8817ed4faaec0629102c5333c228a0e9c98
previously when adjusting size down based on a smaller riff_size the
checks were insufficient to prevent 'size -= RIFF_HEADER_SIZE' from
rolling over causing ChunkVerifyAndAssign to over read. the new checks
are imported from demux.c.
BUG=webp:386
Change-Id: If863c4a9892977b9ade7dd894392a0ecae13775c
this internalizes the init checks and provides stronger synchronization
with pthreads when available while still allowing VP8GetCPUInfo to be
modified (mostly for testing purposes). windows is left as is since a
critical section or mutex would cause a leak.
Change-Id: Ieb997e014f2805c0ae39c16f13337663521356f4
(cherry picked from commit d77bf512bd)
the 'accum' variable can be larger than 15b for large
rescale values.
Assert triggered:
src/dsp/rescaler_sse2.c:249: RescalerExportRowExpand_SSE2: Assertion `v >= 0 && v <= 255' failed.
src/dsp/rescaler_sse2.c:350: RescalerExportRowShrink_SSE2: Assertion `v >= 0 && v <= 255' failed.
-> fall back to C implementation in this case for now
Change-Id: I7ea1cb72301cafc1459be403f6a6f4e3cbc89bb1
Control Flow Integrity [1] indirect call checking verifies that function
pointers only call valid functions with a matching type signature. This
change eliminates function pointer casts that were causing cfi-icall
failures.
[1] https://www.chromium.org/developers/testing/control-flow-integrity
BUG=chromium:827826
Change-Id: I5db021d06390a6cefd670fdd2f0d34c9e530465e
(cherry picked from commit 978eec2507)
Output is <.1% difference in size, randomly.
Speed is 30-50% faster (-m 0 -sharp_yuv).
It also gives the exact same output on ARM and x86, because floats
are no longer used.
Change-Id: Id0f0aa748cc4fc0b82bac1fc5ca954775a0a1b7c
do_copy is a loop invariant, but based on a variable parameter; it would
only be extracted if Import was inlined.
Change-Id: Id5b4a1a4a83a4f2083444da4934e4c994df65b44
for q<=98, we always enable error diffusion.
+ reduce storage 2x by using int8_t
+ make the error diffusion more robust
BUG=webp:340,308
Change-Id: I0608df839ff7b64d6843005a0f81d2577143af9e
* regarding alpha_data_ used for testing.
alpha_data_!=NULL is as close a good test as we'll get.
* regarding filter-strength / sharpness forcing
no practical use (can be done during encode cycles,
for experimentation)
* regarding a 'less-complex' filtering:
no practical use so far. Next version!
Change-Id: If2dfff5818552a7d3e7c23ac08d64fe6d270229c
For large images overflowing the partition0, we re-do a number
of passes but were forgetting to reset the block_count[].
This was leading to incorrect summary.
+ some cosmetic fixes here and there
BUG=webp:355
Change-Id: Ie87158d7f177f8efdca429b146cfcd0e81652d2f
We can't predict if the quality is going to be below the threshold
eventually, so we might as well enable it always.
Change-Id: I30aedecc8c6d4daf159f6ef152697df0206d1e93
This fixes some color smearing due to heavy quantization.
This is only enabled for q <= 30 (cf ERROR_DIFFUSION_QUALITY)
Change-Id: I07e83a4d38461357a32c9e214f7eadc6db73baa9
with WEBP_NEON_OMIT_C_CODE the default _C functions won't be set and
with WEBP_REDUCE_CSP the NEON functions won't be either triggering an
assert for an empty table member.
BUG=chromium:792627
Change-Id: I8d2d430eaa37bb92885b61a3dd39f961924a8def
RGB to YUV conversion was not using SSE to finish up the row.
End data is now copied to a buffer big enough to fit in a
SSE register.
(UPSAMPLE_LAST_BLOCK was already using that trick).
Change-Id: Ie539bcbe570a643a774aa88263503c0d2c41890f
alpha processing is still required when requesting premultiplied output
since:
1b27bf8b WEBP_REDUCE_SIZE: disable all rescaler code
Change-Id: Id1b03256c4c04b8db31527e60cd31dd20ce6f3ad
only supported ones are: RGBA/BGRA/rgbA/bgrA (decoder)
as well as: WebPPictureImportRGB/RGBX/RGBA (encoder).
(note: extras/get_disto is affected too)
Change-Id: If6c4f95054ca15759c4e289fb3b4c352b3521c2c
(cherry picked from commit 6de20df02c)
remove auto-filter (-af) support and make WebPPictureCopy,
WebPPictureIsView, WebPPictureView, WebPPictureCrop, and
WebPPictureRescale noops.
Change-Id: If39d512cc268a0015298a1138dbc94feb86575e5
with gcc-4.8, clang-4.0.1/5 this is no faster (actually up to 2x slower)
than the code generated for memset (0x01010... * dst[-1]). shuffles in
sse4 recover a bit, but performance is still down.
Change-Id: Ie85e8353f8ede559d0b05a1d388787fd18ecc80f
Rewrote WebPPictureHasTransparency() to use them (even for argb).
This is 10% faster, for some reasons.
SSE2 version should be straightforward.
Removes a TODO.
Change-Id: I7ad5848fc5e355e2df505dbcd5a0f42fb6cbab41
The WebPDemux and WebPAnimDecoder APIs are provided for the purpose of
animated webp parsing and decoding. No major changes are currently
planned for the libwebp API.
Change-Id: I2758ecda195b0c4091572d5731a0a85fa3716303
this can result in an alignment hint on arm causing a SIGBUS. casting
the input ptr to anything aside from its type is unnecessary for memcpy
and is contrary to the intent of this function.
Change-Id: I9a4d3f4be90f80cd8c3e96ccbe557e51e34cf7a5
when targeting NEON C functions with NEON equivalents won't be used, but
will contribute to binary size. the same goes for sse2, etc., but this
change is primarily concerned with binary sizes for android arm targets.
note '-noasm' or otherwise modifying VP8GetCPUInfo will have no effect
on the use of NEON functions.
this decision can be overridden by defining WEBP_DSP_OMIT_C_CODE to 0.
Change-Id: I47bd453c84a3d341ca39bc986a39eb9c785aface
* changes:
{dec,enc}_neon: harmonize function suffixes x2
upsampling_neon: harmonize function suffixes
yuv_neon: harmonize function suffixes
rescaler_neon: harmonize function suffixes
lossless_neon: harmonize function suffixes
lossless_enc_neon: harmonize function suffixes
filters_neon,cosmetics: fix indent
enc_neon: harmonize function suffixes
dec_neon: harmonize function suffixes
and all older versions.
force Sub3() to not be inlined, otherwise the code in Select() will be
incorrect.
extends the check add previously in:
637b3888 dsp/lossless: workaround gcc-4.9 bug on arm
BUG=webp:363
Change-Id: I1403b558f8660b764f3a570a3326822d5ef0be29
avoids over reading if the reported ANMF payload is < 8 bytes.
likely broken since:
81b8a741 Design change in ANMF and FRGM chunks:
Change-Id: I3e267bafea348a50545587dea8fafb2199c6b650
ssse3 is bit #9 in ecx, bit 1 is sse3. this only controls the check for
slow ssse3 and likely had no ill effect.
Change-Id: I84ce73dc480e1cdbd085e37be06f3f402116c201
calculate the file duration using unsigned math. this could still result
in an incorrect average duration calculation if there were multiple
rollovers. caching the duration is an option if it was desirable to
support such an extreme case.
Change-Id: I3875d94d081fec947c03a857055df6e27ff5351d
For a pixel, we look for the longest match starting in a window around it.
For the following pixel, the previous result can be used and smaller
search window is used.
Change-Id: Ice16f9a7c8754099d068380848f0d77de3f756ac
add a check for cpu-features.h and rework some of the ifdef's around
android + neon. for android builds with cpu-features enabled the
*_neon.c files will still need to be flagged correctly (with e.g.,
.c.neon in Android.mk) to properly build them.
BUG=webp:353
Change-Id: I905ce305af0a204e560b915d8665093a3edaceb9
including the type in the macro doesn't bring much benefit to ordering,
current platforms work with a prefix, this would be insufficient if the
attribute needed to follow the function prototype. this form makes it
easier to override on the command line.
BUG=webp:355
Change-Id: Iba41ec0bb319403054be0e899c4cc472dd932fd9
this check was relocated in:
b903b80c Split cost-based backward references in its own file.
quiets -Wundef
Change-Id: I7f7a4773fb8cc77ca9f671b11f50d5db2275d415
If "exact" is false, we can modify the luma samples in fully transparent
areas to facilitate lossy compression. Experiments on some PNG images
show compression improvement of more than 20%.
Change-Id: I1a728cfa920a6652bc1f600d87c01f7f648c4942
Half of the functionality was duplicated.
The rest is about the alpha channel handling so we
might as well put it in the appropriate file.
Change-Id: I8d5ef0afce82cc4842ab7132fd97995c42e6140a