Compare commits

...

9 Commits

Author SHA1 Message Date
306335198d muxread: fix reading of buffers > riff size
After:
  2c70ad76 muxread,CreateInternal: fix riff size checks (cl/200674839)

`SizeWithPadding()` adds `CHUNK_HEADER_SIZE` (plus additional 1 byte
padding if needed). A later check included `CHUNK_HEADER_SIZE` before
capping the value of the size passed to `WebPMuxCreateInternal()`,
missing cases with a small amount of extra data after the RIFF chunk
(like a newline when the file is opened and saved in a text editor) and
setting size to an incorrect value, so larger sizes would also fail.

Another check of `riff_size < CHUNK_HEADER_SIZE` after the call to
`SizeWithPadding()` is removed because 1) it could not fail given
`SizeWithPadding()` adds `CHUNK_HEADER_SIZE` to the value; and 2) it is
redundant as `size < RIFF_HEADER_SIZE + CHUNK_HEADER_SIZE` is checked
earlier in the function.

Bug: webp:42340561
Change-Id: I58dc4f071b27c2841001b4012aabdb1869f64f97
2024-11-22 12:40:34 -08:00
4c85d860ea yuv.h: update RGB<->YUV coefficients in comment
The values for the R/G/B floating point formulas resembled
https://fourcc.org/fccyvrgb.php and Video Demystified, but the fixed
point values are more closely aligned to rounded values from
https://en.wikipedia.org/wiki/YCbCr and BT.601.

The R/G/B formulas with the values prior to this change are added to
sharpyuv_csp.c as they align with the fixed values. The origin of those
coefficients is unclear. For consistency between library versions we'll
leave them as is.

Bug: webp:375011696
Change-Id: Id3f2a57530eee700cc52a899b32b25b5c015e89b
2024-11-21 16:21:45 -08:00
0ab789e067 Merge changes I6dfedfd5,I2376e2dc into main
* changes:
  rework AddVectorEq_SSE2
  rework AddVector_SSE2
2024-11-15 02:58:10 +00:00
0323645066 {ios,xcframework}build.sh: fix compilation w/Xcode 16
Don't use `-fembed-bitcode`, fixes:
ld: warning: -bitcode_bundle is no longer supported and will be ignored
ld: -mllvm and -bitcode_bundle (Xcode setting ENABLE_BITCODE=YES) cannot
    be used together

Change-Id: I4ead0fc71da39bb5ec92c1f5ba467b95ad8b7461
2024-11-14 20:26:57 +00:00
61e2cfdadd rework AddVectorEq_SSE2
Take advantage of the known sizes used by VP8LHistogramAdd() and
remove loop for the remainder. The loop was being auto-vectorized making
the code larger and slower than the vectorized C code.

For larger sizes the new code is ~3-4.5% faster than the old code with
about the same improvement against the vectorized C code. For the
minimal size (40), the new code is ~30% faster than the C and old SSE2
code.

The LINE_SIZE==8 option is removed with this change. It had been set
to 16 for its entire life and clang-16 was unrolling the LINE_SIZE==8
case by 2 in any case; they both profile similarly.

Change-Id: I6dfedfd57474f44d15e2ce510a48e5252221077a
2024-11-14 12:21:39 -08:00
7bda3deb89 rework AddVector_SSE2
Take advantage of the known sizes used by VP8LHistogramAdd() and remove
loop for the remainder. The loop was being auto-vectorized making the
code larger and slower than the vectorized C code.

For larger sizes the new code is ~4-7% faster than the old code with
about the same improvement against the vectorized C code. For the
minimal size (40), the new code is ~30% faster than the C and old SSE2
code.

The LINE_SIZE==8 option is removed with this change. It had been set to
16 for its entire life and clang-16 was unrolling the LINE_SIZE==8 case
by 2 in any case; they both profile similarly.

Change-Id: I2376e2dca3bffa38477b4a432f4c533419e3be0e
2024-11-14 12:21:33 -08:00
2ddaaf0aa5 Fix variable names in SharpYuvComputeConversionMatrix
Change-Id: Ia07e71aae42396100a4f50dc104e828239522d77
2024-11-07 09:37:40 +01:00
a3ba6f19e9 Makefile.vc: fix gif2webp link error
Add missing dependency on libsharpyuv.

needed after:
f999d94f gif2webp: add -sharp_yuv/-near_lossless

Change-Id: I8bdd5c0fd4622f9c8ec6ffdf4ac11399f86350da
2024-11-06 10:14:05 -08:00
f999d94f4a gif2webp: add -sharp_yuv/-near_lossless
This change is the same as the one that introduced the options to
img2webp:
0825faa4 img2webp: add -sharp_yuv/-near_lossless

Change-Id: Id380d159299c38dd6440f833d487e00c0976afec
2024-11-04 12:29:24 -08:00
12 changed files with 115 additions and 41 deletions

View File

@ -567,7 +567,8 @@ if(WEBP_BUILD_GIF2WEBP)
add_executable(gif2webp ${GIF2WEBP_SRCS})
target_link_libraries(gif2webp exampleutil imageioutil webp libwebpmux
${WEBP_DEP_GIF_LIBRARIES})
target_include_directories(gif2webp PRIVATE ${CMAKE_CURRENT_BINARY_DIR}/src)
target_include_directories(gif2webp PRIVATE ${CMAKE_CURRENT_BINARY_DIR}/src
${CMAKE_CURRENT_SOURCE_DIR})
install(TARGETS gif2webp RUNTIME DESTINATION ${CMAKE_INSTALL_BINDIR})
endif()

View File

@ -393,7 +393,7 @@ $(DIRBIN)\dwebp.exe: $(IMAGEIO_UTIL_OBJS)
$(DIRBIN)\dwebp.exe: $(LIBWEBPDEMUX)
$(DIRBIN)\gif2webp.exe: $(DIROBJ)\examples\gif2webp.obj $(EX_GIF_DEC_OBJS)
$(DIRBIN)\gif2webp.exe: $(EX_UTIL_OBJS) $(IMAGEIO_UTIL_OBJS) $(LIBWEBPMUX)
$(DIRBIN)\gif2webp.exe: $(LIBWEBP)
$(DIRBIN)\gif2webp.exe: $(LIBWEBP) $(LIBSHARPYUV)
$(DIRBIN)\vwebp.exe: $(DIROBJ)\examples\vwebp.obj $(EX_UTIL_OBJS)
$(DIRBIN)\vwebp.exe: $(IMAGEIO_UTIL_OBJS) $(LIBWEBPDEMUX) $(LIBWEBP)
$(DIRBIN)\vwebp_sdl.exe: $(DIROBJ)\extras\vwebp_sdl.obj

View File

@ -354,6 +354,10 @@ Options:
-lossy ................. encode image using lossy compression
-mixed ................. for each frame in the image, pick lossy
or lossless compression heuristically
-near_lossless <int> ... use near-lossless image preprocessing
(0..100=off), default=100
-sharp_yuv ............. use sharper (and slower) RGB->YUV conversion
(lossy only)
-q <float> ............. quality factor (0:small..100:big)
-m <int> ............... compression method (0=fast, 6=slowest)
-min_size .............. minimize output size (default:off)

View File

@ -67,7 +67,7 @@ dwebp_LDADD += ../src/libwebp.la
dwebp_LDADD +=$(PNG_LIBS) $(JPEG_LIBS)
gif2webp_SOURCES = gif2webp.c gifdec.c gifdec.h
gif2webp_CPPFLAGS = $(AM_CPPFLAGS) $(GIF_INCLUDES)
gif2webp_CPPFLAGS = $(AM_CPPFLAGS) $(GIF_INCLUDES) -I$(top_srcdir)
gif2webp_LDADD =
gif2webp_LDADD += libexample_util.la
gif2webp_LDADD += ../imageio/libimageio_util.la

View File

@ -28,6 +28,7 @@
#endif
#include <gif_lib.h>
#include "sharpyuv/sharpyuv.h"
#include "webp/encode.h"
#include "webp/mux.h"
#include "../examples/example_util.h"
@ -70,6 +71,11 @@ static void Help(void) {
printf(" -lossy ................. encode image using lossy compression\n");
printf(" -mixed ................. for each frame in the image, pick lossy\n"
" or lossless compression heuristically\n");
printf(" -near_lossless <int> ... use near-lossless image preprocessing\n"
" (0..100=off), default=100\n");
printf(" -sharp_yuv ............. use sharper (and slower) RGB->YUV "
"conversion\n"
" (lossy only)\n");
printf(" -q <float> ............. quality factor (0:small..100:big)\n");
printf(" -m <int> ............... compression method (0=fast, 6=slowest)\n");
printf(" -min_size .............. minimize output size (default:off)\n"
@ -166,6 +172,10 @@ int main(int argc, const char* argv[]) {
} else if (!strcmp(argv[c], "-mixed")) {
enc_options.allow_mixed = 1;
config.lossless = 0;
} else if (!strcmp(argv[c], "-near_lossless") && c < argc - 1) {
config.near_lossless = ExUtilGetInt(argv[++c], 0, &parse_error);
} else if (!strcmp(argv[c], "-sharp_yuv")) {
config.use_sharp_yuv = 1;
} else if (!strcmp(argv[c], "-loop_compatibility")) {
loop_compatibility = 1;
} else if (!strcmp(argv[c], "-q") && c < argc - 1) {
@ -226,10 +236,13 @@ int main(int argc, const char* argv[]) {
} else if (!strcmp(argv[c], "-version")) {
const int enc_version = WebPGetEncoderVersion();
const int mux_version = WebPGetMuxVersion();
const int sharpyuv_version = SharpYuvGetVersion();
printf("WebP Encoder version: %d.%d.%d\nWebP Mux version: %d.%d.%d\n",
(enc_version >> 16) & 0xff, (enc_version >> 8) & 0xff,
enc_version & 0xff, (mux_version >> 16) & 0xff,
(mux_version >> 8) & 0xff, mux_version & 0xff);
printf("libsharpyuv: %d.%d.%d\n", (sharpyuv_version >> 24) & 0xff,
(sharpyuv_version >> 16) & 0xffff, sharpyuv_version & 0xff);
FREE_WARGV_AND_RETURN(EXIT_SUCCESS);
} else if (!strcmp(argv[c], "-quiet")) {
quiet = 1;

View File

@ -53,7 +53,7 @@ DEMUXLIBLIST=''
if [[ -z "${SDK}" ]]; then
echo "iOS SDK not available"
exit 1
elif [[ ${SDK%%.*} -gt 8 ]]; then
elif [[ ${SDK%%.*} -gt 8 && "${XCODE%%.*}" -lt 16 ]]; then
EXTRA_CFLAGS="-fembed-bitcode"
elif [[ ${SDK%%.*} -le 6 ]]; then
echo "You need iOS SDK version 6.0 or above"

View File

@ -1,5 +1,5 @@
.\" Hey, EMACS: -*- nroff -*-
.TH GIF2WEBP 1 "July 18, 2024"
.TH GIF2WEBP 1 "November 4, 2024"
.SH NAME
gif2webp \- Convert a GIF image to WebP
.SH SYNOPSIS
@ -39,6 +39,18 @@ Encode the image using lossy compression.
Mixed compression mode: optimize compression of the image by picking either
lossy or lossless compression for each frame heuristically.
.TP
.BI \-near_lossless " int
Specify the level of near\-lossless image preprocessing. This option adjusts
pixel values to help compressibility, but has minimal impact on the visual
quality. It triggers lossless compression mode automatically. The range is 0
(maximum preprocessing) to 100 (no preprocessing, the default). The typical
value is around 60. Note that lossy with \fB\-q 100\fP can at times yield
better results.
.TP
.B \-sharp_yuv
Use more accurate and sharper RGB->YUV conversion. Note that this process is
slower than the default 'fast' RGB->YUV conversion.
.TP
.BI \-q " float
Specify the compression factor for RGB channels between 0 and 100. The default
is 75.

View File

@ -22,16 +22,16 @@ void SharpYuvComputeConversionMatrix(const SharpYuvColorSpace* yuv_color_space,
const float kr = yuv_color_space->kr;
const float kb = yuv_color_space->kb;
const float kg = 1.0f - kr - kb;
const float cr = 0.5f / (1.0f - kb);
const float cb = 0.5f / (1.0f - kr);
const float cb = 0.5f / (1.0f - kb);
const float cr = 0.5f / (1.0f - kr);
const int shift = yuv_color_space->bit_depth - 8;
const float denom = (float)((1 << yuv_color_space->bit_depth) - 1);
float scale_y = 1.0f;
float add_y = 0.0f;
float scale_u = cr;
float scale_v = cb;
float scale_u = cb;
float scale_v = cr;
float add_uv = (float)(128 << shift);
assert(yuv_color_space->bit_depth >= 8);
@ -60,6 +60,10 @@ void SharpYuvComputeConversionMatrix(const SharpYuvColorSpace* yuv_color_space,
// Matrices are in YUV_FIX fixed point precision.
// WebP's matrix, similar but not identical to kRec601LimitedMatrix
// Derived using the following formulas:
// Y = 0.2569 * R + 0.5044 * G + 0.0979 * B + 16
// U = -0.1483 * R - 0.2911 * G + 0.4394 * B + 128
// V = 0.4394 * R - 0.3679 * G - 0.0715 * B + 128
static const SharpYuvConversionMatrix kWebpMatrix = {
{16839, 33059, 6420, 16 << 16},
{-9719, -19081, 28800, 128 << 16},

View File

@ -175,64 +175,102 @@ static void CollectColorRedTransforms_SSE2(const uint32_t* WEBP_RESTRICT argb,
// Note we are adding uint32_t's as *signed* int32's (using _mm_add_epi32). But
// that's ok since the histogram values are less than 1<<28 (max picture size).
#define LINE_SIZE 16 // 8 or 16
static void AddVector_SSE2(const uint32_t* WEBP_RESTRICT a,
const uint32_t* WEBP_RESTRICT b,
uint32_t* WEBP_RESTRICT out, int size) {
int i;
for (i = 0; i + LINE_SIZE <= size; i += LINE_SIZE) {
int i = 0;
int aligned_size = size & ~15;
// Size is, at minimum, NUM_DISTANCE_CODES (40) and may be as large as
// NUM_LITERAL_CODES (256) + NUM_LENGTH_CODES (24) + (0 or a non-zero power of
// 2). See the usage in VP8LHistogramAdd().
assert(size >= 16);
assert(size % 2 == 0);
do {
const __m128i a0 = _mm_loadu_si128((const __m128i*)&a[i + 0]);
const __m128i a1 = _mm_loadu_si128((const __m128i*)&a[i + 4]);
#if (LINE_SIZE == 16)
const __m128i a2 = _mm_loadu_si128((const __m128i*)&a[i + 8]);
const __m128i a3 = _mm_loadu_si128((const __m128i*)&a[i + 12]);
#endif
const __m128i b0 = _mm_loadu_si128((const __m128i*)&b[i + 0]);
const __m128i b1 = _mm_loadu_si128((const __m128i*)&b[i + 4]);
#if (LINE_SIZE == 16)
const __m128i b2 = _mm_loadu_si128((const __m128i*)&b[i + 8]);
const __m128i b3 = _mm_loadu_si128((const __m128i*)&b[i + 12]);
#endif
_mm_storeu_si128((__m128i*)&out[i + 0], _mm_add_epi32(a0, b0));
_mm_storeu_si128((__m128i*)&out[i + 4], _mm_add_epi32(a1, b1));
#if (LINE_SIZE == 16)
_mm_storeu_si128((__m128i*)&out[i + 8], _mm_add_epi32(a2, b2));
_mm_storeu_si128((__m128i*)&out[i + 12], _mm_add_epi32(a3, b3));
#endif
i += 16;
} while (i != aligned_size);
if ((size & 8) != 0) {
const __m128i a0 = _mm_loadu_si128((const __m128i*)&a[i + 0]);
const __m128i a1 = _mm_loadu_si128((const __m128i*)&a[i + 4]);
const __m128i b0 = _mm_loadu_si128((const __m128i*)&b[i + 0]);
const __m128i b1 = _mm_loadu_si128((const __m128i*)&b[i + 4]);
_mm_storeu_si128((__m128i*)&out[i + 0], _mm_add_epi32(a0, b0));
_mm_storeu_si128((__m128i*)&out[i + 4], _mm_add_epi32(a1, b1));
i += 8;
}
for (; i < size; ++i) {
out[i] = a[i] + b[i];
size &= 7;
if (size == 4) {
const __m128i a0 = _mm_loadu_si128((const __m128i*)&a[i]);
const __m128i b0 = _mm_loadu_si128((const __m128i*)&b[i]);
_mm_storeu_si128((__m128i*)&out[i], _mm_add_epi32(a0, b0));
} else if (size == 2) {
const __m128i a0 = _mm_loadl_epi64((const __m128i*)&a[i]);
const __m128i b0 = _mm_loadl_epi64((const __m128i*)&b[i]);
_mm_storel_epi64((__m128i*)&out[i], _mm_add_epi32(a0, b0));
}
}
static void AddVectorEq_SSE2(const uint32_t* WEBP_RESTRICT a,
uint32_t* WEBP_RESTRICT out, int size) {
int i;
for (i = 0; i + LINE_SIZE <= size; i += LINE_SIZE) {
int i = 0;
int aligned_size = size & ~15;
// Size is, at minimum, NUM_DISTANCE_CODES (40) and may be as large as
// NUM_LITERAL_CODES (256) + NUM_LENGTH_CODES (24) + (0 or a non-zero power of
// 2). See the usage in VP8LHistogramAdd().
assert(size >= 16);
assert(size % 2 == 0);
do {
const __m128i a0 = _mm_loadu_si128((const __m128i*)&a[i + 0]);
const __m128i a1 = _mm_loadu_si128((const __m128i*)&a[i + 4]);
#if (LINE_SIZE == 16)
const __m128i a2 = _mm_loadu_si128((const __m128i*)&a[i + 8]);
const __m128i a3 = _mm_loadu_si128((const __m128i*)&a[i + 12]);
#endif
const __m128i b0 = _mm_loadu_si128((const __m128i*)&out[i + 0]);
const __m128i b1 = _mm_loadu_si128((const __m128i*)&out[i + 4]);
#if (LINE_SIZE == 16)
const __m128i b2 = _mm_loadu_si128((const __m128i*)&out[i + 8]);
const __m128i b3 = _mm_loadu_si128((const __m128i*)&out[i + 12]);
#endif
_mm_storeu_si128((__m128i*)&out[i + 0], _mm_add_epi32(a0, b0));
_mm_storeu_si128((__m128i*)&out[i + 4], _mm_add_epi32(a1, b1));
#if (LINE_SIZE == 16)
_mm_storeu_si128((__m128i*)&out[i + 8], _mm_add_epi32(a2, b2));
_mm_storeu_si128((__m128i*)&out[i + 12], _mm_add_epi32(a3, b3));
#endif
i += 16;
} while (i != aligned_size);
if ((size & 8) != 0) {
const __m128i a0 = _mm_loadu_si128((const __m128i*)&a[i + 0]);
const __m128i a1 = _mm_loadu_si128((const __m128i*)&a[i + 4]);
const __m128i b0 = _mm_loadu_si128((const __m128i*)&out[i + 0]);
const __m128i b1 = _mm_loadu_si128((const __m128i*)&out[i + 4]);
_mm_storeu_si128((__m128i*)&out[i + 0], _mm_add_epi32(a0, b0));
_mm_storeu_si128((__m128i*)&out[i + 4], _mm_add_epi32(a1, b1));
i += 8;
}
for (; i < size; ++i) {
out[i] += a[i];
size &= 7;
if (size == 4) {
const __m128i a0 = _mm_loadu_si128((const __m128i*)&a[i]);
const __m128i b0 = _mm_loadu_si128((const __m128i*)&out[i]);
_mm_storeu_si128((__m128i*)&out[i], _mm_add_epi32(a0, b0));
} else if (size == 2) {
const __m128i a0 = _mm_loadl_epi64((const __m128i*)&a[i]);
const __m128i b0 = _mm_loadl_epi64((const __m128i*)&out[i]);
_mm_storel_epi64((__m128i*)&out[i], _mm_add_epi32(a0, b0));
}
}
#undef LINE_SIZE
//------------------------------------------------------------------------------
// Entropy

View File

@ -11,15 +11,15 @@
//
// The exact naming is Y'CbCr, following the ITU-R BT.601 standard.
// More information at: https://en.wikipedia.org/wiki/YCbCr
// Y = 0.2569 * R + 0.5044 * G + 0.0979 * B + 16
// U = -0.1483 * R - 0.2911 * G + 0.4394 * B + 128
// V = 0.4394 * R - 0.3679 * G - 0.0715 * B + 128
// Y = 0.2568 * R + 0.5041 * G + 0.0979 * B + 16
// U = -0.1482 * R - 0.2910 * G + 0.4392 * B + 128
// V = 0.4392 * R - 0.3678 * G - 0.0714 * B + 128
// We use 16bit fixed point operations for RGB->YUV conversion (YUV_FIX).
//
// For the Y'CbCr to RGB conversion, the BT.601 specification reads:
// R = 1.164 * (Y-16) + 1.596 * (V-128)
// G = 1.164 * (Y-16) - 0.813 * (V-128) - 0.391 * (U-128)
// B = 1.164 * (Y-16) + 2.018 * (U-128)
// G = 1.164 * (Y-16) - 0.813 * (V-128) - 0.392 * (U-128)
// B = 1.164 * (Y-16) + 2.017 * (U-128)
// where Y is in the [16,235] range, and U/V in the [16,240] range.
//
// The fixed-point implementation used here is:

View File

@ -223,11 +223,11 @@ WebPMux* WebPMuxCreateInternal(const WebPData* bitstream, int copy_data,
// Note this padding is historical and differs from demux.c which does not
// pad the file size.
riff_size = SizeWithPadding(riff_size);
if (riff_size < CHUNK_HEADER_SIZE) goto Err;
if (riff_size > size) goto Err;
// There's no point in reading past the end of the RIFF chunk.
if (size > riff_size + CHUNK_HEADER_SIZE) {
size = riff_size + CHUNK_HEADER_SIZE;
// There's no point in reading past the end of the RIFF chunk. Note riff_size
// includes CHUNK_HEADER_SIZE after SizeWithPadding().
if (size > riff_size) {
size = riff_size;
}
end = data + size;

View File

@ -172,7 +172,9 @@ for (( i = 0; i < $NUM_PLATFORMS; ++i )); do
CFLAGS="-pipe -isysroot ${SDKROOT} -O3 -DNDEBUG"
case "${PLATFORM}" in
iPhone*)
CFLAGS+=" -fembed-bitcode"
if [[ "${XCODE%%.*}" -lt 16 ]]; then
CFLAGS+=" -fembed-bitcode"
fi
CFLAGS+=" -target ${ARCH}-apple-ios${IOS_MIN_VERSION}"
[[ "${PLATFORM}" == *Simulator* ]] && CFLAGS+="-simulator"
;;