Commit Graph

119 Commits

Author SHA1 Message Date
James Zern
979c0ebbcd sharpyuv: add SharpYuvGetCPUInfo
This gives a similar structure to libwebp and fixes a bug where passing
NULL to SharpYuvInit() would unconditionally set optimized function
pointers. SharpYuvInit() is left as an undocumented public function and
SharpYuvGetCPUInfo is kept private to serialize updates to the pointer.

Change-Id: Id72fbf3ba5b396367510e3bcd1ee2e4e11b95b8c
2022-10-26 13:07:01 -07:00
Maryla
2d607ee646 sharpyuv: increase precision of gamma<->linear conversion
Change-Id: I261bae3628315bda4ec0dafb8798c7512dd03a36
2022-06-01 15:38:44 +02:00
Maryla
01a05de1a7 libsharpyuv: add colorspace utilities
Change-Id: I620c8593dc81f39e747de5ed354332adb7278bed
2022-04-08 10:07:22 +02:00
Maryla
d55d447c9a Make libwebp depend on libsharpyuv.
Change-Id: I6d8ebfe1f855024fc0694b1aa584f71fa27b83ae
2022-03-04 11:35:03 +01:00
Ilya Kurdyukov
a09a647241 SSE4.1 version of TransformColorInverse
Change-Id: I6ba5cb35917eef7a52152c4924eca205b4af7220
2021-02-18 12:42:39 +01:00
Johann
0fd7514b55 neon: SetResidualCoeffs
Much faster with aarch64. Still somewhat faster without vmaxv.

C: 3.700s
ArmV7: 3.675
aarch64: 3.600

BUG=b/118740850

Change-Id: I3be852da89633eca4bddce443c87f5e4a2f55868
2018-11-14 11:46:40 -08:00
Vincent Rabaud
cbf82cc04d Remove AVX2 files.
There is only enc_avx2.c and we never managed to get
something fast enough.

Change-Id: I7465b5d8ccf47d9aa612173b8f80f96060cdb366
2018-10-16 14:12:03 +02:00
James Zern
c56a02d971 Android.mk: use LOCAL_EXPORT_C_INCLUDES w/public libs
dependents can then pickup the include path for webp/ automatically

Change-Id: Ie768a93d0054f8ebc1720f16fbb550c0b10ef61d
2018-06-07 16:12:58 -07:00
James Zern
f4dd92565e remove WEBP_EXPERIMENTAL_FEATURES
the webp bitstream is considered stable at this point

Change-Id: I4b13f9ed4c45f63785474b097e96cb7bf651be7b
2018-02-09 10:25:11 -08:00
Vincent Rabaud
807b53c47e Implement the upsampling/yuv functions in SSE41
Change-Id: If122da22b74a974262063d232f6ca0ab902ff64e
2017-12-04 22:29:43 +01:00
James Zern
4fbdc9fb12 Android.mk,mips: fix clang build with r15
-integrated-as is now required, the opposite of r14

Change-Id: Ic478b2b3b933e66e7d159030eac29f58743eecda
2017-08-31 22:42:14 -07:00
James Zern
c6d1db4b36 fix Android standalone toolchain build
add a check for cpu-features.h and rework some of the ifdef's around
android + neon. for android builds with cpu-features enabled the
*_neon.c files will still need to be flagged correctly (with e.g.,
.c.neon in Android.mk) to properly build them.

BUG=webp:353

Change-Id: I905ce305af0a204e560b915d8665093a3edaceb9
2017-08-01 22:59:03 -07:00
Vincent Rabaud
8acb4942f7 Remove the argb* files.
Half of the functionality was duplicated.
The rest is about the alpha channel handling so we
might as well put it in the appropriate file.

Change-Id: I8d5ef0afce82cc4842ab7132fd97995c42e6140a
2017-06-25 14:44:33 +02:00
James Zern
67de68b5d9 Android.mk/build.gradle: fix mips build with clang from r14b
fixes unknown instruction errors for e.g.,
usw    $15, 0($8)

BUG=webp:343

Change-Id: I71d00527fecd2370a40f6bd12f4e361fb525477f
2017-06-03 13:06:54 -07:00
Vincent Rabaud
b903b80c30 Split cost-based backward references in its own file.
Change-Id: I4d8281e69b0e41f7c90337e5be70a6c65b044086
2017-06-01 16:22:31 +02:00
Pascal Massimino
52245424b0 NEON implementation of some Sharp-YUV420 functions
Change-Id: I449ef9c76b06f971f6e2ad7f9db96bf906d8fe1f
new-file: dsp/yuv_neon.c
2017-04-18 19:22:37 +02:00
Pascal Massimino
693bf74ec0 move the SSIM calculation code in ssim.c / ssim_sse2.c
Change-Id: I63a63fa7f44f257f2e17e45358b206c23069c448
2017-02-21 12:53:35 +01:00
James Zern
668e1dd44f src/{dec,enc,utils}: give filenames a unique suffix
this avoids duplicates between these trees and dsp/, e.g., enc/tree.c,
dec/tree.c, making pulling the whole library source tree into one target
possible

BUG=webp:279

Change-Id: I060a614833c7c24ddd37bf641702ae6a5eef1775
2017-01-19 19:09:48 -08:00
Pascal Massimino
1de931c669 NEON: implement alpha-filters (horizontal/vertical/gradient)
gradient-filter code is not much faster, but maybe improvable in the future.

Change-Id: Ia16070e409fe8703b02276166f19526917df6b35
2017-01-17 15:44:46 +01:00
Pascal Massimino
49d0280df1 NEON: implement several alpha-processing functions
- ApplyAlphaMultiply
 - DispatchAlpha
 - DispatchAlphaToGreen
 - ExtractAlpha

Decoding to Argb / rgbA / ... is 10-15% faster (measured on N4)

new file: alpha_processing_neon.c

Change-Id: I40f1a809e9885d1031ff0bc886d8d001efa66bca
2017-01-11 17:39:29 +01:00
James Zern
de568abfdb Android.mk: use -fvisibility=hidden
brings the final libwebp.so size down 16/20K with arm64/armv7 builds
using ndk-r13

Change-Id: I20d8aba61d6b692b0fc32f4b271e2f9872f03c28
2016-11-18 19:24:09 -08:00
Vincent Rabaud
6cc48b1728 Move some lossless logic out of dsp.
Change-Id: I4cfd60cd5497666a2e1c188ceada2e71b05f1505
2016-09-13 15:37:32 +02:00
Parag Salasakar
d3ddacb625 Add MSA optimized YUV to RGB upsampling functions
We add the following MSA optimized YUV to RGB upsampling functions:
- UpsampleRgbLinePair
- UpsampleBgrLinePair
- UpsampleRgbaLinePair
- UpsampleBgraLinePair
- UpsampleArgbLinePair
- UpsampleRgba4444LinePair
- UpsampleRgb565LinePair

Change-Id: I7264a615edc7eb376e443e9d38bd8e3c9a2cab1f
2016-07-22 14:28:30 +00:00
James Zern
c379b55a93 move examples/{example_util,image_dec} to imageio/
Change-Id: I2508c786a095a2a75bebf766210c64e2af88f9b6
2016-07-19 19:06:29 -07:00
Parag Salasakar
9ac74f922e Add MSA optimized rescaling functions
We add the following MSA optimized rescaling functions:
- RescalerExportRowExpand
- RescalerExportRowShrink

Change-Id: Ic1c76065423b02617db94cf0c22bb564219b36e6
2016-07-19 15:52:42 +00:00
Parag Salasakar
cb19dbc1a4 Add MSA optimized color transform functions
We add the following MSA optimized color transform functions:
- TransformColor
- SubtractGreenFromBlueAndRed

Change-Id: Ib182d2b5faa7191f503ce70f0dfde0ac89402fd3
2016-07-18 13:49:24 +00:00
Parag Salasakar
435308e029 Add MSA optimized encoder transform functions
We add the following MSA optimized encoder transform functions:
- ITransform
- FTransform
- FTransformWHT

Change-Id: Ia6b17556aba5aff2d7a88208905fb45293d080a8
2016-07-05 14:35:47 +00:00
Parag Salasakar
dce64bfa1b Add MSA optimized alpha filter functions
We add the following MSA optimized alpha filter functions:
- HorizontalFilter
- VerticalFilter
- GradientFilter

Change-Id: I71e2e04050e569b8c0bf086fadf210ee16d50924
2016-07-01 19:58:25 +00:00
Parag Salasakar
701c772eed Add MSA optimized colorspace conversion functions
We add the following MSA optimized colorspace conversion functions:
- ConvertBGRAToRGBA
- ConvertBGRAToBGR
- ConvertBGRAToRGB

Change-Id: I76db1c829d593a06d4975d54dbafa385c82b84fb
2016-06-27 21:19:06 +00:00
Parag Salasakar
5e60c42a76 Added MSA optimized transform functions
1. TransformWHT
2. TransformTwo
3. TransformDC
4. TransformAC3

Change-Id: Ia3624cb4aed215bcaffce542b28794e643207039
2016-06-16 09:04:27 +00:00
skal
b4e731cd93 neon-implementation for rescaler code
It's better to stay with a 32b fixed-point precision overall, otherwise
the C-version on ARM gets *slower*.
Actually, gcc ARM compiler optimizes some instructions pretty
well when WEBP_RESCALER_FIX is exactly 32, even in C.

Change-Id: I0eea97f7db5947470f5af355dee098eca81e178d
2015-10-07 21:18:39 -07:00
Mislav Bradac
48f66b6687 Add delta_palettization feature to WebP
Change-Id: Ibaf4e49aa67d63d0eb11848cca4fd0c60815864a
2015-10-02 14:29:54 -07:00
Pascal Massimino
76a7dc39e5 rescaler: add some SSE2 code
The rounding and arithmetic is not the same as previously, to prevent overflow cases for large upscale factors.

We still rely on 32b x 32b -> 64b multiplies. Raised the fixed-point precision to 32b
so that we have some nice shifts from epi64 to epi32.
Changed rescaler_t type to 'uint32_t' in order to squeeze in all the precision required.

The MIPS code has been disabled because it's now out-of-sync. Will be fixed in
a subsequent CL when the dust settles.
~30-35% faster

Change-Id: I32e4ddc00933f1b1aa3463403086199fd5dad07b
2015-09-25 15:07:13 +02:00
Urvang Joshi
d39dc8f3cc Create a WebPAnimDecoder API.
This is designed for the simple use-case where one wants to decode all
frames one-by-one in order.

Also, use this API in anim_util library, which is in turn used by
anim_diff tool.

Change-Id: Ie8b653c04e867d40fd23321b3dd41b87689656c7
2015-09-02 16:23:10 -07:00
James Zern
14efabbf1c Android: limit use of cpufeatures
cpufeatures is only used with armeabi-v7a.*

Change-Id: I80284061d71d9defa50d139c7f1bda67c00f567e
2015-08-19 18:44:33 -07:00
Pascal Massimino
f3d687e3fa SSE4.1 implementation of some lossless encoding functions
New implementations: SubtractGreenFromBlueAndRed and TransformColor

around 1-2% faster lossless encoding.

Change-Id: I1668e36fdc316ba55b3b798b91b4a3e36ce62861
2015-06-23 08:46:57 +02:00
Pascal Massimino
bfc300c7ff SSE4.1 implementation of some alpha-processing functions
DispatchAlpha* functions are hard to speed up, compared to SSE2.
ExtractAlpha sees a ~15% speed-up though.

Change-Id: I8715c2defecbc832f469eed7e6ffd012146b52de
2015-06-19 14:17:39 -07:00
Pascal Massimino
94055503e3 encoding SSE4.1 stub for StoreHistogram + Quantize + SSE_16xN
Visible speed-up, thanks to pshufb and pabsw and psignw use.

had to tweak configure.ac to make "smmintri.h" presence correctly
detected (we need to set the CPPFLAGS instead of the CFLAGS!)

Change-Id: I2ab99e16a27a64fdf1f09b2b4e30a5e74ccca080
2015-03-25 20:23:51 -07:00
James Zern
553051f741 dsp/lossless: split enc/dec functions
adds lossless_enc*.c; reduces the size of the decode-only so: ~78K
w/gcc-4.8.2 on x86_64.

Change-Id: If5e4610b67d05eba5896bc64bab79e9df92b2092
2015-03-23 22:57:50 -07:00
Pascal Massimino
e9570dd987 stub for SSE4.1 support.
Change-Id: I0c845a98d2871cc8907ff7b914bab7747a92c7ed
2015-03-20 00:26:35 -07:00
James Zern
53c16ff047 Android.mk: add webpmux target
Change-Id: I60fc898fd804e23f08d760694192c5d04adcae91
2015-02-24 19:18:10 -08:00
James Zern
21852a00a1 Android.mk: add webpdemux target
Change-Id: I2fbbefbee59a96c52f5addcfc5bfe1216caad5cc
2015-02-24 19:18:10 -08:00
James Zern
8697a3bcc8 Android.mk: add webpdecoder{,_static} targets
webpdecoder_static is reused to create libwebpdecoder.so and
libwebp.{a,so}

Change-Id: I940293cb755040c0ea45dc13f22624de8f355867
2015-02-24 19:18:09 -08:00
James Zern
4a67049113 Android.mk: split source lists per-directory
will allow reuse in future targets

Change-Id: Iededc19d954226e62f2d2383a2b80f268d613647
2015-02-24 19:18:09 -08:00
Pascal Massimino
2a407092ab 4-5% faster encoding using SSE2 for GetResidualCost
new file: cost_sse2.c

Change-Id: I4896c07f5ff2443ef743f4435fe2758d95a672ed
2015-02-18 09:41:02 +01:00
Pascal Massimino
a987faedfa MIPS: dspr2: added optimization for function GetResidualCost
set/get residual C functions moved to new file in src/dsp
mips32 version of GetResidualCost moved to new file

Change-Id: I7cebb7933a89820ff28c187249a9181f281081d2
2015-02-07 02:13:26 -08:00
Pascal Massimino
022d2f886c add SSE2 variants for alpha filtering functions
The 'inverse' variants are harder to parallelize, since
the result of filtering is used for prediction.
The 'direct' way is relatively easier.

The heavy bottleneck left for optimization is still GradientUnfilter()

Change-Id: I358008f492a887e8fff6600cb27857b18dee86e9
2015-01-29 08:46:22 +01:00
Pascal Massimino
7afdaf8496 Alpha coding: reorganize the filter/unfiltering code
Move the filtering code to their own dsp/ spot
New function: VP8FiltersInit()

Change-Id: I0b2041eab42346c59b972f2575b05509e6a8f7b1
2015-01-28 08:02:41 +01:00
Djordje Pesut
cbcbedd0de move rescaler functions to rescaler* files in src/dsp/
Change-Id: I906add1b1010a59ebfcc2dd81e15745433cc206b
2015-01-09 16:47:09 +01:00
Pascal Massimino
ca7f60db5f SSE2 implementation of VP8PackARGB
Change-Id: I40c0e26a6a2701216e4ddebcf793aa535677f437
2015-01-05 05:17:51 -08:00