Pascal Massimino
44a0ee3fa7
introduce WebPHasAlpha8b and WebPHasAlpha32b
...
Rewrote WebPPictureHasTransparency() to use them (even for argb).
This is 10% faster, for some reasons.
SSE2 version should be straightforward.
Removes a TODO.
Change-Id: I7ad5848fc5e355e2df505dbcd5a0f42fb6cbab41
2017-11-20 15:20:29 +01:00
Vincent Rabaud
c462cd0065
Remove useless code.
...
The casts are to the same type and the #define not used.
Change-Id: I8d69c3b9dde7a1c53c2ba5a026a653d8c2e1d2a7
2017-11-08 10:52:49 +01:00
James Zern
b7971d0e22
dsp: avoid defining _C functions w/NEON builds
...
when targeting NEON C functions with NEON equivalents won't be used, but
will contribute to binary size. the same goes for sse2, etc., but this
change is primarily concerned with binary sizes for android arm targets.
note '-noasm' or otherwise modifying VP8GetCPUInfo will have no effect
on the use of NEON functions.
this decision can be overridden by defining WEBP_DSP_OMIT_C_CODE to 0.
Change-Id: I47bd453c84a3d341ca39bc986a39eb9c785aface
2017-10-27 10:54:56 -07:00
James Zern
8d033b14d7
{dec,enc}_neon: harmonize function suffixes x2
...
+ neon.h
BUG=webp:355
Change-Id: Ia17c7dfc7d61742a4758823675a2d556a739c389
2017-10-20 19:00:53 -07:00
James Zern
0295e9815d
upsampling_neon: harmonize function suffixes
...
BUG=webp:355
Change-Id: I75423abbe0bcea3c98a42e412cc2116be81b5d08
2017-10-20 19:00:53 -07:00
James Zern
d572c4e52b
yuv_neon: harmonize function suffixes
...
BUG=webp:355
Change-Id: Ia2f716b459950c18717b062175197d1e6419bf2a
2017-10-20 19:00:53 -07:00
James Zern
ab9c2500db
rescaler_neon: harmonize function suffixes
...
BUG=webp:355
Change-Id: I161caa14f7ebbc3ae978b1722472625a77d0a4a4
2017-10-20 19:00:53 -07:00
James Zern
93e0ce27f4
lossless_neon: harmonize function suffixes
...
BUG=webp:355
Change-Id: I4210081a39800b5c2589c443da237269908af666
2017-10-20 19:00:53 -07:00
James Zern
22fbc50edd
lossless_enc_neon: harmonize function suffixes
...
BUG=webp:355
Change-Id: I462facaeade4f0f4fc1e96895493306d095a6a9a
2017-10-20 19:00:53 -07:00
James Zern
447875b47b
filters_neon,cosmetics: fix indent
...
BUG=webp:355
Change-Id: I9df1119f1ea94868f75253a92c2e878c9290f744
2017-10-20 19:00:29 -07:00
James Zern
785da7eadd
enc_neon: harmonize function suffixes
...
BUG=webp:355
Change-Id: Ie59efd271d16f12d21f3c800667dfc0980dc2e68
2017-10-20 00:18:32 -07:00
James Zern
bc1a251fcf
dec_neon: harmonize function suffixes
...
BUG=webp:355
Change-Id: I61c9a0c9e24515322955e04afd8c4ea6a44b9319
2017-10-20 00:14:18 -07:00
James Zern
61e535f1ac
dsp/lossless: workaround gcc-4.8 bug on arm
...
and all older versions.
force Sub3() to not be inlined, otherwise the code in Select() will be
incorrect.
extends the check add previously in:
637b3888
dsp/lossless: workaround gcc-4.9 bug on arm
BUG=webp:363
Change-Id: I1403b558f8660b764f3a570a3326822d5ef0be29
2017-10-19 13:05:48 -07:00
Pascal Massimino
0a17f4712c
Merge "WIP: list includes as descendants of the project dir"
2017-10-11 08:21:42 +00:00
James Zern
a439972175
WIP: list includes as descendants of the project dir
...
#include "(.|..)/..." -> #include "src/..."
Change-Id: I772880aa097a770722043c8a4393552ba38a89b6
2017-10-10 23:04:05 -07:00
James Zern
d361a6a733
yuv_sse2: harmonize function suffixes
...
BUG=webp:355
Change-Id: I02a66f7446c75a10c3ce4766235e5767617d0dce
2017-10-08 14:06:34 -07:00
James Zern
6921aa6f0c
upsampling_sse2: harmonize function suffixes
...
BUG=webp:355
Change-Id: I3a02cc717eb7506bd87511d6a17ab1691e84f72c
2017-10-08 14:06:30 -07:00
James Zern
08c67d3ed1
ssim_sse2: harmonize function suffixes
...
BUG=webp:355
Change-Id: I1282559888118b8cb0a46b7f0aa627d26b8838f5
2017-10-08 14:06:24 -07:00
James Zern
582a1b572a
rescaler_sse2: harmonize function suffixes
...
BUG=webp:355
Change-Id: I978fd826ff90149c0ffd9d7607dcc6f88082d3e6
2017-10-08 14:06:19 -07:00
James Zern
2c1b18ba2f
lossless_sse2: harmonize function suffixes
...
BUG=webp:355
Change-Id: I59d828800c2ab2a36e0ea90f629b74bd57207411
2017-10-08 14:06:14 -07:00
James Zern
0ac46e818b
lossless_enc_sse2: harmonize function suffixes
...
BUG=webp:355
Change-Id: I06c64416103c3f3fc0519dd46d64b0a35f9798e4
2017-10-08 14:06:05 -07:00
James Zern
bc634d57c2
enc_sse2: harmonize function suffixes
...
BUG=webp:355
Change-Id: Idd2f289fcf99f12bf36494111b07a8906c99c826
2017-10-08 14:05:59 -07:00
James Zern
bcb7347c2b
dec_sse2: harmonize function suffixes
...
BUG=webp:355
Change-Id: Ic0390a4a24a5d8caff5b8af9fc9d59769ec533b1
2017-10-07 15:14:03 -07:00
James Zern
fb3daad604
cpu: fix ssse3 check
...
ssse3 is bit #9 in ecx, bit 1 is sse3. this only controls the check for
slow ssse3 and likely had no ill effect.
Change-Id: I84ce73dc480e1cdbd085e37be06f3f402116c201
2017-09-29 16:27:47 -07:00
Vincent Rabaud
a5216efc8c
Fix integer overflow warning.
...
Though the overflow could happen, it does not change the
end results.
Change-Id: I1b84e022a0776d35eab5c5c4fb7d3563f5667bfa
2017-09-25 11:02:22 +02:00
James Zern
f78da3dea6
add LOCAL_CLANG_PREREQ and avoid WORK_AROUND_GCC w/3.8+
...
this results in a 15-20% speedup for lossy decoding on a N5/S6/CM1
BUG=webp:339
Change-Id: Icdeb84c3e0b8908147ac276b4d8f76c3d565b735
2017-09-19 20:59:49 -07:00
James Zern
01c426f1e7
define WEBP_USE_INTRINSICS w/gcc-4.9+
...
32-bit builds are neutral to slightly faster using ndk r15c on a
N5/S6/CM1
BUG=webp:339
Change-Id: I94b9442e0ceaf2f5edb2b4026bc8b99cd77c918b
2017-09-19 20:59:43 -07:00
Pascal Massimino
3822762a6c
rationalize the Makefile.am
...
one library addition per line, etc...
BUG=webp:355
Change-Id: I95761dea598a382db5632c5187210937e129ff75
2017-08-29 00:00:14 -07:00
Pascal Massimino
42c79aa66b
Merge "Encoder: harmonize function suffixes"
2017-08-09 18:13:57 +00:00
skal
b09307dcde
Encoder: harmonize function suffixes
...
BUG=webp:355
Change-Id: Ia2fe95db7dfb303f3f64e390d43bc41b8933256c
2017-08-09 02:41:01 +00:00
James Zern
bed0456d58
Merge "SSIM: harmonize the function suffix"
2017-08-09 02:37:39 +00:00
skal
54f6a3cf3a
lossless_sse2.c: fix some missed suffix changes
...
BUG=webp:355
Change-Id: If830e3169a4021899ed850aa7edfd94b81fa2cf9
2017-08-08 14:19:05 -07:00
skal
088f1dcce8
SSIM: harmonize the function suffix
...
BUG=webp:355
Change-Id: I751852ddb2abb7319e41e6c7d022ac4f288b4d08
2017-08-08 08:52:06 -07:00
skal
a0f72a4fe0
VP8LTransformColorFunc: drop an non-respected 'const' from the signature.
...
BUG=webp:355
Change-Id: Ie99bf377a55db2950bfbac9423bfe0967623ea5d
2017-08-07 19:05:01 -07:00
Pascal Massimino
8c934902cd
Merge "Lossess dec: harmonize the function suffixes"
2017-08-08 02:04:10 +00:00
skal
622242aaba
Lossess dec: harmonize the function suffixes
...
BUG=webp:355
Change-Id: I445d64df6aa2e347f41e7af306be12a77e2ac6a5
2017-08-07 18:22:41 -07:00
skal
1411f02761
Lossless Enc: harmonize the function suffixes
...
BUG=webp:355
Change-Id: I8baf506bd2a27095b956ef22a862b071f60c0d72
2017-08-07 18:02:07 -07:00
James Zern
7beed2807b
add missing ()s to macro parameters
...
BUG=webp:355
Change-Id: I616c6d3540d6551edd1b1cfdb5bffcf0a044c90f
2017-08-04 17:02:53 -07:00
James Zern
6473d20b3e
Merge "fix Android standalone toolchain build"
2017-08-04 18:25:21 +00:00
James Zern
0c83a8bc69
Merge "yuv: harmonize suffix naming"
2017-08-02 06:35:36 +00:00
James Zern
c6d1db4b36
fix Android standalone toolchain build
...
add a check for cpu-features.h and rework some of the ifdef's around
android + neon. for android builds with cpu-features enabled the
*_neon.c files will still need to be flagged correctly (with e.g.,
.c.neon in Android.mk) to properly build them.
BUG=webp:353
Change-Id: I905ce305af0a204e560b915d8665093a3edaceb9
2017-08-01 22:59:03 -07:00
skal
663a6d9d2e
unify the ALTERNATE_CODE flag usage
...
Pattern is now:
#if !defined(FLAG)
#define FLAG 0 // ALTERNATE_CODE
#endif
...
#if (FLAG == 1)
...
#else
...
#endif // FLAG
...
Removed some unused code / flags:
WEBP_YUV_USE_TABLE, WEBP_REFERENCE_IMPLEMENTATION,
experimental code, VP8YUVInit(), ...
BUG=webp:355
Change-Id: I98deb9189446a4cfd665c13ea8aa1ce6a308c63f
2017-08-01 20:49:29 -07:00
skal
73ea9f2702
yuv: harmonize suffix naming
...
BUG=webp:355
Change-Id: I403c4b3cdfc55b3b1648f98a1d189326a3e660a3
2017-08-01 20:40:00 -07:00
skal
c4568b47fd
Rescaler: harmonize the suffix naming
...
BUG=webp:355
Change-Id: I7720502c62f96c780793d3d881eac7b3afae1418
2017-08-01 23:49:44 +00:00
Pascal Massimino
6cb13b0532
Merge "alpha_processing: harmonize the naming suffixes to be _C()"
2017-08-01 03:38:03 +00:00
James Zern
83a3e69a20
Merge "simplify WEBP_EXTERN macro"
2017-08-01 03:29:12 +00:00
Pascal Massimino
7295fde2e6
Merge "filters: harmonize the suffixes naming to _SSE2(), _C(), etc."
2017-08-01 01:55:48 +00:00
James Zern
8e42ba4c80
simplify WEBP_EXTERN macro
...
including the type in the macro doesn't bring much benefit to ordering,
current platforms work with a prefix, this would be insufficient if the
attribute needed to follow the function prototype. this form makes it
easier to override on the command line.
BUG=webp:355
Change-Id: Iba41ec0bb319403054be0e899c4cc472dd932fd9
2017-07-31 18:27:52 -07:00
skal
331ab34bcd
cost*.c: harmonize the suffix namings
...
BUG=webp:355
Change-Id: Ic2e60eaab71cdffe1ebf93fc36aaa3eb25bbf08d
2017-07-31 17:18:32 -07:00
skal
b161f670f8
filters: harmonize the suffixes naming to _SSE2(), _C(), etc.
...
BUG=webp:355
Change-Id: I28f464eb13444c3046332cdda3c547f81700ecf4
2017-08-01 00:09:05 +00:00
skal
dec5e4d330
alpha_processing: harmonize the naming suffixes to be _C()
...
BUG=webp:355
Change-Id: Iae8221cd34957764ead21aa46abfc320e5514a4b
2017-07-31 23:34:24 +00:00
James Zern
92982609bc
dsp.h: fix -Wundef w/__mips_dsp_rev
...
Change-Id: I552a543c7b039774041b43ace75b0cbea566b119
2017-07-11 16:12:32 -07:00
James Zern
4ea49f6b82
rescaler_sse2.c: fix WEBP_RESCALER_FIX -> _RFIX typo
...
quiets -Wundef
Change-Id: I8f1facf401b6f1ab393005c93086ac3e2ae354d5
2017-07-11 15:35:27 -07:00
James Zern
b34a9db1a1
cosmetics,dec_sse2: remove some redundant comments
...
Change-Id: I5a59d6dde9b6638b318f36d51d0d53870a3de273
2017-07-06 23:19:18 -07:00
Vincent Rabaud
8acb4942f7
Remove the argb* files.
...
Half of the functionality was duplicated.
The rest is about the alpha channel handling so we
might as well put it in the appropriate file.
Change-Id: I8d5ef0afce82cc4842ab7132fd97995c42e6140a
2017-06-25 14:44:33 +02:00
Vincent Rabaud
7ca0df1363
Have the SSE2 version of PackARGB use common code.
...
The common code actually got sped-up by 25% by using the code
from PackARGB.
Change-Id: I94be6ccff2bfe02fff13c8e2698669e6a0d8fc74
2017-06-20 17:41:14 +02:00
Vincent Rabaud
8f6df1d0b9
Unroll Predictors 10, 11 and 12.
...
We see the following speed-ups:
10 -> 13%
11 -> 13%
12 -> 13%
Change-Id: I4734fd388d0f4e508884d0b123976bf2cbe69d2f
2017-06-08 20:37:47 +02:00
Vincent Rabaud
e4eb458741
lossless, VP8LTransformColor_C: make sure no overflow happens with colors.
...
Change-Id: Iec0d07cf1188ba96391cdb1b62131fc1469dfac6
2017-05-24 11:34:40 +02:00
Pascal Massimino
faf42213f4
NEON: implement ConvertRGB24ToY/BGR24/ARGB/RGBA32ToUV/ARGBToUV
...
Change-Id: Ie68aaed36d17f56d998c1b284514860cf5d28b8a
2017-05-09 15:57:20 +02:00
Pascal Massimino
f768218966
yuv: rationalize the C/SSE2 function naming
...
+ implement some easy missing targets in SSE2 (565/4444)
Change-Id: Ib575f7ada2a0ed7309cddd238f8bfc0e8999f145
2017-04-21 13:52:25 +02:00
Pascal Massimino
52245424b0
NEON implementation of some Sharp-YUV420 functions
...
Change-Id: I449ef9c76b06f971f6e2ad7f9db96bf906d8fe1f
new-file: dsp/yuv_neon.c
2017-04-18 19:22:37 +02:00
Pascal Massimino
28c37ebd5a
VP8LEnc: remove use of BitsLog2Ceiling()
...
was only used once. Better fall back for Log2Floor.
Change-Id: Ibcc26505440971bffe62ba6aca3d179ca85791d4
2017-03-20 02:58:16 -07:00
James Zern
80a2218668
ssim.c: remove dead include
...
Change-Id: Ia4be534b3b95d5d9f712ff53e530c98b942df860
2017-02-21 20:17:19 -08:00
Pascal Massimino
693bf74ec0
move the SSIM calculation code in ssim.c / ssim_sse2.c
...
Change-Id: I63a63fa7f44f257f2e17e45358b206c23069c448
2017-02-21 12:53:35 +01:00
Pascal Massimino
4105d565d3
disable WEBP_USE_XXX optimisations when EMSCRIPTEN is defined
...
Currently, none are available. If WEBP_HAVE_SSE2 eventually works,
we'll have to refine this conditionals.
BUG=webp:261
Change-Id: Ibc63ee1c013f2a4169eeb85cc8b6317b6420c2ad
2017-02-08 15:44:20 +00:00
Parag Salasakar
aa893914fc
Add clang build fix for MSA
...
Change-Id: If139f4ecbdce756c69ba4ae032a70f81179683f8
2017-02-01 17:45:17 +05:30
Pascal Massimino
4f3e3bbd44
disable GradientUnfilter_NEON
...
Compile with XCode, it appears quite slower than the C-version,
especially for arm64.
Change-Id: Ic46dba184a36be454fef674129d2f909003788fc
2017-01-25 16:33:26 -08:00
Pascal Massimino
79bf46f120
rename the pretentious SmartYUV into SharpYUV
...
Change-Id: Ifeeb9cb85896c5f3ba0cc1c2c821f8d00295f69e
2017-01-20 14:36:21 +01:00
James Zern
668e1dd44f
src/{dec,enc,utils}: give filenames a unique suffix
...
this avoids duplicates between these trees and dsp/, e.g., enc/tree.c,
dec/tree.c, making pulling the whole library source tree into one target
possible
BUG=webp:279
Change-Id: I060a614833c7c24ddd37bf641702ae6a5eef1775
2017-01-19 19:09:48 -08:00
Pascal Massimino
71c53f1aeb
NEON: speed-up strong filtering
...
The sub-expression trick removes two constants and
two vmlal_s8 instructions.
Change-Id: I200022573b4880871b528b13a11a8f3d95def113
2017-01-19 20:46:48 +00:00
Pascal Massimino
749a45a520
Merge "NEON: implement alpha-filters (horizontal/vertical/gradient)"
2017-01-17 15:13:08 +00:00
Pascal Massimino
74c053b57d
Merge "NEON: fix overflow in SSE NxN calculation"
2017-01-17 15:10:54 +00:00
Pascal Massimino
1de931c669
NEON: implement alpha-filters (horizontal/vertical/gradient)
...
gradient-filter code is not much faster, but maybe improvable in the future.
Change-Id: Ia16070e409fe8703b02276166f19526917df6b35
2017-01-17 15:44:46 +01:00
Pascal Massimino
9b3aca404d
NEON: fix overflow in SSE NxN calculation
...
vmlal_u8() is prone to overflow during the accumulation.
There was a mismatch happening at low q mostly. Because in this
case the distortion is important and the accumulated sum was
later than 16bit-unsigned.
Change-Id: I1a08a2f744bcdf0b26647e61b9ee92a0c2e28fe8
2017-01-17 11:47:36 +01:00
Pascal Massimino
1c07a3c639
dsp: WebPExtractGreen function for alpha decompression
...
+ NEON implementation
Change-Id: I67204f99d6e4c5974718bdf21dad30381978f72c
2017-01-17 09:33:25 +00:00
Pascal Massimino
8fda56126e
Merge "add a kSlowSSSE3 feature for CPUInfo"
2017-01-13 07:01:48 +00:00
Pascal Massimino
86bbd24552
add a kSlowSSSE3 feature for CPUInfo
...
This is meant to be used for run-time detection of slow platforms
regarding instructions like pshufb and bsr.
Adapted from libvpx patch: https://chromium-review.googlesource.com/#/c/367731
Change-Id: I2c22fbb9aae699d87a041393ba1ad5f1f21ff640
2017-01-13 06:19:27 +00:00
Vincent Rabaud
7c2779e95a
Get code to fully compile in C++.
...
Change-Id: I6d8490c8c9b955d90dcc89ee8a9cf29ca0f93b08
2017-01-12 18:03:55 +01:00
Vincent Rabaud
250c358662
Merge "When compiling as C++, avoid narrowing warnings."
2017-01-12 13:00:56 +00:00
Vincent Rabaud
c0648ac2ae
When compiling as C++, avoid narrowing warnings.
...
The gcc compilation warning was: narrowing conversion from ‘int’ to ‘int8_t’
Change-Id: I4803dd60ad04060cdb5d61a1aa98b25215b9d4eb
2017-01-12 13:39:22 +01:00
Pascal Massimino
0d55f60c91
40% faster ApplyAlphaMultiply_SSE2
...
process four pixels at a time
Change-Id: I1dee7f70772be4915654fc6638ef4729a1a239d4
2017-01-12 02:33:09 -08:00
Pascal Massimino
49d0280df1
NEON: implement several alpha-processing functions
...
- ApplyAlphaMultiply
- DispatchAlpha
- DispatchAlphaToGreen
- ExtractAlpha
Decoding to Argb / rgbA / ... is 10-15% faster (measured on N4)
new file: alpha_processing_neon.c
Change-Id: I40f1a809e9885d1031ff0bc886d8d001efa66bca
2017-01-11 17:39:29 +01:00
Pascal Massimino
48b1e85fbe
SSE2: 15% faster alpha-processing functions
...
ApplyAlphaMultiply / MultARGBRow / MultRow
we use now: x/255 = (x * 0x8081) >> (16 + 7)
and x/255 + .5 = ((x + 128) * 0x0101) >> 16
Change-Id: I8931091316ffc8bbf65aa3402f2e7d2b800e1971
2017-01-11 15:35:16 +01:00
Pascal Massimino
28fe054e73
SSE2: 30% faster ApplyAlphaMultiply()
...
and 15% faster MultARGBRow()
by switching to formulae:
X / 255 = (X + 1 + (X >> 8)) >> 8 for any 16bit value X.
(X / 255 + .5) = (XX + (XX >> 8)) >> 8, with XX = X + 128
Change-Id: Ia4a7408aee74d7f61b58f5dff304d05546c04e81
2017-01-10 23:34:22 +01:00
Pascal Massimino
be0ef6395f
fix a comment typo
...
Change-Id: I0fabd08cd8abd3cea7ddfd2e498507adb0d3c67e
2017-01-10 21:17:13 +01:00
Pascal Massimino
00b08c88c0
Merge "NEON: 5% faster conversion to RGB565 and RGBA4444"
2016-12-22 08:39:01 +00:00
Pascal Massimino
0e7f444702
Merge "NEON: faster fancy upsampling"
2016-12-21 14:53:24 +00:00
Pascal Massimino
b016cb91c5
NEON: faster fancy upsampling
...
2-3% faster decoding overall
Change-Id: I2c53e50dc7e0ade5245cff8cc5d7b96a14062955
2016-12-21 15:23:54 +01:00
Vincent Rabaud
1cb638010c
Call the C function to finish off lossless SSE loops only when necessary.
...
Change-Id: I4e221d80879dc9c90c24d69a40bc5811d73787ad
2016-12-21 14:25:54 +01:00
Vincent Rabaud
875fafc191
Implement BundleColorMap in SSE2.
...
Change-Id: I44cd23647bd0a49330b6b2b3ed08050a5500e58e
2016-12-21 10:44:31 +01:00
Pascal Massimino
341d711c43
NEON: 5% faster conversion to RGB565 and RGBA4444
...
We use the magic 'shift and insert' instruction instead of
the multiple shifts and or's.
Change-Id: I48df0320668b502a91792defc0423a9441669d19
2016-12-20 17:01:48 +01:00
Pascal Massimino
a4bbe4b38b
fix indentation
...
Change-Id: I5593fb2441f253c6b8cc43949c11909f19184b55
2016-12-13 22:50:29 -08:00
Pascal Massimino
58fc507842
Merge "PredictorSub: implement fully-SSE2 version"
2016-12-13 11:03:13 +00:00
Pascal Massimino
9cc421675b
PredictorSub: implement fully-SSE2 version
...
and inline the C-version too.
Predictor #13 is still a hard one.
Change-Id: Iedecfb5cbf216da4e28ccfdd0810286133f42331
2016-12-13 02:19:35 -08:00
James Zern
2423017a28
dsp/lossless.c,cosmetics: fix indent
...
after:
fbba5bc
optimize predictor #1 in plain-C For some reason, gcc has hard
time inlining this one...
Change-Id: I2e2416593acd4c9d14958d8757bfd284d999100b
2016-12-12 12:53:23 -08:00
Pascal Massimino
fbba5bc2c1
optimize predictor #1 in plain-C
...
For some reason, gcc has hard time inlining this one...
Also optimize predictor #0 and #1 for encoding, so we don't have to
call the generic pointers VP8LPredictors[...]
Change-Id: I1ff31e3b83874b53f84fe23487f644619fd61db9
2016-12-12 17:41:36 +01:00
Pascal Massimino
9ae0b3f65a
Merge "SSE2: slightly (~2%) faster Predictor #1 "
2016-12-12 14:46:21 +00:00
Pascal Massimino
c1f97bd758
SSE2: slightly (~2%) faster Predictor #1
...
by removing a load from memory
Change-Id: If6c4aa7fb99309d09f943393ec772891449971f0
2016-12-12 02:24:38 -08:00
Pascal Massimino
ea664b8995
SSE2: 10% faster Predictor #11
...
Change-Id: I14ae5f6603071b86dfdbe8e6f7dfdbe5d8510185
2016-12-12 02:20:41 -08:00
Pascal Massimino
b3fb8bb602
slightly faster Predictor #11 in NEON
...
(+some slight modifications on Predictor #12 )
Change-Id: Ic2132dcd83d961cd069fa01ca1670e35e35274e2
2016-12-08 07:32:51 -08:00