1488 Commits

Author SHA1 Message Date
Frank Barchard
d71cda1bb0 Rollback util cpuid hybrid detect due to android build errors
Bug: 438241552
Change-Id: Ie56aa7296e796e44e63d0dd913120b897b12cc9b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6843504
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-08-12 14:13:24 -07:00
Frank Barchard
cdd3bae848 TestI400LargeSize fix for warning message build error
- change %ld to %zd for size_t printf warnings
- disable TestI400LargeSize when disabling SLOW_TESTS
- disable cpuid tests that read proc/cpuinfo test data files
- add ifdef around timers to allow hexagon build
- remove faulty hybrid detect
- remove old mips LIBYUV_DISABLE_DSPR2 reference in gyp build
- apply clang-format

Bug: 434382656
Change-Id: Id74812e6ef29d4a8d0ff967a9189d249b80816d4
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6812825
Reviewed-by: Jeremy Leconte <jleconte@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2025-08-01 12:03:11 -07:00
Frank Barchard
3ff31b2a5f Make LibYUVConvertTest.TestI400LargeSize skip test on low end arm cpu
- detect lack of dot product instruction to infer the cpu is low end
- only run the test on higher end arm

Bug: 416842099
Change-Id: Idd2dd16a624bbba280cf531644440024b12f7ecf
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6804632
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2025-07-31 02:41:17 -07:00
Frank Barchard
6f729fbe65 ARGBToUV SSE use average of 4 pixels
- Was using avgb twice for non-exact and C for exact.

On Skylake Xeon:

Now SSE3
ARGBToJ420_Opt (326 ms)

Was
Exact C
ARGBToJ420_Opt (871 ms)
Not exact AVX2
ARGBToJ420_Opt (237 ms)
Not exact SSSE3
ARGBToJ420_Opt (312 ms)

Bug: 381138208
Change-Id: I6d1081bb52e36f06736c0c6575fa82bb2268629b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6629605
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Ben Weiss <bweiss@google.com>
2025-06-17 11:55:27 -07:00
Frank Barchard
889613683a Add hybrid detect for Intel laptop cpus
- Add +i8mm build option for sve ARGBToUV which uses usdot
- util/cpuid Get cpu count (windows, macos, linux)
- For each x86 cpu, detect hybrid (e-core)
- Includes a comment fix for ubsan unittest
- Bump version
- Apply clang format to util/*.c as well as all *.cc/*.h

Bug: 424637372
Change-Id: I08310e18051fff62c9e4e4a10d1e4361871119ac
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6635640
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-06-13 13:22:54 -07:00
Frank Barchard
4ac0a3ae3d ubsan compliant '_any' functions using ptrdiff_t for pointer math
Bug: 416842099
Change-Id: I1e3c7bc1b363c11baeb3b529ee78e5ac8878c359
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6634217
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-06-10 15:01:52 -07:00
Frank Barchard
0853c9353f ARGBToUV 64 bit use ymm8 for shuffler
Bug: 381138208
Change-Id: I5e69bc1610bd6269bf9a4113e729cf307dd36f60
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6536833
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2025-05-12 15:09:40 -07:00
Frank Barchard
9f9b5cf660 ARGBToUV allow 32 bit x86 build
- make width loop count on stack
- set YMM constants in its own asm block
- make struct for shuffle and add constants
- disable clang format on row_neon.cc function

Bug: 413781394
Change-Id: I263f6862cb7589dc31ac65d118f7ebeb65dbb24a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6495259
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2025-04-28 12:11:00 -07:00
Frank Barchard
23d416d6f3 Detect SME without SVE dependency
Bug: None
Change-Id: Ibe29488e893a493699ea3fae1a1a54a4fff5969c
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6418571
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-03-31 17:27:40 -07:00
Frank Barchard
5f284054cb RVV disable 64 bit elements and vcombine_v
Bug: 405451074
Change-Id: I8e4437be92934b3c367c94d867d7967c32747260
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6385788
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-03-25 12:51:25 -07:00
Jordan
0fd4581a51 Updating license id for libyuv
Bug: b/358504615

Change-Id: I93fecd22c16df8949a8ebe85aabe539c0231985e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6275535
Commit-Queue: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-03-18 16:37:39 -07:00
Frank Barchard
c060118bea ARGBToJ444 use 256 for fixed point scale UV
- use negative coefficients for UV to allow -128
- change shift to truncate instead of round for UV
- adapt all row_gcc RGB to UV into matrix functions
- add -DLIBYUV_ENABLE_ROWWIN to allow clang on Windows to use row_win.cc

Bug: 381138208
Change-Id: I6016062c859faf147a8a2cdea6c09976cbf2963c
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6277710
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: James Zern <jzern@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2025-02-27 13:04:15 -08:00
Frank Barchard
61354d2671 ARGBToUV Matrix for AVX2 and SSSE3
- Round before shifting to 8 bit to match NEON
  - RAWToARGB use unaligned loads and port to AVX2

Was C/SSSE/AVX2
ARGBToI444_Opt (343 ms)
ARGBToJ444_Opt (677 ms)
RAWToI444_Opt (405 ms)
RAWToJ444_Opt (803 ms)

Now AVX2
ARGBToI444_Opt (283 ms)
ARGBToJ444_Opt (284 ms)
RAWToI444_Opt (316 ms)
RAWToJ444_Opt (339 ms)

Profile Now AVX2
  38.31%  ARGBToUVJ444Row_AVX2
  32.31%  RAWToARGBRow_AVX2
  23.99%  ARGBToYJRow_AVX2

Profile Was C/SSSE/AVX2
    73.15%  ARGBToUVJ444Row_C
    15.74%  RAWToARGBRow_SSSE3
     8.87%  ARGBToYJRow_AVX2

Bug: 381138208
Change-Id: I696b2d83435bc985aa38df831e01ff1a658da56e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6231592
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Ben Weiss <bweiss@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2025-02-10 18:36:18 -08:00
Frank Barchard
d32d19ccf2 UV subsample on ARM use rounding average of 4 pixels
Performance on Samsung S22 Exynos (SVE2+I8MM+DOTPROD+Neon)
AArch64
ARGBToI400_Opt (168 ms)
ARGBToJ400_Opt (103 ms)
ABGRToJ400_Opt (81 ms)
RGBAToJ400_Opt (82 ms)
RGB24ToJ400_Opt (176 ms)
RAWToJ400_Opt (176 ms)
ABGRToI420_Opt (258 ms)
ARGBToI420_Opt (259 ms)
ARGBToI422_Opt (403 ms)
ARGBToI444_Opt (213 ms)
ARGBToJ420_Opt (257 ms)
ARGBToJ422_Opt (403 ms)
ARGBToJ444_Opt (214 ms)
ABGRToJ420_Opt (255 ms)
ABGRToJ422_Opt (399 ms)
ARGB4444ToI420_Opt (285 ms)
RGB565ToI420_Opt (316 ms)
ARGB1555ToI420_Opt (324 ms)
BGRAToI420_Opt (260 ms)
RAWToI420_Opt (303 ms)
RAWToI444_Opt (303 ms)
RAWToJ420_Opt (335 ms)
RAWToJ444_Opt (308 ms)
RGB24ToI420_Opt (372 ms)
RGB24ToJ420_Opt (365 ms)
RGBAToI420_Opt (259 ms)

AArch32 (Neon)
ARGBToI400_Opt (496 ms)
ARGBToJ400_Opt (478 ms)
ABGRToJ400_Opt (483 ms)
RGBAToJ400_Opt (493 ms)
RGB24ToJ400_Opt (343 ms)
RAWToJ400_Opt (341 ms)
ABGRToI420_Opt (993 ms)
ARGBToI420_Opt (992 ms)
ARGBToI422_Opt (1503 ms)
ARGBToI444_Opt (1257 ms)
ARGBToJ420_Opt (1006 ms)
ARGBToJ422_Opt (1521 ms)
ARGBToJ444_Opt (1267 ms)
ABGRToJ420_Opt (1002 ms)
ABGRToJ422_Opt (1504 ms)
ARGB4444ToI420_Opt (1180 ms)
RGB565ToI420_Opt (1112 ms)
ARGB1555ToI420_Opt (1115 ms)
BGRAToI420_Opt (993 ms)
RAWToI420_Opt (703 ms)
RAWToI444_Opt (1717 ms)
RAWToJ420_Opt (704 ms)
RAWToJ444_Opt (1739 ms)
RGB24ToI420_Opt (703 ms)
RGB24ToJ420_Opt (703 ms)
RGBAToI420_Opt (993 ms)

Bug: 381138208
Change-Id: I33728d5237f357362b0bfc509a9ebe6fe46f45d4
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6228987
Reviewed-by: Ben Weiss <bweiss@google.com>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-02-04 15:19:19 -08:00
Frank Barchard
5a9a6ea936 Add RAWToI444
Skylake Xeon
  RAWToI444_Opt (433 ms)
  RAWToJ444_Opt (1781 ms)
  ARGBToI444_Opt (352 ms)
  ARGBToJ444_Opt (1577 ms)

Samsung S22 Exynos
  ARGBToI444_Opt (283 ms)
  ARGBToJ444_Opt (209 ms)
  RAWToI444_Opt (294 ms)
  RAWToJ444_Opt (293 ms)

Profiling on Samsung S22 Exynos
37.62%,  ARGBToUV444Row_NEON_I8MM
29.42%,  RAWToARGBRow_SVE2
19.61%,  ARGBToYRow_NEON_DotProd

Passing different --libyuv_cpu_info=N etc we can compare each ISA
C           1  RAWToI444_Opt (781 ms)
NEON      511  RAWToI444_Opt (757 ms)
NEONDOT  1023  RAWToI444_Opt (571 ms)
NEONI8MM 2047  RAWToI444_Opt (334 ms)
SVE2     8191  RAWToI444_Opt (307 ms)



Bug: 390247964
Change-Id: I0316fedd32222588455afa751f5b854f46bce024
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6223658
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-02-03 16:13:03 -08:00
Frank Barchard
c1bac9e6a5 RAWToJ444 and ARGBToJ444
- ARGBToJ444 implements ARGBToUVJ444Row_C
- RAWToJ444 implemented as 2 steps - RAWToARGB and ARGBToJ444

libyuv_test '--gunit_filter=*R*To?444_Opt' --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1
(with bit exact off)

Samsung S23
RAWToJ444_Opt (437 ms)
ARGBToJ444_Opt (337 ms)
ARGBToI444_Opt (196 ms)

Skylake Xeon
RAWToJ444_Opt (1699 ms)
ARGBToJ444_Opt (1559 ms)
ARGBToI444_Opt (346 ms)

Bug: 390247964
Change-Id: Id1b1b45a5e4512ab50830aadf62f780fbe631575
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6207845
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-01-29 15:18:38 -08:00
Frank Barchard
6c2415bfab J420ToI420 AVX2
libyuv_test '--gunit_filter=*J420ToI420*' --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1

Skylake Xeon
AVX2 J420ToI420_Opt (114 ms)
C    J420ToI420_Opt (596 ms)

Sapphire Rapids
AVX2 J420ToI420_Opt (126 ms)
C    J420ToI420_Opt (717 ms)

Samsung S23
NEON J420ToI420_Opt (46 ms)
C    J420ToI420_Opt (95 ms)

Bug: 381327032
Change-Id: I2b551507c2a8b1da4f04651b622fc9247a75050d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6201239
Reviewed-by: Justin Green <greenjustin@google.com>
2025-01-27 11:23:44 -08:00
Frank Barchard
26277baf96 J420ToI420 using planar 8 bit scaling
- Add Convert8To8Plane which scale and add 8 bit values allowing full range
  YUV to be converted to limited range YUV

libyuv_test '--gunit_filter=*J420ToI420*' --gunit_also_run_disabled_tests --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1

Samsung S23
J420ToI420_Opt (45 ms)
I420ToI420_Opt (37 ms)

Skylake
J420ToI420_Opt (596 ms)
I420ToI420_Opt (99 ms)

Bug: 381327032
Change-Id: I380c3fa783491f2e3727af28b0ea9ce16d2bb8a4
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6182631
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-01-22 02:50:24 -08:00
Frank Barchard
10592b60c0 Add required 'Security Critical' field to README.chromium
Marked as yes to match webrtc

Bug: b/365319755
Change-Id: I92ba17e3215b6290211519e2b671087ec1386270
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6170587
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Mirko Bonadei <mbonadei@chromium.org>
2025-01-12 03:45:13 -08:00
Frank Barchard
47ddac2996 Sub sampling conversions use CopyPlane for Y channel
- Replace ScalePlane with CopyPlane for Y channel
- Vertical mirroring is supported, but not horizontal mirroring.
- Check src_y is not null when dst_y is not null for all libyuv functions that allow a null dst_y.
- Apply clang-format
- Bump version to 1899

Bug: None
Change-Id: Id1805b52b8024ba95a7f1b098dabf45af48670eb
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6128599
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-01-02 13:34:11 -08:00
Frank Barchard
1c501a8f3f CpuId test FSMR - Fast Short Rep Movsb
- Renumber cpuid bits to use low byte to ID the type of CPU and upper 24 bits for features

Intel CPUs starting at Icelake support FSMR
adl:Has FSMR 0x8000
arl:Has FSMR 0x0
bdw:Has FSMR 0x0
clx:Has FSMR 0x0
cnl:Has FSMR 0x0
cpx:Has FSMR 0x0
emr:Has FSMR 0x8000
glm:Has FSMR 0x0
glp:Has FSMR 0x0
gnr:Has FSMR 0x8000
gnr256:Has FSMR 0x8000
hsw:Has FSMR 0x0
icl:Has FSMR 0x8000
icx:Has FSMR 0x8000
ivb:Has FSMR 0x0
knl:Has FSMR 0x0
knm:Has FSMR 0x0
lnl:Has FSMR 0x8000
mrm:Has FSMR 0x0
mtl:Has FSMR 0x8000
nhm:Has FSMR 0x0
pnr:Has FSMR 0x0
rpl:Has FSMR 0x8000
skl:Has FSMR 0x0
skx:Has FSMR 0x0
slm:Has FSMR 0x0
slt:Has FSMR 0x0
snb:Has FSMR 0x0
snr:Has FSMR 0x0
spr:Has FSMR 0x8000
srf:Has FSMR 0x0
tgl:Has FSMR 0x8000
tnt:Has FSMR 0x0
wsm:Has FSMR 0x0

Intel CPUs starting at Ivybridge support ERMS

adl:Has ERMS 0x4000
arl:Has ERMS 0x4000
bdw:Has ERMS 0x4000
clx:Has ERMS 0x4000
cnl:Has ERMS 0x4000
cpx:Has ERMS 0x4000
emr:Has ERMS 0x4000
glm:Has ERMS 0x4000
glp:Has ERMS 0x4000
gnr:Has ERMS 0x4000
gnr256:Has ERMS 0x4000
hsw:Has ERMS 0x4000
icl:Has ERMS 0x4000
icx:Has ERMS 0x4000
ivb:Has ERMS 0x4000
knl:Has ERMS 0x4000
knm:Has ERMS 0x4000
lnl:Has ERMS 0x4000
mrm:Has ERMS 0x0
mtl:Has ERMS 0x4000
nhm:Has ERMS 0x0
pnr:Has ERMS 0x0
rpl:Has ERMS 0x4000
skl:Has ERMS 0x4000
skx:Has ERMS 0x4000
slm:Has ERMS 0x4000
slt:Has ERMS 0x0
snb:Has ERMS 0x0
snr:Has ERMS 0x4000
spr:Has ERMS 0x4000
srf:Has ERMS 0x4000
tgl:Has ERMS 0x4000
tnt:Has ERMS 0x4000
wsm:Has ERMS 0x0
Change-Id: I18e5a3905f2691ab66d4d0cb6f668c0a0ff72d37
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6027541
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2024-11-18 17:56:45 +00:00
Frank Barchard
ffd791f749 Check malloc allocation sizes are less than SIZE_MAX
Bug: b/371615496
Change-Id: I75a94b08469d6d6b6fd55a8659031cbcb3d48eed
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5912039
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2024-10-07 21:34:15 +00:00
Frank Barchard
4620f17058 ScalePlane crash fix for 3/4 scaling
- Scaling 48 pixels at a time, but calling code checked for 24 pixels
- Added test for scaling to 1080x1920
  libyuv_test --gunit_filter=LibYUVScaleTest.I420ScaleTo1080x1920_Box* --libyuv_width=1440 --libyuv_height=2560

Was
libyuv_test --gunit_filter=LibYUVScaleTest.I420ScaleTo1080x1920_Box* --libyuv_width=1440 --libyuv_height=2560
[ RUN      ] LibYUVScaleTest.I420ScaleTo1080x1920_Box
Segmentation fault
Traceback (most recent call last):

Now
[ RUN      ] LibYUVScaleTest.I420ScaleTo1080x1920_Box
filter 3 -     6741 us C -     3566 us OPT
[       OK ] LibYUVScaleTest.I420ScaleTo1080x1920_Box (43 ms)

Bug: b/366045177
Change-Id: I0ea6c2d6a32b2e7ca44cd030abc9f248115be44a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5857554
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2024-09-13 01:20:39 +00:00
Frank Barchard
a97746349b Add test for I010ToNV12
- Add support for negative height to invert
- Fix off by 1 on odd width and height
- Bump version to 1895

Initial I010 is 2 step planar conversion

libyuv_test '--gunit_filter=*010ToNV12_Opt' --gunit_also_run_disabled_tests --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1

Skylake Xeon
[       OK ] LibYUVConvertTest.I010ToNV12_Opt (2675 ms)
[       OK ] LibYUVConvertTest.P010ToNV12_Opt (1547 ms)
Pixel 7
[       OK ] LibYUVConvertTest.I010ToNV12_Opt (464 ms)
[       OK ] LibYUVConvertTest.P010ToNV12_Opt (125 ms)

Bug: b/357721018, b/357439226
Change-Id: I2ae59783cf328a6592d0ab80c374ae4dc281daf3
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5778595
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2024-08-12 18:57:56 +00:00
Chunbo Hua
e23bc72e8e Bump version number in order to expose new API
Bug: 357721018
Change-Id: I2c6e115cd049db2038631195305c5907764d5c7b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5768078
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2024-08-07 22:10:05 +00:00
Frank Barchard
32ccd53bb3 Add P010ToNV12 to convert 10 bit biplanar to 8 bit biplanar
- P010 and NV12 have the same layout: Full size Y plane and half size UV plane.
  P010 and NV12 are 4:2:0 subsampling
- P010 uses upper 10 bits of 16 bit elements
- NV12 uses 8 bit elements
- The Convert16To8 used internally will discard the low 2 bits.
- UV order is the same - U first in memory, followed by V, interleaved
- UV plane is be rounded up in size to allow odd size Y to have UV values
- Similar code could be used to convert P210ToNV16, P410ToNV24, with the size
  of the UV plane affected by subsampling 4:2:2 and 4:4:4 variants.

Bug: b/357439226
Change-Id: I5d6ec84d97d0e0cc4008eeb18a929ea28570d6d9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5761958
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2024-08-05 18:55:44 +00:00
Frank Barchard
4cd90347e7 Rotate use NULL for C compatability
Bug: b/353323977
Change-Id: I2472f23ce8fcc0bc09a292bd6fb758304c6c2b18
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5735714
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2024-07-23 18:02:47 +00:00
Frank Barchard
611806a155 [AArch64] Fix SVE/SME vector length printing in cpuid
A semicolon is treated as the start of a comment by some assemblers
causing the vector length to be reported incorrectly, so use a newline
instead.

- Add volatile asm in row_gcc and row_neon64

Bug: b/5631539
Change-Id: I6b0836fcdd9247ef7b9e8ceda01df3150519ecf8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5666060
Reviewed-by: Justin Green <greenjustin@google.com>
2024-07-02 19:44:41 +00:00
Frank Barchard
fa16ddbb9f cpuid show vector length on ARM and RISCV
- additional asm volatile changes from github
- rotate mips remove C function - moved to common

Run on Samsung S22
[ RUN      ] LibYUVBaseTest.TestCpuHas
Kernel Version 5.10
Has Arm 0x2
Has Neon 0x4
Has Neon DotProd 0x10
Has Neon I8MM 0x20
Has SVE 0x40
Has SVE2 0x80
Has SME 0x0
SVE vector length: 16 bytes
[       OK ] LibYUVBaseTest.TestCpuHas (0 ms)
[ RUN      ] LibYUVBaseTest.TestCompilerMacros
__ATOMIC_RELAXED 0
__cplusplus 201703
__clang_major__ 17
__clang_minor__ 0
__GNUC__ 4
__GNUC_MINOR__ 2
__aarch64__ 1
__clang__ 1
__llvm__ 1
__pic__ 2
INT_TYPES_DEFINED
__has_feature

Run on RISCV qemu emulating SiFive X280:
[ RUN      ] LibYUVBaseTest.TestCpuHas
Kernel Version 6.6
Has RISCV 0x10000000
Has RVV 0x20000000
RVV vector length: 64 bytes
[       OK ] LibYUVBaseTest.TestCpuHas (4 ms)
[ RUN      ] LibYUVBaseTest.TestCompilerMacros
__ATOMIC_RELAXED 0
__cplusplus 202002
__clang_major__ 9999
__clang_minor__ 0
__GNUC__ 4
__GNUC_MINOR__ 2
__riscv 1
__riscv_vector 1
__riscv_v_intrinsic 12000
__riscv_zve64x 1000000
__clang__ 1
__llvm__ 1
__pic__ 2
INT_TYPES_DEFINED
__has_feature

Bug: b/42280943
Change-Id: I53cf0450be4965a28942e113e4c77295ace70999
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5672088
Reviewed-by: David Gao <davidgao@google.com>
2024-07-02 18:10:56 +00:00
Frank Barchard
616bee5420 Add volatile for gcc inline to avoid being removed
Bug: b/42280943
Change-Id: I4439077a92ffa6dff91d2d10accd5251b76f7544
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5671187
Reviewed-by: David Gao <davidgao@google.com>
2024-07-02 01:25:24 +00:00
Frank Barchard
9d660a0f3b Fix environment variable LIBYUV_CPU_INFO for unittests
- Also bump version number

Bug: libyuv:979
Change-Id: I2903f15f9b9f3cd1b556eba95b01c4c58d1733b7
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5466641
Reviewed-by: James Zern <jzern@google.com>
2024-04-20 17:41:56 +00:00
Frank Barchard
b66c42d4a8 Revert "AMX detect OS support for linux kernel"
This reverts commit 8c8a33762d64b916ae8469cc3fc602a64080a23a.

Reason for revert: breaks sandbox

Original change's description:
> AMX detect OS support for linux kernel
>
> Bug: b/327013106
> Change-Id: Ie1784249f3a121c52e6504ff502bdc3eb245d858
> Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5329907
> Commit-Queue: Frank Barchard <fbarchard@chromium.org>
> Reviewed-by: richard winterton <rrwinterton@gmail.com>

Bug: b/327013106
Change-Id: If54bb84bc1167177c1869763f6ccfdf1f92fbe09
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5332617
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2024-02-29 00:33:29 +00:00
Frank Barchard
8c8a33762d AMX detect OS support for linux kernel
Bug: b/327013106
Change-Id: Ie1784249f3a121c52e6504ff502bdc3eb245d858
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5329907
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2024-02-28 03:13:44 +00:00
Frank Barchard
a6a2ec654b Add AMXINT8 cpu detect
sde -spr -- libyuv_test -- --gunit_filter=*Cpu*
Note: Google Test filter = *Cpu*
[==========] Running 4 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 3 tests from LibYUVBaseTest
[ RUN      ] LibYUVBaseTest.TestCpuHas
Cpu Flags 0x57fff9
Has X86 0x8
Has SSE2 0x10
Has SSSE3 0x20
Has SSE41 0x40
Has SSE42 0x80
Has AVX 0x100
Has AVX2 0x200
Has ERMS 0x400
Has FMA3 0x800
Has F16C 0x1000
Has AVX512BW 0x2000
Has AVX512VL 0x4000
Has AVX512VNNI 0x8000
Has AVX512VBMI 0x10000
Has AVX512VBMI2 0x20000
Has AVX512VBITALG 0x40000
Has AVX10 0x0
HAS AVXVNNI 0x100000
Has AVXVNNIINT8 0x0
Has AMXINT8 0x400000
[       OK ] LibYUVBaseTest.TestCpuHas (34 ms)

Bug: b/324356616
Change-Id: I5129b8946363a501bdd570e6dba3936c54aacd6c
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5283433
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2024-02-15 21:44:47 +00:00
Frank Barchard
3e435fe6d4 YUY2ToARGB use ymm6/7 for shuffle constants
- 1 load and 2 shuffles from registers replaces 2 loads and 2 memory shuffles
- vbroadcastf128 16 byte shuffler replaces 32 byte shufflers
- bump version and apply clang-format

libyuv_test '--gunit_filter=*.???2ToARGB_Opt' --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1

AMD Zen2
I422ToARGB_Opt (272 ms)
NV12ToARGB_Opt (255 ms)
YUY2ToARGB_Opt (208 ms)

Was
YUY2ToARGB_Opt (214 ms)

Change-Id: I1fa4d462d04536c877d1cab1a14586be8ed1b2f2
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5218447
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2024-01-22 21:47:23 +00:00
Frank Barchard
914624f0b8 YUY2ToARGBMatrix and UYVYToARGBMatrix added to allow any color matrix
Bug: libyuv:971
Change-Id: If15d4598d75500a3717f07d02c0c295fdc58254e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5214453
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2024-01-19 21:21:37 +00:00
Frank Barchard
5625f42424 I444ToI420 and I422ToI420 check U and V pointers and return -1 if NULL.
- Add detect linux kernel version number in util/cpuid

adbrun -- blaze-bin/third_party/libyuv/cpuid
Kernel Version 4.14
Cpu Flags 0x7
Has ARM 0x2

Bug: libyuv:970
Change-Id: I655ed598db3655ca8448be08f1d71fbc328ced66
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5207990
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2024-01-18 21:56:11 +00:00
Frank Barchard
af6ac8265b AVX10 cpuid detect added
Replace unused popcount feature bit

Bug: libyuv:911
Change-Id: Icd88fcc732751d39b0950d5f09a58bc9ac2c4e30
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5179911
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2024-01-10 00:08:22 +00:00
Frank Barchard
6dc03dacbf Split scale_test and scale_plane_test to allow building on small devices
Bug: libyuv:956
Change-Id: I1903aa616243e891440ed92836dfb0992d31d4cd
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5107257
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2023-12-09 18:39:41 +00:00
Frank Barchard
9e61d7f9c1 Split convert_test and convert_argb_test to allow building on small systems that run out of memory compiling unittests.
Update build files to include the new tests and source code.

Bug: libyuv:956
Change-Id: I6ec0beb6dc9570f0597d7df1835d616489dbaece
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5103585
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-12-08 13:39:56 +00:00
Wan-Teh Chang
fb6341d326 Change ScalePlane,ScalePlane_16,... to return int
Change ScalePlane(), ScalePlane_16(), and ScalePlane_12() to return int
so that they can report memory allocation failures (by returning 1).

BUG=libyuv:968

Change-Id: Ie5c183ee42e3d595302671f9ecb7b3472dc8fdb5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5005031
Commit-Queue: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-11-03 23:53:24 +00:00
Frank Barchard
31e1d6f896 Check allocations that return NULL and return early
BUG=libyuv:968

Change-Id: I9e8594440a6035958511f9c50072820131331fc8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4977552
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-10-27 17:41:36 +00:00
Frank Barchard
331c361581 AVX-VNNI detect
- Add kCpuHasAVXVNNI flag
- Remove deprecated GFNI detect to make space.

Meteor Lake has AVX-VNNI but not AVX512
~/intelsde/sde -mtl -- blaze-bin/third_party/libyuv/libyuv_test --gunit_filter=*CpuHas
doyuv3

Note: Google Test filter = *CpuHas
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from LibYUVBaseTest
[ RUN      ] LibYUVBaseTest.TestCpuHas
Cpu Flags 0x203ff1
Has X86 0x10
Has SSE2 0x20
Has SSSE3 0x40
Has SSE41 0x80
Has SSE42 0x100
Has AVX 0x200
Has AVX2 0x400
Has ERMS 0x800
Has FMA3 0x1000
Has F16C 0x2000
Has AVX512BW 0x0
Has AVX512VL 0x0
Has AVX512VNNI 0x0
Has AVX512VBMI 0x0
Has AVX512VBMI2 0x0
Has AVX512VBITALG 0x0
Has AVX512VPOPCNTDQ 0x0
HAS AVXVNNI 0x200000
Has AVXVNNIINT8 0x0


AVX-VNNI detect

- Add kCpuHasAVXVNNI flag
- Remove deprecated GFNI detect to make space.

https://bugs.chromium.org/p/libyuv/issues/detail?id=967

Meteor Lake has AVX-VNNI but not AVX512
~/intelsde/sde -mtl -- blaze-bin/third_party/libyuv/libyuv_test --gunit_filter=*CpuHas
doyuv3
Note: Google Test filter = *CpuHas
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from LibYUVBaseTest
[ RUN      ] LibYUVBaseTest.TestCpuHas
Cpu Flags 0x203ff1
Has X86 0x10
Has SSE2 0x20
Has SSSE3 0x40
Has SSE41 0x80
Has SSE42 0x100
Has AVX 0x200
Has AVX2 0x400
Has ERMS 0x800
Has FMA3 0x1000
Has F16C 0x2000
Has AVX512BW 0x0
Has AVX512VL 0x0
Has AVX512VNNI 0x0
Has AVX512VBMI 0x0
Has AVX512VBMI2 0x0
Has AVX512VBITALG 0x0
Has AVX512VPOPCNTDQ 0x0
HAS AVXVNNI 0x200000
Has AVXVNNIINT8 0x0

Running on all cpus the following report avx-vnni
grep 'AVXVNNI 0x2' */*
adl/libyuv64.txt:HAS AVXVNNI 0x200000
gnr/libyuv64.txt:HAS AVXVNNI 0x200000
grr/libyuv64.txt:HAS AVXVNNI 0x200000
mtl/libyuv64.txt:HAS AVXVNNI 0x200000
rpl/libyuv64.txt:HAS AVXVNNI 0x200000
spr/libyuv64.txt:HAS AVXVNNI 0x200000
srf/libyuv64.txt:HAS AVXVNNI 0x200000

while these support avx512 vnni
grep 'VNNI 0x1' */*
clx/libyuv64.txt:Has AVX512VNNI 0x10000
cpx/libyuv64.txt:Has AVX512VNNI 0x10000
gnr/libyuv64.txt:Has AVX512VNNI 0x10000
icl/libyuv64.txt:Has AVX512VNNI 0x10000
icx/libyuv64.txt:Has AVX512VNNI 0x10000
spr/libyuv64.txt:Has AVX512VNNI 0x10000
tgl/libyuv64.txt:Has AVX512VNNI 0x10000

and these support avx-vnni-int8
grep AVXVNNIINT8.0x4 */*
grr/libyuv64.txt:Has AVXVNNIINT8 0x400000
srf/libyuv64.txt:Has AVXVNNIINT8 0x400000

Bug: libyuv:967
Change-Id: I84cd71d1b320e7c284173eb695fc1d3b72d14ddb
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4912017
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2023-10-05 21:24:09 +00:00
Frank Barchard
709d60e6ee VNNI-INT8 detect
- Add kCpuHasAVXVNNIINT8 flag
- Move mips flags up a bit to make space.

~/intelsde/sde -srf         -- blaze-bin/third_party/libyuv/libyuv_test --gunit_filter=*CpuHas
Note: Google Test filter = *CpuHas
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from LibYUVBaseTest
[ RUN      ] LibYUVBaseTest.TestCpuHas
Cpu Flags 0x403ff1
Has X86 0x10
Has SSE2 0x20
Has SSSE3 0x40
Has SSE41 0x80
Has SSE42 0x100
Has AVX 0x200
Has AVX2 0x400
Has ERMS 0x800
Has FMA3 0x1000
Has F16C 0x2000
Has AVX512BW 0x0
Has AVX512VL 0x0
Has AVX512VNNI 0x0
Has AVX512VBMI 0x0
Has AVX512VBMI2 0x0
Has AVX512VBITALG 0x0
Has AVX512VPOPCNTDQ 0x0
Has AVXVNNIINT8 0x400000
Has GFNI 0x0
[       OK ] LibYUVBaseTest.TestCpuHas (32 ms)

INT8 supported on srf and grr
     -srf                Set chip-check and CPUID for Intel(R) Sierra Forest CPU
     -grr                Set chip-check and CPUID for Intel(R) Grand Ridge CPU

Bug: b/303434603
Change-Id: I628007929ff0518b2b36e1469b4d9aed71a9fa8f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4912015
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-10-04 16:31:36 +00:00
Frank Barchard
f0921806a2 Disable NEON if memory sanitizer is enabled
- MSAN fails on most inline assembly, unaware of what the load and store instructions do.
- MSAN is also failing on row_any functions, which memcpy a correct number of pixels into a buffer that is SIMD vector sized, apply SIMD to the full vector, and then memcpy the exact number of resulting pixels to the output buffer.  MSAN wants the temporary buffer to be initialized.  Which genenerally is done with a memset(buf, 0, sizeof(buf)); to satisify MSAN.
- RVV may not require disabling MSAN, since row functions are all 'any' number of elements, and implementation is intrinsics.

Bug: b/297979878
Change-Id: Ic21200689c0c7d2c85bb1de3eef38570137d3d8b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4832740
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2023-08-31 18:07:42 +00:00
Anne Redulla
9b6895ccd9 [ssci] Added Shipped field to READMEs
This CL adds the Shipped field (and may update the
License File field) in Chromium READMEs. Changes were
automatically created, so if you disagree with any of
them (e.g. a package is used only for testing purposes
and is not shipped), comment the suggested change and
why.

See the LSC doc at go/lsc-chrome-metadata.

Bug: b:285450740
Change-Id: I69bd0f58ab3b3861498f355e5a5650dcddfa3a6f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4666442
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Anne Redulla <aredulla@google.com>
2023-07-07 06:09:53 +00:00
Frank Barchard
650be7496f Fix warnings for missing prototypes
- Add static to internal scale and rotate functions
- Remove unittest that tested an internal scale function
- Remove unused private functions
- Include missing scale_argb.h header
- Bump version and apply clang format

Bug: libyuv:830
Change-Id: I45bab0423b86334f9707f935aedd0c6efc442dd4
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4658956
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2023-06-30 17:46:56 +00:00
Frank Barchard
a366ad714a ARGBAttenuate use (a + b + 255) >> 8
- Makes ARM and Intel match and fixes some off by 1 cases
- Add ARGBToUV444MatrixRow_NEON
- Add ConvertFP16ToFP32Column_NEON
- scale_rvv fix intinsic build error
- disable row_win version of ARGBAttenuate/Unattenuate

Bug: libyuv:936, libyuv:956
Change-Id: Ied99aaad3a11a8eb69212b628c58f86ec0723c38
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4617013
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-06-16 21:37:53 +00:00
Frank Barchard
2a5d7e2fbc FilterRows_NEON - remove unused function - same as InterpolateRow_NEON
- Bump version to 1872
- Add scale_rvv to build files

Bug: libyuv:956
Change-Id: Ib9e9fd840a0774bd35bcdcca55a2596f33272383
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4608519
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-06-13 15:20:02 +00:00
Frank Barchard
157b153b60 Fix tidy warning that uint32_t dither4 should not be const
- Remove const from uint32_t dither4 parameter to fix clang-tidy warning
- Apply clang format
- Bump version
- Remove unused MMI source; superceded by MSA

Bug: None
Change-Id: Id49991db25bca4e99590b415312542d917471c62
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4581882
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-06-02 00:42:02 +00:00