1520 Commits

Author SHA1 Message Date
Frank Barchard
10592b60c0 Add required 'Security Critical' field to README.chromium
Marked as yes to match webrtc

Bug: b/365319755
Change-Id: I92ba17e3215b6290211519e2b671087ec1386270
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6170587
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Mirko Bonadei <mbonadei@chromium.org>
2025-01-12 03:45:13 -08:00
Frank Barchard
47ddac2996 Sub sampling conversions use CopyPlane for Y channel
- Replace ScalePlane with CopyPlane for Y channel
- Vertical mirroring is supported, but not horizontal mirroring.
- Check src_y is not null when dst_y is not null for all libyuv functions that allow a null dst_y.
- Apply clang-format
- Bump version to 1899

Bug: None
Change-Id: Id1805b52b8024ba95a7f1b098dabf45af48670eb
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6128599
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2025-01-02 13:34:11 -08:00
Frank Barchard
1c501a8f3f CpuId test FSMR - Fast Short Rep Movsb
- Renumber cpuid bits to use low byte to ID the type of CPU and upper 24 bits for features

Intel CPUs starting at Icelake support FSMR
adl:Has FSMR 0x8000
arl:Has FSMR 0x0
bdw:Has FSMR 0x0
clx:Has FSMR 0x0
cnl:Has FSMR 0x0
cpx:Has FSMR 0x0
emr:Has FSMR 0x8000
glm:Has FSMR 0x0
glp:Has FSMR 0x0
gnr:Has FSMR 0x8000
gnr256:Has FSMR 0x8000
hsw:Has FSMR 0x0
icl:Has FSMR 0x8000
icx:Has FSMR 0x8000
ivb:Has FSMR 0x0
knl:Has FSMR 0x0
knm:Has FSMR 0x0
lnl:Has FSMR 0x8000
mrm:Has FSMR 0x0
mtl:Has FSMR 0x8000
nhm:Has FSMR 0x0
pnr:Has FSMR 0x0
rpl:Has FSMR 0x8000
skl:Has FSMR 0x0
skx:Has FSMR 0x0
slm:Has FSMR 0x0
slt:Has FSMR 0x0
snb:Has FSMR 0x0
snr:Has FSMR 0x0
spr:Has FSMR 0x8000
srf:Has FSMR 0x0
tgl:Has FSMR 0x8000
tnt:Has FSMR 0x0
wsm:Has FSMR 0x0

Intel CPUs starting at Ivybridge support ERMS

adl:Has ERMS 0x4000
arl:Has ERMS 0x4000
bdw:Has ERMS 0x4000
clx:Has ERMS 0x4000
cnl:Has ERMS 0x4000
cpx:Has ERMS 0x4000
emr:Has ERMS 0x4000
glm:Has ERMS 0x4000
glp:Has ERMS 0x4000
gnr:Has ERMS 0x4000
gnr256:Has ERMS 0x4000
hsw:Has ERMS 0x4000
icl:Has ERMS 0x4000
icx:Has ERMS 0x4000
ivb:Has ERMS 0x4000
knl:Has ERMS 0x4000
knm:Has ERMS 0x4000
lnl:Has ERMS 0x4000
mrm:Has ERMS 0x0
mtl:Has ERMS 0x4000
nhm:Has ERMS 0x0
pnr:Has ERMS 0x0
rpl:Has ERMS 0x4000
skl:Has ERMS 0x4000
skx:Has ERMS 0x4000
slm:Has ERMS 0x4000
slt:Has ERMS 0x0
snb:Has ERMS 0x0
snr:Has ERMS 0x4000
spr:Has ERMS 0x4000
srf:Has ERMS 0x4000
tgl:Has ERMS 0x4000
tnt:Has ERMS 0x4000
wsm:Has ERMS 0x0
Change-Id: I18e5a3905f2691ab66d4d0cb6f668c0a0ff72d37
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6027541
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2024-11-18 17:56:45 +00:00
Frank Barchard
ffd791f749 Check malloc allocation sizes are less than SIZE_MAX
Bug: b/371615496
Change-Id: I75a94b08469d6d6b6fd55a8659031cbcb3d48eed
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5912039
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2024-10-07 21:34:15 +00:00
Frank Barchard
4620f17058 ScalePlane crash fix for 3/4 scaling
- Scaling 48 pixels at a time, but calling code checked for 24 pixels
- Added test for scaling to 1080x1920
  libyuv_test --gunit_filter=LibYUVScaleTest.I420ScaleTo1080x1920_Box* --libyuv_width=1440 --libyuv_height=2560

Was
libyuv_test --gunit_filter=LibYUVScaleTest.I420ScaleTo1080x1920_Box* --libyuv_width=1440 --libyuv_height=2560
[ RUN      ] LibYUVScaleTest.I420ScaleTo1080x1920_Box
Segmentation fault
Traceback (most recent call last):

Now
[ RUN      ] LibYUVScaleTest.I420ScaleTo1080x1920_Box
filter 3 -     6741 us C -     3566 us OPT
[       OK ] LibYUVScaleTest.I420ScaleTo1080x1920_Box (43 ms)

Bug: b/366045177
Change-Id: I0ea6c2d6a32b2e7ca44cd030abc9f248115be44a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5857554
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2024-09-13 01:20:39 +00:00
Frank Barchard
a97746349b Add test for I010ToNV12
- Add support for negative height to invert
- Fix off by 1 on odd width and height
- Bump version to 1895

Initial I010 is 2 step planar conversion

libyuv_test '--gunit_filter=*010ToNV12_Opt' --gunit_also_run_disabled_tests --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1

Skylake Xeon
[       OK ] LibYUVConvertTest.I010ToNV12_Opt (2675 ms)
[       OK ] LibYUVConvertTest.P010ToNV12_Opt (1547 ms)
Pixel 7
[       OK ] LibYUVConvertTest.I010ToNV12_Opt (464 ms)
[       OK ] LibYUVConvertTest.P010ToNV12_Opt (125 ms)

Bug: b/357721018, b/357439226
Change-Id: I2ae59783cf328a6592d0ab80c374ae4dc281daf3
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5778595
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2024-08-12 18:57:56 +00:00
Chunbo Hua
e23bc72e8e Bump version number in order to expose new API
Bug: 357721018
Change-Id: I2c6e115cd049db2038631195305c5907764d5c7b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5768078
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2024-08-07 22:10:05 +00:00
Frank Barchard
32ccd53bb3 Add P010ToNV12 to convert 10 bit biplanar to 8 bit biplanar
- P010 and NV12 have the same layout: Full size Y plane and half size UV plane.
  P010 and NV12 are 4:2:0 subsampling
- P010 uses upper 10 bits of 16 bit elements
- NV12 uses 8 bit elements
- The Convert16To8 used internally will discard the low 2 bits.
- UV order is the same - U first in memory, followed by V, interleaved
- UV plane is be rounded up in size to allow odd size Y to have UV values
- Similar code could be used to convert P210ToNV16, P410ToNV24, with the size
  of the UV plane affected by subsampling 4:2:2 and 4:4:4 variants.

Bug: b/357439226
Change-Id: I5d6ec84d97d0e0cc4008eeb18a929ea28570d6d9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5761958
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2024-08-05 18:55:44 +00:00
Frank Barchard
4cd90347e7 Rotate use NULL for C compatability
Bug: b/353323977
Change-Id: I2472f23ce8fcc0bc09a292bd6fb758304c6c2b18
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5735714
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2024-07-23 18:02:47 +00:00
Frank Barchard
611806a155 [AArch64] Fix SVE/SME vector length printing in cpuid
A semicolon is treated as the start of a comment by some assemblers
causing the vector length to be reported incorrectly, so use a newline
instead.

- Add volatile asm in row_gcc and row_neon64

Bug: b/5631539
Change-Id: I6b0836fcdd9247ef7b9e8ceda01df3150519ecf8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5666060
Reviewed-by: Justin Green <greenjustin@google.com>
2024-07-02 19:44:41 +00:00
Frank Barchard
fa16ddbb9f cpuid show vector length on ARM and RISCV
- additional asm volatile changes from github
- rotate mips remove C function - moved to common

Run on Samsung S22
[ RUN      ] LibYUVBaseTest.TestCpuHas
Kernel Version 5.10
Has Arm 0x2
Has Neon 0x4
Has Neon DotProd 0x10
Has Neon I8MM 0x20
Has SVE 0x40
Has SVE2 0x80
Has SME 0x0
SVE vector length: 16 bytes
[       OK ] LibYUVBaseTest.TestCpuHas (0 ms)
[ RUN      ] LibYUVBaseTest.TestCompilerMacros
__ATOMIC_RELAXED 0
__cplusplus 201703
__clang_major__ 17
__clang_minor__ 0
__GNUC__ 4
__GNUC_MINOR__ 2
__aarch64__ 1
__clang__ 1
__llvm__ 1
__pic__ 2
INT_TYPES_DEFINED
__has_feature

Run on RISCV qemu emulating SiFive X280:
[ RUN      ] LibYUVBaseTest.TestCpuHas
Kernel Version 6.6
Has RISCV 0x10000000
Has RVV 0x20000000
RVV vector length: 64 bytes
[       OK ] LibYUVBaseTest.TestCpuHas (4 ms)
[ RUN      ] LibYUVBaseTest.TestCompilerMacros
__ATOMIC_RELAXED 0
__cplusplus 202002
__clang_major__ 9999
__clang_minor__ 0
__GNUC__ 4
__GNUC_MINOR__ 2
__riscv 1
__riscv_vector 1
__riscv_v_intrinsic 12000
__riscv_zve64x 1000000
__clang__ 1
__llvm__ 1
__pic__ 2
INT_TYPES_DEFINED
__has_feature

Bug: b/42280943
Change-Id: I53cf0450be4965a28942e113e4c77295ace70999
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5672088
Reviewed-by: David Gao <davidgao@google.com>
2024-07-02 18:10:56 +00:00
Frank Barchard
616bee5420 Add volatile for gcc inline to avoid being removed
Bug: b/42280943
Change-Id: I4439077a92ffa6dff91d2d10accd5251b76f7544
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5671187
Reviewed-by: David Gao <davidgao@google.com>
2024-07-02 01:25:24 +00:00
Frank Barchard
9d660a0f3b Fix environment variable LIBYUV_CPU_INFO for unittests
- Also bump version number

Bug: libyuv:979
Change-Id: I2903f15f9b9f3cd1b556eba95b01c4c58d1733b7
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5466641
Reviewed-by: James Zern <jzern@google.com>
2024-04-20 17:41:56 +00:00
Frank Barchard
b66c42d4a8 Revert "AMX detect OS support for linux kernel"
This reverts commit 8c8a33762d64b916ae8469cc3fc602a64080a23a.

Reason for revert: breaks sandbox

Original change's description:
> AMX detect OS support for linux kernel
>
> Bug: b/327013106
> Change-Id: Ie1784249f3a121c52e6504ff502bdc3eb245d858
> Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5329907
> Commit-Queue: Frank Barchard <fbarchard@chromium.org>
> Reviewed-by: richard winterton <rrwinterton@gmail.com>

Bug: b/327013106
Change-Id: If54bb84bc1167177c1869763f6ccfdf1f92fbe09
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5332617
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Bot-Commit: Rubber Stamper <rubber-stamper@appspot.gserviceaccount.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2024-02-29 00:33:29 +00:00
Frank Barchard
8c8a33762d AMX detect OS support for linux kernel
Bug: b/327013106
Change-Id: Ie1784249f3a121c52e6504ff502bdc3eb245d858
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5329907
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2024-02-28 03:13:44 +00:00
Frank Barchard
a6a2ec654b Add AMXINT8 cpu detect
sde -spr -- libyuv_test -- --gunit_filter=*Cpu*
Note: Google Test filter = *Cpu*
[==========] Running 4 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 3 tests from LibYUVBaseTest
[ RUN      ] LibYUVBaseTest.TestCpuHas
Cpu Flags 0x57fff9
Has X86 0x8
Has SSE2 0x10
Has SSSE3 0x20
Has SSE41 0x40
Has SSE42 0x80
Has AVX 0x100
Has AVX2 0x200
Has ERMS 0x400
Has FMA3 0x800
Has F16C 0x1000
Has AVX512BW 0x2000
Has AVX512VL 0x4000
Has AVX512VNNI 0x8000
Has AVX512VBMI 0x10000
Has AVX512VBMI2 0x20000
Has AVX512VBITALG 0x40000
Has AVX10 0x0
HAS AVXVNNI 0x100000
Has AVXVNNIINT8 0x0
Has AMXINT8 0x400000
[       OK ] LibYUVBaseTest.TestCpuHas (34 ms)

Bug: b/324356616
Change-Id: I5129b8946363a501bdd570e6dba3936c54aacd6c
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5283433
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2024-02-15 21:44:47 +00:00
Frank Barchard
3e435fe6d4 YUY2ToARGB use ymm6/7 for shuffle constants
- 1 load and 2 shuffles from registers replaces 2 loads and 2 memory shuffles
- vbroadcastf128 16 byte shuffler replaces 32 byte shufflers
- bump version and apply clang-format

libyuv_test '--gunit_filter=*.???2ToARGB_Opt' --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1

AMD Zen2
I422ToARGB_Opt (272 ms)
NV12ToARGB_Opt (255 ms)
YUY2ToARGB_Opt (208 ms)

Was
YUY2ToARGB_Opt (214 ms)

Change-Id: I1fa4d462d04536c877d1cab1a14586be8ed1b2f2
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5218447
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2024-01-22 21:47:23 +00:00
Frank Barchard
914624f0b8 YUY2ToARGBMatrix and UYVYToARGBMatrix added to allow any color matrix
Bug: libyuv:971
Change-Id: If15d4598d75500a3717f07d02c0c295fdc58254e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5214453
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2024-01-19 21:21:37 +00:00
Frank Barchard
5625f42424 I444ToI420 and I422ToI420 check U and V pointers and return -1 if NULL.
- Add detect linux kernel version number in util/cpuid

adbrun -- blaze-bin/third_party/libyuv/cpuid
Kernel Version 4.14
Cpu Flags 0x7
Has ARM 0x2

Bug: libyuv:970
Change-Id: I655ed598db3655ca8448be08f1d71fbc328ced66
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5207990
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2024-01-18 21:56:11 +00:00
Frank Barchard
af6ac8265b AVX10 cpuid detect added
Replace unused popcount feature bit

Bug: libyuv:911
Change-Id: Icd88fcc732751d39b0950d5f09a58bc9ac2c4e30
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5179911
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2024-01-10 00:08:22 +00:00
Frank Barchard
6dc03dacbf Split scale_test and scale_plane_test to allow building on small devices
Bug: libyuv:956
Change-Id: I1903aa616243e891440ed92836dfb0992d31d4cd
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5107257
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2023-12-09 18:39:41 +00:00
Frank Barchard
9e61d7f9c1 Split convert_test and convert_argb_test to allow building on small systems that run out of memory compiling unittests.
Update build files to include the new tests and source code.

Bug: libyuv:956
Change-Id: I6ec0beb6dc9570f0597d7df1835d616489dbaece
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5103585
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-12-08 13:39:56 +00:00
Wan-Teh Chang
fb6341d326 Change ScalePlane,ScalePlane_16,... to return int
Change ScalePlane(), ScalePlane_16(), and ScalePlane_12() to return int
so that they can report memory allocation failures (by returning 1).

BUG=libyuv:968

Change-Id: Ie5c183ee42e3d595302671f9ecb7b3472dc8fdb5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5005031
Commit-Queue: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-11-03 23:53:24 +00:00
Frank Barchard
31e1d6f896 Check allocations that return NULL and return early
BUG=libyuv:968

Change-Id: I9e8594440a6035958511f9c50072820131331fc8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4977552
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-10-27 17:41:36 +00:00
Frank Barchard
331c361581 AVX-VNNI detect
- Add kCpuHasAVXVNNI flag
- Remove deprecated GFNI detect to make space.

Meteor Lake has AVX-VNNI but not AVX512
~/intelsde/sde -mtl -- blaze-bin/third_party/libyuv/libyuv_test --gunit_filter=*CpuHas
doyuv3

Note: Google Test filter = *CpuHas
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from LibYUVBaseTest
[ RUN      ] LibYUVBaseTest.TestCpuHas
Cpu Flags 0x203ff1
Has X86 0x10
Has SSE2 0x20
Has SSSE3 0x40
Has SSE41 0x80
Has SSE42 0x100
Has AVX 0x200
Has AVX2 0x400
Has ERMS 0x800
Has FMA3 0x1000
Has F16C 0x2000
Has AVX512BW 0x0
Has AVX512VL 0x0
Has AVX512VNNI 0x0
Has AVX512VBMI 0x0
Has AVX512VBMI2 0x0
Has AVX512VBITALG 0x0
Has AVX512VPOPCNTDQ 0x0
HAS AVXVNNI 0x200000
Has AVXVNNIINT8 0x0


AVX-VNNI detect

- Add kCpuHasAVXVNNI flag
- Remove deprecated GFNI detect to make space.

https://bugs.chromium.org/p/libyuv/issues/detail?id=967

Meteor Lake has AVX-VNNI but not AVX512
~/intelsde/sde -mtl -- blaze-bin/third_party/libyuv/libyuv_test --gunit_filter=*CpuHas
doyuv3
Note: Google Test filter = *CpuHas
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from LibYUVBaseTest
[ RUN      ] LibYUVBaseTest.TestCpuHas
Cpu Flags 0x203ff1
Has X86 0x10
Has SSE2 0x20
Has SSSE3 0x40
Has SSE41 0x80
Has SSE42 0x100
Has AVX 0x200
Has AVX2 0x400
Has ERMS 0x800
Has FMA3 0x1000
Has F16C 0x2000
Has AVX512BW 0x0
Has AVX512VL 0x0
Has AVX512VNNI 0x0
Has AVX512VBMI 0x0
Has AVX512VBMI2 0x0
Has AVX512VBITALG 0x0
Has AVX512VPOPCNTDQ 0x0
HAS AVXVNNI 0x200000
Has AVXVNNIINT8 0x0

Running on all cpus the following report avx-vnni
grep 'AVXVNNI 0x2' */*
adl/libyuv64.txt:HAS AVXVNNI 0x200000
gnr/libyuv64.txt:HAS AVXVNNI 0x200000
grr/libyuv64.txt:HAS AVXVNNI 0x200000
mtl/libyuv64.txt:HAS AVXVNNI 0x200000
rpl/libyuv64.txt:HAS AVXVNNI 0x200000
spr/libyuv64.txt:HAS AVXVNNI 0x200000
srf/libyuv64.txt:HAS AVXVNNI 0x200000

while these support avx512 vnni
grep 'VNNI 0x1' */*
clx/libyuv64.txt:Has AVX512VNNI 0x10000
cpx/libyuv64.txt:Has AVX512VNNI 0x10000
gnr/libyuv64.txt:Has AVX512VNNI 0x10000
icl/libyuv64.txt:Has AVX512VNNI 0x10000
icx/libyuv64.txt:Has AVX512VNNI 0x10000
spr/libyuv64.txt:Has AVX512VNNI 0x10000
tgl/libyuv64.txt:Has AVX512VNNI 0x10000

and these support avx-vnni-int8
grep AVXVNNIINT8.0x4 */*
grr/libyuv64.txt:Has AVXVNNIINT8 0x400000
srf/libyuv64.txt:Has AVXVNNIINT8 0x400000

Bug: libyuv:967
Change-Id: I84cd71d1b320e7c284173eb695fc1d3b72d14ddb
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4912017
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2023-10-05 21:24:09 +00:00
Frank Barchard
709d60e6ee VNNI-INT8 detect
- Add kCpuHasAVXVNNIINT8 flag
- Move mips flags up a bit to make space.

~/intelsde/sde -srf         -- blaze-bin/third_party/libyuv/libyuv_test --gunit_filter=*CpuHas
Note: Google Test filter = *CpuHas
[==========] Running 1 test from 1 test suite.
[----------] Global test environment set-up.
[----------] 1 test from LibYUVBaseTest
[ RUN      ] LibYUVBaseTest.TestCpuHas
Cpu Flags 0x403ff1
Has X86 0x10
Has SSE2 0x20
Has SSSE3 0x40
Has SSE41 0x80
Has SSE42 0x100
Has AVX 0x200
Has AVX2 0x400
Has ERMS 0x800
Has FMA3 0x1000
Has F16C 0x2000
Has AVX512BW 0x0
Has AVX512VL 0x0
Has AVX512VNNI 0x0
Has AVX512VBMI 0x0
Has AVX512VBMI2 0x0
Has AVX512VBITALG 0x0
Has AVX512VPOPCNTDQ 0x0
Has AVXVNNIINT8 0x400000
Has GFNI 0x0
[       OK ] LibYUVBaseTest.TestCpuHas (32 ms)

INT8 supported on srf and grr
     -srf                Set chip-check and CPUID for Intel(R) Sierra Forest CPU
     -grr                Set chip-check and CPUID for Intel(R) Grand Ridge CPU

Bug: b/303434603
Change-Id: I628007929ff0518b2b36e1469b4d9aed71a9fa8f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4912015
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-10-04 16:31:36 +00:00
Frank Barchard
f0921806a2 Disable NEON if memory sanitizer is enabled
- MSAN fails on most inline assembly, unaware of what the load and store instructions do.
- MSAN is also failing on row_any functions, which memcpy a correct number of pixels into a buffer that is SIMD vector sized, apply SIMD to the full vector, and then memcpy the exact number of resulting pixels to the output buffer.  MSAN wants the temporary buffer to be initialized.  Which genenerally is done with a memset(buf, 0, sizeof(buf)); to satisify MSAN.
- RVV may not require disabling MSAN, since row functions are all 'any' number of elements, and implementation is intrinsics.

Bug: b/297979878
Change-Id: Ic21200689c0c7d2c85bb1de3eef38570137d3d8b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4832740
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2023-08-31 18:07:42 +00:00
Anne Redulla
9b6895ccd9 [ssci] Added Shipped field to READMEs
This CL adds the Shipped field (and may update the
License File field) in Chromium READMEs. Changes were
automatically created, so if you disagree with any of
them (e.g. a package is used only for testing purposes
and is not shipped), comment the suggested change and
why.

See the LSC doc at go/lsc-chrome-metadata.

Bug: b:285450740
Change-Id: I69bd0f58ab3b3861498f355e5a5650dcddfa3a6f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4666442
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Anne Redulla <aredulla@google.com>
2023-07-07 06:09:53 +00:00
Frank Barchard
650be7496f Fix warnings for missing prototypes
- Add static to internal scale and rotate functions
- Remove unittest that tested an internal scale function
- Remove unused private functions
- Include missing scale_argb.h header
- Bump version and apply clang format

Bug: libyuv:830
Change-Id: I45bab0423b86334f9707f935aedd0c6efc442dd4
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4658956
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2023-06-30 17:46:56 +00:00
Frank Barchard
a366ad714a ARGBAttenuate use (a + b + 255) >> 8
- Makes ARM and Intel match and fixes some off by 1 cases
- Add ARGBToUV444MatrixRow_NEON
- Add ConvertFP16ToFP32Column_NEON
- scale_rvv fix intinsic build error
- disable row_win version of ARGBAttenuate/Unattenuate

Bug: libyuv:936, libyuv:956
Change-Id: Ied99aaad3a11a8eb69212b628c58f86ec0723c38
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4617013
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-06-16 21:37:53 +00:00
Frank Barchard
2a5d7e2fbc FilterRows_NEON - remove unused function - same as InterpolateRow_NEON
- Bump version to 1872
- Add scale_rvv to build files

Bug: libyuv:956
Change-Id: Ib9e9fd840a0774bd35bcdcca55a2596f33272383
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4608519
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-06-13 15:20:02 +00:00
Frank Barchard
157b153b60 Fix tidy warning that uint32_t dither4 should not be const
- Remove const from uint32_t dither4 parameter to fix clang-tidy warning
- Apply clang format
- Bump version
- Remove unused MMI source; superceded by MSA

Bug: None
Change-Id: Id49991db25bca4e99590b415312542d917471c62
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4581882
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-06-02 00:42:02 +00:00
Vignesh Venkatasubramanian
c0f64c14ca Add I412/I212 to I420 functions
They re-use the same method as I410/I210 to I420 with a depth
value of 12 instead of 10.

Bug: b/268505204
Change-Id: I299862b4556461d8c95f0fc1dcd5260e1c1f25cd
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4581867
Commit-Queue: Vignesh Venkatasubramanian <vigneshv@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-06-01 19:50:16 +00:00
Frank Barchard
a37799344d ARGBToI420Alpha function to convert ARGB to I420 with Alpha
Bug: b/281866362
Change-Id: Ic1093a887fb483f134c78909cf1ee7495e7345ba
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4534100
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2023-05-17 00:23:24 +00:00
Frank Barchard
6a68b18a96 Bump version and apply clang format
Bug: libyuv:956
Change-Id: I2375a02583789af2a5f13f8dba6c663d5975aaa9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4522352
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-05-11 11:27:28 +00:00
Frank Barchard
7c6a7e5737 cpuid for arm/mips/riscv initialize buffer
- change cpu printf to hex to better show flags

util/cpuid:
Cpu Flags 0x30000001
Has RISCV 0x10000000
Has RVV 0x20000000


[ RUN      ] LibYUVBaseTest.TestCpuHas
Cpu Flags 0x30000001
Has RISCV 0x10000000
Has RVV 0x20000000
Has RVVZVFH 0x0
[       OK ] LibYUVBaseTest.TestCpuHas (1 ms)
[ RUN      ] LibYUVBaseTest.TestCompilerMacros
__ATOMIC_RELAXED 0
__cplusplus 201703
__clang_major__ 9999
__clang_minor__ 0
__GNUC__ 4
__GNUC_MINOR__ 2
__riscv 1
__riscv_vector 1
__clang__ 1
__llvm__ 1
__pic__ 2
INT_TYPES_DEFINED
__has_feature

Bug: libyuv:956
Change-Id: Iee4f1f34799434390e756de1e6c2c4596d82ace5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4484957
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-04-27 22:46:27 +00:00
Frank Barchard
68659d0d68 UVScale down by 2 fix for C and optimize for NEON
- update cpu_id to use "re" for fopen to avoid leaking handles if a thread is started while the file is open.

Bug: libyuv:958
Change-Id: I1af9de68fce12e440e1226fc8070634ccb1bf090
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4417176
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-04-12 22:49:20 +00:00
Darren Hsieh
e8af6cb2e4 Add RAWToARGBRow_RVV,RAWToRGBARow_RVV,RAWToRGB24Row_RVV
* Run on SiFive internal FPGA:

RAWToARGB_Opt (~2x vs scalar)

RAWToRGBA_Opt (~2x vs scalar)

RAWToRGB24_Opt (~1.5x vs scalar)

LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=10

Change-Id: I21a13d646589ea2aa3822cb9225f5191068c285b
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4408357
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-04-07 18:45:08 +00:00
Frank Barchard
464c51a035 AArch32 YUVTORGB_SETUP use load and dup to avoid modifying pointer
- Allows code to be optimized with clang 17 -flto-thin
- Bump version number to 1864 to allow detection of fix
- Apply clang format to standardize formatting; No impact on code generated

Bug: chromium:1424089
Change-Id: Ib745836b27915a5e4cb1d7d928ee52659360612b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4370052
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Justin Green <greenjustin@google.com>
2023-03-24 19:32:30 +00:00
Frank Barchard
3f219a3501 GCC warning fix for MT2T
- Fix redundent assignment compile warning in GCC
- Apply clang-format
- Bump version to 1863

Bug: libyuv:955
Change-Id: If2b6588cd5a7f068a1745fe7763e90caa7277101
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4344729
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2023-03-16 06:57:20 +00:00
Frank Barchard
f9b23b9cc0 Transpose 4x4 for SSE2 and AVX2
Skylake Xeon
AVX2 Transpose4x4_Opt (290 ms)
SSE2 Transpose4x4_Opt (302 ms)
C    Transpose4x4_Opt (522 ms)

AMD Zen2
AVX2 Transpose4x4_Opt (136 ms)
SSE2 Transpose4x4_Opt (137 ms)
C    Transpose4x4_Opt (431 ms)

Bug: None
Change-Id: I4997dbd5c5387c22bfd6c5960b421504e4bc8a2a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4292946
Reviewed-by: Justin Green <greenjustin@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-03-03 17:46:23 +00:00
Frank Barchard
88b050f337 MergeUV AVX512BW use assembly
- Convert MergeUVRow_AVX512BW to assembly
- Enable MergeUVRow_AVX512BW for Windows with clangcl
- MergeUVRow_AVX2 use vpmovzxbw and vpsllw
- MergeUVRow_16_AVX2 use vpmovzxbw and vpsllw with different shift for U and V

AMD Zen 4 640x360 100000 iterations
Was
AVX512 MergeUVPlane_Opt (884 ms)
AVX2   MergeUVPlane_Opt (945 ms)
AVX2   MergeUVPlane_16_Opt (2167 ms)

Now
AVX512 MergeUVPlane_Opt (865 ms)
AVX2   MergeUVPlane_Opt (943 ms)
SSE2   MergeUVPlane_Opt (973 ms)
AVX2   MergeUVPlane_16_Opt (2102 ms)

Bug: None
Change-Id: I658ada2a75d44c3f93be8bd3ed96f83d5fa2ab8d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4271230
Reviewed-by: Fritz Koenig <frkoenig@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2023-02-22 21:19:08 +00:00
Frank Barchard
2bdc210be9 MergeUV_AVX512BW for I420ToNV12
On Skylake Xeon 640x360 100000 iterations
AVX512   MergeUVPlane_Opt (1196 ms)
AVX2     MergeUVPlane_Opt (1565 ms)
SSE2     MergeUVPlane_Opt (1780 ms)
Pixel 7  MergeUVPlane_Opt (1177 ms)

Bug: None
Change-Id: If47d4fa957cf27781bba5fd6a2f0bf554101a5c6
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4242247
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2023-02-13 20:14:57 +00:00
Frank Barchard
d5aa3d4a76 P010ToI010 and P012ToI012 conversion functions
- Convert 10 and 12 bit biplanar formats to planar.
- Shift 10 MSB to 10 LSB
- P010 is similar to NV12 in layout, but uses 10 MSB of 16 bit values.
- I010 is similar to I420 in layout, but uses 10 LSB of 16 bit values.

Bug: libyuv:951
Change-Id: I16a1bc64239d0fa4f41810910da448bf5720935f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4166560
Reviewed-by: Justin Green <greenjustin@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-01-13 19:20:12 +00:00
Frank Barchard
6e4b0acb4b I422Rotate take stride for temporary buffers
- Minor variable name changes first/last to top/bottom
- Comments explaining rotate temporary buffers usage
- Add asserts for scale parameter
- Use NULL and stddef.h instead of 0
- Use void * for allocation in row.h
- Add () around size parameter in macros


Bug: libyuv:926, libyuv:949
Change-Id: Ib55417570926ccada0a0f8abd1753dc12e5b162e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4136762
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2023-01-04 23:11:52 +00:00
Sergio Garcia Murillo
f583b1b4b8 Add I410Copy and I410ToI420 methods
The I410To420 implementation does a two step approach for scaling down and 10-to-8 bit conversion using the Y plane as temporal storage.

Bug: libyuv:950
Change-Id: I3d35fad4b99e17253230456233fbd947e013c0ec
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4110783
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2023-01-03 20:27:28 +00:00
Frank Barchard
3abd6f36b6 Casting for scale functions
- MT2T support for source strides added, but only works for positive values.
- Reduced casting in row_common - one cast per assignment.
- scaling functions use intptr_t for intermediate calculations, then cast strides to ptrdiff_t

Bug: libyuv:948, b/257266635, b/262468594
Change-Id: I0409a0ce916b777da2a01c0ab0b56dccefed3b33
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4102203
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Justin Green <greenjustin@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Ernest Hua <ernesthua@google.com>
2022-12-15 22:34:22 +00:00
Frank Barchard
610e0cdead MT2T Warning fixes for fuchsia
Bug: b/258474032, b/257266635
Change-Id: Ic5cbbc60e2e1463361e359a2fe3e97976c1ea929
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4081348
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Justin Green <greenjustin@google.com>
2022-12-06 19:54:40 +00:00
Frank Barchard
8713ba3f0b Add vzeroupper to AVX row functions
- move power of two macro to planar functions source
- revert row.h IS_ALIGNED change

Bug: b/258474032
Change-Id: If87bb8d55c9b9930dd3e378614f8e4faae0870e9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4035166
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2022-11-17 23:00:08 +00:00
Frank Barchard
2d2cee418a Add Detile_16 planar function for 10 bit MT2T format
- Neon and SSE2
- Any for odd widths

Pixel 2 little core AArch32 build
C
TestDetilePlane_16 (1275 ms)
TestDetilePlane (1203 ms)
Neon
TestDetilePlane_16 (693 ms)
TestDetilePlane (660 ms)

Bug: b/258474032
Change-Id: Idbd09c5e9324e4deef5f1d54090d4b63cc7db812
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4031848
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-11-17 02:47:57 +00:00
Frank Barchard
fe9ced6e3c ScaleRowUp2_Bilinear_12_SSSE3 preserve xmm7 for Windows
- Preserve xmm7 in ScaleRowUp2_Bilinear_12_SSSE3
- Previously xmm7 was used in ScaleRowUp2_Bilinear_12_SSSE3 without being preserved, which violates the Windows x64 calling conventions and can cause undefined behavior.


Bug: libyuv:945, 1218384
Change-Id: If18b292b588573355f9b4ba8c5b9c3fbe143d36b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3972137
Reviewed-by: Bruce Dawson <brucedawson@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-10-21 19:35:17 +00:00
Frank Barchard
3da24c3ca3 Detile vld for gcc build fix
- add {} around loaded register

Bug: libyuv:944
Change-Id: I0d916e37beb50bda0838e4867742eb7afa57e1cc
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3957634
Reviewed-by: Justin Green <greenjustin@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-10-14 19:12:53 +00:00
Frank Barchard
cb35d5f90e BGRAToI420 use SSSE3 for Y but C for UV when LIBYUV_BIT_EXACT enabled
- Previously was C for both Y and UV.

Was BGRAToI420_Opt (17780 ms)
Now BGRAToI420_Opt (9546 ms)

Bug: b/253491233
Change-Id: Id103d8d5ba0fed0f7a427dd5955e1830275eff6b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3953131
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2022-10-14 03:09:56 +00:00
Frank Barchard
b9adaef113 Enable unlimited data for YUV to RGB
- Provide LIBYUV_LIMITED_DATA macro for backwards compatiblity

Bug: b/474156256
Change-Id: I5d5d7fb640d51ae3c5ad363f2a28c8bfbd3048a5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3912081
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-09-23 12:51:37 +00:00
Frank Barchard
f9fda6e7d8 Fix shift amount for SSSE3 assembly for I012 format conversions
Bug: libyuv:938, libyuv:942
Change-Id: I6fb6e7e17fa941785e398bc630f465baf72fcabd
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3906091
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2022-09-20 23:07:53 +00:00
Frank Barchard
e4b1ddd8fe Fix immediate offsets for row_neon build on gcc
Bug: libyuv:942
Change-Id: I7d2dc87a44cc1cc5c79c37f407583e0c907dc2de
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3906088
Reviewed-by: Justin Green <greenjustin@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-09-20 20:16:13 +00:00
Frank Barchard
248172e2ba I422ToRGB24, I422ToRAW, I422ToRGB24MatrixFilter conversion functions added.
- YUV to RGB use linear for first and last row.
- add assert(yuvconstants)
- rename pointers to match row functions.
- use macros that match row functions.
- use 12 bit upsampler for conversions of 10 and 12 bits

Cortex A53 AArch32
I420ToRGB24_Opt (3627 ms)
I422ToRGB24_Opt (4099 ms)
I444ToRGB24_Opt (4186 ms)
I420ToRGB24Filter_Opt (5451 ms)
I422ToRGB24Filter_Opt (5430 ms)

AVX2
Was I420ToRGB24Filter_Opt (583 ms)
Now I420ToRGB24Filter_Opt (560 ms)

Neon Cortex A7
Was I420ToRGB24Filter_Opt (5447 ms)
Now I420ToRGB24Filter_Opt (5439 ms)

Bug: libyuv:938


Change-Id: I1731f2dd591073ae11a756f06574103ba0f803c7
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3906082
Reviewed-by: Justin Green <greenjustin@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-09-20 02:00:52 +00:00
Frank Barchard
f71c83552d I420ToRGB24MatrixFilter function added
- Implemented as 3 steps: Upsample UV to 4:4:4, I444ToARGB, ARGBToRGB24
- Fix some build warnings for missing prototypes.

Pixel 4
I420ToRGB24_Opt (743 ms)
I420ToRGB24Filter_Opt (1331 ms)

Windows with skylake xeon:
x86 32 bit
I420ToRGB24_Opt (387 ms)
I420ToRGB24Filter_Opt (571 ms)
x64 64 bit
I420ToRGB24_Opt (384 ms)
I420ToRGB24Filter_Opt (582 ms)


Bug: libyuv:938, libyuv:830
Change-Id: Ie27f70816ec084437014f8a1c630ae011ee2348c
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3900298
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2022-09-16 19:46:47 +00:00
Frank Barchard
65e7c9d570 MM21ToYUY2 and ABGRToJ420 conversion
MM21 to YUY2 use zip1 for performance

Cortex A510
Was MM21ToYUY2 (612 ms)
Now MM21ToYUY2 (573 ms)

Prefetches help Cortex A53
Was MM21ToYUY2 (4998 ms)
Now MM21ToYUY2 (1900 ms)

Pixel 4 Cortex A76
Was MM21ToYUY2 (215 ms)
Now MM21ToYUY2 (173 ms)

ABGRToJ420
- NEON, SSSE3 and AVX2 row functions
- J400, J420 and J422 formats.
- Added AVX2 for UV on ARGBToJ420.  Was SSSE3

Same code/performance as ARGBToJ420 but with constants re-ordered.
Pixel 4
ABGRToJ420_Opt (623 ms)
ABGRToJ422_Opt (702 ms)
ABGRToJ400_Opt (238 ms)

Skylake Xeon
With LIBYUV_BIT_EXACT which uses C for UV
ABGRToJ420_Opt (988 ms)
ABGRToJ422_Opt (1872 ms)
ABGRToJ400_Opt (186 ms)
Skylake Xeon using AVX2
ABGRToJ420_Opt (251 ms)
ABGRToJ422_Opt (245 ms)
ABGRToJ400_Opt (184 ms)
Skylake Xeon using SSSE3
ABGRToJ420_Opt (328 ms)
ABGRToJ422_Opt (362 ms)
ABGRToJ400_Opt (185 ms)

Bug: b/238137982
Change-Id: I559c3fe3fb80fa2ce5be3d8218736f9cbc627666
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3832111
Reviewed-by: Justin Green <greenjustin@google.com>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2022-08-16 22:07:38 +00:00
Frank Barchard
1c5a8bb17a AB64ToARGB fix for inplace conversion
- add tests for all single plane formats that reduce or stay same in size

Bug: b/242233673
Change-Id: Ic25d808114f11995ac56ea9c31b99f66ba36d345
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3828485
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2022-08-12 01:28:13 +00:00
Vignesh Venkatasubramanian
9b17af9bef Bump up version to 1838
Commit a5a1102a added a function to the public ABI. Update the
version number to 1838.

Bug: b/241451603

Change-Id: I615792672c0dc097e2b1b2637ec5c3a1586d9f09
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3821166
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-08-09 21:47:55 +00:00
Frank Barchard
d53f1beecd RAWToJ400 require multiple of 16 pixels for NEON
- fix crash when width is not a multiple of 16
- apply clang format
- bump version

Bug: libyuv:940, b/240094327
Change-Id: Ic18e5b7b64f78f26e8b7d8440bf490a679bda200
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3812594
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2022-08-04 22:55:48 +00:00
Frank Barchard
b028453ba6 Disable bilinear 16 bit scale up for SSE2
- Undefine HAS_SCALEROWUP2_BILINEAR_16_SSE2
- Save XMM7 in ScaleRowUp2_Bilinear_16_SSE2().
- Rename HAS_SCALEROWUP2LINEAR_xxx to HAS_SCALEROWUP2_LINEAR_xxx
- DetileSplitUVRow_C() is implemented using SplitUVRow_C().
- Changes to unit_test/planar_test.cc.

Bug: libyuv:882
Change-Id: I0a8e8e5fb43bdf58ded87244e802343eacb789f2
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3795063
Reviewed-by: Wan-Teh Chang <wtc@google.com>
2022-08-01 22:54:48 +00:00
Frank Barchard
d248929c05 Enable 256x144 scale tests for libyuv
- This test used to fail on ARM, but is passing now, so re-enable
- Kept behind a flag so it can be disabled with /DDISABLE_SLOW_TESTS

Bug:  libyuv:905, b/197551385
Change-Id: Iff3c75c1778610c136621b595adee4b1004df9a5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3758943
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-07-13 11:08:29 +00:00
Frank Barchard
6900494d90 Merge/SplitRGB fix -mcmodel=large x86 and InterpolateRow_16To8_NEON
MergeRGB and SplitRGB use a register to point to 9 shuffle tables.

- fixes an out of registers error with -mcmodel=large

InterpolateRow_16To8_NEON improves performance for I210ToI420:

On Pixel 4 for 720p x1000 images
Was I210ToI420_Opt (608 ms)
Now I210ToI420_Opt (336 ms)

On Skylake Xeon
Was I210ToI420_Opt (259 ms)
Now I210ToI420_Opt (209 ms)


Bug: libyuv:931, libyuv:930
Change-Id: I20f8244803f06da511299bf1a2ffc7945eb35221
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3717054
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Justin Green <greenjustin@google.com>
2022-06-29 00:00:46 +00:00
Frank Barchard
fe4a50df8e Bilinear scale up msan fix
- Avoid stepping to height + 1 for bilinear filter 2nd row for last row of source
- Box filter ubsan fix for 3/4 and 3/8 scaling for 16 bit planar
- Height 1 asan fixes

Bug: libyuv:935, b/206716399
Change-Id: I56088520f2a884a37b987ee5265def175047673e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3717263
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-06-22 00:11:49 +00:00
Frank Barchard
e906ba9fe9 InterpolateRow_Any test if fraction is 0 and dont memcpy 2nd row.
Bug: b/228605787
Change-Id: Ia8912e4c1599401320ee82882a2593e78bf56582
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3708833
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-06-17 18:15:09 +00:00
Frank Barchard
30f9b28048 Add I210ToI420
Bug: libyuv:931, b/228605787, b/233233302, b/233634772, b/234558395, b/234340482
Change-Id: Ib135d0b4ff17665f6a4ab60edb782a7b314219a4
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3696042
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2022-06-09 08:07:50 +00:00
Frank Barchard
d011314f14 Revert "I210ToI420, InterpolatePlane_16, and ScalePlane Vertical-only asan fix"
This reverts commit 60254a1d846a93a4d7559009004cdd91bcc04d82.

Reason for revert: breaks PaintCanvasVideoRendererTest.HighBitDepth

Original change's description:
> I210ToI420, InterpolatePlane_16, and ScalePlane Vertical-only asan fix
>
> - Add I210ToI420 to convert 10 bit 4:2:2 YUV to 4:2:0 8 bit
> - Add NEON InterpolateRow_16 for fast 10 bit scaling
> - When scaling up, set step to interpolate toward height - 1 to avoid buffer overread
> - When scaling down, center the 2 rows used for source to achieve filtering.
> - CopyPlane check for 0 size and return
>
> Bug:  libyuv:931, b/228605787, b/233233302, b/233634772, b/234558395, b/234340482
> Change-Id: I63e8580710a57812b683c2fe40583ac5a179c4f1
> Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3687552
> Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
> Reviewed-by: richard winterton <rrwinterton@gmail.com>

Bug: libyuv:931, b/228605787, b/233233302, b/233634772, b/234558395, b/234340482
Change-Id: Icc05bb340db0e7fe864061fb501d0a861c764116
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3692886
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Mirko Bonadei <mbonadei@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2022-06-07 09:16:05 +00:00
Frank Barchard
60254a1d84 I210ToI420, InterpolatePlane_16, and ScalePlane Vertical-only asan fix
- Add I210ToI420 to convert 10 bit 4:2:2 YUV to 4:2:0 8 bit
- Add NEON InterpolateRow_16 for fast 10 bit scaling
- When scaling up, set step to interpolate toward height - 1 to avoid buffer overread
- When scaling down, center the 2 rows used for source to achieve filtering.
- CopyPlane check for 0 size and return

Bug:  libyuv:931, b/228605787, b/233233302, b/233634772, b/234558395, b/234340482
Change-Id: I63e8580710a57812b683c2fe40583ac5a179c4f1
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3687552
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2022-06-07 01:41:56 +00:00
Frank Barchard
715150b5aa Add UYVYToY function
This function reads 2 byte values and writes the 2nd byte to the destination.
It turns out this is useful for P010ToNV12 as well, so adding the planar function allows a high level to call this.
And adds UYVY support for something YUY2 already had.  Which is writing the 1st byte.

Bug: b/233233302, b/233634772
Change-Id: I10a9454cb4f5b2c4ac5532fa86feddf78284d8b8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3659055
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2022-05-24 01:42:31 +00:00
Frank Barchard
de71c67e53 MergeUV test fix - depth is 16 (bits)
Bug: b/230550621
Change-Id: Ie36d3b8bdadb4300d54611798a4dfd488c30ca8d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3609691
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-04-27 07:38:05 +00:00
Frank Barchard
d62ee21e66 UVScale fix for vertical-only scaling
Bug: b/228841445
Change-Id: I0342856e1bfcea69851d718459d66926bb170219
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3595240
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas-Sanchez <mcasas@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-04-20 01:27:33 +00:00
Frank Barchard
3c0f408607 Enable HAS_DETILESPLITUVROW_NEON
On Pixel 4
Was C
AArch64 TestDetileSplitUVPlane_Benchmark (935 ms)
AArch32 TestDetileSplitUVPlane_Benchmark (787 ms)

Now NEON
AArch64 TestDetileSplitUVPlane_Benchmark (248 ms)
AArch32 TestDetileSplitUVPlane_Benchmark (256 ms)

Bug: libyuv:915, b/228518489
Change-Id: Ib82b702c1321285738c044ad8c2a7805b16f074a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3594524
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Justin Green <greenjustin@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-04-19 21:17:03 +00:00
Frank Barchard
eec8dd37e8 Change ScaleUVRowUp2_Biinear_16_SSE2 to SSE41
Bug: libyuv:928

xed -i scale_gcc.o:
SYM ScaleUVRowUp2_Linear_16_SSE2:
XDIS 0: LOGICAL   SSE2       660FEFED                 pxor xmm5, xmm5
XDIS 4: SSE       SSE2       660F76E4                 pcmpeqd xmm4, xmm4
XDIS 8: SSE       SSE2       660F72D41F               psrld xmm4, 0x1f
XDIS d: SSE       SSE2       660F72F401               pslld xmm4, 0x1
XDIS 12: DATAXFER  SSE2       F30F7E07                 movq xmm0, qword ptr [rdi]
XDIS 16: DATAXFER  SSE2       F30F7E4F04               movq xmm1, qword ptr [rdi+0x4]
XDIS 1b: SSE       SSE2       660F61C5                 punpcklwd xmm0, xmm5
XDIS 1f: SSE       SSE2       660F61CD                 punpcklwd xmm1, xmm5
XDIS 23: DATAXFER  SSE2       660F6FD0                 movdqa xmm2, xmm0
XDIS 27: DATAXFER  SSE2       660F6FD9                 movdqa xmm3, xmm1
XDIS 2b: SSE       SSE2       660F70D24E               pshufd xmm2, xmm2, 0x4e
XDIS 30: SSE       SSE2       660F70DB4E               pshufd xmm3, xmm3, 0x4e
XDIS 35: SSE       SSE2       660FFED4                 paddd xmm2, xmm4
XDIS 39: SSE       SSE2       660FFEDC                 paddd xmm3, xmm4
XDIS 3d: SSE       SSE2       660FFED0                 paddd xmm2, xmm0
XDIS 41: SSE       SSE2       660FFED9                 paddd xmm3, xmm1
XDIS 45: SSE       SSE2       660FFEC0                 paddd xmm0, xmm0
XDIS 49: SSE       SSE2       660FFEC9                 paddd xmm1, xmm1
XDIS 4d: SSE       SSE2       660FFEC2                 paddd xmm0, xmm2
XDIS 51: SSE       SSE2       660FFECB                 paddd xmm1, xmm3
XDIS 55: SSE       SSE2       660F72D002               psrld xmm0, 0x2
XDIS 5a: SSE       SSE2       660F72D102               psrld xmm1, 0x2
XDIS 5f: SSE       SSE4       660F382BC1               packusdw xmm0, xmm1
XDIS 64: DATAXFER  SSE2       F30F7F06                 movdqu xmmword ptr [rsi], xmm0
XDIS 68: MISC      BASE       488D7F08                 lea rdi, ptr [rdi+0x8]
XDIS 6c: MISC      BASE       488D7610                 lea rsi, ptr [rsi+0x10]
XDIS 70: BINARY    BASE       83EA04                   sub edx, 0x4
XDIS 73: COND_BR   BASE       7F9D                     jnle 0x12 <ScaleUVRowUp2_Linear_16_SSE2+0x12>
XDIS 75: RET       BASE       C3                       ret

SYM ScaleUVRowUp2_Bilinear_16_SSE2:
XDIS 0: LOGICAL   SSE2       660FEFFF                 pxor xmm7, xmm7
XDIS 4: SSE       SSE2       660F76F6                 pcmpeqd xmm6, xmm6
XDIS 8: SSE       SSE2       660F72D61F               psrld xmm6, 0x1f
XDIS d: SSE       SSE2       660F72F603               pslld xmm6, 0x3
XDIS 12: DATAXFER  SSE2       F30F7E07                 movq xmm0, qword ptr [rdi]
XDIS 16: DATAXFER  SSE2       F30F7E4F04               movq xmm1, qword ptr [rdi+0x4]
XDIS 1b: SSE       SSE2       660F61C7                 punpcklwd xmm0, xmm7
XDIS 1f: SSE       SSE2       660F61CF                 punpcklwd xmm1, xmm7
XDIS 23: DATAXFER  SSE2       660F6FD0                 movdqa xmm2, xmm0
XDIS 27: DATAXFER  SSE2       660F6FD9                 movdqa xmm3, xmm1
XDIS 2b: SSE       SSE2       660F70D24E               pshufd xmm2, xmm2, 0x4e
XDIS 30: SSE       SSE2       660F70DB4E               pshufd xmm3, xmm3, 0x4e
XDIS 35: SSE       SSE2       660FFED0                 paddd xmm2, xmm0
XDIS 39: SSE       SSE2       660FFED9                 paddd xmm3, xmm1
XDIS 3d: SSE       SSE2       660FFEC0                 paddd xmm0, xmm0
XDIS 41: SSE       SSE2       660FFEC9                 paddd xmm1, xmm1
XDIS 45: SSE       SSE2       660FFEC2                 paddd xmm0, xmm2
XDIS 49: SSE       SSE2       660FFECB                 paddd xmm1, xmm3
XDIS 4d: DATAXFER  SSE2       F30F7E1477               movq xmm2, qword ptr [rdi+rsi*2]
XDIS 52: DATAXFER  SSE2       F30F7E5C7704             movq xmm3, qword ptr [rdi+rsi*2+0x4]
XDIS 58: SSE       SSE2       660F61D7                 punpcklwd xmm2, xmm7
XDIS 5c: SSE       SSE2       660F61DF                 punpcklwd xmm3, xmm7
XDIS 60: DATAXFER  SSE2       660F6FE2                 movdqa xmm4, xmm2
XDIS 64: DATAXFER  SSE2       660F6FEB                 movdqa xmm5, xmm3
XDIS 68: SSE       SSE2       660F70E44E               pshufd xmm4, xmm4, 0x4e
XDIS 6d: SSE       SSE2       660F70ED4E               pshufd xmm5, xmm5, 0x4e
XDIS 72: SSE       SSE2       660FFEE2                 paddd xmm4, xmm2
XDIS 76: SSE       SSE2       660FFEEB                 paddd xmm5, xmm3
XDIS 7a: SSE       SSE2       660FFED2                 paddd xmm2, xmm2
XDIS 7e: SSE       SSE2       660FFEDB                 paddd xmm3, xmm3
XDIS 82: SSE       SSE2       660FFED4                 paddd xmm2, xmm4
XDIS 86: SSE       SSE2       660FFEDD                 paddd xmm3, xmm5
XDIS 8a: DATAXFER  SSE2       660F6FE0                 movdqa xmm4, xmm0
XDIS 8e: DATAXFER  SSE2       660F6FEA                 movdqa xmm5, xmm2
XDIS 92: SSE       SSE2       660FFEE0                 paddd xmm4, xmm0
XDIS 96: SSE       SSE2       660FFEEE                 paddd xmm5, xmm6
XDIS 9a: SSE       SSE2       660FFEE0                 paddd xmm4, xmm0
XDIS 9e: SSE       SSE2       660FFEE5                 paddd xmm4, xmm5
XDIS a2: SSE       SSE2       660F72D404               psrld xmm4, 0x4
XDIS a7: DATAXFER  SSE2       660F6FEA                 movdqa xmm5, xmm2
XDIS ab: SSE       SSE2       660FFEEA                 paddd xmm5, xmm2
XDIS af: SSE       SSE2       660FFEC6                 paddd xmm0, xmm6
XDIS b3: SSE       SSE2       660FFEEA                 paddd xmm5, xmm2
XDIS b7: SSE       SSE2       660FFEE8                 paddd xmm5, xmm0
XDIS bb: SSE       SSE2       660F72D504               psrld xmm5, 0x4
XDIS c0: DATAXFER  SSE2       660F6FC1                 movdqa xmm0, xmm1
XDIS c4: DATAXFER  SSE2       660F6FD3                 movdqa xmm2, xmm3
XDIS c8: SSE       SSE2       660FFEC1                 paddd xmm0, xmm1
XDIS cc: SSE       SSE2       660FFED6                 paddd xmm2, xmm6
XDIS d0: SSE       SSE2       660FFEC1                 paddd xmm0, xmm1
XDIS d4: SSE       SSE2       660FFEC2                 paddd xmm0, xmm2
XDIS d8: SSE       SSE2       660F72D004               psrld xmm0, 0x4
XDIS dd: DATAXFER  SSE2       660F6FD3                 movdqa xmm2, xmm3
XDIS e1: SSE       SSE2       660FFED3                 paddd xmm2, xmm3
XDIS e5: SSE       SSE2       660FFECE                 paddd xmm1, xmm6
XDIS e9: SSE       SSE2       660FFED3                 paddd xmm2, xmm3
XDIS ed: SSE       SSE2       660FFED1                 paddd xmm2, xmm1
XDIS f1: SSE       SSE2       660F72D204               psrld xmm2, 0x4
XDIS f6: SSE       SSE4       660F382BE0               packusdw xmm4, xmm0
XDIS fb: DATAXFER  SSE2       F30F7F22                 movdqu xmmword ptr [rdx], xmm4
XDIS ff: SSE       SSE4       660F382BEA               packusdw xmm5, xmm2
XDIS 104: DATAXFER  SSE2       F30F7F2C4A               movdqu xmmword ptr [rdx+rcx*2], xmm5
XDIS 109: MISC      BASE       488D7F08                 lea rdi, ptr [rdi+0x8]
XDIS 10d: MISC      BASE       488D5210                 lea rdx, ptr [rdx+0x10]
XDIS 111: BINARY    BASE       4183E804                 sub r8d, 0x4
XDIS 115: COND_BR   BASE       0F8FF7FEFFFF             jnle 0x12 <ScaleUVRowUp2_Bilinear_16_SSE2+0x12>
XDIS 11b: RET       BASE       C3                       ret

Change-Id: Ia20860e9c3c45368822cfd8877167ff0bf973dcc
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3587602
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-04-15 18:46:09 +00:00
Wan-Teh Chang
18f9110516 Avoid AVX instructions in ScaleRowUp2_Linear_SSSE3
The "vpackuswb   %%xmm2,%%xmm0,%%xmm0" and "vmovdqu     %%xmm0,(%1)"
instructions in ScaleRowUp2_Linear_SSSE3() are AVX instructions. They
cause an illegal instruction exception on CPUs that do not support AVX.

Bug: libyuv:927
Bug: chromium:1312551
Change-Id: I87b2aaf041e7d185e7e8fb07172d4f37482e9d08
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3585881
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2022-04-15 00:18:39 +00:00
Frank Barchard
2ad73733d9 I422Rotate update to remove name space for ios build warning
- Remove libyuv:: from within libyuv to resolve a build warning on IOS.
- Check src_y parameter is not NULL if there is a dst_y parameter
- Apply clang-format
- Bump version

Performance on Intel Skylake Xeon
ARGBRotate90_Opt (795 ms)
I420Rotate90_Opt (283 ms)
I422Rotate90_Opt (867 ms)  <-- scales and rotates
I444Rotate90_Opt (565 ms)
NV12Rotate90_Opt (289 ms)

Performance on Pixel 4 (Cortex A76)
ARGBRotate90_Opt (4208 ms)
I420Rotate90_Opt (273 ms)
I422Rotate90_Opt (1207 ms)
I444Rotate90_Opt (718 ms)
NV12Rotate90_Opt (282 ms)


Bug: libyuv:926
Change-Id: I42e1b93a9595f6ed075918e91bed977dd3d23f6f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3576778
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-04-07 21:06:44 +00:00
Wan-Teh Chang
ebd9e130f0 Fix bugs in I010AlphaToARGBMatrixBilinear()
Add a missing increment of src_a and ARGBAttenuateRow() call.

Bug: libyuv:922
Change-Id: I26e04e70c6a1a231cbe54b60c249f4c2e8af112a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3549976
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
2022-03-25 15:39:52 +00:00
Frank Barchard
124bf08fee RGBScale function using 3 steps: RGB24ToARGB, ARGBScale, ARGBToRGB24
1920x1080 to/from 1280x720 to ARGB on Intel Skylake Xeon
RGBScaleTo1920x1080_Bilinear (2625 ms)
RGBScaleFrom1920x1080_Bilinear (2115 ms)
ARGBScaleTo1920x1080_Bilinear (1668 ms)
ARGBScaleFrom1920x1080_Bilinear (1164 ms)

Bug: b/224814071
Change-Id: Ifc7611b597409771728b13c9c39e5a7e06131021
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3537341
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-03-19 01:44:06 +00:00
Frank Barchard
95b14b2446 RAWToJ400 faster version for ARM
- Unrolled to 16 pixels
- Take constants via structure, allowing different colorspace and channel order
- Use ADDHN to add 16.5 and take upper 8 bits of 16 bit values, narrowing to 8 bits
- clang-format applied, affecting mips code

On Cortex A510
Was RAWToJ400_Opt (1623 ms)
Now RAWToJ400_Opt (862 ms)

C   RAWToJ400_Opt (1627 ms)

Bug: b/220171611
Change-Id: I06a9baf9650ebe2802fb6ff6dfbd524e2c06ada0
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3534023
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-03-18 07:22:36 +00:00
Yuan Tong
3aebf69d66 Fix newline in version.h
Bug: libyuv:920
Change-Id: I10406f6db8a1d161d6a6a9539add2075e4a4a7b2
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3527834
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-03-16 08:35:13 +00:00
Frank Barchard
42d76a342f RAWToJNV21 function with 2 step conversion
RAWToJ420 + J420ToNV21 on row level

Pixel 6
RAWToJNV21_Opt (320 ms)

Skylake Xeon
RAWToJNV21_Opt (302 ms)

Bug: b/220171611
Change-Id: I39dcce9cf56c576b95666bb4fb1baccf9fbc7f7a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3495876
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-03-01 19:33:49 +00:00
Frank Barchard
e77531f6f1 Fix RotatePlane by 90 on Neon when source width is not a multiple of 8
Bug: b/220888716, b/218875554, b/220205245
Change-Id: I17e118ac9b9a7013386a5f0ad27a2dd249474ae5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3483576
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-02-23 19:16:53 +00:00
Justin Green
b4ddbaf549 Add support for MM21.
Add support for MM21 to NV12 and I420 conversion, and add SIMD
optimizations for arm, aarch64, SSE2, and SSSE3 machines.

Bug: libyuv:915, b/215425056
Change-Id: Iecb0c33287f35766a6169d4adf3b7397f1ba8b5d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3433269
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Justin Green <greenjustin@google.com>
2022-02-03 17:01:49 +00:00
Frank Barchard
804980bbab DetilePlane and unittest for NEON
Bug: libyuv:915, b/215425056
Change-Id: Iccab1ed3f6d385f02895d44faa94d198ad79d693
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3424820
Reviewed-by: Justin Green <greenjustin@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-01-31 20:05:55 +00:00
Frank Barchard
2c6bfc02d5 Remove MMI support
Bug: libyuv:916
Change-Id: I345b7e271ceb4b32fe91e292915e66be40812810
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3415817
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-01-26 08:41:33 +00:00
Frank Barchard
cdd62da670 VNNI detect
Bug: libyuv:911
Change-Id: Ic4e7720b4d5c20010470f06a7021d1a2426e765f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3381495
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2022-01-12 07:08:20 +00:00
Frank Barchard
78625492cb InterpolateRow_AVX2 use AVX2 instead of ERMS for 100%
Bug: b/210066781
Change-Id: I709e403f03bd6b9f8fe693b165b242b784076fe0
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3329072
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2021-12-15 03:54:18 +00:00
Frank Barchard
d7a2d5da87 J400ToARGB optimized for Exynos using ZIP+ST1
Bug: 204562143
Change-Id: I56c98198c02bd0dd1283f1c14837730c92832c39
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3328702
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2021-12-10 01:00:07 +00:00
Frank Barchard
000806f373 NV21ToYUV24 replace ST3 with ST1. ARGBToAR64 replace ST2 with ST1
On Samsung S8 Exynos M2
Was ST3 NV21ToYUV24_Opt (769 ms)
Now ST1 NV21ToYUV24_Opt (473 ms)
Was ST2 ARGBToAR64_Opt (1759 ms)
Now ST1 ARGBToAR64_Opt (987 ms)

Skylake Xeon, AVX2 version:
Was NV21ToYUV24_Opt (885 ms)
Now NV21ToYUV24_Opt (194 ms)

Bug: b/204562143, b/124413599
Change-Id: Icc9cb64d822cd11937789a4e04fbb773b3e33aa3
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3290664
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2021-11-24 07:38:49 +00:00
Frank Barchard
a04e4f87fb Fix scale any mask parameter bug for NV12Scale
Bug: None
Change-Id: Ib4e174c086162ee709faf4b04c7d5d5847a7de3d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3267488
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2021-11-08 20:00:04 +00:00
Frank Barchard
fa043c7a64 Android420ToI420Rotate function to convert with rotation
- adapted from Android420ToI420, adding a rotation parameter
- SplitRotateUV added to rotate and split the UV channel of NV12 or NV21
- rename RotateUV functions to SplitRotateUV

Bug: b/203549508
Change-Id: I6774da5fb5908fdf1fc12393f0001f41bbda9851
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3251282
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2021-10-28 22:38:04 +00:00
Frank Barchard
b179f1847a Enable SIMD for exact RGB to Y conversions
Bug: libyuv:908, b/202888439
Change-Id: Icc5470b85d91b441ded9958ee04b4f32246646f0
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3230489
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2021-10-19 07:54:50 +00:00
Frank Barchard
55b97cb48f BIT_EXACT for unattenuate and attenuate.
- reenable Intel SIMD unaffected by BIT_EXACT
- add bit exact version of ARGBAttenuate, which uses ARM version of formula.
- add bit exact version of ARGBUnatenuate, which mimics the AVX code.

Apply clang format to cleanup code.

Bug: libyuv:908, b/202888439
Change-Id: Ie842b1b3956b48f4190858e61c02998caedc2897
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3224702
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2021-10-15 19:46:02 +00:00
Frank Barchard
11cbf8f976 Add LIBYUV_BIT_EXACT macro to force C to match SIMD
- C code use ARM path, so NEON and C match
- C used on Intel platforms, disabling AVX.

Bug: libyuv:908, b/202888439
Change-Id: Ie035a150a60d3cf4ee7c849a96819d43640cf020
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3223507
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
2021-10-14 20:37:39 +00:00
Frank Barchard
daf9778a24 Fix for failed compile with armv-7a neon gcc
Bug: libyuv:907
Change-Id: I955e83c72b57ce5ba45730030b32f337be610a21
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3216739
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2021-10-12 18:17:50 +00:00
Frank Barchard
d13d9d5972 Disable slow and redundant scaling tests
- Filter None and Filter Linear disabled
- Filter Box disabled in UV and ARGB scaling
- Tests are only disabled if DISABLE_SLOW_TESTS macro is set.

Bug: b/197551385
Change-Id: If0a357541412dc762e61c98ef0d80a2c86292177
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3194126
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2021-09-30 18:08:20 +00:00
Frank Barchard
b9bd1b5537 DISABLE_SLOW_TESTS replaces ENABLE_SLOW_TESTS
- change default to enable all tests for better test/bot coverage
- DISABLE_SLOW_TESTS turns off tests that are redundent or unoptimized

Bug: libyuv:905, b/197551385
Change-Id: Ia720526864af774a009852751a1a85c6b1b7f978
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3183099
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2021-09-27 22:40:43 +00:00
Frank Barchard
48d167108f Prune conversion tests to OPT and I420 variations
- ENABLE_FULL_TESTS added internally to select which tests to build

Bug: libyuv:905, b/197551385
Change-Id: Ib4add87fee829402321fd65acebeb6123bf19ec4
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3183182
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
2021-09-25 00:03:41 +00:00
Frank Barchard
b92a60320f ConvertFromI420 respect destination stride for NV12 and NV21
Bug: libyuv:904
Change-Id: Ie1fd39c693e64661eb52f75492a261384db70776
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3176483
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2021-09-22 19:44:06 +00:00