The Clang static analyzer (scan-build) in LLVM 14 warns about
array index out of bounds in scaletbl[boxwidth - minboxwidth] in
ScaleAddCols2_C() and ScaleAddCols2_16_C(). The scaletbl array has two
elements. It's not clear the index boxwidth - minboxwidth is either 0 or
1.
Change-Id: I072476e86950154beffe6b1a89915755118b3cbd
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4403882
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
- Allows code to be optimized with clang 17 -flto-thin
- Bump version number to 1864 to allow detection of fix
- Apply clang format to standardize formatting; No impact on code generated
Bug: chromium:1424089
Change-Id: Ib745836b27915a5e4cb1d7d928ee52659360612b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4370052
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Justin Green <greenjustin@google.com>
Fix the algorithm for unpacking the lower 2 bits of M2T2 pixels.
Bug: b:258474032
Change-Id: Iea1d63f26e3f127a70ead26bc04ea3d939e793e3
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4337978
Commit-Queue: Justin Green <greenjustin@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
- Convert MergeUVRow_AVX512BW to assembly
- Enable MergeUVRow_AVX512BW for Windows with clangcl
- MergeUVRow_AVX2 use vpmovzxbw and vpsllw
- MergeUVRow_16_AVX2 use vpmovzxbw and vpsllw with different shift for U and V
AMD Zen 4 640x360 100000 iterations
Was
AVX512 MergeUVPlane_Opt (884 ms)
AVX2 MergeUVPlane_Opt (945 ms)
AVX2 MergeUVPlane_16_Opt (2167 ms)
Now
AVX512 MergeUVPlane_Opt (865 ms)
AVX2 MergeUVPlane_Opt (943 ms)
SSE2 MergeUVPlane_Opt (973 ms)
AVX2 MergeUVPlane_16_Opt (2102 ms)
Bug: None
Change-Id: I658ada2a75d44c3f93be8bd3ed96f83d5fa2ab8d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4271230
Reviewed-by: Fritz Koenig <frkoenig@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
- was dup of 8h but mul of 4s. now use umull
Bug: libyuv:951
Change-Id: If6cb01f5f006c2235886b81ce120642d7e24a9bb
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4166563
Reviewed-by: Justin Green <greenjustin@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
- Convert 10 and 12 bit biplanar formats to planar.
- Shift 10 MSB to 10 LSB
- P010 is similar to NV12 in layout, but uses 10 MSB of 16 bit values.
- I010 is similar to I420 in layout, but uses 10 LSB of 16 bit values.
Bug: libyuv:951
Change-Id: I16a1bc64239d0fa4f41810910da448bf5720935f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4166560
Reviewed-by: Justin Green <greenjustin@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
- Minor variable name changes first/last to top/bottom
- Comments explaining rotate temporary buffers usage
- Add asserts for scale parameter
- Use NULL and stddef.h instead of 0
- Use void * for allocation in row.h
- Add () around size parameter in macros
Bug: libyuv:926, libyuv:949
Change-Id: Ib55417570926ccada0a0f8abd1753dc12e5b162e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4136762
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
This initial implementation is based on current unoptimized code in webrtc using just plain for loops.
Bug: libyuv:949
Change-Id: Ic87ee49c3a0b62edbaaa4255c263c1f7be4ea02b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4110782
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
The I410To420 implementation does a two step approach for scaling down and 10-to-8 bit conversion using the Y plane as temporal storage.
Bug: libyuv:950
Change-Id: I3d35fad4b99e17253230456233fbd947e013c0ec
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4110783
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
- MT2T support for source strides added, but only works for positive values.
- Reduced casting in row_common - one cast per assignment.
- scaling functions use intptr_t for intermediate calculations, then cast strides to ptrdiff_t
Bug: libyuv:948, b/257266635, b/262468594
Change-Id: I0409a0ce916b777da2a01c0ab0b56dccefed3b33
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4102203
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Justin Green <greenjustin@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Ernest Hua <ernesthua@google.com>
- fix ifdefs for DetilePlane_16 to use 16 bit versions, not 8 bit. (no functional change)
Bug: b/258474032
Change-Id: Ic07e02d9801e21126ebee0ceb5779aa712a493ce
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4034812
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
- Preserve xmm7 in ScaleRowUp2_Bilinear_12_SSSE3
- Previously xmm7 was used in ScaleRowUp2_Bilinear_12_SSSE3 without being preserved, which violates the Windows x64 calling conventions and can cause undefined behavior.
Bug: libyuv:945, 1218384
Change-Id: If18b292b588573355f9b4ba8c5b9c3fbe143d36b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3972137
Reviewed-by: Bruce Dawson <brucedawson@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
- Previously was C for both Y and UV.
Was BGRAToI420_Opt (17780 ms)
Now BGRAToI420_Opt (9546 ms)
Bug: b/253491233
Change-Id: Id103d8d5ba0fed0f7a427dd5955e1830275eff6b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3953131
Reviewed-by: Wan-Teh Chang <wtc@google.com>
- Optimized YUY2ToNV12 that reduces it from 3 steps to 2 steps
- Was SplitUV, memcpy Y, InterpolateUV
- Now YUY2ToY, YUY2ToNVUV
- rollback LIBYUV_UNLIMITED_DATA
3840x2160 1000 iterations:
Pixel 2 Cortex A73
Was YUY2ToNV12_Opt (6515 ms)
Now YUY2ToNV12_Opt (3350 ms)
AB7 Mediatek P35 Cortex A53
Was YUY2ToNV12_Opt (6435 ms)
Now YUY2ToNV12_Opt (3301 ms)
Skylake AVX2 x64
Was YUY2ToNV12_Opt (1872 ms)
Now YUY2ToNV12_Opt (1657 ms)
SSE2 x64
Was YUY2ToNV12_Opt (2008 ms)
Now YUY2ToNV12_Opt (1691 ms)
Windows Skylake AVX2 32 bit x86
Was YUY2ToNV12_Opt (2161 ms)
Now YUY2ToNV12_Opt (1628 ms)
Bug: libyuv:943
Change-Id: I6c2ba2ae765413426baf770b837de114f808f6d0
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3929843
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
- YUV to RGB use linear for first and last row.
- add assert(yuvconstants)
- rename pointers to match row functions.
- use macros that match row functions.
- use 12 bit upsampler for conversions of 10 and 12 bits
Cortex A53 AArch32
I420ToRGB24_Opt (3627 ms)
I422ToRGB24_Opt (4099 ms)
I444ToRGB24_Opt (4186 ms)
I420ToRGB24Filter_Opt (5451 ms)
I422ToRGB24Filter_Opt (5430 ms)
AVX2
Was I420ToRGB24Filter_Opt (583 ms)
Now I420ToRGB24Filter_Opt (560 ms)
Neon Cortex A7
Was I420ToRGB24Filter_Opt (5447 ms)
Now I420ToRGB24Filter_Opt (5439 ms)
Bug: libyuv:938
Change-Id: I1731f2dd591073ae11a756f06574103ba0f803c7
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3906082
Reviewed-by: Justin Green <greenjustin@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
- add tests for all single plane formats that reduce or stay same in size
Bug: b/242233673
Change-Id: Ic25d808114f11995ac56ea9c31b99f66ba36d345
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3828485
Reviewed-by: Wan-Teh Chang <wtc@google.com>
The code already exists to use a specific matrix. This CL simply
adds a function to use a generic YUV matrix for the conversion.
Bug: b/241451603
Change-Id: I0eea7e96a891d045905a9c963b56c053097029ec
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3820903
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
- fix crash when width is not a multiple of 16
- apply clang format
- bump version
Bug: libyuv:940, b/240094327
Change-Id: Ic18e5b7b64f78f26e8b7d8440bf490a679bda200
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3812594
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Explicitly initialize the 'pad' field of RgbConstants to 0. This
prevents the following warning/error in some compilers:
error: missing field 'pad' initializer [-Werror,-Wmissing-field-initializers]
Bug: b/241008246
Change-Id: Id6a0beb75c5c709404290c75915049f8a3898c83
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3808044
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Fix the following MSVC warnings:
src\source\row_win.cc(117): warning C4309: 'argument': truncation of
constant value
src\source\row_win.cc(136): warning C4309: 'argument': truncation of
constant value
src\source\row_win.cc(155): warning C4309: 'argument': truncation of
constant value
src\source\row_win.cc(174): warning C4309: 'argument': truncation of
constant value
src\source\row_common.cc(1712): warning C4244: 'initializing':
conversion from 'uint16_t' to 'int8_t', possible loss of data
src\source\row_common.cc(1731): warning C4244: 'initializing':
conversion from 'int16_t' to 'int8_t', possible loss of data
src\source\row_common.cc(1786): warning C4244: 'initializing':
conversion from 'uint16_t' to 'int8_t', possible loss of data
src\source\row_common.cc(1805): warning C4244: 'initializing':
conversion from 'uint16_t' to 'int8_t', possible loss of data
Bug: libyuv:939
Change-Id: Ie87ba6e716732d1ff1ae5c236dfd9cfdac13439d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3807105
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
- Define HAS_SCALEROWUP2_BILINEAR_16_SSE2: it's now fixed.
- Correct function name to ScaleRowUp2_Bilinear_16_Any_SSE2:
this row function uses only SSE2 instructions.
Bug: libyuv:882
Change-Id: Ib1c7ac5b09997cb5b32bc54109d8c566af762433
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3800842
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
MergeRGB and SplitRGB use a register to point to 9 shuffle tables.
- fixes an out of registers error with -mcmodel=large
InterpolateRow_16To8_NEON improves performance for I210ToI420:
On Pixel 4 for 720p x1000 images
Was I210ToI420_Opt (608 ms)
Now I210ToI420_Opt (336 ms)
On Skylake Xeon
Was I210ToI420_Opt (259 ms)
Now I210ToI420_Opt (209 ms)
Bug: libyuv:931, libyuv:930
Change-Id: I20f8244803f06da511299bf1a2ffc7945eb35221
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3717054
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Justin Green <greenjustin@google.com>
- Avoid stepping to height + 1 for bilinear filter 2nd row for last row of source
- Box filter ubsan fix for 3/4 and 3/8 scaling for 16 bit planar
- Height 1 asan fixes
Bug: libyuv:935, b/206716399
Change-Id: I56088520f2a884a37b987ee5265def175047673e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3717263
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Fixes chromium PaintCanvasVideoRendererTest.HighBitDepth
sqdmulh was creating a 9 bit value with rounding, and then shifted it right 1 with no rounding. The rounding had an off by 1 impact in some tests.
Pixel 3
C I010ToI420_Opt (749 ms)
Was sqdmulh I010ToI420_Opt (370 ms)
Now ushl I010ToI420_Opt (324 ms)
Pixel 4
C I010ToI420_Opt (581 ms)
Was sqdmulh I010ToI420_Opt (240 ms)
Now ushl I010ToI420_Opt (231 ms)
Bug: b/216321733, b/233233302
Change-Id: I26f673bb411401d1e4a8126bf22d61c649223e9b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3694143
Reviewed-by: Justin Green <greenjustin@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
- Add I210ToI420 to convert 10 bit 4:2:2 YUV to 4:2:0 8 bit
- Add NEON InterpolateRow_16 for fast 10 bit scaling
- When scaling up, set step to interpolate toward height - 1 to avoid buffer overread
- When scaling down, center the 2 rows used for source to achieve filtering.
- CopyPlane check for 0 size and return
Bug: libyuv:931, b/228605787, b/233233302, b/233634772, b/234558395, b/234340482
Change-Id: I63e8580710a57812b683c2fe40583ac5a179c4f1
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3687552
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
If a width, height, and src/dst strides passed in are all 0, height is updated to 1 which means some CPU optimized functions may try to copy data when the dst rect is not valid.
Bug: b:234340482
Change-Id: I63be1c6ba05d669d67f5079d812acbec09c8f6c9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3689909
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Pixel 3
Was C I010ToI420_Opt (749 ms)
Now NEON I010ToI420_Opt (356 ms)
Pixel 4
Was C I010ToI420_Opt (581 ms)
Now NEON I010ToI420_Opt (163 ms)
Bug: b/233233302, b/233634772
Change-Id: I60a84648a66f77d97c0a7822b29bd18b8e3a3355
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3661401
Reviewed-by: Justin Green <greenjustin@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
This function reads 2 byte values and writes the 2nd byte to the destination.
It turns out this is useful for P010ToNV12 as well, so adding the planar function allows a high level to call this.
And adds UYVY support for something YUY2 already had. Which is writing the 1st byte.
Bug: b/233233302, b/233634772
Change-Id: I10a9454cb4f5b2c4ac5532fa86feddf78284d8b8
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3659055
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
The "vpackuswb %%xmm2,%%xmm0,%%xmm0" and "vmovdqu %%xmm0,(%1)"
instructions in ScaleRowUp2_Linear_SSSE3() are AVX instructions. They
cause an illegal instruction exception on CPUs that do not support AVX.
Bug: libyuv:927
Bug: chromium:1312551
Change-Id: I87b2aaf041e7d185e7e8fb07172d4f37482e9d08
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3585881
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Wan-Teh Chang <wtc@google.com>
When doing 90 or 270 degrees rotation we need to do a rotate&scale of the UV planes, as there are no helper optimized functions to do this, we use the Y plane as temporal memory and perform each of the transforms independently:
First U plane is rotated, putting the result in the Y plane. After the rotation, the output has double the samples horizontally and half the samples vertically, so it is scaled into the final U plane. Same process is done with the V plane.
Last the Y plane that can be just rotated without scaling.
It would be great to have an optimized version for this, but maybe this is helpfull for triggering the discussions.
Bug: libyuv:926
Change-Id: I188af103c4d0e3f9522021b4bf2b63c9d5de8b93
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3568424
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
- Unrolled to 16 pixels
- Take constants via structure, allowing different colorspace and channel order
- Use ADDHN to add 16.5 and take upper 8 bits of 16 bit values, narrowing to 8 bits
- clang-format applied, affecting mips code
On Cortex A510
Was RAWToJ400_Opt (1623 ms)
Now RAWToJ400_Opt (862 ms)
C RAWToJ400_Opt (1627 ms)
Bug: b/220171611
Change-Id: I06a9baf9650ebe2802fb6ff6dfbd524e2c06ada0
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3534023
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
When used in C enum keyword can't be eliminated.
Bug: libyuv:872
Change-Id: Iacff5a8bd84ec7caa1f90889e48f81ffc10071ae
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3513317
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
This patch adds some deleted control macros so that these MSA
optimization functions can be called normally on mips platform.
There are also some modifications to adapt to the clang compiler.
Bug: libyuv:918
Change-Id: I6ffadc6582682b5eaeae2e0f4033d66d370b48b9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3494667
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
This patch fixes compilation errors caused by the removal of kUVBias
and two failed test cases of LibYUVConvertTest.RGB565ToI420_Opt and
LibYUVConvertTest.ARGB1555ToI420_Opt.
Bug: libyuv:918
Change-Id: I1a66bcd7ef616aacbeca5b4015013015ccdf0f18
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3477416
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Add support for MM21 to NV12 and I420 conversion, and add SIMD
optimizations for arm, aarch64, SSE2, and SSSE3 machines.
Bug: libyuv:915, b/215425056
Change-Id: Iecb0c33287f35766a6169d4adf3b7397f1ba8b5d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3433269
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Justin Green <greenjustin@google.com>
Optimize 20 functions in source/scale_lsx.cc file.
All test cases passed on loongarch platform.
Bug: libyuv:913
Change-Id: I85bcb3b0bfd9461bb6f93202546507352cbd624a
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3351469
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Optimize two functions in source/rotate_lsx.cc file.
All test cases passed on loongarch platform.
Bug: libyuv:913
Change-Id: Idf670a1bc078f6284a499a292e0cb795f5b603b4
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3351468
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Optimize 44 functions in source/row_lsx.cc file.
All test cases passed on loongarch platform.
Bug: libyuv:913
Change-Id: Ic80a5751314adc2e9bd435f2bbd928ab017a90f9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3351467
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
1. Add supports for LSX and LASX.
2. Three optimization functions are added in loongarch/row_lasx.cc file:
I422ToARGBRow_LASX,I422ToRGBARow_LASX,I422AlphaToARGBRow_LASX.
Bug: libyuv:912, Bug: libyuv:913
Change-Id: I043c2704f99a5215724b5c0b7f97e6bf5f7a199b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3329189
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
- adapted from Android420ToI420, adding a rotation parameter
- SplitRotateUV added to rotate and split the UV channel of NV12 or NV21
- rename RotateUV functions to SplitRotateUV
Bug: b/203549508
Change-Id: I6774da5fb5908fdf1fc12393f0001f41bbda9851
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3251282
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
- ubsan complains on unaligned tests when an int16 or int32 is stored unaligned in C.
Although current Intel, ARM, Mips and PPC can do unaligned load/store, its not guaranteed
and could crash a CPU that doesnt support it.
- unaligned tests use offset of 2 or 4, which ubsan accepts.
- unittest fills in random buffer with 2 bytes at a time instead of a short.
- row common functions for int16 types use 2 shorts instead of 1 int.
Bug: libyuv:908, b/203243873
Change-Id: Idf13fa901647d7b0975f1947291caa781999a9bc
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3229782
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
- reenable Intel SIMD unaffected by BIT_EXACT
- add bit exact version of ARGBAttenuate, which uses ARM version of formula.
- add bit exact version of ARGBUnatenuate, which mimics the AVX code.
Apply clang format to cleanup code.
Bug: libyuv:908, b/202888439
Change-Id: Ie842b1b3956b48f4190858e61c02998caedc2897
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3224702
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
- C code use ARM path, so NEON and C match
- C used on Intel platforms, disabling AVX.
Bug: libyuv:908, b/202888439
Change-Id: Ie035a150a60d3cf4ee7c849a96819d43640cf020
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3223507
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
With "m" GCC generates a memory address with offset which is
not allowed with ld1 on aarch64. Change constraint to "Q" to
force address without offset.
Bug: chromium:819294, libyuv:903
Change-Id: Iaae24bc6882cdef823259040a37fdbfc31f91185
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2922146
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
- swap U and V when crop x is odd
- document YUY2 and UYVY formats
- apply clang-format
Bug: libyuv:902
Change-Id: I045e44c907f4a9eb625d7c024b669bb308055f32
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3039549
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>