libyuv

mirror of https://chromium.googlesource.com/libyuv/libyuv synced 2026-08-03 02:06:19 +08:00

History

George Steed 2c32b689e4 [AArch64] Improve instruction interleaving in READI212_SVE The existing instruction arrangement is sub-optimal on little cores since it has instructions with dependencies next to each other, so spread them out to improve performance. No significant change observed on bigger cores, but little cores do show some small improvements except for the Alpha kernels which regress slightly. Runtimes observed compared to the previous SVE implementation: \| Cortex-A510 \| Cortex-A520 I210AlphaToARGBRow \| (!) +7.0% \| (!) +6.8% I210ToAR30Row \| -10.3% \| -9.9% I210ToARGBRow \| -2.4% \| -2.3% I212ToAR30Row \| -10.3% \| -9.9% I212ToARGBRow \| -2.4% \| -2.3% Change-Id: I626942ce02c4610cfac1ea4f8e7890653ee4324f Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6067150 Reviewed-by: Frank Barchard <fbarchard@chromium.org>		2024-12-03 21:50:47 +00:00
..
compare_common.cc	clang-tidy applied	2021-04-01 21:42:47 +00:00
compare_gcc.cc	Add volatile for gcc inline to avoid being removed	2024-07-02 01:25:24 +00:00
compare_msa.cc	use unix line endings	2018-06-20 23:19:59 +00:00
compare_neon64.cc	Add volatile for gcc inline to avoid being removed	2024-07-02 01:25:24 +00:00
compare_neon.cc	Add volatile for gcc inline to avoid being removed	2024-07-02 01:25:24 +00:00
compare_win.cc	Switch win32 to row_gcc for clangcl.	2021-04-22 19:32:32 +00:00
compare.cc	[AArch64] Add Neon implementation of HashDjb2	2024-05-01 19:37:31 +00:00
convert_argb.cc	Fix bugs in ARGBAttenuateRow_LASX/LSX function	2024-11-30 23:09:04 +00:00
convert_from_argb.cc	[AArch64] Add I8MM implementation of ARGBToUV444Row	2024-07-16 17:32:52 +00:00
convert_from.cc	Change ScalePlane,ScalePlane_16,... to return int	2023-11-03 23:53:24 +00:00
convert_jpeg.cc	PlaneScale, UVScale and ARGBScale test 3x and 4x down sample.	2020-10-28 20:41:59 +00:00
convert_to_argb.cc	Make functions that malloc check for ubsan math overflow	2024-10-08 21:08:34 +00:00
convert_to_i420.cc	Make functions that malloc check for ubsan math overflow	2024-10-08 21:08:34 +00:00
convert.cc	[AArch64] Use full Neon vectors in RGB565To{ARGB,UV,Y}Row_NEON	2024-09-16 04:35:47 +00:00
cpu_id.cc	Add CopyPlane_Unaligned, _Any and _Invert tests/benchmarksCpuId test	2024-11-19 23:53:05 +00:00
mjpeg_decoder.cc	Add AMXINT8 cpu detect	2024-02-15 21:44:47 +00:00
mjpeg_validate.cc	Update to r1732 for more robust jpeg	2019-07-01 22:32:36 +00:00
planar_functions.cc	HalfFloat fix SigIll on aarch64	2024-11-22 22:08:00 +00:00
rotate_any.cc	[AArch64] Fix rotate by odd sizes	2024-07-15 18:13:31 +00:00
rotate_argb.cc	CpuId test FSMR - Fast Short Rep Movsb	2024-11-18 17:56:45 +00:00
rotate_common.cc	[AArch64] Use full vectors in TransposeWx{8 => 16}_NEON	2024-05-21 07:46:42 +00:00
rotate_gcc.cc	Add volatile for gcc inline to avoid being removed	2024-07-02 01:25:24 +00:00
rotate_lsx.cc	[AArch64] Use full vectors in TransposeWx{8 => 16}_NEON	2024-05-21 07:46:42 +00:00
rotate_msa.cc	cpuid show vector length on ARM and RISCV	2024-07-02 18:10:56 +00:00
rotate_neon64.cc	Add volatile for gcc inline to avoid being removed	2024-07-02 01:25:24 +00:00
rotate_neon.cc	Add volatile for gcc inline to avoid being removed	2024-07-02 01:25:24 +00:00
rotate_sme.cc	[AArch64] Re-enable SME only for Linux and new versions of Clang	2024-09-23 09:29:53 +00:00
rotate_win.cc	Switch win32 to row_gcc for clangcl.	2021-04-22 19:32:32 +00:00
rotate.cc	CpuId test FSMR - Fast Short Rep Movsb	2024-11-18 17:56:45 +00:00
row_any.cc	HalfFloat fix SigIll on aarch64	2024-11-22 22:08:00 +00:00
row_common.cc	Change ARGBMultiplyRow_C to match Neon	2024-09-23 21:48:33 +00:00
row_gcc.cc	CpuId test FSMR - Fast Short Rep Movsb	2024-11-18 17:56:45 +00:00
row_lasx.cc	Fix bugs in ARGBAttenuateRow_LASX/LSX function	2024-11-30 23:09:04 +00:00
row_lsx.cc	Fix bugs in ARGBAttenuateRow_LASX/LSX function	2024-11-30 23:09:04 +00:00
row_msa.cc	Fix Bugs on mips platform V2.	2022-03-01 13:16:31 +00:00
row_neon64.cc	HalfFloat fix SigIll on aarch64	2024-11-22 22:08:00 +00:00
row_neon.cc	HalfFloat fix SigIll on aarch64	2024-11-22 22:08:00 +00:00
row_rvv.cc	Fix -Wmissing-prototypes warnings	2024-08-12 19:08:24 +00:00
row_sme.cc	[AArch64] Add SME implementation of I444ToARGBRow	2024-10-29 18:10:23 +00:00
row_sve.cc	[AArch64] Improve instruction interleaving in READI212_SVE	2024-12-03 21:50:47 +00:00
row_win.cc	Fix tidy warning that uint32_t dither4 should not be const	2023-06-02 00:42:02 +00:00
scale_any.cc	[AArch64] Unroll and use TBL in ScaleRowDown34_NEON	2024-09-16 15:37:27 +00:00
scale_argb.cc	[AArch64] Add SME implementation of I422ToARGBRow	2024-10-29 05:49:28 +00:00
scale_common.cc	Fix warnings for missing prototypes	2023-06-30 17:46:56 +00:00
scale_gcc.cc	cpuid show vector length on ARM and RISCV	2024-07-02 18:10:56 +00:00
scale_lsx.cc	DetilePlane and unittest for NEON	2022-01-31 20:05:55 +00:00
scale_msa.cc	Switch to C99 types	2018-01-23 19:16:05 +00:00
scale_neon64.cc	[AArch64] Add Neon implementation of ScaleRowDown2Linear_16	2024-11-25 21:10:26 +00:00
scale_neon.cc	scale_neon.cc: Fix -Wmissing-prototypes warnings	2024-08-13 03:50:51 +00:00
scale_rgb.cc	Make functions that malloc check for ubsan math overflow	2024-10-08 21:08:34 +00:00
scale_rvv.cc	Add volatile for gcc inline to avoid being removed	2024-07-02 01:25:24 +00:00
scale_sme.cc	CpuId test FSMR - Fast Short Rep Movsb	2024-11-18 17:56:45 +00:00
scale_uv.cc	[AArch64] Add SME implementation of ScaleUVRowDown2Box	2024-11-12 18:30:30 +00:00
scale_win.cc	Switch win32 to row_gcc for clangcl.	2021-04-22 19:32:32 +00:00
scale.cc	[AArch64] Add Neon implementation of ScaleRowDown2Linear_16	2024-11-25 21:10:26 +00:00
test.sh	Optimze ABGRToI420 for AVX2	2020-06-04 18:24:45 +00:00
video_common.cc	Lint cleanup after C99 change CL	2018-01-24 19:16:03 +00:00