Frank Barchard 65e7c9d570 MM21ToYUY2 and ABGRToJ420 conversion
MM21 to YUY2 use zip1 for performance

Cortex A510
Was MM21ToYUY2 (612 ms)
Now MM21ToYUY2 (573 ms)

Prefetches help Cortex A53
Was MM21ToYUY2 (4998 ms)
Now MM21ToYUY2 (1900 ms)

Pixel 4 Cortex A76
Was MM21ToYUY2 (215 ms)
Now MM21ToYUY2 (173 ms)

ABGRToJ420
- NEON, SSSE3 and AVX2 row functions
- J400, J420 and J422 formats.
- Added AVX2 for UV on ARGBToJ420.  Was SSSE3

Same code/performance as ARGBToJ420 but with constants re-ordered.
Pixel 4
ABGRToJ420_Opt (623 ms)
ABGRToJ422_Opt (702 ms)
ABGRToJ400_Opt (238 ms)

Skylake Xeon
With LIBYUV_BIT_EXACT which uses C for UV
ABGRToJ420_Opt (988 ms)
ABGRToJ422_Opt (1872 ms)
ABGRToJ400_Opt (186 ms)
Skylake Xeon using AVX2
ABGRToJ420_Opt (251 ms)
ABGRToJ422_Opt (245 ms)
ABGRToJ400_Opt (184 ms)
Skylake Xeon using SSSE3
ABGRToJ420_Opt (328 ms)
ABGRToJ422_Opt (362 ms)
ABGRToJ400_Opt (185 ms)

Bug: b/238137982
Change-Id: I559c3fe3fb80fa2ce5be3d8218736f9cbc627666
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3832111
Reviewed-by: Justin Green <greenjustin@google.com>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2022-08-16 22:07:38 +00:00
..
compare_common.cc clang-tidy applied 2021-04-01 21:42:47 +00:00
compare_gcc.cc Make 2 step transitive tests measure 2 step time. 2021-04-30 18:14:57 +00:00
compare_mmi.cc MMI ifdef guards and add source to various build files. 2018-08-03 18:37:23 +00:00
compare_msa.cc use unix line endings 2018-06-20 23:19:59 +00:00
compare_neon64.cc Scale by even factor low level row function 2020-11-03 21:25:18 +00:00
compare_neon.cc Scale by even factor low level row function 2020-11-03 21:25:18 +00:00
compare_win.cc Switch win32 to row_gcc for clangcl. 2021-04-22 19:32:32 +00:00
compare.cc Remove MMI support 2022-01-26 08:41:33 +00:00
convert_argb.cc Add I422ToRGB565Matrix 2022-08-09 20:15:44 +00:00
convert_from_argb.cc MM21ToYUY2 and ABGRToJ420 conversion 2022-08-16 22:07:38 +00:00
convert_from.cc Change ScaleUVRowUp2_Biinear_16_SSE2 to SSE41 2022-04-15 18:46:09 +00:00
convert_jpeg.cc PlaneScale, UVScale and ARGBScale test 3x and 4x down sample. 2020-10-28 20:41:59 +00:00
convert_to_argb.cc Remove M420 and refactor NV12ToI420 2020-05-26 18:48:00 +00:00
convert_to_i420.cc Fix ConvertToI420 when using YUY2 or UYVY with odd crop_x. 2021-07-19 22:22:22 +00:00
convert.cc MM21ToYUY2 and ABGRToJ420 conversion 2022-08-16 22:07:38 +00:00
cpu_id.cc Add I210ToI420 2022-06-09 08:07:50 +00:00
mjpeg_decoder.cc JPeg decoder remove assert when out of data 2021-09-16 23:11:14 +00:00
mjpeg_validate.cc Update to r1732 for more robust jpeg 2019-07-01 22:32:36 +00:00
planar_functions.cc MM21ToYUY2 and ABGRToJ420 conversion 2022-08-16 22:07:38 +00:00
rotate_any.cc Remove MMI support 2022-01-26 08:41:33 +00:00
rotate_argb.cc Remove MMI support 2022-01-26 08:41:33 +00:00
rotate_common.cc Switch to C99 types 2018-01-23 19:16:05 +00:00
rotate_gcc.cc Make 2 step transitive tests measure 2 step time. 2021-04-30 18:14:57 +00:00
rotate_lsx.cc DetilePlane and unittest for NEON 2022-01-31 20:05:55 +00:00
rotate_mmi.cc MMI ifdef guards and add source to various build files. 2018-08-03 18:37:23 +00:00
rotate_msa.cc Switch to C99 types 2018-01-23 19:16:05 +00:00
rotate_neon64.cc Scale by even factor low level row function 2020-11-03 21:25:18 +00:00
rotate_neon.cc Scale by even factor low level row function 2020-11-03 21:25:18 +00:00
rotate_win.cc Switch win32 to row_gcc for clangcl. 2021-04-22 19:32:32 +00:00
rotate.cc I422Rotate update to remove name space for ios build warning 2022-04-07 21:06:44 +00:00
row_any.cc MM21ToYUY2 and ABGRToJ420 conversion 2022-08-16 22:07:38 +00:00
row_common.cc MM21ToYUY2 and ABGRToJ420 conversion 2022-08-16 22:07:38 +00:00
row_gcc.cc MM21ToYUY2 and ABGRToJ420 conversion 2022-08-16 22:07:38 +00:00
row_lasx.cc RAWToJ400 faster version for ARM 2022-03-18 07:22:36 +00:00
row_lsx.cc Optimize functions for LASX in row_lasx.cc. 2022-03-09 08:52:54 +00:00
row_mmi.cc clang-tidy applied 2021-04-01 21:42:47 +00:00
row_msa.cc Fix Bugs on mips platform V2. 2022-03-01 13:16:31 +00:00
row_neon64.cc MM21ToYUY2 and ABGRToJ420 conversion 2022-08-16 22:07:38 +00:00
row_neon.cc MM21ToYUY2 and ABGRToJ420 conversion 2022-08-16 22:07:38 +00:00
row_win.cc Fix MSVC warnings by adding casts 2022-08-03 21:24:21 +00:00
scale_any.cc Fix SSE2 version of ScalePlaneUp2_16_Bilinear 2022-08-02 20:35:48 +00:00
scale_argb.cc Bilinear scale up msan fix 2022-06-22 00:11:49 +00:00
scale_common.cc Merge/SplitRGB fix -mcmodel=large x86 and InterpolateRow_16To8_NEON 2022-06-29 00:00:46 +00:00
scale_gcc.cc Fix SSE2 version of ScalePlaneUp2_16_Bilinear 2022-08-02 20:35:48 +00:00
scale_lsx.cc DetilePlane and unittest for NEON 2022-01-31 20:05:55 +00:00
scale_mmi.cc MMI Optimized functions I422ToARGB for 1080p video 2019-09-11 21:06:21 +00:00
scale_msa.cc Switch to C99 types 2018-01-23 19:16:05 +00:00
scale_neon64.cc GCC: replace mov .8h with mov .16b 2021-06-01 17:44:56 +00:00
scale_neon.cc Add full 16 bit scaling up by 2x function 2021-03-02 19:29:02 +00:00
scale_rgb.cc RGBScale function using 3 steps: RGB24ToARGB, ARGBScale, ARGBToRGB24 2022-03-19 01:44:06 +00:00
scale_uv.cc Bilinear scale up msan fix 2022-06-22 00:11:49 +00:00
scale_win.cc Switch win32 to row_gcc for clangcl. 2021-04-22 19:32:32 +00:00
scale.cc MM21ToYUY2 and ABGRToJ420 conversion 2022-08-16 22:07:38 +00:00
test.sh Optimze ABGRToI420 for AVX2 2020-06-04 18:24:45 +00:00
video_common.cc Lint cleanup after C99 change CL 2018-01-24 19:16:03 +00:00