Frank Barchard 1019e4537f port I444ToARGB avx2 code from Visual C to GCC.
SSSE3
Note: Google Test filter = *I444ToARGB*
[==========] Running 8 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 8 tests from LibYUVConvertTest
[ RUN      ] LibYUVConvertTest.I444ToARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_Any (435 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_Unaligned (418 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_Invert (417 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_Opt (411 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Any (419 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned (432 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Invert (435 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Opt (421 ms)
[----------] 8 tests from LibYUVConvertTest (3389 ms total)

AVX2
Note: Google Test filter = *I444ToARGB*
[==========] Running 8 tests from 1 test case.
[----------] Global test environment set-up.
[----------] 8 tests from LibYUVConvertTest
[ RUN      ] LibYUVConvertTest.I444ToARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_Any (340 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_Unaligned (325 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_Invert (316 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_Opt (316 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Any
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Any (315 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned (341 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Invert
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Invert (331 ms)
[ RUN      ] LibYUVConvertTest.I444ToARGB_ARGB_Opt
[       OK ] LibYUVConvertTest.I444ToARGB_ARGB_Opt (329 ms)
[----------] 8 tests from LibYUVConvertTest (2615 ms total)

TBR=harryjin@google.com
BUG=libyuv:492

Review URL: https://codereview.chromium.org/1445893002 .
2015-11-13 18:31:22 -08:00
..
compare_common.cc xmmword cast for clang 2015-08-18 11:13:12 -07:00
compare_gcc.cc nolint removed 2015-08-31 10:52:13 -07:00
compare_neon64.cc xmmword cast for clang 2015-08-18 11:13:12 -07:00
compare_neon.cc xmmword cast for clang 2015-08-18 11:13:12 -07:00
compare_win.cc xmmword cast for clang 2015-08-18 11:13:12 -07:00
compare.cc xmmword cast for clang 2015-08-18 11:13:12 -07:00
convert_argb.cc rename yuv matrix constants to be more clear about what they are 2015-11-03 17:09:53 -08:00
convert_from_argb.cc change all pix parameters to width for consistency 2015-10-07 22:30:36 -07:00
convert_from.cc rename yuv matrix constants to be more clear about what they are 2015-11-03 17:09:53 -08:00
convert_jpeg.cc libyuv::MJPGToI420() and libyuv::MJPGToARGB() return failure if callback to JPeg fails. 2014-01-28 03:08:59 +00:00
convert_to_argb.cc Remove Q420 fourcc support. 2015-02-11 18:20:54 +00:00
convert_to_i420.cc Remove Q420 fourcc support. 2015-02-11 18:20:54 +00:00
convert.cc change all pix parameters to width for consistency 2015-10-07 22:30:36 -07:00
cpu_id.cc remove mips dsp detect 2015-11-03 16:57:40 -08:00
mjpeg_decoder.cc nolint removed 2015-08-31 10:52:13 -07:00
mjpeg_validate.cc validate scan EOI from end for better coverage 2015-09-14 10:58:51 -07:00
planar_functions.cc rename yuv matrix constants to be more clear about what they are 2015-11-03 17:09:53 -08:00
rotate_any.cc rotate nv12 any width 2015-08-07 23:48:38 -07:00
rotate_argb.cc rotate include and proto cleanup 2015-07-22 18:09:04 -07:00
rotate_common.cc rotate include and proto cleanup 2015-07-22 18:09:04 -07:00
rotate_gcc.cc use visual c 32 bit code for clangcl 2015-08-11 10:10:45 -07:00
rotate_mips.cc rename rotate macros and functions to match 2015-07-27 17:00:41 -07:00
rotate_neon64.cc rotate include and proto cleanup 2015-07-22 18:09:04 -07:00
rotate_neon.cc remove align directives 2015-08-04 17:00:03 -07:00
rotate_win.cc use visual c 32 bit code for clangcl 2015-08-11 10:10:45 -07:00
rotate.cc Remove sse2 functions that also have ssse3 2015-09-30 14:24:44 -07:00
row_any.cc disable more avx2 functions that dont link in chrome 2015-11-09 17:20:02 -08:00
row_common.cc fix yvu constants for avx2 yuv to rgb 2015-11-10 10:45:44 -08:00
row_gcc.cc port I444ToARGB avx2 code from Visual C to GCC. 2015-11-13 18:31:22 -08:00
row_mips.cc remove I422ToBGRA and use I422ToRGBA internally 2015-11-02 10:24:12 -08:00
row_neon64.cc Neon versions of I420AlphaToARGB 2015-11-03 19:21:36 -08:00
row_neon.cc set d19 alpha on inner loop 2015-11-06 11:38:21 -08:00
row_win.cc Raw 24 bit RGB to RGB24 (bgr) 2015-11-03 10:30:30 -08:00
scale_any.cc Box filter for YUV use rows with accumulation buffer for better memory behavior. The old code would do columns accumulated into registers, and then store the result once. This was slow from a memory point of view. The new code does a row of source at a time, updating an accumulation buffer every row. The accumulation buffer is small, and should fit cache. Before each accumulation of N rows, the buffer needs to be reset to zero. If the memset is a bottleneck, it would be faster to do the first row without an add, storing to the accumulation buffer, and then add for the remaining rows. 2015-06-09 01:05:18 +00:00
scale_argb.cc scale with conversion using 2 steps with unittest 2015-11-13 11:25:56 -08:00
scale_common.cc Box filter for YUV use rows with accumulation buffer for better memory behavior. The old code would do columns accumulated into registers, and then store the result once. This was slow from a memory point of view. The new code does a row of source at a time, updating an accumulation buffer every row. The accumulation buffer is small, and should fit cache. Before each accumulation of N rows, the buffer needs to be reset to zero. If the memset is a bottleneck, it would be faster to do the first row without an add, storing to the accumulation buffer, and then add for the remaining rows. 2015-06-09 01:05:18 +00:00
scale_gcc.cc fix avx2 box filter bug for yuv down sampling. 2015-10-07 11:02:33 -07:00
scale_mips.cc remove align directives 2015-08-04 17:00:03 -07:00
scale_neon64.cc work arounds for ios 64 bit compiler where int passed into assembly needs to be explicitely cast to 'w' register. 2015-05-05 22:46:16 +00:00
scale_neon.cc remove align directives 2015-08-04 17:00:03 -07:00
scale_win.cc clang use scalewin 2015-08-18 14:50:27 -07:00
scale.cc Box filter for YUV use rows with accumulation buffer for better memory behavior. The old code would do columns accumulated into registers, and then store the result once. This was slow from a memory point of view. The new code does a row of source at a time, updating an accumulation buffer every row. The accumulation buffer is small, and should fit cache. Before each accumulation of N rows, the buffer needs to be reset to zero. If the memset is a bottleneck, it would be faster to do the first row without an add, storing to the accumulation buffer, and then add for the remaining rows. 2015-06-09 01:05:18 +00:00
video_common.cc Remove bayer format support from libyuv. This format is very rare and used on legacy hardware. Its not well optimized and has bugs related to odd widths. Removing the format will allow tests to pass under more circumstances, run faster and allow focus on higher priority quality and performance issues. 2015-02-09 19:58:19 +00:00