libyuv/unit_test
Frank Barchard f0a9d6d206 Gaussian reorder for benefit of A73
Roughly. instead of 4 loads and 8 multiples, use 1 load and 2 multiples
4 times over.  The original code, as with the C code from clang and gcc,
did all the loads, then all the math, then the store.  The new code
does a load, then the math, then the next load, etc.
This schedules better on current arm 64 cpus.
Number of registers also reduced, reusing the same registers.

HiSilicon ARM A73:

Now
TestGaussRow_Opt (890 ms)
TestGaussCol_Opt (571 ms)

Was
TestGaussRow_Opt (1061 ms)
TestGaussCol_Opt (595 ms)

Qualcomm 821 (Pixel):

Now
TestGaussRow_Opt (571 ms)
TestGaussCol_Opt (474 ms)

Was
TestGaussRow_Opt (751 ms)
TestGaussCol_Opt (520 ms)

TBR=kjellander@chromium.org
BUG=libyuv:719
TEST=LibYUVPlanarTest.TestGaussRow_Opt

Reviewed-on: https://chromium-review.googlesource.com/627478
Reviewed-by: Cheng Wang <wangcheng@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
Change-Id: I5ec81191d460801f0d4a89f0384f89925ff036de
Reviewed-on: https://chromium-review.googlesource.com/634448
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-08-25 19:00:05 +00:00
..
testdata Detect asimd as same as Neon for Arm features. Used on Juno aarch64 linux. 2014-09-22 18:30:17 +00:00
basictypes_test.cc break up unittests into categories 2015-10-13 16:01:07 -07:00
color_test.cc clang-format libyuv 2016-11-07 17:37:23 -08:00
compare_test.cc Move compare functions into a unittest class 2017-06-19 19:39:10 +00:00
convert_test.cc Roll chromium_revision da7cc8ca4c..ce95e5d83f (465147:465389) 2017-04-18 22:40:59 +00:00
cpu_test.cc lint warning fixes for CpuID 2017-05-25 22:00:17 +00:00
cpu_thread_test.cc MaskCpuFlags return cpuinfo so InitCpuFlags can call it 2017-05-24 22:27:03 +00:00
math_test.cc clang-format libyuv 2016-11-07 17:37:23 -08:00
planar_test.cc Gaussian reorder for benefit of A73 2017-08-25 19:00:05 +00:00
rotate_argb_test.cc clang-format libyuv 2016-11-07 17:37:23 -08:00
rotate_test.cc clang-format libyuv 2016-11-07 17:37:23 -08:00
scale_argb_test.cc scale test clipping code unused cpu parameters removed 2017-02-14 03:26:50 +00:00
scale_test.cc Add MSA optimized SplitUV, Set, MirrorUV, SobelX and SobelY row functions. 2017-08-17 18:39:22 +00:00
unit_test.cc Move compare functions into a unittest class 2017-06-19 19:39:10 +00:00
unit_test.h scale float samples and return max value 2017-08-04 23:34:30 +00:00
video_common_test.cc clang-format libyuv 2016-11-07 17:37:23 -08:00