70 Commits

Author SHA1 Message Date
Shiyou Yin
bed9292f2c Move init process of msa after mmi.
Some processors support both MSA and MMI.
when they are enabled together, MSA will be preferd.
This patch move MSA initialization after MMI, so that
MSA can overide MMI and be setted to effective.

Change-Id: I8a52cce83ee4ec9727d47c99b287c9580329b149
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2155944
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2020-04-28 11:01:51 +00:00
Frank Barchard
f9aacffa02 Fix arm unittest failure by removing unused FloatDivToByteRow.
Apply clang-format to fix jpeg if() for lint fix.
Change comments about 4th pixel for open source compliance.
Rename UVToVU to SwapUV for consistency with MergeUV.

BUG=b/135532289, b/136515133

Change-Id: I9ce377c57b1d4d8f8b373c4cb44cd3f836300f79
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1685936
Reviewed-by: Chong Zhang <chz@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2019-07-02 20:00:30 +00:00
Martin Storsjö
9b772abf97 Restore the file mode for source files
This was changed in 21be9122aadf7824efe3fc19b2a09ff253a688e1.

Change-Id: I6c04dc92f673557e10c231bd090ec8aa88b6bee4
Reviewed-on: https://chromium-review.googlesource.com/1146183
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
2018-08-06 18:53:32 +00:00
lixia zhang
21be9122aa libyuv:loongson optimize compare/row/scale/rotate files with mmi.
Currently, libyuv supports MIPS SIMD Arch(MSA),
but libyuv does not supports MultiMedia Instruction(MMI)(such as loongson3a platform).

In order to improve performance of libyuv on loongson3a platform,
this provides optimize 98 functions with mmi.

BUG=libyuv:804

Change-Id: I8947626009efad769b3103a867363ece25d79629
Reviewed-on: https://chromium-review.googlesource.com/1122064
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-07-20 22:53:04 +00:00
Frank Barchard
b792e0dbc1 tidy applied with all cppcoreguidelines and google
TBR=braveyao@chromium.org
Bug: libyuv:750
Test: builds and runs and passes more tidy tests
Change-Id: I1400a915ee5734c38d19dab9cf1f210ca43d17fc
Reviewed-on: https://chromium-review.googlesource.com/905810
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-02-07 02:28:25 +00:00
Frank Barchard
92e22cf5b6 Lint cleanup after C99 change CL
TBR=braveyao@chromium.org
Bug: libyuv:774
Test: git cl lint
Change-Id: I51cf8107a8db17fbc9952d610f3e4d7aac5aa743
Reviewed-on: https://chromium-review.googlesource.com/882217
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-01-24 19:16:03 +00:00
Frank Barchard
7e389884a1 Switch to C99 types
Append _t to all sized types.
uint64 becomes uint64_t etc

Bug: libyuv:774
Test: try bots build on all platforms
Change-Id: Ide273d7f8012313d6610415d514a956d6f3a8cac
Reviewed-on: https://chromium-review.googlesource.com/879922
Reviewed-by: Miguel Casas <mcasas@chromium.org>
2018-01-23 19:16:05 +00:00
Frank Barchard
c682abe597 libyuv: fix undefined mul overflow
Bug: libyuv:771
Test: build asm with ubsan and check
Change-Id: I966d0bff74eef9ddfbeb93965fbff24c1472927c
Reviewed-on: https://chromium-review.googlesource.com/860898
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
2018-01-10 23:22:46 +00:00
Frank Barchard
80077a80c2 HammingDistance_X86 using popcnt assembly
popcnt has a fake dependency on the destination.
This assembly avoids the dependency by using a different
register for each popcnt.

Bug: libyuv:701
Test: LIBYUV_DISABLE_SSSE3=1 out/Release/libyuv_unittest --gtest_filter=*Ham*Opt --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=9999 --libyuv_flags=-1 --libyuv_cpu_info=-1
Change-Id: Ie1d202e2613b7fa8a3c02acd433940e92c80eafa
Reviewed-on: https://chromium-review.googlesource.com/731826
Reviewed-by: Cheng Wang <wangcheng@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-10-23 21:15:12 +00:00
Frank Barchard
e23b27d040 Reduce HammingDistance block size to 32k to avoid overflow
Bug: libyuv:701
Test: HammingDistance unittest with large size
Change-Id: Id41a2c27eb8922d03b3a21dab32fa2e7b015ba38
Reviewed-on: https://chromium-review.googlesource.com/708335
Reviewed-by: Cheng Wang <wangcheng@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-10-10 18:42:47 +00:00
Frank Barchard
60f433fbd9 Revert "ComputeHammingDistance reduce SIMD loop to 1 call when possible."
This reverts commit ec75df5894845b8d6b1341885a78db1de83decd8.

Reason for revert: <INSERT REASONING HERE>

Original change's description:
> ComputeHammingDistance reduce SIMD loop to 1 call when possible.
> 
> 32 bit x86 has high overhead due to -fpic.  So this reduces the
> number of calls by 1.
> 
> TBR=kjellander@chromium.org
> Bug: libyuv:701
> Test: BenchmarkHammingDistance
> Change-Id: I7f557ef047920db65eab362a5f93abbd274ca051
> Reviewed-on: https://chromium-review.googlesource.com/701755
> Reviewed-by: Frank Barchard <fbarchard@google.com>
> Reviewed-by: Cheng Wang <wangcheng@google.com>

TBR=rrwinterton@gmail.com,fbarchard@google.com,wangcheng@google.com

Change-Id: Ia61e8558a8f083c14be5f51e0e141550b6f2b5c1
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: libyuv:701
Reviewed-on: https://chromium-review.googlesource.com/707823
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-10-10 01:16:15 +00:00
Frank Barchard
ec75df5894 ComputeHammingDistance reduce SIMD loop to 1 call when possible.
32 bit x86 has high overhead due to -fpic.  So this reduces the
number of calls by 1.

TBR=kjellander@chromium.org
Bug: libyuv:701
Test: BenchmarkHammingDistance
Change-Id: I7f557ef047920db65eab362a5f93abbd274ca051
Reviewed-on: https://chromium-review.googlesource.com/701755
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-10-09 22:51:23 +00:00
Frank Barchard
1734712a6f Fix odd length HammingDistance
If length of HammingDistance was not a multiple of 4,
the result was incorrect.  The old tests did not catch this
so a new test is done to count 1s.

Bug: libyuv:740
Test: LibYUVCompareTest.TestHammingDistance
Change-Id: I93db5437821c597f1f162ac263d4a594bb83231f
Reviewed-on: https://chromium-review.googlesource.com/699614
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-10-04 22:21:36 +00:00
Frank Barchard
fecd741794 Port HammingDistance to SSSE3
Bug: libyuv:701
Test: BenchmarkHammingDistance_Opt
Change-Id: Ibdd5d382677ebef4f82a62e0d5c3b88614a3b6e4
Reviewed-on: https://chromium-review.googlesource.com/696290
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-10-03 19:11:05 +00:00
Frank Barchard
bde789b176 Hamming Distance SSE2 and AVX2 optimized
Bug: None
Test: None
Change-Id: Id52663f9c957aac3172fba92d888ad1b041d5cf0
Reviewed-on: https://chromium-review.googlesource.com/692981
Reviewed-by: Cheng Wang <wangcheng@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
2017-10-02 22:32:54 +00:00
Manojkumar Bhosale
2621c91bf1 Add MSA optimized HammingDistance and SumSquareError functions
TBR=kjellander@chromium.org
R=fbarchard@google.com

Bug:libyuv:634
Change-Id: Id0126ba5aff38817525b1efa6044f1dc2cfa1a36
Reviewed-on: https://chromium-review.googlesource.com/625739
Reviewed-by: Frank Barchard <fbarchard@google.com>
2017-09-05 21:32:33 +00:00
Frank Barchard
e0615c0e69 Optimize Hamming Distance C code to do 64 bits at a time.
BUG=libyuv:701
TEST=LibYUVBaseTest.BenchmarkHammingDistance_C
R=wangcheng@google.com

Change-Id: I243003b098bea8ef3809298bbec349ed52a43d8c
Reviewed-on: https://chromium-review.googlesource.com/499487
Reviewed-by: Cheng Wang <wangcheng@google.com>
2017-05-12 17:53:52 +00:00
Frank Barchard
e62309f259 clang-format libyuv
BUG=libyuv:654
R=kjellander@chromium.org

Review URL: https://codereview.chromium.org/2469353005 .
2016-11-07 17:37:23 -08:00
Frank Barchard
cda9d38a4e xmmword cast for clang
clangcl use compare_win for 32 bit, allowing fallback and enabling avx2 code for clang.
move defines/protos to compare_row.h
fix issue with odd width ARGBCopyAlpha functions by copying destination to temp buffer, then doing alpha copy, then copy back to destination.

R=harryjin@google.com
TBR=harryjin@google.com
BUG=libyuv:484

Review URL: https://webrtc-codereview.appspot.com/59379004.
2015-08-18 11:13:12 -07:00
Frank Barchard
baf6a3c1bd Using the visual C source allows clangcl to fallback seamlessly to visual c, and supports SSE41 and AVX2 versions.
R=harryjin@google.com
BUG=libyuv:469

Review URL: https://webrtc-codereview.appspot.com/58469004.
2015-08-17 10:47:43 -07:00
fbarchard@google.com
d28cd77f99 Enable assembly for clangcl build on Windows. Previously assembly was disabled so clangcl would work, but only with C code. As clangcl mimics both Visual C and GCC, ifdefs need to pick one or the other or often you'll end up with both. In this CL we disable most Visual C code and use the GCC versions which allow assembly for both 32 and 64 bit intel.
BUG=412
TESTED=clang=1 build on windows
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/51389004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1341 16f28f9a-4ce2-e073-06de-1de4eb20be90
2015-03-19 20:36:31 +00:00
fbarchard@google.com
a6025e8b6b ARGBDetect do 2 pixels at a time for improved performance.
BUG=375
TESTED=libyuvTest.BenchmarkARGBDetect_Opt
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/26049004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1155 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-05 23:23:17 +00:00
fbarchard@google.com
b661b3ee0d Detect Endian of ARGB image.
BUG=375
TESTED=libyuv builds, but no test app for it yet
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/32389004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1154 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-05 18:46:06 +00:00
fbarchard@google.com
9c4c82181b Remove alignment constraint for SSE2. Allows the optimized function to be used with unaligned memory, improving performance in that use case. Hurts performance on core2 and prior where memory was faster with movdqa instruction.
BUG=365
TESTED=psnr, ssim and djb2 unittests pass.
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/22859004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1100 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-09-30 18:53:34 +00:00
yang.zhang@arm.com
26f43db1ef AArch64:add SumSquareError_NEON armv8 assembly version
BUG=none
TESTED=libyuv_unittest
R=fbarchard@google.com

Review URL: https://webrtc-codereview.appspot.com/16259004/

the benckmarking result is as follows:
toolchain: gcc 4.9
hardware: A53

| count | C Times/NEON times |
| 16    | 3.35               |
| 128   | 6.63               |
| 512   | 7.47               |
| 1024  | 7.72               |

Change-Id: Ic10bf22d77d069a1a2074b68bd5a310c579ec490

Review URL: https://webrtc-codereview.appspot.com/21159004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1043 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-08-13 06:10:02 +00:00
fbarchard@google.com
9c6e52791f Port compare to C89 / Visual C.
BUG=303
TESTED=cl /c /TC /Iinclude source/compare.cc
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/7019006

git-svn-id: http://libyuv.googlecode.com/svn/trunk@966 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-13 18:57:30 +00:00
fbarchard@google.com
da443d7adc Remainder calc needs to be after blocks are done. Move calc to old location.
BUG=303
TESTED=Djb2 unittests
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6849004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@960 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-09 20:29:27 +00:00
fbarchard@google.com
167d5d1c2f Porting parts of compare to c89
BUG=303
TESTED=try bots still build, gcc and vc direct for c testing.
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6739004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@956 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-08 00:59:40 +00:00
fbarchard@google.com
a1f5254a95 Switch to c style casts for all source and includes.
BUG=303
TESTED=try
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6629004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@952 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-07 03:03:00 +00:00
fbarchard@google.com
88c0b01cdd Use 64 bit Sum for planar function to remove size limitation
BUG=302
TESTED=out\release\libyuv_unittest --gtest_filter=*Psnr
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/6499004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@941 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-01-02 22:57:06 +00:00
fbarchard@google.com
095f33d870 Coalesce rows by changing width/height and dropping into code instead of recursing. Improve coalesce by setting stride to 0 so it can be used even on odd width images. Reduce unittests to improve time to run emulators.
BUG=277
TEST=unittests all build and pass
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/2589004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@819 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-10-21 19:29:10 +00:00
fbarchard@google.com
07c3fe2f61 Fix DrMemory errors in unittests that were not initializing memory.
BUG=263
TEST=set GYP_DEFINES=build_for_tool=drmemory target_arch=ia32 & drmemory out\debug\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=*
R=kjellander@webrtc.org

Review URL: https://webrtc-codereview.appspot.com/2270007

git-svn-id: http://libyuv.googlecode.com/svn/trunk@798 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-09-24 07:39:10 +00:00
fbarchard@google.com
823548cb3b AVX2 hash using vex128 as first step.
BUG=none
TEST=BenchmarkDjb2_Opt
R=ryanpetrie@google.com

Review URL: https://webrtc-codereview.appspot.com/2219004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@792 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-09-17 20:34:28 +00:00
fbarchard@google.com
0d41aee26b Port compare functions to Nacl
BUG=253
TEST=none
R=nfullagar@google.com

Review URL: https://webrtc-codereview.appspot.com/1998004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@752 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-08-08 23:52:34 +00:00
fbarchard@google.com
abfeea9b81 Math functions - add, substract, multiply and shade adapted to nacl friendly addressing.
BUG=253
TEST=out\release\libyuv_unittest --gtest_filter=*Add*
R=dingkai@google.com, nfullagar@chromium.org

Review URL: https://webrtc-codereview.appspot.com/1972004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@746 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-08-06 20:47:18 +00:00
fbarchard@google.com
9b4c00b908 Move vzeroupper to row functions to simplify caller and allow mix of avx2 and sse2. Impact reduced by row coalescing.
BUG=none
TEST=all tests pass with sde
Review URL: https://webrtc-codereview.appspot.com/1269009

git-svn-id: http://libyuv.googlecode.com/svn/trunk@641 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-04-04 05:54:59 +00:00
fbarchard@google.com
83a63e65a6 Change YUV_DISABLE_ASM to LIBYUV_DISABLE_NEON, LIBYUV_DISABLE_MIPS, LIBYUV_DISABLE_X86
BUG=189
TESTED=try
Review URL: https://webrtc-codereview.appspot.com/1113006

git-svn-id: http://libyuv.googlecode.com/svn/trunk@582 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-02-27 00:20:29 +00:00
fbarchard@google.com
3c7bb050bd Unattenuate AVX2
BUG=190
TEST=planar_test
Review URL: https://webrtc-codereview.appspot.com/1112004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@577 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-02-20 22:18:36 +00:00
fbarchard@google.com
408e574366 Use vmovd to avoid switch to sse mode
BUG=none
TEST=c:\intelsde\sde -hsw -- out\release\libyuv_unittest.exe --gtest_filter=*Psnr*
Review URL: https://webrtc-codereview.appspot.com/1097013

git-svn-id: http://libyuv.googlecode.com/svn/trunk@573 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-02-14 19:03:30 +00:00
fbarchard@google.com
f3ad618d40 Sum of Square Error ported to AVX2
BUG=187
TEST=compare_unittest
Review URL: https://webrtc-codereview.appspot.com/1099009

git-svn-id: http://libyuv.googlecode.com/svn/trunk@572 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-02-13 18:38:03 +00:00
fbarchard@google.com
cde587092f Replace two spaces with one after .
BUG=none
TEST=lint
Review URL: https://webrtc-codereview.appspot.com/1063010

git-svn-id: http://libyuv.googlecode.com/svn/trunk@553 16f28f9a-4ce2-e073-06de-1de4eb20be90
2013-01-28 00:02:35 +00:00
fbarchard@google.com
66fe097a2b Move compare modules into their own files, and scale for mips
BUG=none
TEST=none
Review URL: https://webrtc-codereview.appspot.com/920005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@434 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-10-22 16:18:53 +00:00
fbarchard@google.com
3f467451cf Move compare low levels into their own files, for consistency with NEON.
BUG=none
TEST=none
Review URL: https://webrtc-codereview.appspot.com/921004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@429 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-10-20 01:23:27 +00:00
fbarchard@google.com
64ce0ab544 Move Neon source to its own files.
BUG=none
TEST=none
Review URL: https://webrtc-codereview.appspot.com/860009

git-svn-id: http://libyuv.googlecode.com/svn/trunk@396 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-10-09 00:05:29 +00:00
fbarchard@google.com
fc7314e86b Add exports to allow libyuv to be built as a shared lib.
BUG=99
TEST=shared lib builds without impact and unittests link against import lib.
Review URL: https://webrtc-codereview.appspot.com/844005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@379 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-09-27 02:17:51 +00:00
fbarchard@google.com
142f6c4ed5 Move row.h to include and remove rotate_priv.h
BUG=93
TESTED=try server
Review URL: https://webrtc-codereview.appspot.com/811004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@360 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-09-18 20:56:51 +00:00
fbarchard@google.com
b0c9797589 Update Copyright notice to follow new chromium conventions.
BUG=63
TEST=none
Review URL: https://webrtc-codereview.appspot.com/730004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@314 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-08-08 19:04:24 +00:00
fbarchard@google.com
5bf29b59db p2align all loops, copy stride to local for scale, and copy last byte in bilinear more efficiently
BUG=none
TEST=none
Review URL: https://webrtc-codereview.appspot.com/547007

git-svn-id: http://libyuv.googlecode.com/svn/trunk@255 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-05-02 00:10:16 +00:00
fbarchard@google.com
5ff3a8fec5 ARGBBlendRow1_SSSE3 added to allow SSSE3 only alpha blending. Saves on SSE2 cpu dispatching
BUG=none
TEST=none
Review URL: https://webrtc-codereview.appspot.com/518002

git-svn-id: http://libyuv.googlecode.com/svn/trunk@251 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-04-24 19:43:45 +00:00
fbarchard@google.com
85ebc8e20b HashDjb2_SSE41 ported to gcc. gcc 4.5 required for pmulld instruction.
BUG=none
TEST=none
Review URL: https://webrtc-codereview.appspot.com/506002

git-svn-id: http://libyuv.googlecode.com/svn/trunk@248 16f28f9a-4ce2-e073-06de-1de4eb20be90
2012-04-23 20:09:35 +00:00