1114 Commits

Author SHA1 Message Date
fbarchard@google.com
59ed448685 MirrorAny functions so assembly can always be used.
BUG=none
TESTED=untested
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/29069004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1170 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-18 01:03:47 +00:00
fbarchard@google.com
55db4ec23b port lea removal for mirror to gcc
BUG=none
TESTED=none
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/27209004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1169 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-17 20:06:40 +00:00
fbarchard@google.com
b9d17e1d79 Fix offset in addresses for windows. Wants it within [] now.
BUG=none
TESTED=local windows build.
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/32479004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1168 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-17 19:50:42 +00:00
fbarchard@google.com
ad113fbaf6 Remove alignment from loops. Newer cpus will execute the loop efficiently without alignment, and the extra nops would slow the initial iteration marginally if anything.
BUG=none
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/27199004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1167 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-17 19:25:21 +00:00
fbarchard@google.com
5822505e0a Remove extra unaligned loop from alphablender. Both aligned and unaligned loops were the same, so remove the extra.
BUG=none
TESTED=try bots.
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/29059004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1166 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-17 18:33:07 +00:00
fbarchard@google.com
1eb636d249 remove initial lea in mirror functions and add the offset in the address mode.
BUG=none
TESTED=local libyuv unittests on windows
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/26169004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1165 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-17 18:16:23 +00:00
fbarchard@google.com
35508d0979 Mirror_AVX2 ported to GCC.
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/32079004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1164 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-13 23:11:10 +00:00
fbarchard@google.com
91000425a3 ARGBUnattenuate_AVX2 ported to GCC. Minor cleanup of constants to use broadcast to make 16 byte constant instead of 32 byte.
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30999004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1163 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-13 17:57:33 +00:00
fbarchard@google.com
f8c334473b ARGBAttenuate_AVX2 ported to GCC.
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/29049004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1162 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-12 18:38:06 +00:00
fbarchard@google.com
ec1f854f86 Use broadcast to duplicate constants from 16 bytes to 32 bytes to save data space.
BUG=none
TESTED=intelsde
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/32029004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1161 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-12 01:45:27 +00:00
fbarchard@google.com
a843cafbe4 ARGBMultiply_AVX2 ported to GCC.
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/27139005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1160 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-11 20:33:33 +00:00
fbarchard@google.com
0387df5186 ARGBSubtract_AVX2 ported to GCC.
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/27129004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1159 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-11 19:12:38 +00:00
fbarchard@google.com
9e9e26d60a ARGBAdd ported AVX2 ported to GCC.
BUG=269
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/32449004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1158 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-11 19:01:29 +00:00
fbarchard@google.com
10d9c0d0a7 MergeUV for AVX2 ported to gcc. Add missing vzeroupper to all avx2 functions.
BUG=none
TESTED=ncval for nacl
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/25059005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1157 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-10 19:19:12 +00:00
fbarchard@google.com
d1885bcc73 SplitUVRow_AVX2 ported to GCC/NaCL.
BUG=269
TESTED=validator for nacl.
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/28979004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1156 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-06 01:39:26 +00:00
fbarchard@google.com
a6025e8b6b ARGBDetect do 2 pixels at a time for improved performance.
BUG=375
TESTED=libyuvTest.BenchmarkARGBDetect_Opt
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/26049004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1155 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-05 23:23:17 +00:00
fbarchard@google.com
b661b3ee0d Detect Endian of ARGB image.
BUG=375
TESTED=libyuv builds, but no test app for it yet
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/32389004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1154 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-05 18:46:06 +00:00
fbarchard@google.com
bb3a4b41e9 vextractf128 requuires a constant argument for which dqword to extract, so add a new macro.
BUG=none
TESTED=local build on clang for osx
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30869004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1153 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-04 21:05:55 +00:00
fbarchard@google.com
3f87404769 Port YUY2ToUV, YUY2ToUV422, UYVYToUV and UYVYToUV422 to AVX2 on GCC/Nacl.
BUG=269
TESTED=ncval
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/26029004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1152 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-04 18:24:10 +00:00
fbarchard@google.com
067892c5a1 Port YUY2ToYRow_AVX2 and UYVYToYRow_AVX2 to gcc/NaCL from Windows AVX code.
BUG=269
TESTED=ncval
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/25039004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1151 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-11-03 18:30:17 +00:00
fbarchard@google.com
260e3b2273 now that libyuv requires newer nacl compiler, bundles can be assumed and bundle align macro can be removed. no impact on code gen.
BUG=none
TESTED=validator still passes
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30019004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1150 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-30 20:02:03 +00:00
fbarchard@google.com
ee4bc0d834 vzeroupper moved to just before ret. in one case it was done after ret, which is a bug that would cause a performance stall.
BUG=none
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/24159004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1149 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-30 19:27:21 +00:00
fbarchard@google.com
d10f80500f Improve cmake build. Add unittests to cmake build and automatically detect jpeg support. This change was originally generated to support the build of libyuv in naclports: https://chromium.googlesource.com/external/naclports/+/master/ports/libyuv/. Also add cmake artifacts to .gitignore file.
BUG=366
TESTED=build and run unittests with cmake
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/27009004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1146 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-28 23:37:11 +00:00
fbarchard@google.com
44b8fd363e Pass neon option to assembler but not the compiler. Step 1 of unifying the two libraries back into one.
BUG=371
TESTED=local ios builds ignore the option, but still work.
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/31719004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1145 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-28 23:15:16 +00:00
fbarchard@google.com
2edea9454d Fix lint extraneous warning on row_win assembly by disabling the warning for those affected lines.
BUG=none
TESTED=line row_win.cc
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/29969004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1144 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-27 16:27:48 +00:00
fbarchard@google.com
9ed836b154 The 'Any' versions of functions can handle any width now, so remove the check from the calling code. This has 2 advantages - less code, and less overhead in calling function when any function is NOT used. Downside is more code for case where any is used.
BUG=373
TESTED=libyuv_unittest still passes
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/24129004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1143 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-24 23:29:31 +00:00
fbarchard@google.com
88ac01aed0 Change YAny functions to share, and use mask for how many bytes at a time for simd vs C.
BUG=373
TESTED=libyuv_unittest passes
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/31819004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1142 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-24 22:58:38 +00:00
fbarchard@google.com
78a3a6b345 Change Any functions that convert 1 to 1 formats, memcpy style, so use C for remainder to allow a minimum width of 1. This has some advantages - allows function to be used even with SIMD that only allows aligned memory. Fewer macros, used by more functions. SIMD is not used unaligned avoiding page/cache split. No overlap so it can be used in place. Disadvantage is it will be slower if close to the maximum number of non-SIMD pixels.
BUG=373
TESTED=libyuv_unittest still passes
R=brucedawson@google.com, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/23209004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1141 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-24 22:17:59 +00:00
fbarchard@google.com
1f151f62a9 add a check that the simd function should be called. allows any functions to support any width, simplifing and speeding up the calling code.
BUG=373
TESTED=try bots
R=brucedawson@chromium.org, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/25949004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1140 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-24 00:45:27 +00:00
fbarchard@google.com
0a6dab42c0 Add check for minimum of 8 pixels for any functions and multiple of 8 not 16 for neon functions.
BUG=373
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/23189004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1139 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-23 23:05:12 +00:00
fbarchard@google.com
f2fa453b94 Port I422ToABGR to AVX2.
BUG=269
TESTED=intelsde on I422ToABGR
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/23149004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1138 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-23 17:20:22 +00:00
fbarchard@google.com
22eb5965fc Optimize I422ToRGBA for AVX2 by hoisting ymm5 initialization and using different register for output of unpack.
BUG=269
TESTED=intelsde on I422ToABGR
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/29889004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1137 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-22 23:39:16 +00:00
fbarchard@google.com
c000955bc0 Port I422ToRGBA to AVX.
BUG=269
TESTED=intelsde on I422ToRGBA
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/28769004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1136 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-22 22:41:39 +00:00
fbarchard@google.com
af6f25245e Reenable AVX2 scaling with bug fix for any width
BUG=376
TESTED=unittest on scale functions
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30759004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1135 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-22 01:15:20 +00:00
fbarchard@google.com
4165437c3e Disable AVX2 version of bilinear filter used for scaling.
BUG=376
TESTED=d:\src\libyuv\trunk>c:\intelsde\sde -ast -hsw -- out\release\libyuv_unittest.exe --gtest_filter=libyuvTest.ScaleTo569x480_Bilinear
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/25909004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1134 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-21 23:10:16 +00:00
fbarchard@google.com
4ec55a21cf Use macros to simplify I422ToARGB for AVX code.
BUG=269
TESTED=local build with Visual C
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/24079004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1133 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-21 22:48:32 +00:00
fbarchard@google.com
a063a66de4 Change I422ToARGB_AVX2 register usage to match SSSE3. ymm0 = B, ymm1 = G, ymm2 = R.
BUG=269
TESTED=intelsde passes on unittests.
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/28759004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1132 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-21 19:02:06 +00:00
fbarchard@google.com
51b78880c5 gcc version of I422ToBGRA_AVX2. Original copied from https://webrtc-codereview.appspot.com/28729004/ and compatible with, but unrelated to windows version.
BUG=269
TESTED=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/29849004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1131 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-21 02:18:11 +00:00
fbarchard@google.com
5a09c3ef2a remove ppapi/c/pp_macros.h dependency and assume m37 is available.
BUG=374
TESTED=untested
R=nfullagar@chromium.org, tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/26769005

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1130 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-20 23:54:14 +00:00
fbarchard@google.com
d81dddd3d0 port I420ToBGRA to AVX2.
BUG=269
TESTED=c:\intelsde\sde -ast -hsw -- out\release\libyuv_unittest.exe --gtest_filter=*I420ToBGRA*
R=brucedawson@google.com, harryjin@google.com, magjed@chromium.org

Review URL: https://webrtc-codereview.appspot.com/26869004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1127 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-20 19:35:55 +00:00
fbarchard@google.com
055725b8fe Neon does 8 at a time, so a check is added for any function of I422ToBGRA that width is >= 8 and for fast path that it is a multiple of 8 not 16.
BUG=373
TESTED=untested
R=brucedawson@google.com

Review URL: https://webrtc-codereview.appspot.com/24059004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1126 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-20 18:39:21 +00:00
fbarchard@google.com
9107460c7f Offset destination by 1 for I420ToARGB_Unaligned test to ensure destination alignment avoids exceptions.
BUG=372
TESTED=out\release_x64\libyuv_unittest --gtest_catch_exceptions=0 --gtest_filter=*I420ToARGB_Unaligned
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/23109004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1125 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-17 01:18:02 +00:00
fbarchard@google.com
3dbaaf0032 switch win64 intrinsics to loadu / storeu for unaligned memory.
BUG=372
TESTED=untested
R=brucedawson@google.com, harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30729004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1124 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-16 23:46:48 +00:00
fbarchard@google.com
e737688603 Fix for r1122 to change back to elif for rotate build error on Mac.
BUG=268
TESTED=try bot
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/31749004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1123 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-16 22:21:48 +00:00
fbarchard@google.com
f713691a6f Change elif to endif and if to allow AVX2 as well as SSE2 in future changes instead of one or the other.
BUG=none
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30719004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1122 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-16 20:47:22 +00:00
fbarchard@google.com
f6e495169c Copy width to 64 bit register to work around clang 3.4 warning
BUG=none
TESTED=local ios 64 bit build completes without size warnings on xcode 5.1.1
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/31699004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1120 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-13 23:26:17 +00:00
fbarchard@google.com
f58c85199e Roll chromium deps to match webrtc from 455c66b4375d72984b79249616d0a708ad568894 to 4d46be3930146bf9bdff7c17545c5d47361d3a80.
BUG=none
TESTED=try bots
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/24919004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1119 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-13 19:47:46 +00:00
fbarchard@google.com
4d46be3930 Declare CopyRow_AVX as using xmm usage, not ymm. Should resolve chromium build error for Android Atom.
BUG=libyuv:369
TESTED=untested
R=tpsiaki@google.com

Review URL: https://webrtc-codereview.appspot.com/31609004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1118 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-09 17:54:43 +00:00
zhongwei.yao@arm.com
0eb196f8db clear aarch64 related macro and fix bugs
fix 2 bugs:
 - build bug libyuv.gyp
 - runtime bug in ScaleRowDown38_2_Box_NEON
BUG=
TESTED=libyuv_unittest
R=fbarchard@google.com, fbarchard@chromium.org

Review URL: https://webrtc-codereview.appspot.com/23939004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1117 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-09 02:00:40 +00:00
fbarchard@google.com
205c1440cf Use movdqu then pavgb to allow unaligned memory for rgb subsampling code. Allows this assembly to be used for unaligned pointers as well as aligned ones with no performance hit when memory is aligned on a modern cpu.
BUG=365
TESTED=libyuvTest.ARGBToI420_Unaligned (453 ms)
R=harryjin@google.com

Review URL: https://webrtc-codereview.appspot.com/30679004

git-svn-id: http://libyuv.googlecode.com/svn/trunk@1116 16f28f9a-4ce2-e073-06de-1de4eb20be90
2014-10-07 19:47:06 +00:00