fbarchard@google.com
|
ee4bc0d834
|
vzeroupper moved to just before ret. in one case it was done after ret, which is a bug that would cause a performance stall.
BUG=none
TESTED=try bots
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/24159004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1149 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-10-30 19:27:21 +00:00 |
|
fbarchard@google.com
|
2edea9454d
|
Fix lint extraneous warning on row_win assembly by disabling the warning for those affected lines.
BUG=none
TESTED=line row_win.cc
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/29969004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1144 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-10-27 16:27:48 +00:00 |
|
fbarchard@google.com
|
f2fa453b94
|
Port I422ToABGR to AVX2.
BUG=269
TESTED=intelsde on I422ToABGR
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/23149004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1138 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-10-23 17:20:22 +00:00 |
|
fbarchard@google.com
|
22eb5965fc
|
Optimize I422ToRGBA for AVX2 by hoisting ymm5 initialization and using different register for output of unpack.
BUG=269
TESTED=intelsde on I422ToABGR
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/29889004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1137 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-10-22 23:39:16 +00:00 |
|
fbarchard@google.com
|
c000955bc0
|
Port I422ToRGBA to AVX.
BUG=269
TESTED=intelsde on I422ToRGBA
R=brucedawson@google.com
Review URL: https://webrtc-codereview.appspot.com/28769004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1136 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-10-22 22:41:39 +00:00 |
|
fbarchard@google.com
|
af6f25245e
|
Reenable AVX2 scaling with bug fix for any width
BUG=376
TESTED=unittest on scale functions
R=brucedawson@google.com, harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/30759004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1135 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-10-22 01:15:20 +00:00 |
|
fbarchard@google.com
|
4ec55a21cf
|
Use macros to simplify I422ToARGB for AVX code.
BUG=269
TESTED=local build with Visual C
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/24079004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1133 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-10-21 22:48:32 +00:00 |
|
fbarchard@google.com
|
a063a66de4
|
Change I422ToARGB_AVX2 register usage to match SSSE3. ymm0 = B, ymm1 = G, ymm2 = R.
BUG=269
TESTED=intelsde passes on unittests.
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/28759004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1132 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-10-21 19:02:06 +00:00 |
|
fbarchard@google.com
|
d81dddd3d0
|
port I420ToBGRA to AVX2.
BUG=269
TESTED=c:\intelsde\sde -ast -hsw -- out\release\libyuv_unittest.exe --gtest_filter=*I420ToBGRA*
R=brucedawson@google.com, harryjin@google.com, magjed@chromium.org
Review URL: https://webrtc-codereview.appspot.com/26869004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1127 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-10-20 19:35:55 +00:00 |
|
fbarchard@google.com
|
3dbaaf0032
|
switch win64 intrinsics to loadu / storeu for unaligned memory.
BUG=372
TESTED=untested
R=brucedawson@google.com, harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/30729004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1124 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-10-16 23:46:48 +00:00 |
|
fbarchard@google.com
|
205c1440cf
|
Use movdqu then pavgb to allow unaligned memory for rgb subsampling code. Allows this assembly to be used for unaligned pointers as well as aligned ones with no performance hit when memory is aligned on a modern cpu.
BUG=365
TESTED=libyuvTest.ARGBToI420_Unaligned (453 ms)
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/30679004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1116 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-10-07 19:47:06 +00:00 |
|
fbarchard@google.com
|
ca308327d2
|
Remove unaligned functions, since most function support unaligned memory now. This reduces complexity and improves performance for unaligned cases because C code can be avoided, and overhead is less. Downside is old cpus (core2 and earlier) will be slower for aligned memory case. Except mips, which has alignment requirement, but remove unaligned variant.
BUG=365
TESTED=unittest builds and passes locally
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/24839004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1113 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-10-07 00:59:31 +00:00 |
|
fbarchard@google.com
|
b720049a54
|
Make row functions used for planarfunctions and convert use movdqu to relax alignment constraint. Step 1 - make functions unaligned.
BUG=365
TESTED=libyuv_unittest passes
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/26709004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1111 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-10-03 21:11:37 +00:00 |
|
fbarchard@google.com
|
d83f63a3b4
|
InterpolateRow used for scale handle unaligned memory. Remove HalfRow which is not used.
BUG=367
TESTED=unittest on I422ToI420
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/28639004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1107 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-10-03 17:37:11 +00:00 |
|
fbarchard@google.com
|
455ae94c60
|
Make rotate SIMD allow unaligned pointers.
BUG=365
TESTED=libyuv_unittest
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/22899004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1102 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-10-02 17:56:48 +00:00 |
|
fbarchard@google.com
|
044f914c29
|
Change scale to unaligned movdqu.
BUG=365
TESTED=scale unittests
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/22879004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1101 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-10-01 01:16:04 +00:00 |
|
fbarchard@google.com
|
d33bf86b25
|
CopyRow_AVX which supports unaligned pointers for Sandy Bridge CPU.
BUG=363
TESTED=out\release\libyuv_unittest --gtest_filter=*ARGBToARGB_*
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/31489004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1097 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-09-29 23:53:18 +00:00 |
|
fbarchard@google.com
|
aec76f2e30
|
add stride to pointer in C and pass as register to inline.
BUG=357
TESTED=clang on ios
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/29489004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1086 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-09-19 22:51:39 +00:00 |
|
fbarchard@google.com
|
6e95f6f7e1
|
ifdef headers to avoid intrinsics if built with gcc 64 bit on windows.
BUG=351
TESTED=untested
R=jzern@chromium.org
Review URL: https://webrtc-codereview.appspot.com/22419004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1058 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-08-21 22:44:49 +00:00 |
|
fbarchard@google.com
|
9e0f21af0b
|
fixes for blank line lint warnings
BUG=348
TESTED=cpplint.py --filter=-casting source/*.cc include/libyuv/*.h
R=harryjin@google.com
Review URL: https://webrtc-codereview.appspot.com/18139004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1045 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-08-14 19:42:48 +00:00 |
|
fbarchard@google.com
|
e6dd1fa024
|
Port I420ToARGB to intrinsics for win64
BUG=336
TESTED=out\release_x64\libyuv_unittest --gunit_also_run_disabled_tests --gtest_filter=*I420To*B*
R=bryan.bernhart@intel.com, tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/15809005
git-svn-id: http://libyuv.googlecode.com/svn/trunk@1018 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-06-24 20:45:45 +00:00 |
|
fbarchard@google.com
|
a1f5254a95
|
Switch to c style casts for all source and includes.
BUG=303
TESTED=try
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/6629004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@952 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-01-07 03:03:00 +00:00 |
|
fbarchard@google.com
|
5dba58cb1e
|
FixedDiv1 using a single 64/32 divide. Removes size restriction from slope.
BUG=302
TESTED=libyuv scale tests
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/6489004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@940 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2014-01-02 22:32:09 +00:00 |
|
fbarchard@google.com
|
c2295807bd
|
Reduce alignment for loops from 16 bytes to 4 bytes. Reduces outer loop overhead without hurting innerloop time.
BUG=none
TESTED=try bots
R=fbarchard@chromium.org, mflodman@webrtc.org
Review URL: https://webrtc-codereview.appspot.com/4659004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@880 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-12-02 15:57:39 +00:00 |
|
fbarchard@google.com
|
a0630d77f0
|
Report of affine to nacl using %k0
BUG=none
TEST=none
R=johannkoenig@google.com
Review URL: https://webrtc-codereview.appspot.com/3929004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@855 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-11-15 17:42:44 +00:00 |
|
fbarchard@google.com
|
c2a889eb55
|
Bump reciprocal up by 1
BUG=none
TEST=none
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/3599004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@847 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-11-11 05:14:13 +00:00 |
|
fbarchard@google.com
|
191ab18073
|
Use fixed point for small blurs
BUG=none
TEST=libyuvTest.ARGBBlurSmall_Opt
R=ryanpetrie@google.com
Review URL: https://webrtc-codereview.appspot.com/3389004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@843 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-11-05 18:19:11 +00:00 |
|
fbarchard@google.com
|
4a4b7374c1
|
Load matrix with one vector and splat to 4 different ones.
BUG=none
TEST=none
R=ryanpetrie@google.com
Review URL: https://webrtc-codereview.appspot.com/3299004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@838 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-11-01 21:29:45 +00:00 |
|
fbarchard@google.com
|
11a0d48e45
|
pass parameter for yuv conversion
BUG=267
TEST=Luma
R=tpsiaki@google.com
Review URL: https://webrtc-codereview.appspot.com/3169005
git-svn-id: http://libyuv.googlecode.com/svn/trunk@834 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-10-31 05:47:13 +00:00 |
|
fbarchard@google.com
|
21796c94aa
|
Move constant to its own asm block to save 3 GPR registers for main loop
BUG=267
TESTED=32 bit mac build
Review URL: https://webrtc-codereview.appspot.com/3099004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@832 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-10-29 08:43:13 +00:00 |
|
fbarchard@google.com
|
ca8f826ba3
|
Luma fetch 4 pixels
BUG=267
TEST=Luma*
R=ryanpetrie@google.com
Review URL: https://webrtc-codereview.appspot.com/3079004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@831 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-10-28 22:53:22 +00:00 |
|
fbarchard@google.com
|
4c736098d6
|
Use packssdw which is SSE2 not packusdw which is SSSE4.
BUG=none
TEST=Sobel* on AMD cpu
R=ryanpetrie@google.com
Review URL: https://webrtc-codereview.appspot.com/3069004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@829 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-10-28 19:12:49 +00:00 |
|
fbarchard@google.com
|
6f7e514caa
|
Full metal BCS
BUG=none
TEST=Luma* unittest
R=thorcarpenter@google.com
Review URL: https://webrtc-codereview.appspot.com/3029004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@828 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-10-28 17:10:49 +00:00 |
|
fbarchard@google.com
|
08b24a4232
|
Bayer GG specialized version for Sobel
BUG=none
TEST=Sobel
R=johannkoenig@google.com
Review URL: https://webrtc-codereview.appspot.com/2849004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@826 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-10-25 07:39:43 +00:00 |
|
fbarchard@google.com
|
092099507e
|
Sobel using max to get abs for SSE2
BUG=none
TEST=none
R=ryanpetrie@google.com
Review URL: https://webrtc-codereview.appspot.com/2769004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@824 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-10-23 00:51:52 +00:00 |
|
fbarchard@google.com
|
38157bdc71
|
Change Attenuate and Unattenuate to unaligned memory ops.
BUG=279
TEST=ARGBAttenuate_Unaligned
R=nfullagar@google.com, ryanpetrie@google.com
Review URL: https://webrtc-codereview.appspot.com/2709004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@821 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-10-21 21:44:23 +00:00 |
|
fbarchard@google.com
|
8be4b289c7
|
ARGBSobelToPlane which produces a planar output.
BUG=none
TEST=none
R=ryanpetrie@google.com
Review URL: https://webrtc-codereview.appspot.com/2415005
git-svn-id: http://libyuv.googlecode.com/svn/trunk@818 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-10-21 18:39:07 +00:00 |
|
fbarchard@google.com
|
adef267edf
|
CopyYToAlpha to copy from a plane to alpha channel of ARGB
BUG=275
TESTED=untested
R=ryanpetrie@google.com
Review URL: https://webrtc-codereview.appspot.com/2415004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@814 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-10-17 07:32:16 +00:00 |
|
fbarchard@google.com
|
3075de8285
|
Use simple masking for AVX2 version of CopyAlpha so it can be implemented using a more generic bit mask function in future, and use more broadly known and optimized opcodes that will always be fast. Same performance as vblend.
BUG=none
TEST=CopyAlpha*
R=johannkoenig@google.com
Review URL: https://webrtc-codereview.appspot.com/2393005
git-svn-id: http://libyuv.googlecode.com/svn/trunk@813 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-10-15 00:32:29 +00:00 |
|
fbarchard@google.com
|
f6631bb814
|
CopyAlpha AVX2
BUG=none
TEST=Alpha*
R=ryanpetrie@google.com
Review URL: https://webrtc-codereview.appspot.com/2392004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@812 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-10-14 19:37:21 +00:00 |
|
fbarchard@google.com
|
7f67961ec5
|
ARGBCopyAlpha for effects
BUG=none
TEST=none
R=johannkoenig@google.com
Review URL: https://webrtc-codereview.appspot.com/2385004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@810 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-10-12 22:27:37 +00:00 |
|
fbarchard@google.com
|
8b0cdb4a6e
|
ARGBShuffle_SSE2 ported to GCC and NaCL, and HalfRow_SSE2 ported to NaCL.
BUG=271
TESTED=ABGRToARGB on linux
R=johannkoenig@google.com, nfullagar@google.com
Review URL: https://webrtc-codereview.appspot.com/2362004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@808 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-10-08 00:15:34 +00:00 |
|
fbarchard@google.com
|
212a1a5000
|
ARGBShuffle_SSE2 for lower end CPUs
BUG=271
TESTED=out\release\libyuv_unittest --gtest_filter=**R*ToARGB*
R=johannkoenig@google.com, ryanpetrie@google.com
Review URL: https://webrtc-codereview.appspot.com/2361004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@807 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-10-05 04:17:50 +00:00 |
|
fbarchard@google.com
|
c99db063e2
|
Change ARGBColorMatrix to a 4x4.
BUG=none
TEST=planar_unitest updates
R=johannkoenig@google.com, ryanpetrie@google.com, thorcarpenter@google.com
Review URL: https://webrtc-codereview.appspot.com/2320008
git-svn-id: http://libyuv.googlecode.com/svn/trunk@805 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-10-01 01:27:30 +00:00 |
|
fbarchard@google.com
|
446f91d040
|
Use vbroadcastf128 to copy m128 to ymm duplicating the value to high and low 128 bits. Allows shared variables.
BUG=none
TEST=avx2 unittests still pass.
R=mflodman@webrtc.org
Review URL: https://webrtc-codereview.appspot.com/2324004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@803 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-09-30 06:49:10 +00:00 |
|
fbarchard@google.com
|
0d19fc5ed3
|
disable lint warning on movzx instructions
BUG=none
TEST=lint
R=johannkoenig@google.com
Review URL: https://webrtc-codereview.appspot.com/2290004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@802 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-09-24 21:48:50 +00:00 |
|
fbarchard@google.com
|
47e856c632
|
Make I411ToARGB read 2 bytes to avoid overread.
BUG=262
TESTED=I411ToARGB
R=kjellander@webrtc.org
Review URL: https://webrtc-codereview.appspot.com/2278004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@799 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-09-24 10:07:16 +00:00 |
|
fbarchard@google.com
|
afd1d6b4ec
|
Fix 2 bugs with Luma scale
BUG=267
TEST=luma unittest improved
R=ryanpetrie@google.com
Review URL: https://webrtc-codereview.appspot.com/2260005
git-svn-id: http://libyuv.googlecode.com/svn/trunk@794 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-09-20 01:00:54 +00:00 |
|
fbarchard@google.com
|
7a0d01ef8b
|
Luma Table optimized for SSSE3
BUG=267
TESTED=lUMA unittest
R=jingning@google.com, nfullagar@google.com
Review URL: https://webrtc-codereview.appspot.com/2257004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@793 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-09-19 17:55:54 +00:00 |
|
fbarchard@google.com
|
a1ab194545
|
Color Table x86 reoptimized and ported to gcc.
BUG=266
TESTED=color table unittests
R=changjun.yang@intel.com
Review URL: https://webrtc-codereview.appspot.com/2216004
git-svn-id: http://libyuv.googlecode.com/svn/trunk@791 16f28f9a-4ce2-e073-06de-1de4eb20be90
|
2013-09-16 17:01:02 +00:00 |
|