libyuv

mirror of https://chromium.googlesource.com/libyuv/libyuv synced 2025-12-07 17:26:49 +08:00

Author	SHA1	Message	Date
Manojkumar Bhosale	56b5bbb0be	Add MSA optimized ARGB scaling functions R=fbarchard@google.com BUG=libyuv:634 Performance Gain (vs C vectorized) ScaleARGBRowDown2_MSA - ~2.6x ScaleARGBRowDown2Linear_MSA - ~7.9x ScaleARGBRowDown2Box_MSA - ~3.7x ScaleARGBRowDownEven_MSA - ~1.2x ScaleARGBRowDownEvenBox_MSA - ~3.5x ScaleARGBRowDown2_Any_MSA - ~2.6x ScaleARGBRowDown2Linear_Any_MSA - ~7.9x ScaleARGBRowDown2Box_Any_MSA - ~3.6x ScaleARGBRowDownEven_Any_MSA - ~1.2x ScaleARGBRowDownEvenBox_Any_MSA - ~3.5x Performance Gain (vs C non-vectorized) ScaleARGBRowDown2_MSA - 2.6x ScaleARGBRowDown2Linear_MSA - 13.5x ScaleARGBRowDown2Box_MSA - 5.8x ScaleARGBRowDownEven_MSA - 1.2x ScaleARGBRowDownEvenBox_MSA - 3.7x ScaleARGBRowDown2_Any_MSA - 2.6x ScaleARGBRowDown2Linear_Any_MSA - 13.5x ScaleARGBRowDown2Box_Any_MSA - 5.3x ScaleARGBRowDownEven_Any_MSA - 1.2x ScaleARGBRowDownEvenBox_Any_MSA - 3.7x Review URL: https://codereview.chromium.org/2527983002 .	2016-12-07 11:47:15 +05:30
Manojkumar Bhosale	83f460be33	Add MSA optimized ARGB Multiply/Add/Subtract row functions R=fbarchard@google.com BUG=libyuv:634 Performance Gain (vs C vectorized) ARGBMultiplyRow_MSA - 1.4x ARGBAddRow_MSA - 8.6x ARGBSubtractRow_MSA - 8.6x ARGBMultiplyRow_Any_MSA - 1.35x ARGBAddRow_Any_MSA - 7.3x ARGBSubtractRow_Any_MSA - 7.2x Performance Gain (vs C non-vectorized) ARGBMultiplyRow_MSA - 4.4x ARGBAddRow_MSA - 27x ARGBSubtractRow_MSA - 22x ARGBMultiplyRow_Any_MSA - 3.5x ARGBAddRow_Any_MSA - 23x ARGBSubtractRow_Any_MSA - 18x Review URL: https://codereview.chromium.org/2529983002 .	2016-12-02 15:21:10 +05:30
Frank Barchard	da0c29dada	Add MSA optimized ARGBToRGB565Row_MSA, ARGBToARGB1555Row_MSA, ARGBToARGB4444Row_MSA, ARGBToUV444Row_MSA functions R=fbarchard@google.com BUG=libyuv:634 Performance Gain (vs C vectorized) ARGBToRGB565Row_MSA - ~1.6x ARGBToRGB565Row_Any_MSA - ~1.6x ARGBToARGB1555Row_MSA - ~1.3x ARGBToARGB1555Row_Any_MSA - ~1.3x ARGBToARGB4444Row_MSA - ~3.8x ARGBToARGB4444Row_Any_MSA - ~3.8x ARGBToUV444Row_MSA - ~2.4x ARGBToUV444Row_Any_MSA - ~2.4x Performance Gain (vs C non-vectorized) ARGBToRGB565Row_MSA - ~2.8x ARGBToRGB565Row_Any_MSA - ~2.8x ARGBToARGB1555Row_MSA - ~2.2x ARGBToARGB1555Row_Any_MSA - ~2.2x ARGBToARGB4444Row_MSA - ~6.8x ARGBToARGB4444Row_Any_MSA - ~6.6x ARGBToUV444Row_MSA - ~6.7x ARGBToUV444Row_Any_MSA - ~6.7x Review URL: https://codereview.chromium.org/2520003004 .	2016-11-22 10:47:55 -08:00
Frank Barchard	b1504a8e48	Add MSA optimized ARGBToRGB24Row_MSA and ARGBToRAWRow_MSA functions R=fbarchard@google.com BUG=libyuv:634 Review URL: https://codereview.chromium.org/2487913004 .	2016-11-18 15:05:10 -08:00
Frank Barchard	3028e1bd97	clang-format row_gcc.cc with some functions disabled BUG=libyuv:654 TEST=try bots build R=kjellander@chromium.org Review URL: https://codereview.chromium.org/2484083003 .	2016-11-07 18:37:29 -08:00
Frank Barchard	e62309f259	clang-format libyuv BUG=libyuv:654 R=kjellander@chromium.org Review URL: https://codereview.chromium.org/2469353005 .	2016-11-07 17:37:23 -08:00
Frank Barchard	f2c27dafa2	HalfFloat neon armv7 fix for destination pointer. Improved unittests detect different in arm64 rounding. TEST=util/android/test_runner.py gtest -s libyuv_unittest -t 7200 --verbose --release --gtest_filter=Half -a "--libyuv_width=640 --libyuv_height=360" BUG=libyuv:560 R=wangcheng@google.com Review URL: https://codereview.chromium.org/2478313004 .	2016-11-07 12:13:04 -08:00
Frank Barchard	eca08525cb	HalfFloat Neon for ARMv7. 64 bit version made similar to 32 bit with registers 1 for load and store results, and 2 and 3 as expanded float temporary values. TEST=out/Release/libyuv_unittest --gtest_filter=Half BUG=libyuv:560 R=wangcheng@google.com Review URL: https://codereview.chromium.org/2467723002 .	2016-11-01 11:36:51 -07:00
Frank Barchard	10ce829bad	Add MSA optimized I422ToRGB565Row_MSA, I422ToARGB4444Row_MSA and I422ToARGB1555Row_MSA functions R=fbarchard@google.com BUG=libyuv:634 Performance Gain (vs C vectorized) I422ToRGB565Row_MSA : ~1.5x I422ToRGB565Row_Any_MSA : ~1.5x I422ToARGB4444Row_MSA : ~1.4x I422ToARGB4444Row_Any_MSA : ~1.4x I422ToARGB1555Row_MSA : ~1.4x I422ToARGB1555Row_Any_MSA : ~1.4x Performance Gain (vs C non-vectorized) I422ToRGB565Row_MSA : ~6.8x I422ToRGB565Row_Any_MSA : ~6.8x I422ToARGB4444Row_MSA : ~6.6x I422ToARGB4444Row_Any_MSA : ~6.6x I422ToARGB1555Row_MSA : ~6.6x I422ToARGB1555Row_Any_MSA : ~6.6x Review URL: https://codereview.chromium.org/2445343007 .	2016-10-27 10:47:35 -07:00
Frank Barchard	532f5708a9	Add MSA optimized I422AlphaToARGBRow_MSA and I422ToRGB24Row_MSA functions R=fbarchard@google.com BUG=libyuv:634 Performance Gain (vs C vectorized) I422AlphaToARGBRow_MSA : ~1.4x I422AlphaToARGBRow_Any_MSA : ~1.4x I422ToRGB24Row_MSA : ~4.8x I422ToRGB24Row_Any_MSA : ~4.8x Performance Gain (vs C non-vectorized) I422AlphaToARGBRow_MSA : ~7.0x I422AlphaToARGBRow_Any_MSA : ~7.0x I422ToRGB24Row_MSA : ~7.9x I422ToRGB24Row_Any_MSA : ~7.7x Review URL: https://codereview.chromium.org/2454433003 .	2016-10-26 11:12:17 -07:00
Frank Barchard	2488b3105b	White spaces, comments and lint fixes for msa. no functional changes. TBR=kjellander@chromium.org BUG=libyuv:634 Review URL: https://codereview.chromium.org/2446313002 .	2016-10-25 11:36:54 -07:00
Frank Barchard	c2073823b4	use __OPTIMIZE__ macro to determine debug vs release. Debug builds of x86 gcc/clang can run out of register. Previously NDEBUG or _DEBUG was used to detect a debug build. But those macros are not set by gentoo builds. This CL switches to the compiler predefine __OPTIMIZE__ which is built into clang and gcc. BUG=libyuv:602 TEST=untested R=wangcheng@google.com Review URL: https://codereview.chromium.org/2451503002 .	2016-10-24 18:02:48 -07:00
Frank Barchard	f5d5bd88d6	Add MSA optimized I422ToARGBRow_MSA and I422ToRGBARow_MSA functions R=fbarchard@google.com BUG=libyuv:634 Performance Gains :- (vs C vectorized) I422ToARGBRow_MSA : ~1.6x I422ToRGBARow_MSA : ~1.6x I422ToARGBRow_Any_MSA : ~1.58x I422ToRGBARow_Any_MSA : ~1.6x Performance Gains :- (vs C non-vectorized) I422ToARGBRow_MSA : ~7x I422ToRGBARow_MSA : ~7x I422ToARGBRow_Any_MSA : ~6.9x I422ToRGBARow_Any_MSA : ~6.8x Regarding performance measurement, We have created standalone tests which pass in row's data from a 1920x1080 filled buffer to both the C and MSA functions. And such N iterations are executed to get more accurate timings of C vs MSA. Review URL: https://codereview.chromium.org/2430313005 .	2016-10-24 15:37:08 -07:00
Frank Barchard	451af5e922	scale by 1 for neon implemented void HalfFloat1Row_NEON(const uint16* src, uint16* dst, float, int width) { asm volatile ( "1: \n" MEMACCESS(0) "ld1 {v1.16b}, [%0], #16 \n" // load 8 shorts "subs %w2, %w2, #8 \n" // 8 pixels per loop "uxtl v2.4s, v1.4h \n" // 8 int's "uxtl2 v1.4s, v1.8h \n" "scvtf v2.4s, v2.4s \n" // 8 floats "scvtf v1.4s, v1.4s \n" "fcvtn v4.4h, v2.4s \n" // 8 floatsgit "fcvtn2 v4.8h, v1.4s \n" MEMACCESS(1) "st1 {v4.16b}, [%1], #16 \n" // store 8 shorts "b.gt 1b \n" : "+r"(src), // %0 "+r"(dst), // %1 "+r"(width) // %2 : : "cc", "memory", "v1", "v2", "v4" ); } void HalfFloatRow_NEON(const uint16* src, uint16* dst, float scale, int width) { asm volatile ( "1: \n" MEMACCESS(0) "ld1 {v1.16b}, [%0], #16 \n" // load 8 shorts "subs %w2, %w2, #8 \n" // 8 pixels per loop "uxtl v2.4s, v1.4h \n" // 8 int's "uxtl2 v1.4s, v1.8h \n" "scvtf v2.4s, v2.4s \n" // 8 floats "scvtf v1.4s, v1.4s \n" "fmul v2.4s, v2.4s, %3.s[0] \n" // adjust exponent "fmul v1.4s, v1.4s, %3.s[0] \n" "uqshrn v4.4h, v2.4s, #13 \n" // isolate halffloat "uqshrn2 v4.8h, v1.4s, #13 \n" MEMACCESS(1) "st1 {v4.16b}, [%1], #16 \n" // store 8 shorts "b.gt 1b \n" : "+r"(src), // %0 "+r"(dst), // %1 "+r"(width) // %2 : "w"(scale * 1.9259299444e-34f) // %3 : "cc", "memory", "v1", "v2", "v4" ); } TEST=LibYUVPlanarTest.TestHalfFloatPlane_One BUG=libyuv:560 R=hubbe@chromium.org Review URL: https://codereview.chromium.org/2430313008 .	2016-10-21 14:30:03 -07:00
Frank Barchard	550cf829fb	HalfFloat avx2 unpack bug fix. AVX unpack parameters were reverse ordered causing incorrect results on AVX2 hardware. TEST=/usr/local/google/home/fbarchard/intelsde/sde -skx -- out/Release/libyuv_unittest --gtest_filter=Half BUG=libyuv:560 R=wangcheng@google.com Review URL: https://codereview.chromium.org/2438893002 .	2016-10-20 15:49:00 -07:00
Frank Barchard	f553db2d30	HalfFloatPlane unittest for denormal half floats Halffloats have a limited range. It shouldnt normally come up, but if the scale value passed in produces a small value, the half floats will be denormals, which are slow and/or flust to zero. This test ensures they behave the same in C and SIMD and tests the performance of denormals. TEST=TestHalfFloatPlane_denormal BUG=libyuv:560 R=hubbe@chromium.org Review URL: https://codereview.chromium.org/2424233004 .	2016-10-19 18:13:01 -07:00
Frank Barchard	78c58ab8aa	Add MSA optimized ARGB4444ToI420 and ARGB4444ToARGB functions R=fbarchard@google.com BUG=libyuv:634 Performance gains : (Auto-vectorized C vs MSA SIMD) ARGB4444ToYRow_MSA : ~3.0x ARGB4444ToUVRow_MSA : ~1.8x ARGB4444ToARGBRow_MSA : ~3.4x ARGB4444ToYRow_Any_MSA : ~2.8x ARGB4444ToUVRow_Any_MSA : ~1.7x ARGB4444ToARGBRow_Any_MSA : ~3.2x Review URL: https://codereview.chromium.org/2421843002 .	2016-10-19 11:10:51 -07:00
Frank Barchard	e16e3a629f	cpu_id cleanup. no functional change. remove old comment about initialize to zero. remove ifdef and replace with macro defined to zero. BUG=None TEST=try bots R=kjellander@chromium.org Review URL: https://codereview.chromium.org/2425623004 .	2016-10-18 12:26:02 -07:00
Frank Barchard	2d80fc3133	Port HalfFloatRow_SSE2 to AVX2 but not using F16C. R=wangcheng@google.com, hubbe@chromium.org BUG=libyuv:560 Review URL: https://codereview.chromium.org/2421993002 .	2016-10-14 19:01:41 -07:00
Frank Barchard	fdcf524aac	Add f16c (halffloat) cpuid R=wangcheng@google.com, hubbe@chromium.org BUG=libyuv:560 Review URL: https://codereview.chromium.org/2418763006 .	2016-10-14 16:34:08 -07:00
Frank Barchard	5333e94e70	Port ARGBExtractAlpha_AVX2 function to windows. BUG=libyuv:572 TEST=try bots R=wangcheng@google.com, magjed@chromium.org Review URL: https://codereview.chromium.org/2416783004 .	2016-10-13 23:20:57 -07:00
Frank Barchard	a5e93766a2	Add ARGBExtractAlpha_AVX2 function Port SSE2 version to AVX2. BUG=libyuv:572 TEST=/usr/local/google/home/fbarchard/intelsde/sde -skx -- out/Release/libyuv_unittest --gtest_filter=Extract R=wangcheng@google.com, magjed@chromium.org Review URL: https://codereview.chromium.org/2420553002 .	2016-10-13 16:03:43 -07:00
Frank Barchard	d363ea6527	Remove I411 support. YUV 411 is very uncommon format. Remove support. Update documentation to reflect that 411 is deprecated. Simplify tests for YUV to only test with the new side by side YUV but keep old 3 plane test around with a macro for now. BUG=libyuv:645 R=kjellander@chromium.org Review URL: https://codereview.chromium.org/2406123002 .	2016-10-11 11:14:16 -07:00
Frank Barchard	af87c11c9a	YUY2ToI422 coalesce rows for small images TBR=wangcheng@google.com BUG=libyuv:647 TESTED=LibYUVConvertTest.YUY2ToI422_Opt Review URL: https://codereview.chromium.org/2393393006 .	2016-10-07 18:35:42 -07:00
Frank Barchard	edd3a84d05	libyuv::YUY2ToY for isolating Y channel of YUY2. This function is the first step of YUY2 To I420. Provided primarily for diagnostics. TBR=wangcheng@google.com BUG=libyuv:647 TESTED=LibYUVConvertTest.YUY2ToY_Opt Review URL: https://codereview.chromium.org/2399153004 .	2016-10-07 17:20:30 -07:00
Frank Barchard	a2891ec77c	Add MSA optimized YUY2ToI422, YUY2ToI420, UYVYToI422, UYVYToI420 functions R=fbarchard@google.com BUG=libyuv:634 Performance gains as below, YUY2ToI422, YUY2ToI420 :- YUY2ToYRow_MSA : ~10x YUY2ToUVRow_MSA : ~11x YUY2ToUV422Row_MSA : ~9x YUY2ToYRow_Any_MSA : ~6x YUY2ToUVRow_Any_MSA : ~5x YUY2ToUV422Row_Any_MSA : ~4x UYVYToI422, UYVYToI420 :- UYVYToYRow_MSA : ~10x UYVYToUVRow_MSA : ~11x UYVYToUV422Row_MSA : ~9x UYVYToYRow_Any_MSA : ~6x UYVYToUVRow_Any_MSA : ~5x UYVYToUV422Row_Any_MSA : ~4x Review URL: https://codereview.chromium.org/2397693002 .	2016-10-07 10:37:22 -07:00
Frank Barchard	3b88a19ab1	YUY2ToI422_Any_Neon clean up to not require 16 pixels YUY2ToI422_Any_Neon previously required 16 pixels and duplicated the last pixel. The replication was not necessary after a previous change to treat YUY2 to 4 byte macro pixels. TBR=harryjin@google.com BUG=libyuv:648 TESTED=util/android/test_runner.py gtest -s libyuv_unittest -t 7200 --verbose --release --gtest_filter=YUY2ToI422 -a "--libyuv_width=17 --libyuv_height=7 --libyuv_repeat=999 --libyuv_flags=1" Review URL: https://codereview.chromium.org/2399143002 .	2016-10-06 12:11:40 -07:00
Frank Barchard	7018f5be0f	Add MSA optimized I422ToYUY2Row, I422ToUYVYRow functions R=fbarchard@google.com BUG=libyuv:634 Performance gains :- I422ToYUY2Row_MSA - ~12x I422ToYUY2Row_Any_MSA - ~7x I422ToUYVYRow_MSA - ~12x I422ToUYVYRow_Any_MSA - ~7x Review URL: https://codereview.chromium.org/2378753004 .	2016-10-03 18:21:31 -07:00
Frank Barchard	aa197ee1a3	HalfFloat_SSE2 for Visual C Low level support for 12 bit 420, 422 and 444 YUV video frame conversion. BUG=libyuv:560, chromium:445071 TEST=LibYUVPlanarTest.TestHalfFloatPlane on windows R=hubbe@chromium.org, wangcheng@google.com Review URL: https://codereview.chromium.org/2387713002 .	2016-10-03 10:33:38 -07:00
Frank Barchard	4a14cb2e81	HalfFloat_SSE2 port from C algorithm to SSE2 Low level support for 12 bit 420, 422 and 444 YUV video frame conversion. BUG=libyuv:560, chromium:445071 TEST=untested R=hubbe@chromium.org Review URL: https://codereview.chromium.org/2381493006 .	2016-09-30 09:47:16 -07:00
Frank Barchard	7fc932ddd3	Add low level support for 12 bit 420, 422 and 444 YUV video frame conversion. BUG=libyuv:560,chromium:445071 TEST=untested R=hubbe@chromium.org Review URL: https://codereview.chromium.org/2371293002 .	2016-09-29 15:06:30 -07:00
Frank Barchard	c11e9b7fb7	bt709 coefficients for video constrained space Original bt709 color space coefficients were full range yuv for higher quality. This change makes the coefficients use the video constrained color space the same as bt601 which is 16 to 240 for Y and 16 to 235 for chroma channels. BUG=libyuv:639 TEST=libyuv unittests run locally R=hubbe@chromium.org Review URL: https://codereview.chromium.org/2367253003 .	2016-09-28 15:07:46 -07:00
Frank Barchard	6732bcbde9	ShortToHalfFloat_AVX2 function BUG=libyuv:560 TEST=local compile for windows R=wangcheng@google.com Review URL: https://codereview.chromium.org/2364293002 .	2016-09-27 14:18:32 -07:00
Frank Barchard	618149084e	Add MIPS SIMD Arch (MSA) optimized ARGBMirrorRow function This patch adds MSA optimized ARGBMirrorRow function in libYUV project. Performance gain ~3x R=fbarchard@google.com BUG=libyuv:634 Review URL: https://codereview.chromium.org/2368313003 .	2016-09-26 16:28:01 -07:00
Frank Barchard	34a29bf756	fix warning on visual C for mips cpu detect follow up warning fixs cpu_id.cc(167): warning C4267: 'initializing': conversion from 'size_t' to 'int', possible loss of data lint warning: cpu_id.cc:171: Missing space before ( in if( [whitespace/parens] [5] TBR=manojkumar.bhosale@imgtec.com BUG=libyuv:634 TEST=try bots for windows. Review URL: https://codereview.chromium.org/2365813002 .	2016-09-22 18:25:52 -07:00
Frank Barchard	c5323b0fdc	Add MIPS SIMD Arch (MSA) optimized MirrorRow function As per the preparation patch added in Chromium sources at, 2150943003: Add MIPS SIMD Arch (MSA) build flags for GYP/GN builds This patch adds first MSA optimized function in libYUV project. BUG=libyuv:634 R=fbarchard@google.com Review URL: https://codereview.chromium.org/2285683002 .	2016-09-22 16:12:22 -07:00
Frank Barchard	6ad3aa6ae4	fix multi-line comment warning ../../source/scale_neon.cc:576:1: error: multi-line comment [-Werror=comment] // #define BLENDER(a, b, f) (uint8)((int)(a) + \ ^ BUG=None TEST=try bots Review URL: https://codereview.chromium.org/2344203003 .	2016-09-16 15:16:39 -07:00
Frank Barchard	8279df963e	Scale by 3/8 only if source is multiple of 8 tall. BUG=libyuv:635 TEST=try bots R=harryjin@google.com Review URL: https://codereview.chromium.org/2347733002 .	2016-09-16 14:57:47 -07:00
Frank Barchard	137aa63afe	Fix some comment typos BUG=None TEST=try bots Review URL: https://codereview.chromium.org/2346633002 .	2016-09-15 15:38:19 -07:00
Frank Barchard	de944ed8c7	YuvConstants declare alignment for externs as well as declarations On visual c 2013 and earlier a warning is generated if externs are not declared with the same alignment as the declaration, when using /ltcg BUG=libyuv:633 TEST=standalong test built with cl /Bv /GL /Ox /nologo a.cc b.cc /link /ltcg R=skal@google.com Review URL: https://codereview.chromium.org/2291533004 .	2016-08-30 11:06:46 -07:00
Frank Barchard	c244a3e9a0	Add SplitUVPlanes and MergeUVPlanes Add public methods SplitUVPlanes and MergeUVPlanes based on the optimized assembly functions that already exists. Also, de-duplicate the CPU dispatching code for these functions by moving them to helper functions. BUG=libyuv:629 R=braveyao@chromium.org Review URL: https://codereview.chromium.org/2277603004 .	2016-08-24 16:47:24 -07:00
Frank Barchard	161e5c4569	Allow NULL for dst_y in planar formats. BUG=libyuv:631 TEST=unittests build/pass BUG=libyuv:631 TEST=unittests build/pass R=harryjin@google.com Review URL: https://codereview.chromium.org/2271053003 .	2016-08-24 10:19:14 -07:00
Frank Barchard	17d31e6a4a	NV12 allow NULL for Y The conversion from NV12 and other Bi or Tri planar formats, differs only in the UV handling. The helper function supports passing a NULL for the dst_y channel indicating you only want to do the UV conversion. TBR=harryjin@google.com TEST=LibYUVConvertTest.NV12ToI420_NullY (601 ms) BUG=libyuv:626 Review URL: https://codereview.chromium.org/2276703002 .	2016-08-23 19:05:25 -07:00
Frank Barchard	d58297a2df	NV12ToI420 use SplitPlane function TBR=magjed@chromium.org BUG=libyuv:629 TEST=LibYUVConvertTest.NV12ToI420_Opt Review URL: https://codereview.chromium.org/2267303002 .	2016-08-22 18:35:55 -07:00
Frank Barchard	36ae08ce1c	Suppress MJPEG fprintf() runtime warning TBR=harryjin@google.com BUG=libyuv:630 TEST=local build and try bots pass Review URL: https://codereview.chromium.org/2264293002 .	2016-08-22 16:30:36 -07:00
Frank Barchard	46a8eaaf0c	fix typo in YUV R=braveyao@chromium.org BUG=None Review URL: https://codereview.chromium.org/2152623002 .	2016-07-13 17:17:19 -07:00
Frank Barchard	1aa4ddd21c	Attribute aligned 32 for YUV conversion structure on Intel Fix for unaligned memory exception. R=braveyao@chromium.org BUG=libyuv:616 Review URL: https://codereview.chromium.org/2152553002 .	2016-07-13 12:19:26 -07:00
Frank Barchard	abcb70f183	Test nv21 layout of Android420ToI420 function. to Y,U,V and a pixel stride for U and V. The pixel stride is expected to be 1 or 2. [ RUN ] LibYUVConvertTest.Android420ToI420_1_Any [ OK ] LibYUVConvertTest.Android420ToI420_1_Any (253 ms) [ RUN ] LibYUVConvertTest.Android420ToI420_1_Unaligned [ OK ] LibYUVConvertTest.Android420ToI420_1_Unaligned (250 ms) [ RUN ] LibYUVConvertTest.Android420ToI420_1_Invert [ OK ] LibYUVConvertTest.Android420ToI420_1_Invert (254 ms) [ RUN ] LibYUVConvertTest.Android420ToI420_1_Opt [ OK ] LibYUVConvertTest.Android420ToI420_1_Opt (247 ms) [ RUN ] LibYUVConvertTest.Android420ToI420_2_Any [ OK ] LibYUVConvertTest.Android420ToI420_2_Any (132 ms) [ RUN ] LibYUVConvertTest.Android420ToI420_2_Unaligned [ OK ] LibYUVConvertTest.Android420ToI420_2_Unaligned (122 ms) [ RUN ] LibYUVConvertTest.Android420ToI420_2_Invert [ OK ] LibYUVConvertTest.Android420ToI420_2_Invert (124 ms) [ RUN ] LibYUVConvertTest.Android420ToI420_2_Opt [ OK ] LibYUVConvertTest.Android420ToI420_2_Opt (119 ms) TEST=LibYUVConvertTest.Android420ToI420_Opt BUG=libyuv:604 R=braveyao@chromium.org Review URL: https://codereview.chromium.org/2146733002 .	2016-07-12 18:34:04 -07:00
Frank Barchard	84e04699c2	Add libyuv:Android420ToI420 function which takes 3 pointers to Y,U,V and a pixel stride for U and V. The pixel stride is expected to be 1 or 2. TEST=LibYUVConvertTest.Android420ToI420_Opt BUG=libyuv:604 R=braveyao@chromium.org Review URL: https://codereview.chromium.org/2114843002 .	2016-07-12 16:23:51 -07:00
Frank Barchard	2f101fdbda	mingw64 fix - guard row_win.cc against mingw build. The old guard only checked for defined(_M_X64) which is defined by mingw64. Add a test for defined(_MSC_VER) which is defined for clangcl and visual c but not mingw. mingw should use row_gcc.cc for both 32 and 64 bit. R=harryjin@google.com BUG=webm:1252,libyuv:613 TEST=local gcc/clang builds on linux tested and try bots for others. Review URL: https://codereview.chromium.org/2105603002 .	2016-06-28 10:21:27 -07:00
Frank Barchard	b8ddb5a2a7	rounding for arm filter R=wangcheng@google.com, harryjin@google.com BUG=libyuv:607 Review URL: https://codereview.chromium.org/2093913004 .	2016-06-24 16:07:49 -07:00
Frank Barchard	1b3e4aee47	make count a memory variable for 32 bit 32 bit clang runs out of registers and compiler does core dump. force 32 bit build to use memory variable for counter. BUG=libyuv:612 TBR=harryjin@google.com Review URL: https://codereview.chromium.org/2091913003 .	2016-06-23 20:42:10 -07:00
Frank Barchard	cc88adc620	YUV scale filter columns improved filtering accuracy upscale a YUV image. observe change in hue.. green especially. disable ScaleFilterCols_SSSE3, falling back on ScaleFilterCols_C observe hue.. green especially, is better. was ScaleFrom1280x720_Bilinear (1620 ms) now ScaleFrom1280x720_Bilinear (1907 ms) BUG=libyuv:605 TEST=try bots R=harryjin@google.com, wangcheng@google.com Review URL: https://codereview.chromium.org/2084533006 .	2016-06-23 20:16:55 -07:00
Niels Möller	365ed3851c	Treat YU12 as an alias for I420. Simplify setting of inv_crop_height. BUG= R=fbarchard@google.com Review URL: https://codereview.chromium.org/2020193002 .	2016-06-16 12:49:17 +02:00
Frank Barchard	fd3e676e91	android_full_debug x86 fix - use +rm for width count Work around for android full debug build runnign out of registers. 5 functions were running out of registers causing the compiler error error: 'asm' operand has impossible constraints These functions mostly have 4 pointers, a counter (width) and a tempory eax register. With fpic and debug using stackframes, 2 registers are unavailable. So a total of 8 registers are used. Although fpic and stack frame dont apply to assembly, the compiler reserves 2 registers. The optimized version builds, so its likely freeing up the registers once it knows they are not used. These functions used to build, so compile options and/or compiler may have updated.. likely fpic was turned on. An attribute can be done to disable each, and will avoid using the 2 GPR registers, but they are still reserved and unavailable in debug builds on current compilers (gcc 4.9 and clang 3.8). R=dhrosa@google.com BUG=libyuv:602 Review URL: https://codereview.chromium.org/2066933002 .	2016-06-14 15:25:28 -07:00
Frank Barchard	026be3cd85	neon64 use width int directly. width %w size modifier the int width can be passed directly to arm assembly. For functions that take input constants, the outputs are declared as early write using &, meaning the outputs use used before all inputs are consumed. R=harryjin@google.com BUG=libyuv:598 Review URL: https://codereview.chromium.org/2043073003 .	2016-06-08 10:26:53 -07:00
Frank Barchard	17e8a4d3df	Remove ifdefs for neon in row_neon*.cc ifdefs on a function level are not needed for neon functions, unless they are conditionally enabled in row.h. No functions are conditionally enabled at this time, so all ifdefs can be removed from row_neon.cc and row_neon64.cc TBR=kjellander@chromium.org BUG=libyuv:599 Review URL: https://codereview.chromium.org/2044223002 .	2016-06-07 14:34:13 -07:00
Frank Barchard	6546096269	ARGBExtractAlpha 16 pixels at a time for ARM arm64 8 TestARGBExtractAlpha (10019 ms) <-original 64 bit code arm64 8 x2 TestARGBExtractAlpha (7639 ms) arm64 16 TestARGBExtractAlpha (7369 ms) <- new 64 bit code thumb32 8 TestARGBExtractAlpha (9505 ms) <- original 32 bit code thumb32 8 x2 TestARGBExtractAlpha (7400 ms) thumb32 8 x2i TestARGBExtractAlpha (7266 ms) <- new 32 bit code arm32 8 TestARGBExtractAlpha (10002 ms) BUG=libyuv:572 TESTED=local test on nexus 9 R=harryjin@google.com, wangcheng@google.com Review URL: https://codereview.chromium.org/2035573002 .	2016-06-07 10:44:28 -07:00
Frank Barchard	b00d40160a	make unittest allocator align to 64 bytes. blur requires memory be aligned. change the unittest allocator to guarantee 64 byte alignment. re-enable blur any test that fails if memory is unaligned. TBR=harryjin@google.com BUG=libyuv:596,libyuv:594 TESTED=local build passes with row.h removed from tests. Review URL: https://codereview.chromium.org/2019753002 .	2016-05-27 18:02:47 -07:00
Frank Barchard	ade85fb55c	remove row.h from unittests add SIMD_ALIGNED to unittest header. BUG=libyuv:594 TESTED=local build passes with row.h removed from tests. R=harryjin@google.com Review URL: https://codereview.chromium.org/2001373002 .	2016-05-27 10:57:49 -07:00
Magnus Jedvert	942db3016a	Add ARGBExtractAlpha function BUG=libyuv:572 R=fbarchard@google.com Review URL: https://codereview.chromium.org/1995293002 .	2016-05-26 10:30:57 +02:00
Frank Barchard	74a69522da	white space fixes for MIPS TBR=kjellander@chromium.org BUG=None Review URL: https://codereview.chromium.org/2005053004 .	2016-05-24 14:17:18 -07:00
Frank Barchard	7edf572e28	remove includes for duplicate functions R=harryjin@google.com BUG=libyuv:592 TESTED=local builds work with fewer headers Review URL: https://codereview.chromium.org/2006943002 .	2016-05-23 17:38:26 -07:00
Frank Barchard	fbdc43a03c	fix wrong HAS_ARGBCOPYALPHAROW_SSE2 ifdef TBR=kjellander@chromium.org BUG=libyuv:593 TESTED=try bots pass. Review URL: https://codereview.chromium.org/2000393002 .	2016-05-23 16:26:02 -07:00
Frank Barchard	cf101116c9	Remove initialize to zero on output variables for inline. Inline that uses temporary variables is currently initializing them to 0 and passing in as output "+r". This CL replaces the output constraint to "=&r" for most meaning an output with early write (before inputs). This allows the initialize to zero step to be removed, saving 1 instruction. BUG=libyuv:580 TESTED=local libyuv build on gcc/linux and try bots R=harryjin@google.com Review URL: https://codereview.chromium.org/1895743008 .	2016-04-18 16:24:26 -07:00
Frank Barchard	9c53ff2c57	Fix temporary stride for ConvertToARGB with rotation. BUG=libyuv:578 TESTED=local unittests pass R=harryjin@google.com Review URL: https://codereview.chromium.org/1879783002 .	2016-04-11 15:21:04 -07:00
Frank Barchard	3c862e3d29	Fix stride bug for msan on I420Interpolate. When using C version of I420Interpolate for msan, a 50% interpolation would cause stride to be cast to int, which could cause erroneous memory reads on 64 bit build. This CL makes the stride use ptrdiff_t for HalfRow_C BUG=libyuv:582 TESTED=try bots tests R=dhrosa@google.com Review URL: https://codereview.chromium.org/1872953002 .	2016-04-08 15:58:53 -07:00
Frank Barchard	c7372a323a	add if defined(_MSC_FULL_VER) for NaCL TBR=kjellander@chromium.org BUG=libyuv:573 TESTED=try bots Review URL: https://codereview.chromium.org/1850053002 .	2016-04-01 17:48:23 -07:00
Frank Barchard	76aee8ced7	Remove most clang-cl special cases from cpu_id.cc They are not needed, and due to them there was a call to _xgetbv() without a declaration of the function. This used to work because we implicitly included intrin.h in all translation units with clang-cl, but we want to stop doing that. BUG=chromium:592745 R=fbarchard@google.com Review URL: https://codereview.chromium.org/1780473003 .	2016-03-10 14:01:26 -08:00
Frank Barchard	ee99b85126	Port ARGBToRGB565 from aarch64 neon to 32 bit The 64 bit version of ARGBToRGB565 to 32 bit. 64 bit is using sri which shifts and inserts, saving some masking. The instruction is available for neon 32 bit as well. R=magjed@chromium.org, harryjin@google.com BUG=libyuv:571 Review URL: https://codereview.chromium.org/1724393002 .	2016-02-29 12:22:25 -08:00
Frank Barchard	22e062a448	Port ARGBToJ420 to AVX2 ARGBToJ420 had an SSSE3 version, but not AVX2. ARGBToI420 had an AVX2, so adapt that code to J420. TBR=harryjin@google.com BUG=libyuv:553 Review URL: https://codereview.chromium.org/1702373004 .	2016-02-17 23:16:39 -08:00
Frank Barchard	127ff512b3	add perf data files to ignores document play services update R=jkellander@chromium.org BUG=none Review URL: https://codereview.chromium.org/1712463002 .	2016-02-17 21:37:09 -08:00
Frank Barchard	cc33dc68c7	Port I411ToARGBRow to AVX2. An SSSE3 version already exists, and an AVX2 version is available for Visual C. This ports the function to AVX2 completing the AVX2 ports of all YUV to RGB functions for AVX2 on gcc. TBR=harryjin@google.com BUG=libyuv:555 Review URL: https://codereview.chromium.org/1687253002 .	2016-02-12 10:26:10 -08:00
Frank Barchard	0e554b18fe	port NV12ToRGB565Row_AVX2 to gcc NV12ToRGB565Row for Intel is implemented as a 2 step conversion: NV12ToARGBRow_SSSE3 and ARGBToRGB565Row_SSE2 NV12ToARGBRow has an AVX2 version, so this CL implements NV12ToRGB565Row_AVX2 with call to NV12ToARGBRow_AVX2 and ARGBToRGB565Row_SSE2. R=harryjin@google.com BUG=libyuv:554 Review URL: https://codereview.chromium.org/1687953002 .	2016-02-10 11:13:41 -08:00
Frank Barchard	c39509c8e5	add avx2 wrappers for functions that can call I422ToARGBRow_AVX2 R=harryjin@google.com BUG=libyuv:557 Review URL: https://codereview.chromium.org/1687713002 .	2016-02-09 17:14:29 -08:00
Frank Barchard	0d880e5bc0	rename MIPS_DSPR2 to DSPR2 for consistency When attempting to normalize function names to end in Row_SIMD it was made harder with MIPS_DSPR2 naming convention. Other CPUs do not include the vendor. This should be named consistently. Removed the DISABLE_MIPS in favour of DISABLE_ASM for consistency with other processors. TBR=harryjin@google.com BUG=libyuv:562 Review URL: https://codereview.chromium.org/1677633002 .	2016-02-05 14:49:54 -08:00
Frank Barchard	05ed0c539c	rework scale code for ubsan For more info on ubsan, see http://dev.chromium.org/developers/testing/undefinedbehaviorsanitizer TESTED=Passing compilation using: GYP_DEFINES="ubsan=1" GYP_DEFINES="ubsan_vptr=1" R=harryjin@google.com, pbos@webrtc.org BUG=libyuv:563 Review URL: https://codereview.chromium.org/1654253004 .	2016-02-02 11:01:49 -08:00
Frank Barchard	9e39c1f271	ubsan overflow fix for multiply by 0x01010101 This is an UBSan error reported by libjingle [ RUN ] WebRtcVideoFrameTest.ConvertToYUY2BufferStride [000:000] (videoframe.cc:375): Validate frame passed. format: I420 bpp: 12 size: 1280x720 bytes: 1382400 expected: 1382400 sample[0..3]: 73, 73, 73, 73 ../../chromium/src/third_party/libyuv/source/row_gcc.cc:2903:25: runtime error: signed integer overflow: 128 * 16843009 cannot be represented in type 'int' [8/614] WebRtcVideoFrameTest.ConvertToYUY2BufferStride returned/aborted with exit code 1 (32 ms) [9/614] WebRtcVideoFrameTest.ConvertToYUY2BufferInverted (29 ms) Note: Google Test filter = WebRtcVideoFrameTest.ConvertToYUY2BufferInverted The source is uint8 and the multiply is by 0x01010101 to replicate the byte to 4 bytes. Changing the constant to 0x01010101u should avoid overflow. R=harryjin@google.com TBR=harryjin@google.com BUG=libyuv:563 Review URL: https://codereview.chromium.org/1657533005 .	2016-02-01 12:29:04 -08:00
Frank Barchard	58cb534962	Fix memory overwrite in YUY2ToNV12 odd wdiths When width was odd Y channel wrote an extra pixel. This change splits the Y from UV into a temporary buffer and memcpy's to the destination. Performance is slower. Was YUY2ToNV12_Any (307 ms) YUY2ToNV12_Unaligned (213 ms) TestYUY2ToNV12 (181 ms) YUY2ToNV12_Opt (177 ms) YUY2ToNV12_Invert (177 ms) Npw YUY2ToNV12_Any (300 ms) YUY2ToNV12_Unaligned (226 ms) YUY2ToNV12_Invert (206 ms) TestYUY2ToNV12 (184 ms) YUY2ToNV12_Opt (181 ms) TBR=harryjin@google.com BUG=libyuv:545 Review URL: https://codereview.chromium.org/1593833002 .	2016-01-19 11:28:09 -08:00
Frank Barchard	8377c798fb	Fix I420ToNV21 for wrong dst_stride_y parameter. I420ToNV21 passes the wrong dst_stride_y when it calls I420ToNV12; parameter 8 (convert_from.cc:448) is src_stride_y but should be dst_stride_y. This causes image corruption when converting I420 -> NV21 with mismatched luminance strides. R=dhrosa@google.com, harryjin@google.com BUG=libyuv:547 Review URL: https://codereview.chromium.org/1582793008 .	2016-01-14 17:38:54 -08:00
Frank Barchard	081475b3c8	refactor ARGBToI422 using ARGBToI420 internally R=harryjin@google.com BUG=libyuv:546 Review URL: https://codereview.chromium.org/1574253004 .	2016-01-12 17:05:49 -08:00
Frank Barchard	23c6a83561	Fix ifdef mismatch for mirroruv Macro define and macro ifdef didnt match, leading to C code being used. Make macro match function name. TBR=harryjin@google.com BUG=libyuv:543 Review URL: https://codereview.chromium.org/1579023002 .	2016-01-11 16:33:36 -08:00
Frank Barchard	0e462e6f45	Remove use_sysroot=0 use_sysroot=0 is required for webrtc on linux intel builds, but libyuv doesnt use the affected libraries, so removing this. R=harryjin@google.com, sbc@chromium.org BUG=libyuv:534,libyuv:542 Review URL: https://codereview.chromium.org/1566303002 .	2016-01-11 14:57:50 -08:00
Frank Barchard	fc52d8ded2	Odd width variation of scale down by 2 for subsampling R=dhrosa@google.com, harryjin@google.com BUG=libyuv:538 Review URL: https://codereview.chromium.org/1558093003 .	2016-01-06 15:12:17 -08:00
Frank Barchard	36615d62a0	fix for InterpolateRow_AVX2 port scaledownby4_avx2 to gcc TBR=harryjin@google.com BUG=libyuv:492 Review URL: https://codereview.chromium.org/1546763002 .	2015-12-22 12:29:54 -08:00
Frank Barchard	71deb7ba3a	bug fix - remove shift from InterpolateRow_AVX2 TBR=harryjin@google.com BUG=libyuv:537 Review URL: https://codereview.chromium.org/1547703002 .	2015-12-22 10:28:48 -08:00
Frank Barchard	2cb2e9e1ad	fix for InterpolateRow_AVX2 TBR=harryjin@google.com BUG=libyuv:535 Review URL: https://codereview.chromium.org/1543773002 .	2015-12-21 18:35:12 -08:00
Frank Barchard	3f4d86053e	avx2 interpolate use 8 bit BUG=libyuv:535 R=dhrosa@google.com Review URL: https://codereview.chromium.org/1535833003 .	2015-12-21 10:57:32 -08:00
Frank Barchard	f4447745ae	Add rounding to InterpolateRow for improved quality and consistency. Remove inaccurate specializations for 1/4 and 3/4, since they round incorrectly. Specialize for 100% and 50% are kept due to performance. Make C and ARM code match SSSE3. Make unittests expect zero difference. BUG=libyuv:535 R=harryjin@google.com Review URL: https://codereview.chromium.org/1533643005 .	2015-12-17 15:24:06 -08:00
Frank Barchard	1ccbf8fb7b	use memory for loop counter to work around nearly out of registers TBR=harryjin@google.com BUG=libyuv:533 Review URL: https://codereview.chromium.org/1535433003 .	2015-12-16 17:13:37 -08:00
Frank Barchard	80ca4514ef	change scale down by 4 to use rounding. TBR=harryjin@google.com BUG=libyuv:447 Review URL: https://codereview.chromium.org/1525033005 .	2015-12-15 21:25:18 -08:00
Frank Barchard	70445ef2ef	avx2 scale down by 2 for gcc R=dhrosa@google.com, harryjin@google.com BUG=libyuv:527 Review URL: https://codereview.chromium.org/1520423003 .	2015-12-15 10:59:20 -08:00
Frank Barchard	ae55e41851	use rounding in scaledown by 2 When scaling down by 2 the formula should round consistently. (a+b+c+d+2)/4 The C version did but the SSE2 version was doing 2 averages. avg(avg(a,b),avg(c,d)) This change uses a sum, then rounds. R=dhrosa@google.com, harryjin@google.com BUG=libyuv:447,libyuv:527 Review URL: https://codereview.chromium.org/1513183004 .	2015-12-14 17:25:36 -08:00
Frank Barchard	b3bbcc1f4e	add ifdef for AVX2 so vs2010 can still compile R=harryjin@google.com BUG=libyuv:531 Review URL: https://codereview.chromium.org/1515503005 .	2015-12-09 15:23:51 -08:00
Frank Barchard	cb44936403	fix typo in avx2 gcc blend. was using wrong register on 32 pixel version. R=harryjin@google.com, dhrosa@google.com BUG=libyuv:527 Review URL: https://codereview.chromium.org/1511433006 .	2015-12-09 10:38:46 -08:00
Frank Barchard	353ffbab80	fix for gcc compile error: variable duplicate define TBR=harryjin@google.com BUG=libyuv:529 Review URL: https://codereview.chromium.org/1512793002 .	2015-12-08 19:03:43 -08:00
Frank Barchard	a2ea905679	BlendPlane any width. Benchmark out\release\libyuv_unittest --libyuv_width=1279 --libyuv_height=719 --libyuv_repeat=999 --libyuv_flags=-1 --gtest_filter=Blend \| sortms Was I420Blend_Any (2321 ms) I420Blend_Unaligned (1684 ms) I420Blend_Opt (1675 ms) I420Blend_Invert (1653 ms) BlendPlane_Invert (1556 ms) BlendPlane_Any (1552 ms) BlendPlane_Unaligned (1548 ms) BlendPlane_Opt (1535 ms) ARGBBlend_Unaligned (659 ms) ARGBBlend_Any (596 ms) ARGBBlend_Invert (591 ms) ARGBBlend_Opt (508 ms) BlendPlaneRow_Unaligned (186 ms) BlendPlaneRow_Opt (171 ms) Now ARGBBlend_Any (621 ms) ARGBBlend_Unaligned (585 ms) ARGBBlend_Invert (564 ms) ARGBBlend_Opt (512 ms) I420Blend_Unaligned (347 ms) I420Blend_Invert (345 ms) I420Blend_Any (337 ms) I420Blend_Opt (327 ms) BlendPlane_Unaligned (187 ms) BlendPlaneRow_Unaligned (187 ms) BlendPlane_Invert (186 ms) BlendPlane_Any (186 ms) BlendPlaneRow_Opt (173 ms) BlendPlane_Opt (171 ms) which is comparable to aligned case out\release\libyuv_unittest --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --gtest_filter=Blend \| sortms ARGBBlend_Any (625 ms) ARGBBlend_Unaligned (602 ms) ARGBBlend_Invert (508 ms) ARGBBlend_Opt (506 ms) I420Blend_Any (353 ms) I420Blend_Unaligned (322 ms) I420Blend_Invert (304 ms) I420Blend_Opt (301 ms) BlendPlaneRow_Unaligned (188 ms) BlendPlane_Unaligned (186 ms) BlendPlane_Invert (185 ms) BlendPlane_Any (184 ms) BlendPlaneRow_Opt (173 ms) BlendPlane_Opt (169 ms) R=dhrosa@google.com, harryjin@google.com BUG=libyuv:527 Review URL: https://codereview.chromium.org/1513443002 .	2015-12-08 18:59:48 -08:00
Frank Barchard	dee77a4ebe	Optimize yuv alpha blend AVX2 code to do 32 pixels at time. out/Release/libyuv_unittest --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=9999 --libyuv_flags=-1 --gtest_filter=*I420Blend_Opt Was LibYUVPlanarTest.I420Blend_Opt (2335 ms) Now LibYUVPlanarTest.I420Blend_Opt (1937 ms) vs SSSE3 LibYUVPlanarTest.I420Blend_Opt (2599 ms) BUG=libyuv:527 R=dhrosa@google.com Review URL: https://codereview.chromium.org/1505673003 .	2015-12-08 18:20:30 -08:00
Frank Barchard	fae1a10545	Work around bug in xgetbv for Visual Studio. xgetbv is generating bad code, falsely disabling AVX2 and AVX512. disable optimization for the function affected on older versions of Visual C 32 bit. R=brucedawson@chromium.org, dhrosa@google.com, harryjin@google.com BUG=libyuv:529 Review URL: https://codereview.chromium.org/1503393004 .	2015-12-08 18:13:32 -08:00
Frank Barchard	2657688e70	Add support for odd height YUVA alpha blending. R=dhrosa@google.com, harryjin@google.com BUG=libyuv:527 Review URL: https://codereview.chromium.org/1507683003 .	2015-12-07 12:03:20 -08:00
Frank Barchard	b0b22f88b9	Unroll C version of YUV blender for improved performance. R=dhrosa@google.com, harryjin@google.com BUG=libyuv:527 Review URL: https://codereview.chromium.org/1502343003 .	2015-12-07 12:02:45 -08:00
Frank Barchard	48a919d86e	Bug fix for UYVYToNV12 odd height TBR=harryjin@google.com BUG=libyuv:528 Review URL: https://codereview.chromium.org/1506973002 .	2015-12-07 11:39:48 -08:00
Frank Barchard	bea690b3e0	AVX2 YUV alpha blender and improved unittests AVX2 version can process 16 pixels at a time for improved memory bandwidth and fewer instructions. unittests improved to test unaligned memory, and test exactness when alpha is 0 or 255. R=dhrosa@google.com, harryjin@google.com BUG=libyuv:527 Review URL: https://codereview.chromium.org/1505433002 .	2015-12-05 22:23:29 -08:00
Frank Barchard	fa2618ee26	Port BlendPlaneRow_SSSE3 to GCC R=dhrosa@google.com, harryjin@google.com BUG=libyuv:527 Review URL: https://codereview.chromium.org/1490273006 .	2015-12-04 11:19:41 -08:00
Frank Barchard	8af0ebf816	planar blend use signed images R=dhrosa@google.com, harryjin@google.com, jzern@chromium.org BUG=libyuv:527 Review URL: https://codereview.chromium.org/1491533002 .	2015-12-02 14:20:17 -08:00
Frank Barchard	b6f37bd8ec	Interpolate plane initial implementation. YUV version of interpolation between two images. R=dhrosa@google.com, harryjin@google.com BUG=libyuv:526 Review URL: https://codereview.chromium.org/1479593002 .	2015-11-25 16:11:42 -08:00
Frank Barchard	526558b2d8	disable debug build of 411 to work around compiler bug TBR=harryjin@google.com BUG=libyuv:524 Review URL: https://codereview.chromium.org/1461013002 .	2015-11-19 02:25:00 -08:00
Frank Barchard	b7dfb72559	fix for I411 build error on 32 bit x86 TBR=harrjin@google.com BUG=libyuv:525 Review URL: https://codereview.chromium.org/1461693004 .	2015-11-19 01:45:14 -08:00
Frank Barchard	528356a128	syntax fix for gcc movzwl TBR=harryjin@google.com BUG=libtyv:525 Review URL: https://codereview.chromium.org/1460723003 .	2015-11-18 13:14:15 -08:00
Frank Barchard	50f8cb2db3	port I411 movzx 2 byte reader to gcc previously the I411 format used movd to read U, V pixels. But this reads 4 bytes, and can cause a memory exception. pinsrw can be used, but fails on drmemory 1.5, and is slow. So in this change a movzxw is used to read 2 bytes into EBX, then copy to xmm0 with movd. Slightly slower, but no memory exception Was LibYUVConvertTest.I411ToARGB_Opt (577 ms) Now LibYUVConvertTest.I411ToARGB_Opt (608 ms) TBR=harryjin@google.com BUG=libyuv:525 Review URL: https://codereview.chromium.org/1457783004 .	2015-11-18 13:05:39 -08:00
Frank Barchard	5eefbe2330	Fix for drmemory failure on I411ToARGB Before I420ToARGB_Opt (594 ms) I422ToARGB_Opt (483 ms) I411ToARGB_Opt (748 ms) * I444ToARGB_Opt (452 ms) I400ToARGB_Opt (218 ms) After I420ToARGB_Opt (591 ms) I422ToARGB_Opt (454 ms) I411ToARGB_Opt (502 ms) * I444ToARGB_Opt (441 ms) I400ToARGB_Opt (216 ms) TBR=harryjin@google.com BUG=libyuv:525 Review URL: https://codereview.chromium.org/1459513002 .	2015-11-17 18:00:52 -08:00
Frank Barchard	0815568a50	test for unaligned vs aligned for CopyRow_SSE2 improves performance on older CPUs where movdqa is faster. TBR=harryjin@google.com BUG=libyuv:492 Review URL: https://codereview.chromium.org/1455463002 .	2015-11-17 00:04:03 -08:00
Frank Barchard	1019e4537f	port I444ToARGB avx2 code from Visual C to GCC. SSSE3 Note: Google Test filter = I444ToARGB [==========] Running 8 tests from 1 test case. [----------] Global test environment set-up. [----------] 8 tests from LibYUVConvertTest [ RUN ] LibYUVConvertTest.I444ToARGB_Any [ OK ] LibYUVConvertTest.I444ToARGB_Any (435 ms) [ RUN ] LibYUVConvertTest.I444ToARGB_Unaligned [ OK ] LibYUVConvertTest.I444ToARGB_Unaligned (418 ms) [ RUN ] LibYUVConvertTest.I444ToARGB_Invert [ OK ] LibYUVConvertTest.I444ToARGB_Invert (417 ms) [ RUN ] LibYUVConvertTest.I444ToARGB_Opt [ OK ] LibYUVConvertTest.I444ToARGB_Opt (411 ms) [ RUN ] LibYUVConvertTest.I444ToARGB_ARGB_Any [ OK ] LibYUVConvertTest.I444ToARGB_ARGB_Any (419 ms) [ RUN ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned [ OK ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned (432 ms) [ RUN ] LibYUVConvertTest.I444ToARGB_ARGB_Invert [ OK ] LibYUVConvertTest.I444ToARGB_ARGB_Invert (435 ms) [ RUN ] LibYUVConvertTest.I444ToARGB_ARGB_Opt [ OK ] LibYUVConvertTest.I444ToARGB_ARGB_Opt (421 ms) [----------] 8 tests from LibYUVConvertTest (3389 ms total) AVX2 Note: Google Test filter = I444ToARGB [==========] Running 8 tests from 1 test case. [----------] Global test environment set-up. [----------] 8 tests from LibYUVConvertTest [ RUN ] LibYUVConvertTest.I444ToARGB_Any [ OK ] LibYUVConvertTest.I444ToARGB_Any (340 ms) [ RUN ] LibYUVConvertTest.I444ToARGB_Unaligned [ OK ] LibYUVConvertTest.I444ToARGB_Unaligned (325 ms) [ RUN ] LibYUVConvertTest.I444ToARGB_Invert [ OK ] LibYUVConvertTest.I444ToARGB_Invert (316 ms) [ RUN ] LibYUVConvertTest.I444ToARGB_Opt [ OK ] LibYUVConvertTest.I444ToARGB_Opt (316 ms) [ RUN ] LibYUVConvertTest.I444ToARGB_ARGB_Any [ OK ] LibYUVConvertTest.I444ToARGB_ARGB_Any (315 ms) [ RUN ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned [ OK ] LibYUVConvertTest.I444ToARGB_ARGB_Unaligned (341 ms) [ RUN ] LibYUVConvertTest.I444ToARGB_ARGB_Invert [ OK ] LibYUVConvertTest.I444ToARGB_ARGB_Invert (331 ms) [ RUN ] LibYUVConvertTest.I444ToARGB_ARGB_Opt [ OK ] LibYUVConvertTest.I444ToARGB_ARGB_Opt (329 ms) [----------] 8 tests from LibYUVConvertTest (2615 ms total) TBR=harryjin@google.com BUG=libyuv:492 Review URL: https://codereview.chromium.org/1445893002 .	2015-11-13 18:31:22 -08:00
Frank Barchard	60adcbaf32	scale with conversion using 2 steps with unittest a prototype function to implement the yuv to rgb with conversion and scale. replace with 1 step function in future version, using same API. R=harryjin@google.com BUG=libyuv:471 Review URL: https://codereview.chromium.org/1421553016 .	2015-11-13 11:25:56 -08:00
Frank Barchard	6100f50f13	fix yvu constants for avx2 yuv to rgb the yvu matrix for yuv to rgb had an incorrect entry, affecting yuv to bgra, yuv to abgr and yuv to raw. fix the matrix and reenable avx2 functions. R=harryjin@google.com BUG=libyuv:522 Review URL: https://codereview.chromium.org/1411763004 .	2015-11-10 10:45:44 -08:00
Frank Barchard	72a9e282ec	disable more avx2 functions that dont link in chrome libyuv builds/runs, but when integrated into chromium, produces link errors. unclear why but this disables affected functions. will followup with re-enabling them once the root cause in the runtime error is found. TBR=harryjin@google.com BUG=libyuv:522 Review URL: https://codereview.chromium.org/1427683004 .	2015-11-09 17:20:02 -08:00
Frank Barchard	98eb102bea	set d19 alpha on inner loop TBR=harryjin@google.com BUG=libyuv:521 Review URL: https://codereview.chromium.org/1429263004 .	2015-11-06 11:38:21 -08:00
Frank Barchard	431cb3667a	YUV to RGB for x64 use registers instead of memory. On Arm the YVU to RGB conversions move constants into registers. This change does the same for 64 bit intel builds where additional registers are available. The AVX2 saves 3 instructions by because the 2nd argument needs to be a register, so a vmovdqu was avoided. x64 builds using memory: AVX2 I420ToARGB_Opt (3059 ms) SSSE3 I420ToARGB_Opt (3959 ms) Now using registers AVX2 I420ToARGB_Opt (2906 ms) SSSE3 I420ToARGB_Opt (3928 ms) TBR=harryjin@google.com BUG=libyuv:520 Review URL: https://codereview.chromium.org/1407353010 .	2015-11-04 16:16:18 -08:00
Frank Barchard	860cc0357a	Neon versions of I420AlphaToARGB Add alpha version of YUV to RGB to neon code for ARMv7 and aarch64. For other YUV to RGB conversions, hoist alpha set to 255 out of loop. TBR=harryjin@google.com BUG=libyuv:516 Review URL: https://codereview.chromium.org/1413763017 .	2015-11-03 19:21:36 -08:00
Frank Barchard	d95d2169d9	rename yuv matrix constants to be more clear about what they are R=harryjin@google.com BUG=none Review URL: https://codereview.chromium.org/1429693006 .	2015-11-03 17:09:53 -08:00
Frank Barchard	1f1d140bb6	remove mips dsp detect DSP code is not actually used, only DSPR2. Remove the detect. TBR=harryjin@google.com BUG=none Review URL: https://codereview.chromium.org/1405043008 .	2015-11-03 16:57:40 -08:00
Frank Barchard	ce4c2fad1d	Raw 24 bit RGB to RGB24 (bgr) Add unittests that do 1 step conversion vs 2 step conversion. Tests end swapping versions match direct conversions. R=harryjin@google.com BUG=libyuv:518 Review URL: https://codereview.chromium.org/1419103007 .	2015-11-03 10:30:30 -08:00
Frank Barchard	87926cec8b	remove store bgra, abgr, raw unused macros TBR=harryjin@google.com BUG=libyuv:518 Review URL: https://codereview.chromium.org/1420033004 .	2015-11-02 10:40:03 -08:00
Frank Barchard	2c7aa0070a	remove I422ToBGRA and use I422ToRGBA internally Removes low levels for I420ToBGRA and I420ToRAW and reimplements them as I420ToRGBA and I420ToRGB24 with transposed color matrix. Adds unittests that do 1 step conversion vs 2 steps to test end swapping versions match direct conversions. R=harryjin@google.com BUG=libyuv:518 Review URL: https://codereview.chromium.org/1427993004 .	2015-11-02 10:24:12 -08:00
Frank Barchard	5d97b93369	refactor I420ToABGR to use I420ToARGBRow Using a transposed conversion matrix, I420ToARGB can output ABGR. R=harryjin@google.com, xhwang@chromium.org BUG=libyuv:473 Review URL: https://codereview.chromium.org/1413573010 .	2015-10-30 11:56:57 -07:00
Frank Barchard	cdbdf5b723	Fix debug compilation problems for gcc and 32 bit x86. In some methods with 7 arguments gcc fails to find enough registers to compile the assembler code when compiling debug. Simplest solution is to skip the assembler version in debug of those particular functions (I422Alpha -> ARBG/ABGR) R=harryjin@google.com,bratell@opera.com BUG=libyuv:517 Review URL: https://codereview.chromium.org/1423283002 .	2015-10-28 14:27:29 -07:00
Frank Barchard	b86dbf24d3	refactor I420AlphaToABGR to use I420AlphaToARGB internally swap U and V and transpose conversion matrix, so I420AlphaToARGB and I420AlphaToABGR share low level code. Having less code with same performance allows more focused optimization for future ARM versions. R=harryjin@google.com TBR=harryjin@chromium.org BUG=libyuv:473,libyuv:516 Review URL: https://codereview.chromium.org/1422263002 .	2015-10-27 14:17:21 -07:00
Frank Barchard	cf160cdbaa	implement I444ToABGR by swapping uv and transpose matrix U contributes to B and G. V contributes to R and G. By swapping U and V, they contribute to the opposite channels. Adjust the matrix so the U contribution is in the matrix location such that it till contribute to the new B channel and vice versa. This allows ABGR versions of YUV conversion to use the same low level code as ARGB, just using a different matrix and swapping U and V pointers. As a result the existing I444ToABGRRow functions are no longer needed and are removed. Previously this function was only Intel AVX2 optimized for Windwos. Now it is also optimized for Arm and GCC. ARMv7 Neon Was LibYUVConvertTest.I444ToABGR_Opt (75971 ms) Now LibYUVConvertTest.I444ToABGR_Opt (3672 ms) 20.6 times faster. R=xhwang@chromium.org BUG=libyuv:515 Review URL: https://codereview.chromium.org/1414133006 .	2015-10-27 10:21:21 -07:00
Frank Barchard	2844662e1c	Add avx512bw detection code R=harryjin@google.com BUG=libyuv:514 Review URL: https://codereview.chromium.org/1413463004 .	2015-10-26 14:42:49 -07:00
Frank Barchard	1502832a70	switch cpu flags to 0 for unitialized to avoid compare R=harryjin@google.com BUG=libyuv:512 Review URL: https://codereview.chromium.org/1418253002 .	2015-10-23 10:57:42 -07:00
Frank Barchard	ad36ba5c48	initialize cpu flags to fix compile error on windows R=harryjin@google.com BUG=libyuv:512 Review URL: https://codereview.chromium.org/1422733003 .	2015-10-22 15:16:31 -07:00
Frank Barchard	430bb0a0f0	odd width 444 fix TBR=harryjin@google.com BUG=libyuv:510 Review URL: https://codereview.chromium.org/1415583003 .	2015-10-21 20:03:19 -07:00
Frank Barchard	90335f6043	bug fix for odd width 16/24 bit to i420 A bug was introduced on arm when the code for 'any' width switch to a temporary stack buffer and simd. The C version handles odd width by doing 1 pixel, instead of averaging 2. But the SIMD any version is supposed to replicate the last pixel, then the subsampling in Neon will average the pixel with itself, producing the same result. The previous version did this, but only for ARGB 32 bit, which was to avoid introducing issues with subsampled YUY2 source. This CL adds replication for RGB 16 bit values. TBR=harryjin@google.com BUG=libyuv:510 Review URL: https://codereview.chromium.org/1418983003 .	2015-10-21 18:23:02 -07:00
Frank Barchard	5bf4de0806	width and 3 bug fix in odd width support of ARGBToI411 TBR=harryjin@google.com BUG=none Review URL: https://codereview.chromium.org/1415213002 .	2015-10-21 12:45:08 -07:00
Frank Barchard	ba4b409d51	Fix ARGBToI411 odd width bug. The any function for handling ARGBToI411 was not handling the pixel replication correctly. On 422 and odd width was handled by duplicating a pixel of source. 411 needs replication for remainders of 1, 2 or 3 pixels. The C version was handling odd width but with an average of the remainder pixels, which does not match the SIMD 'any' handling off remainder. This changes the odd width handling to mimic the any version. TBR=harryjin@google.com BUG=libyuv:491 Review URL: https://codereview.chromium.org/1411733004 .	2015-10-21 12:22:24 -07:00
Frank Barchard	9daa550a2e	Move cpu_info variable outside ifdef Fix compile error on arm, mips etc due to undefined variable. TBR=harryjin@google.com BUG=none Review URL: https://codereview.chromium.org/1403373008 .	2015-10-20 16:32:44 -07:00
Frank Barchard	9be6d21ae7	write to cpu_flags once To make init cpu flags thread safe, there can only be one write to the variable. R=richard.winterton@intel.com, harryjin@google.com BUG=libyuv:508 Review URL: https://codereview.chromium.org/1412793006 .	2015-10-20 16:24:01 -07:00
Frank Barchard	cf19a0c9a2	nv21 any fix R=harryjin@google.com BUG=libyuv:507 Review URL: https://codereview.chromium.org/1410643002 .	2015-10-15 16:24:51 -07:00
Frank Barchard	52a5504950	fix for C version of YUV to RGB for Arm YuvPixel for arm was miscomputing YG. TBR=harryjin@google.com BUG=libyuv:506 Review URL: https://codereview.chromium.org/1402333002 .	2015-10-15 12:43:37 -07:00
Frank Barchard	4abd096548	fix for yuv to rgb on arm64. fill in aarch64 yuv constants to match how the code expects them. TBR=harryjin@google.com BUG=libyuv:502 Review URL: https://codereview.chromium.org/1396253004 .	2015-10-12 12:02:54 -07:00
Frank Barchard	2e4466e282	change all pix parameters to width for consistency TBR=harryjin@google.com BUG=none Review URL: https://codereview.chromium.org/1398633002 .	2015-10-07 22:30:36 -07:00
Frank Barchard	76a599ec3b	fix jpeg and bt.709 yuvconstants for neon64. yuv constants for bt.601 were previously ported to neon64, as well as the code to respect other color spaces. But the jpeg and bt.709 colour conversion constants were still in armv7 form. This changes the constants for aarch64 builds to be compatible with the code. yuv constants are now passed as const * Remove Yvu constants which were used for older version on nv21 but not new code. TBR=harryjin@google.com BUG=none Review URL: https://codereview.chromium.org/1398623002 .	2015-10-07 19:46:56 -07:00
Frank Barchard	fae8e66d43	Fix for AVX2 dither function. Fix for 64 bit gcc parameter in dither function which requires m not r, when ABI uses register. BUG=none Review URL: https://codereview.chromium.org/1399463002 .	2015-10-07 19:17:56 -07:00
Frank Barchard	8f0cadede4	port ARGB to 565 dithering AVX2 code to GCC. Previously the assembly code was only available to Windows. This CL ports the AVX2 code to GCC syntax. TBR=harryjin@google.com BUG=libyuv:492 Review URL: https://codereview.chromium.org/1391273003 .	2015-10-07 19:13:59 -07:00
Frank Barchard	cc89e3a77b	port ARGB to 565 dithering SSE2 code to GCC. Previously the assembly code was only available to Windows. This CL ports the SSE2 code to GCC syntax. When running a profiler on all the unittests, this function was the slowest of all functions that still ran in C code. 3.71% libyuv_unittest libyuv_unittest [.] ARGBToRGB565DitherRow_C Was ARGBToRGB565Dither_Opt (2894 ms) Now ARGBToRGB565Dither_Opt (432 ms) TBR=harryjin@google.com BUG=libyuv:492 Review URL: https://codereview.chromium.org/1397673002 .	2015-10-07 18:24:50 -07:00
Frank Barchard	3e38762d6b	fix avx2 box filter bug for yuv down sampling. offset to second group of pixels was off by 16. should have been 32, not 16. requires avx2 hardware and wide image for test. R=harryjin@google.com TBR=harryjin@google.com BUG=libyuv:492,libyuv:501 Review URL: https://codereview.chromium.org/1395603002 .	2015-10-07 11:02:33 -07:00
Frank Barchard	013080f2d2	Pass yuvconstants to YUV conversions for neon 64 bit SETUP provided by zhongwei.yao@linaro.org Previously the 64 bit Neon code had hard coded constants in the setup macro for YUV conversion, while 32 bit Neon code supported the yuvconstants parameter. This change accepts the constants passed to the YUV conversion row function, allowing different color spaces to be respected - naming JPEG and BT.709. As well as the existing BT.601. TBR=harryjin@google.com BUG=libyuv:472 Review URL: https://codereview.chromium.org/1384323002 .	2015-10-06 22:19:14 -07:00
Frank Barchard	914a9856c7	Reimplement NV21ToARGB to allow different color matrix. Low level for NV21ToARGB written to accept yuv matrix used by other YUV to ARGB functions. Previously NV21 was implemented for Windows using NV12 with a different matrix that swapped U and V. But the Arm version of the low level does not allow the matrix U and V contributions to be swapped. Using a new low level function that reads NV21 and uses the same yuvconstants as other YUV conversion functions allows an Arm port of this function. TBR=harryjin@google.com BUG=libyuv:500 Review URL: https://codereview.chromium.org/1388273002 .	2015-10-06 20:34:44 -07:00
Frank Barchard	68fa59c873	add box scaling avx2 optimization for gcc TBR=harryjin@google.com BUG=libyuv:492 Review URL: https://codereview.chromium.org/1392803002 .	2015-10-06 20:01:02 -07:00
Frank Barchard	f00bc9ef46	Add J444ToARGB conversion function. J444 is JPeg YUV color space with 444 subsampling. This implementation uses the existing I444ToARGB conversion, which is BT.601 color space with 444 subsampling, but passing in the jpeg color matrix constants. TBR=harryjin@google.com BUG=449 Review URL: https://codereview.chromium.org/1387313002 .	2015-10-06 18:46:53 -07:00
Frank Barchard	d70293993f	port scale box filter sse2 to gcc TBR=harryjin@google.com BUG=libyuv:492 Review URL: https://codereview.chromium.org/1393653002 .	2015-10-06 16:54:26 -07:00
Frank Barchard	3eefeaeb69	test xsave before calling xgetbv. R=agl@chromium.org, harryjin@google.com BUG=libyuv:497 Review URL: https://codereview.chromium.org/1382803002 .	2015-09-30 17:25:41 -07:00
Frank Barchard	2cc1a2b233	Remove sse2 functions that also have ssse3 ARGBBlendRow_SSE2, ARGBAttenuateRow_SSE2, and MirrorRow_SSE2 Since vast majority of CPUs have SSSE3 now, removing the SSE2 improves the performance of CPU dispatching. R=harryjin@google.com BUG=none Review URL: https://codereview.chromium.org/1377053003 .	2015-09-30 14:24:44 -07:00
Frank Barchard	d039ad6e9b	Width use memory instead of register for 32 bit fpic. Code runs out of registers on 32 bit fpic builts. TBR=harryjin@google.com BUG=libyuv:496 Review URL: https://codereview.chromium.org/1369053002 .	2015-09-25 15:36:04 -07:00
Frank Barchard	febc26a2c9	win64 version of I422AlphaToARGB. Was I420AlphaToARGB_Premult (8861 ms) I420AlphaToARGB_Opt (7119 ms) Now I420AlphaToABGR_Premult (2840 ms) I420AlphaToARGB_Opt (484 ms) C function switched to 1 step. Was I420AlphaToARGB_Premult (8862 ms) I420AlphaToABGR_Opt (6718 ms) Now I420AlphaToARGB_Premult (8706 ms) I420AlphaToARGB_Opt (6541 ms) R=harryjin@google.com BUG=libyuv:496, libyuv:473 Review URL: https://codereview.chromium.org/1359183003 .	2015-09-25 15:06:41 -07:00
Frank Barchard	9a0e12f5f1	AVX2 1 step I422AlphaToARGB for gcc and win. C I420AlphaToARGB_Opt (5169 ms) SSSE3 I420AlphaToARGB_Opt (432 ms) AVX2 I420AlphaToARGB_Opt (358 ms) and with premultiplication as 2 step process: I420AlphaToARGB_Premult (7029 ms) I420AlphaToARGB_Premult (757 ms) I420AlphaToARGB_Premult (508 ms) R=harryjin@google.com BUG=libyuv:496,libyuv:473 Review URL: https://codereview.chromium.org/1372653003 .	2015-09-25 13:37:42 -07:00
Frank Barchard	e365cdde3b	I420Alpha row function in 1 pass. API change - I420AlphaToARGB takes flag indicating if RGB should be premultiplied by alpha. This version implements an efficient SSSE3 version for Windows. C version done in 2 steps. Was libyuvTest.I420AlphaToARGB_Any (1136 ms) libyuvTest.I420AlphaToARGB_Unaligned (1210 ms) libyuvTest.I420AlphaToARGB_Invert (966 ms) libyuvTest.I420AlphaToARGB_Opt (1031 ms) libyuvTest.I420AlphaToABGR_Any (1020 ms) libyuvTest.I420AlphaToABGR_Unaligned (1359 ms) libyuvTest.I420AlphaToABGR_Invert (1082 ms) libyuvTest.I420AlphaToABGR_Opt (986 ms) R=harryjin@google.com BUG=libyuv:496 Review URL: https://codereview.chromium.org/1367093002 .	2015-09-25 10:29:20 -07:00
Frank Barchard	d4594beefc	switch from ebp to ebx. ebx encodes more efficiently (1 byte less) for most address modes, than ebp. previously it was used for 411 format, but the reader uses pinsrw now avoiding gpr register. BUG=libyuv:488 R=harryjin@google.com Review URL: https://codereview.chromium.org/1365003003 .	2015-09-24 17:25:11 -07:00
Frank Barchard	8fb2048e9f	Fix nv12 64 bit gcc increment. Should be 16 bytes, but was 0x16 causing memory corruption. TBR=harryjin@google.com BUG=libyuv:492 Review URL: https://codereview.chromium.org/1368693002 .	2015-09-24 10:19:17 -07:00
Frank Barchard	accc04e6d8	NV12ToARGB_AVX2 ported to gcc TBR=harryjin@google.com BUG=none Review URL: https://codereview.chromium.org/1364913002 .	2015-09-23 15:54:16 -07:00
Frank Barchard	000cf89ca8	YUY2ToARGB avx2 in 1 step conversion. Includes UYVYToARGB ssse3 fix. Was YUY2ToARGB_Opt (433 ms) 69.79% libyuv_unittest libyuv_unittest [.] I422ToARGBRow_AVX2 20.73% libyuv_unittest libyuv_unittest [.] YUY2ToUV422Row_AVX2 6.04% libyuv_unittest libyuv_unittest [.] YUY2ToYRow_AVX2 0.77% libyuv_unittest libyuv_unittest [.] YUY2ToARGBRow_AVX2 Now YUY2ToARGB_Opt (280 ms) 95.66% libyuv_unittest libyuv_unittest [.] YUY2ToARGBRow_AVX2 BUG=libyuv:494 R=harryjin@google.com Review URL: https://codereview.chromium.org/1364813002 .	2015-09-23 11:15:18 -07:00
Frank Barchard	2b92ec8d0f	Fix git markers introduced on landing previous CL BUG=none Review URL: https://codereview.chromium.org/1359023003 .	2015-09-22 15:00:57 -07:00
Frank Barchard	5f3d4270d1	yuy2 to rgb gcc versions read in read function for yuv conversion R=harryjin@google.com BUG=libyuv:488 Review URL: https://codereview.chromium.org/1355393002 .	2015-09-22 14:27:33 -07:00
Frank Barchard	03cd8584e7	Read Y channel in read function for yuv conversion. Allows reader to support YUY2 format. Also contains fix for win64 build for yuv conversion. TBR=harryjin@google.com BUG=libyuv:488 Review URL: https://codereview.chromium.org/1355333002 .	2015-09-22 12:05:16 -07:00
Frank Barchard	f96890a0be	yuvconstants for all YUV to RGB conversion functions. R=harryjin@google.com BUG=libyuv:488 Review URL: https://codereview.chromium.org/1363503002 .	2015-09-22 10:26:03 -07:00
Frank Barchard	62c49dc811	move constants into common R=harryjin@google.com BUG=libyuv:488 Review URL: https://codereview.chromium.org/1359443005 .	2015-09-18 16:28:44 -07:00
Frank Barchard	0381673d19	port I444 to ARGB to matrix. Add I444 to ABGR. R=harryjin@google.com BUG=libyuv:488,libyuv:490 Review URL: https://codereview.chromium.org/1348763005 .	2015-09-18 14:36:15 -07:00
Frank Barchard	28427a53e2	I444ToABGR for android Reimplements I444ToARGB as a matrix function. new I444ToABGR as matrix functions with wrappers and any functions. Allows for future J444 and H444 versions. I444ToABGR user level function added. BUG=libyuv:490, libyuv:449 R=harryjin@google.com Review URL: https://codereview.chromium.org/1355733002 .	2015-09-18 11:20:58 -07:00
Frank Barchard	28ce7d94f5	j422toabgr neon port using i422toabgr matrix function. R=harryjin@google.com BUG=libyuv:488 Review URL: https://codereview.chromium.org/1353923003 .	2015-09-17 15:20:55 -07:00
Frank Barchard	6fcbae1409	J422ToARGB Neon but not aarch64 TBR=harryjin@google.com BUG=libyuv:493 Review URL: https://codereview.chromium.org/1348203004 .	2015-09-17 12:43:05 -07:00
Frank Barchard	6a6b67e7a9	Add H422ToARGB armv7 neon version. Patch provided by zhongwei.yao@linaro.org R=fbarchard@chromium.org, fbarchard@google.com BUG=libyuv:488 Review URL: https://codereview.chromium.org/1344393002 .	2015-09-17 10:38:15 -07:00
Frank Barchard	509c644245	Add J422ToARGB armv7 neon version. R=fbarchard@chromium.org, fbarchard@google.com BUG=libyuv:488 Review URL: https://codereview.chromium.org/1334173005 .	2015-09-15 15:01:48 -07:00
Frank Barchard	73c32d92d7	neon64 use yuvconstants like 32 bit code. TBR=harryjin@google.com BUG=libyuv:488 Review URL: https://codereview.chromium.org/1345643002 .	2015-09-14 16:43:07 -07:00
Frank Barchard	a67927c994	use struct instead of vectors TBR=harryjin@google.com BUG=libyuv:488 Review URL: https://codereview.chromium.org/1345623003 .	2015-09-14 16:07:58 -07:00
Frank Barchard	909160b3b5	use same macros as row_gcc.cc R=harryjin@google.com BUG=libyuv:488 Review URL: https://codereview.chromium.org/1343863002 .	2015-09-14 15:36:10 -07:00
Frank Barchard	fcacbfb27f	validate scan EOI from end for better coverage R=tpsiaki@google.com BUG=libyuv:478 Review URL: https://codereview.chromium.org/1344623003 .	2015-09-14 10:58:51 -07:00
Frank Barchard	67a9e30225	neon yuv matrix function R=harryjin@google.com BUG=libyuv:488 Review URL: https://codereview.chromium.org/1337973002 .	2015-09-11 11:12:30 -07:00
Frank Barchard	316e1ab996	avx2 width parameter bug fix R=harryjin@google.com BUG=libyuv:489 Review URL: https://codereview.chromium.org/1321773004 .	2015-09-09 11:56:35 -07:00
Frank Barchard	ed55d24d9f	H420 functionality R=harryjin@google.com BUG=libyuv:488 Review URL: https://webrtc-codereview.appspot.com/54869004 .	2015-09-06 11:01:40 -07:00
Frank Barchard	67b06e66cb	I422ToABGR for win64. Moves any functions to accomidate win64 subset of formats. TBR=harryjin@google.com BUG=libyuv:488 Review URL: https://webrtc-codereview.appspot.com/57679004 .	2015-09-03 11:00:18 -07:00
Frank Barchard	7060e0d826	I420ToABGRMatrix functions with J420ToABGR wrapper. Allows direct conversion from JPeg to ABGR for android. BUG=libyuv:488 R=harryjin@google.com Review URL: https://webrtc-codereview.appspot.com/55719004 .	2015-09-03 10:42:36 -07:00
Frank Barchard	925c3d9e26	I420ToARGB conversion with matrix. Take color conversion constants as a parameter to row function for I420ToARGBMatrixRow_SSSE3. Allows future variations of color space using a single low level. R=harryjin@google.com BUG=libyuv:488 Review URL: https://webrtc-codereview.appspot.com/56669004 .	2015-09-02 10:45:42 -07:00
Frank Barchard	0bc626a5d7	nolint removed R=harryjin@google.com BUG=none Review URL: https://webrtc-codereview.appspot.com/59389004.	2015-08-31 10:52:13 -07:00
Frank Barchard	0735245c52	pinsrw instruction allows reading 2 bytes directly into an xmm register. Saving a gpr register allows the register to not be pushed for now, and in future it can be used to point to color conversion matrix or alpha channel. R=harryjin@google.com BUG=libyuv:488 Review URL: https://webrtc-codereview.appspot.com/52789004.	2015-08-28 17:03:54 -07:00
Frank Barchard	be11f500f0	Use ebp to point to conversion table. Proof of concept that conversions can table color matrix as a parameter. R=harryjin@google.com BUG=libyuv:472, libyuv:488 Review URL: https://webrtc-codereview.appspot.com/58489004.	2015-08-28 12:00:49 -07:00
Frank Barchard	3c4f5735ce	use pointer to inverse table for clangcl R=harryjin@google.com TBR=harryjin@google.com BUG=none Review URL: https://webrtc-codereview.appspot.com/54859004.	2015-08-26 12:53:03 -07:00
Frank Barchard	5452cce452	port row to clangcl BUG=libyuv:487 R=harryjin@google.com Review URL: https://webrtc-codereview.appspot.com/53799005.	2015-08-25 16:15:42 -07:00
Frank Barchard	fa7ce4af3f	fixed table for clangcl R=harryjin@google.com BUG=libyuv:487 Review URL: https://webrtc-codereview.appspot.com/53799004.	2015-08-25 10:47:30 -07:00
Frank Barchard	d317a70c1d	llvm64 link error fix. R=harryjin@google.com BUG=libyuv:485 Review URL: https://webrtc-codereview.appspot.com/58479004.	2015-08-24 14:21:04 -07:00
Frank Barchard	4dfdabb552	I420AlphaToABGR for android version of yuva conversion Same as I420AlphaToARGB but first step converts to ABGR instead of ARGB. TBR=harryjin@google.com BUG=libyuv:473 Review URL: https://webrtc-codereview.appspot.com/52779004.	2015-08-20 19:36:59 -07:00
Frank Barchard	ee9aaea02f	i422torgb565 is asm for clangcl as well Merge branch 'master' of https://chromium.googlesource.com/libyuv/libyuv into convertcl allow lto for llvm but not gcc R=harryjin@google.com BUG=libyuv:469 Review URL: https://webrtc-codereview.appspot.com/52769004.	2015-08-19 10:46:30 -07:00
Frank Barchard	94d4269936	clang use scalewin R=harryjin@google.com TBR=harryjin@google.com BUG=libyuv:469 Review URL: https://webrtc-codereview.appspot.com/51329004.	2015-08-18 14:50:27 -07:00
Frank Barchard	cda9d38a4e	xmmword cast for clang clangcl use compare_win for 32 bit, allowing fallback and enabling avx2 code for clang. move defines/protos to compare_row.h fix issue with odd width ARGBCopyAlpha functions by copying destination to temp buffer, then doing alpha copy, then copy back to destination. R=harryjin@google.com TBR=harryjin@google.com BUG=libyuv:484 Review URL: https://webrtc-codereview.appspot.com/59379004.	2015-08-18 11:13:12 -07:00
Frank Barchard	baf6a3c1bd	Using the visual C source allows clangcl to fallback seamlessly to visual c, and supports SSE41 and AVX2 versions. R=harryjin@google.com BUG=libyuv:469 Review URL: https://webrtc-codereview.appspot.com/58469004.	2015-08-17 10:47:43 -07:00
Frank Barchard	278d88f872	Copy Alpha odd width support R=harryjin@google.com BUG=none Review URL: https://webrtc-codereview.appspot.com/59369004.	2015-08-13 15:05:14 -07:00
Frank Barchard	8e7a62f22a	I420AlphaToARGB conversion for planar YUV with Alpha to ARGB. R=brucedawson@chromium.org, harryjin@google.com BUG=libyuv:473 Review URL: https://webrtc-codereview.appspot.com/54829004.	2015-08-12 17:01:24 -07:00
Frank Barchard	58f0020137	use visual c 32 bit code for clangcl R=harryjin@google.com BUG=libyuv:483 Review URL: https://webrtc-codereview.appspot.com/54819004.	2015-08-11 10:10:45 -07:00
Frank Barchard	9425c4b01a	rotate nv12 any width BUG=libyuv:464 R=harryjin@google.com Review URL: https://webrtc-codereview.appspot.com/55709004.	2015-08-07 23:48:38 -07:00
Frank Barchard	1f461f73d8	remove align directives R=harryjin@google.com BUG=none Review URL: https://webrtc-codereview.appspot.com/54809004.	2015-08-04 17:00:03 -07:00
Frank Barchard	6e7ef3fddc	allow xgetbv to be disabled for drmemory testing R=harryjin@google.com BUG=none Review URL: https://webrtc-codereview.appspot.com/56649004.	2015-08-04 15:00:39 -07:00
Frank Barchard	e40384b6d9	remove 32 bit gcc version of UV transpose TBR=harryjin@google.com BUG=libyuv:481 Review URL: https://webrtc-codereview.appspot.com/52249004.	2015-08-03 18:03:55 -07:00
Frank Barchard	f14c433916	rotate macros used for source R=brucedawson@chromium.org, harryjin@google.com BUG=libyuv:481 Review URL: https://webrtc-codereview.appspot.com/52239004.	2015-08-03 16:12:18 -07:00
Frank Barchard	7cd7f5a80f	avx ifdef for scale HAS_SCALEADDROW_AVX2. R=jzern@google.com BUG=libyuv:480 Review URL: https://webrtc-codereview.appspot.com/53779004.	2015-07-31 17:17:14 -07:00
Frank Barchard	f242a4a1a1	ValidateJpeg check for valid pointer and size R=harryjin@google.com BUG=chromium:497297 Review URL: https://webrtc-codereview.appspot.com/57649004.	2015-07-30 15:49:48 -07:00
Frank Barchard	93464b926c	Add rotate any support. Fix for sobel for neon which does 16 at a time, not 8. Disable scaling color test that fails on arm. Test is not complete. R=harryjin@google.com BUG=libyuv:479 Review URL: https://webrtc-codereview.appspot.com/52229004.	2015-07-28 15:06:20 -07:00
Frank Barchard	45230390ff	add support for odd width rotate R=harryjin@google.com BUG=libyuv:464 Review URL: https://webrtc-codereview.appspot.com/52219004.	2015-07-28 14:30:07 -07:00
Frank Barchard	cb54e8b69a	rename rotate macros and functions to match BUG=libyuv:477 R=harryjin@google.com Review URL: https://webrtc-codereview.appspot.com/52199004.	2015-07-27 17:00:41 -07:00
Frank Barchard	18a9027ad9	const warning fix on dither, bump chromium deps and add files to ignore list generated by arm build BUG=none R=harryjin@google.com Review URL: https://webrtc-codereview.appspot.com/57639004.	2015-07-27 11:47:01 -07:00
Frank Barchard	2fa4f5a3ea	Adds files and functions for rotate any, but does not hook them up to the caller. rotate any R=harryjin@google.com BUG=libyuv:464 Review URL: https://webrtc-codereview.appspot.com/53769004.	2015-07-27 10:32:08 -07:00
Frank Barchard	3a3a89ccd4	rotate include and proto cleanup R=harryjin@google.com BUG=libyuv:468 Review URL: https://webrtc-codereview.appspot.com/55679005.	2015-07-22 18:09:04 -07:00
Frank Barchard	5be90d23ee	rotate row included R=tpsiaki@google.com BUG=libyuv:468 Review URL: https://webrtc-codereview.appspot.com/55679004.	2015-07-22 17:10:08 -07:00
Frank Barchard	892807d860	move asm out of rotate into win/gcc and header R=harryjin@google.com BUG=libyuv:468 Review URL: https://webrtc-codereview.appspot.com/51319004.	2015-07-22 11:22:55 -07:00
Frank Barchard	ce98129951	yuy2tonv12 R=bcornell@google.com BUG=libyuv:466 Review URL: https://webrtc-codereview.appspot.com/51309004.	2015-07-17 16:22:59 -07:00
Frank Barchard	faa4b14f85	uyvy to nv12 R=harryjin@google.com BUG=libyuv:466 Review URL: https://webrtc-codereview.appspot.com/50339004.	2015-07-17 14:43:19 -07:00
Frank Barchard	faebf89ce0	src_uv typo fix R=harryjin@google.com BUG=none Review URL: https://webrtc-codereview.appspot.com/51299004.	2015-07-15 18:21:06 -07:00
Frank Barchard	3d190ee9f1	break rotate into files by cpu in preparation for optimization. R=bcornell@google.com BUG=libyuv:464 Review URL: https://webrtc-codereview.appspot.com/51289004.	2015-07-14 10:23:10 -07:00
Frank Barchard	673fe7a684	create rotate_row header R=tpsiaki@google.com, tpsiaki BUG=none TESTED=local build still works. Review URL: https://webrtc-codereview.appspot.com/50329004.	2015-07-09 14:40:35 -07:00
Frank Barchard	0e83b64e88	scalerow avx2 bug fix. was using ymm2 instead of ymm3. R=harryjin@google.com BUG=libyuv:462 Review URL: https://webrtc-codereview.appspot.com/56639004.	2015-07-07 17:48:04 -07:00
Frank Barchard	715a29195b	vpermq for avx2 ARGB4444ToARGB, ARGB1555ToARGB and RGB565ToARGB R=harryjin@google.com BUG=libyuv:462 Review URL: https://webrtc-codereview.appspot.com/52759004.	2015-07-07 17:06:04 -07:00
Frank Barchard	97b35daf75	disable faulty avx2 in argb conversions and box filter. and extend temporary buffer to 128 for an avx2 any function. R=harryjin@google.com BUG=libyuv:462 TESTED=libyuv_unittest run on haswell laptop Review URL: https://webrtc-codereview.appspot.com/53759004.	2015-07-07 15:40:24 -07:00
Frank Barchard	0737ff5bd0	128 for avx2 R=harryjin@google.com BUG=libyuv:461 Review URL: https://webrtc-codereview.appspot.com/55649004.	2015-07-04 09:13:20 -07:00
Frank Barchard	9487b9d6d8	any allow for avx2 32 pixels at a time of argb R=harryjin@google.com BUG=libyuv:461 Review URL: https://webrtc-codereview.appspot.com/54779004.	2015-07-01 17:50:48 -07:00
Frank Barchard	82180e8296	rgb24toyuv use 1 or 2 steps consistently. R=bcornell@google.com, impjdi@google.com BUG=libyuv:459 Review URL: https://webrtc-codereview.appspot.com/52149004.	2015-06-29 16:51:05 -07:00
Frank Barchard	0686f26938	blend remove alignment 1 pixel loop for less overhead. R=tpsiaki@google.com BUG=none TESTED=libyuvTest.ARGBBlend_Opt Review URL: https://webrtc-codereview.appspot.com/50289005.	2015-06-24 11:34:12 -07:00
Frank Barchard	553c7f85f1	mirror odd width with simd R=harryjin@google.com BUG=libyuv:448 Review URL: https://webrtc-codereview.appspot.com/54769004.	2015-06-23 17:53:02 -07:00
Frank Barchard	6a9ef1ea36	any 1 to 2 with stride use SIMD R=harryjin@google.com BUG=libyuv:448 Review URL: https://webrtc-codereview.appspot.com/54759004.	2015-06-23 17:08:08 -07:00
Frank Barchard	6dde4f14bd	argb to uv read 4 not 8 R=harryjin@google.com BUG=libyuv:457 Review URL: https://webrtc-codereview.appspot.com/52139004.	2015-06-23 14:48:37 -07:00
Frank Barchard	54100b91c1	copy 2 rows for interpolate and use SIMD. R=harryjin@google.com BUG=libyuv:448 Review URL: https://webrtc-codereview.appspot.com/50279004.	2015-06-23 10:41:46 -07:00
Frank Barchard	3b5d726a4f	1 to 1 any functions with a parameter use memcpy. R=harryjin@google.com BUG=libyuv:448 Review URL: https://webrtc-codereview.appspot.com/57619004.	2015-06-22 15:08:20 -07:00
Frank Barchard	a0fca88b1d	remove fmemcpy and bump version R=harryjin@google.com BUG=libyuv:448 Review URL: https://webrtc-codereview.appspot.com/50269004.	2015-06-19 17:58:17 -07:00
Frank Barchard	722e87f19f	string.h for memcpy R=harryjin BUG=libyuv:448 Review URL: https://webrtc-codereview.appspot.com/57609004.	2015-06-19 16:40:22 -07:00
Frank Barchard	dfb2120a42	set us simd R=harryjin@google.com BUG=libyuv:448 Review URL: https://webrtc-codereview.appspot.com/55629004.	2015-06-19 14:18:48 -07:00
Frank Barchard	6608c100e2	copy last 4 R=harryjin@google.com BUG=libyuv:448 Review URL: https://webrtc-codereview.appspot.com/54749004.	2015-06-18 17:40:19 -07:00
Frank Barchard	a209d7314b	simd for 1 to 1 R=harryjin@google.com, harryjin BUG=448 Review URL: https://webrtc-codereview.appspot.com/55619004.	2015-06-17 18:22:11 -07:00
Frank Barchard	72a235af9f	repeat y for yuy2 so that unittests that check the 2nd y on odd widths will match the C and SIMD. The C code duplicates the last Y. R=harryjin@google.com BUG=libyuv:455 Review URL: https://webrtc-codereview.appspot.com/50249004.	2015-06-16 16:27:15 -07:00
Frank Barchard	44ff3c333d	split share macro R=harryjin@google.com BUG=libyuv:448 Review URL: https://webrtc-codereview.appspot.com/55609004.	2015-06-16 12:44:15 -07:00
Frank Barchard	2edfe0f0c6	merge R=harryjin@google.com BUG=libyuv:448 Review URL: https://webrtc-codereview.appspot.com/52119004.	2015-06-16 12:17:53 -07:00
Frank Barchard	bff1e18e51	share functions in any R=harryjin@google.com BUG=libyuv:448 Review URL: https://webrtc-codereview.appspot.com/57599004.	2015-06-16 12:05:39 -07:00
Frank Barchard	0b3294af6c	disable I422ToYUY2 sse for odd sizes. BUG=455 R=harryjin@google.com Review URL: https://webrtc-codereview.appspot.com/51239004.	2015-06-16 11:09:03 -07:00
Frank Barchard	68e8d9bebd	Math functions need BPP of 4 for odd width support on first source argument BUG=455 TESTED=ARGBMultply R=harryjin@google.com Review URL: https://webrtc-codereview.appspot.com/54719004.	2015-06-16 09:34:51 -07:00
Frank Barchard	b071a3d321	subsample yuy2 dest BUG=455 TESTED=out\release\libyuv_unittest.exe --gtest_catch_exceptions=0 --gtest_filter=ARGBToYUY2 R=harryjin@google.com Review URL: https://webrtc-codereview.appspot.com/58429004.	2015-06-15 12:01:28 -07:00
Frank Barchard	58ca9f899e	remainder done unconditionally and with a variable BUG=448 TESTED=local build R=harryjin@google.com Review URL: https://webrtc-codereview.appspot.com/57559004.	2015-06-12 17:21:41 -07:00
Frank Barchard	242cb2554c	nv12 odd width support using SIMD for remainder BUG=libyuv:448 TESTED=NV21ToRGB565_Any etc R=harryjin@google.com Review URL: https://webrtc-codereview.appspot.com/53689004.	2015-06-12 16:07:20 -07:00
Frank Barchard	cae07fb0e0	bump subsampling up BUG=455 TESTED=libyuvTest.ARGBToYUY2_Random R=harryjin@google.com Review URL: https://webrtc-codereview.appspot.com/58419004.	2015-06-12 15:25:03 -07:00
Frank Barchard	03da5420bc	use SIMD for I420ToARGB odd widths in a temporary buffer instead of using C for remainder. Enter a description of the change. use SIMD for I420ToARGB odd widths in a temporary buffer instead of using C for remainder. Currently the C code does not exactly match the SIMD code, so an odd width produces different pixels than an even width, causing a subtle artifact. By using SIMD consistently, there is no difference in even and odd widths. Also the SIMD performance is faster, so even with overhead of memcpy, performance improves. BUG=447 TESTED=out\release\libyuv_unittest.exe --gtest_filter=I420ToARGB R=harryjin@google.com Review URL: https://webrtc-codereview.appspot.com/55579004.	2015-06-11 16:38:52 -07:00
Frank Barchard	ee351bc2d5	check height is non-zero BUG=none TESTED=libyuv unittest with even width R=bcornell@google.com Review URL: https://webrtc-codereview.appspot.com/51219004.	2015-06-11 16:35:20 -07:00
fbarchard@google.com	2e9f3e5cf5	rename source files from row_posix.cc etc to row_gcc.cc to avoid gyp build filtering out source files from build when on windows with clang. The source code contained in row_gcc.cc is gcc syntax inline assembly available for any platform that supports gcc or clang for intel cpus. BUG=440 TESTED=try bots R=harryjin@google.com Review URL: https://webrtc-codereview.appspot.com/56579004 git-svn-id: http://libyuv.googlecode.com/svn/trunk@1430 16f28f9a-4ce2-e073-06de-1de4eb20be90	2015-06-09 17:27:52 +00:00
fbarchard@google.com	05416e2d9a	Box filter for YUV use rows with accumulation buffer for better memory behavior. The old code would do columns accumulated into registers, and then store the result once. This was slow from a memory point of view. The new code does a row of source at a time, updating an accumulation buffer every row. The accumulation buffer is small, and should fit cache. Before each accumulation of N rows, the buffer needs to be reset to zero. If the memset is a bottleneck, it would be faster to do the first row without an add, storing to the accumulation buffer, and then add for the remaining rows. BUG=425 TESTED=out\release\libyuv_unittest --gtest_filter=ScaleTo1x1 R=harryjin@google.com Review URL: https://webrtc-codereview.appspot.com/52659004 git-svn-id: http://libyuv.googlecode.com/svn/trunk@1428 16f28f9a-4ce2-e073-06de-1de4eb20be90	2015-06-09 01:05:18 +00:00
fbarchard@google.com	b07de879b6	enable intrinsics for clangcl if -mssse3 is enabled. BUG=451 TESTED=untested R=harryjin@google.com Review URL: https://webrtc-codereview.appspot.com/52699004 git-svn-id: http://libyuv.googlecode.com/svn/trunk@1427 16f28f9a-4ce2-e073-06de-1de4eb20be90	2015-06-08 22:48:18 +00:00
fbarchard@google.com	bd2d903e1b	odd width support for ARGBSobel functions. Improves performance for images that are not a multiple of 8 pixels. BUG=444 TESTED=libyuvTest.ARGBSobel_Opt R=harryjin@google.com Review URL: https://webrtc-codereview.appspot.com/54589004 git-svn-id: http://libyuv.googlecode.com/svn/trunk@1415 16f28f9a-4ce2-e073-06de-1de4eb20be90	2015-05-28 22:22:28 +00:00

... 3 4 5 6 7 ...

1490 Commits