libyuv

mirror of https://chromium.googlesource.com/libyuv/libyuv synced 2025-12-06 16:56:55 +08:00

Author	SHA1	Message	Date
Frank Barchard	0f98c3c1df	Add ARGBToAR30Row_SSE2 to speed up H010ToAR30 Port ARGBToAR30Row_AVX2 to ARGBToAR30Row_SSE2 using same instructions but xmm registers and doing half as many pixels per loop. Bug: libyuv:751 Test: LibYUVConvertTest.ARGBToAR30_Opt Change-Id: Id644e54639133d1caf28ea3cd11ff6ab6891a673 Reviewed-on: https://chromium-review.googlesource.com/817918 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-12-09 00:11:20 +00:00
Frank Barchard	324fa32739	Convert16To8Row_SSSE3 port from AVX2 H010ToAR30 uses Convert16To8Row_SSSE3 to convert 10 bit YUV to 8 bit. Then standard YUV conversion can be used. This improves performance on low end CPUs. Future CL will by pass this conversion allowing for 10 bit YUV source, but the function will be useful as a utility for YUV conversions. Bug: libyuv:559, libyuv:751 Test: out/Release/libyuv_unittest --gtest_filter=H010ToAR30 --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --libyuv_cpu_info=-1 Change-Id: I9b3ef22d88a5fd861de4cf1900b4c6e8fd24d0af Reviewed-on: https://chromium-review.googlesource.com/792334 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Frank Barchard <fbarchard@chromium.org>	2017-11-28 19:22:39 +00:00
Lei Zhang	8445617191	Mark a bunch of kArray variables as const. This allows the linker to move the variables from the .data section to the .rodata section. Bug: libyuv:254 Test: out/Release/libyuv_unittest --gtest_filter=* --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --libyuv_cpu_info=-1 Change-Id: I6998570f1af4337d7b80313d9e18e36aa20d6ec0 Reviewed-on: https://chromium-review.googlesource.com/777033 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Frank Barchard <fbarchard@chromium.org>	2017-11-27 23:38:44 +00:00
Frank Barchard	26173eb73e	H010ToAR30 for 10 bit bt.709 YUV to 30 bit RGB This version of the H010ToAR30 provides a 3 step conversion Convert16To8Row_AVX2 H420ToARGB_AVX2 ARGBToAR30_AVX2 Low level function added to convert 16 bit to 8 bit using multiply to adjust 10 bit or other bit depths and then save the upper 16 bits. Bug: libyuv:751 Test: LibYUVPlanarTest.Convert16To8Row_Opt unittest added Change-Id: I9cc576fda8afa1003cb961d03e0e656e0b478f03 Reviewed-on: https://chromium-review.googlesource.com/783554 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-11-22 23:58:30 +00:00
Frank Barchard	a98d6cdb17	ARGBToAR30 AVX2 conversion function Bug: libyuv:751 Test: LibYUVConvertTest.ARGBToAR30_Opt Change-Id: I09c13eb53ba5f1ce1740c013dc587f8300f1d9e0 Reviewed-on: https://chromium-review.googlesource.com/780437 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-11-21 20:37:01 +00:00
Frank Barchard	46594be758	add ScalePlane_16 unit tests Tests ScalePlane vs ScalePlane_16 match. Bug: libyuv:749 Test: LibYUVScaleTest.ScalePlaneDownBy4_Box_16 Change-Id: I3f71748da404982d5d48bfb11bbd3ae95a1d021c Reviewed-on: https://chromium-review.googlesource.com/765045 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: richard winterton <rrwinterton@gmail.com> Reviewed-by: Weiyong Yao <braveyao@chromium.org>	2017-11-16 01:40:48 +00:00
Frank Barchard	49d1e3b036	MultiplyRow_16_AVX2 for converting 10 bit YUV When converting from lsb 10 bit formats to msb, the values need to be shifted to the top 10 bits. Using a multiply allows the different numbers of bits to be copied: // 128 = 9 bits // 64 = 10 bits // 16 = 12 bits // 1 = 16 bits Bug: libyuv:751 Test: LibYUVPlanarTest.MultiplyRow_16_Opt Change-Id: I9cf226053a164baa14155215cb175065b1c4f169 Reviewed-on: https://chromium-review.googlesource.com/762951 Reviewed-by: richard winterton <rrwinterton@gmail.com> Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-10 22:02:32 +00:00
Frank Barchard	2f58d126b9	MergeUV10Row_AVX2 use multiply to handle different bit depths Instead of hardcoded shift, use a multiply by a parameter. 128 = 9 bits 64 = 10 bits 16 = 12 bits 1 = 16 bits Bug: libyuv:751 Test: LibYUVPlanarTest.MergeUV10Row_Opt Change-Id: Id925edfdbf91243370c90641b50eb8e7625ec329 Reviewed-on: https://chromium-review.googlesource.com/762523 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-10 03:38:07 +00:00
Frank Barchard	e26b0a7e0e	casting for c89 compatibility and lint cleanup Bug: libyuv:756 Test: CFLAGS="-m32 -static -std=gnu89 -mno-sse -O2" CXXFLAGS="-m32 -x c -static -std=gnu99 -mno-sse -O2" make -f linux.mk libyuv.a Change-Id: Ic362f93e01ccbb0bea14f361a58585e79297e7d2 Reviewed-on: https://chromium-review.googlesource.com/759423 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: Patrik Höglund <phoglund@chromium.org> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-09 18:22:17 +00:00
Frank Barchard	735ace2ed3	Re-enable x86 assembly without requiring -msse2 clang does not require -msse2 or -msse for inline, except the "x" parameter. So change this to "m" for 32 bit. 64 bit requires sse2 so use "x" for 64 bit. gcc requires -msse for xmm registers in clobber list. Reduce compiler requirement from -msse2 to -msse for enabling assembly. Bug: libyuv:754, libyuv:757 Test: CC=clang CXX=clang++ CFLAGS="-m32" CXXFLAGS="-m32 -mno-sse -O2" make -f linux.mk Change-Id: I86df72cfee80b7d349561c1fd7c97ad360767255 Reviewed-on: https://chromium-review.googlesource.com/759303 Reviewed-by: richard winterton <rrwinterton@gmail.com> Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-09 00:51:06 +00:00
Frank Barchard	d997ac287d	Revert "Enable SSE2 code without -msse" This reverts commit 01e994d74e4e3937ee1a3efdc048320a1e51f818. Change-Id: Ie76710d0f4e641e071889c5125fd3be23cdcdb59 Reviewed-on: https://chromium-review.googlesource.com/758499 Reviewed-by: Frank Barchard <fbarchard@google.com>	2017-11-08 19:33:09 +00:00
Frank Barchard	01e994d74e	Enable SSE2 code without -msse Bug: libyuv:754 Test: CC=clang CXX=clang++ CFLAGS="-m32" CXXFLAGS="-m32 -mno-sse -O2" make -f linux.mk Change-Id: I74bf8d032013694e65ea7637bc38d3253db53ff2 Reviewed-on: https://chromium-review.googlesource.com/758043 Reviewed-by: Frank Barchard <fbarchard@google.com>	2017-11-08 02:54:41 +00:00
Frank Barchard	a0c32b9e49	MergeUV10Row_AVX2 for converting H010 to P010 H010 is 10 bit planar format with 10 bits in lower bits. P010 is 10 bit biplanar format with 10 bits in upper bits. This function weaves the U and V channels and shifts the bits into the upper bits. Bug: libyuv:751 Test: LibYUVPlanarTest.MergeUV10Row_Opt Change-Id: I4a0bac0ef1ff95aa1b8d68261ec8e8e86f2d1fbf Reviewed-on: https://chromium-review.googlesource.com/752692 Reviewed-by: Cheng Wang <wangcheng@google.com> Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-03 18:55:36 +00:00
Frank Barchard	1e16cb5c38	SplitRGBPlane and MergeRGBPlane functions added Converts packed RGB to planar and back. TBR=kjellander@chromium.org BUG=libyuv:728 TEST=MergeRGBPlane_Opt and SplitRGBPlane_Opt unittests added Change-Id: Ida59af940afcb1fc4a48bbf62c714f592665c3cc Reviewed-on: https://chromium-review.googlesource.com/658069 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-09-11 21:02:04 +00:00
Frank Barchard	6825b161d7	HalfFloat SSE2/AVX2 optimized port scheduling. Uses 1 add instead of 2 leas to reduce port pressure on ports 1 and 5 used for SIMD instructions. BUG=libyuv:670 TEST=~/iaca-lin64/bin/iaca.sh -arch HSW out/Release/obj/libyuv/row_gcc.o Change-Id: I3965ee5dcb49941a535efa611b5988d977f5b65c Reviewed-on: https://chromium-review.googlesource.com/433391 Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-02-11 01:02:06 +00:00
Frank Barchard	76e7f104ae	documentation updates BUG=None TEST=Untested Change-Id: I8ab95654255d1aa9cf05a664ecf59ee6c0757e66 Reviewed-on: https://chromium-review.googlesource.com/434941 Reviewed-by: Henrik Kjellander <kjellander@chromium.org> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-02-02 18:31:32 +00:00
Frank Barchard	749e316ed8	Remove commented out code TEST=None BUG=libyuv:672 Change-Id: Ia5949fb20913e4397e62d6a302c89a27dbd7e169 Change-Id: Ia5949fb20913e4397e62d6a302c89a27dbd7e169 Reviewed-on: https://chromium-review.googlesource.com/430321 Reviewed-by: Aaron Gable <agable@chromium.org>	2017-01-20 02:03:12 +00:00
Frank Barchard	a7c87e19f0	add Intel Code Analyst markers add macros to enable/disable code analyst around blocks of code. Normally these macros should not be used, but if performance details are wanted for intel code, enable them around the code and then run via the iaca tool, available on the intel website. BUG=libyuv:670 TEST=~/iaca-lin64/bin/iaca.sh -64 out/Release/libyuv_unittest R=wangcheng@google.com Review-Url: https://codereview.chromium.org/2626193002 .	2017-01-13 15:50:24 -08:00
Frank Barchard	3028e1bd97	clang-format row_gcc.cc with some functions disabled BUG=libyuv:654 TEST=try bots build R=kjellander@chromium.org Review URL: https://codereview.chromium.org/2484083003 .	2016-11-07 18:37:29 -08:00
Frank Barchard	e62309f259	clang-format libyuv BUG=libyuv:654 R=kjellander@chromium.org Review URL: https://codereview.chromium.org/2469353005 .	2016-11-07 17:37:23 -08:00
Frank Barchard	c2073823b4	use __OPTIMIZE__ macro to determine debug vs release. Debug builds of x86 gcc/clang can run out of register. Previously NDEBUG or _DEBUG was used to detect a debug build. But those macros are not set by gentoo builds. This CL switches to the compiler predefine __OPTIMIZE__ which is built into clang and gcc. BUG=libyuv:602 TEST=untested R=wangcheng@google.com Review URL: https://codereview.chromium.org/2451503002 .	2016-10-24 18:02:48 -07:00
Frank Barchard	451af5e922	scale by 1 for neon implemented void HalfFloat1Row_NEON(const uint16* src, uint16* dst, float, int width) { asm volatile ( "1: \n" MEMACCESS(0) "ld1 {v1.16b}, [%0], #16 \n" // load 8 shorts "subs %w2, %w2, #8 \n" // 8 pixels per loop "uxtl v2.4s, v1.4h \n" // 8 int's "uxtl2 v1.4s, v1.8h \n" "scvtf v2.4s, v2.4s \n" // 8 floats "scvtf v1.4s, v1.4s \n" "fcvtn v4.4h, v2.4s \n" // 8 floatsgit "fcvtn2 v4.8h, v1.4s \n" MEMACCESS(1) "st1 {v4.16b}, [%1], #16 \n" // store 8 shorts "b.gt 1b \n" : "+r"(src), // %0 "+r"(dst), // %1 "+r"(width) // %2 : : "cc", "memory", "v1", "v2", "v4" ); } void HalfFloatRow_NEON(const uint16* src, uint16* dst, float scale, int width) { asm volatile ( "1: \n" MEMACCESS(0) "ld1 {v1.16b}, [%0], #16 \n" // load 8 shorts "subs %w2, %w2, #8 \n" // 8 pixels per loop "uxtl v2.4s, v1.4h \n" // 8 int's "uxtl2 v1.4s, v1.8h \n" "scvtf v2.4s, v2.4s \n" // 8 floats "scvtf v1.4s, v1.4s \n" "fmul v2.4s, v2.4s, %3.s[0] \n" // adjust exponent "fmul v1.4s, v1.4s, %3.s[0] \n" "uqshrn v4.4h, v2.4s, #13 \n" // isolate halffloat "uqshrn2 v4.8h, v1.4s, #13 \n" MEMACCESS(1) "st1 {v4.16b}, [%1], #16 \n" // store 8 shorts "b.gt 1b \n" : "+r"(src), // %0 "+r"(dst), // %1 "+r"(width) // %2 : "w"(scale * 1.9259299444e-34f) // %3 : "cc", "memory", "v1", "v2", "v4" ); } TEST=LibYUVPlanarTest.TestHalfFloatPlane_One BUG=libyuv:560 R=hubbe@chromium.org Review URL: https://codereview.chromium.org/2430313008 .	2016-10-21 14:30:03 -07:00
Frank Barchard	550cf829fb	HalfFloat avx2 unpack bug fix. AVX unpack parameters were reverse ordered causing incorrect results on AVX2 hardware. TEST=/usr/local/google/home/fbarchard/intelsde/sde -skx -- out/Release/libyuv_unittest --gtest_filter=Half BUG=libyuv:560 R=wangcheng@google.com Review URL: https://codereview.chromium.org/2438893002 .	2016-10-20 15:49:00 -07:00
Frank Barchard	2d80fc3133	Port HalfFloatRow_SSE2 to AVX2 but not using F16C. R=wangcheng@google.com, hubbe@chromium.org BUG=libyuv:560 Review URL: https://codereview.chromium.org/2421993002 .	2016-10-14 19:01:41 -07:00
Frank Barchard	a5e93766a2	Add ARGBExtractAlpha_AVX2 function Port SSE2 version to AVX2. BUG=libyuv:572 TEST=/usr/local/google/home/fbarchard/intelsde/sde -skx -- out/Release/libyuv_unittest --gtest_filter=Extract R=wangcheng@google.com, magjed@chromium.org Review URL: https://codereview.chromium.org/2420553002 .	2016-10-13 16:03:43 -07:00
Frank Barchard	d363ea6527	Remove I411 support. YUV 411 is very uncommon format. Remove support. Update documentation to reflect that 411 is deprecated. Simplify tests for YUV to only test with the new side by side YUV but keep old 3 plane test around with a macro for now. BUG=libyuv:645 R=kjellander@chromium.org Review URL: https://codereview.chromium.org/2406123002 .	2016-10-11 11:14:16 -07:00
Frank Barchard	aa197ee1a3	HalfFloat_SSE2 for Visual C Low level support for 12 bit 420, 422 and 444 YUV video frame conversion. BUG=libyuv:560, chromium:445071 TEST=LibYUVPlanarTest.TestHalfFloatPlane on windows R=hubbe@chromium.org, wangcheng@google.com Review URL: https://codereview.chromium.org/2387713002 .	2016-10-03 10:33:38 -07:00
Frank Barchard	4a14cb2e81	HalfFloat_SSE2 port from C algorithm to SSE2 Low level support for 12 bit 420, 422 and 444 YUV video frame conversion. BUG=libyuv:560, chromium:445071 TEST=untested R=hubbe@chromium.org Review URL: https://codereview.chromium.org/2381493006 .	2016-09-30 09:47:16 -07:00
Frank Barchard	7fc932ddd3	Add low level support for 12 bit 420, 422 and 444 YUV video frame conversion. BUG=libyuv:560,chromium:445071 TEST=untested R=hubbe@chromium.org Review URL: https://codereview.chromium.org/2371293002 .	2016-09-29 15:06:30 -07:00
Frank Barchard	fd3e676e91	android_full_debug x86 fix - use +rm for width count Work around for android full debug build runnign out of registers. 5 functions were running out of registers causing the compiler error error: 'asm' operand has impossible constraints These functions mostly have 4 pointers, a counter (width) and a tempory eax register. With fpic and debug using stackframes, 2 registers are unavailable. So a total of 8 registers are used. Although fpic and stack frame dont apply to assembly, the compiler reserves 2 registers. The optimized version builds, so its likely freeing up the registers once it knows they are not used. These functions used to build, so compile options and/or compiler may have updated.. likely fpic was turned on. An attribute can be done to disable each, and will avoid using the 2 GPR registers, but they are still reserved and unavailable in debug builds on current compilers (gcc 4.9 and clang 3.8). R=dhrosa@google.com BUG=libyuv:602 Review URL: https://codereview.chromium.org/2066933002 .	2016-06-14 15:25:28 -07:00
Magnus Jedvert	942db3016a	Add ARGBExtractAlpha function BUG=libyuv:572 R=fbarchard@google.com Review URL: https://codereview.chromium.org/1995293002 .	2016-05-26 10:30:57 +02:00
Frank Barchard	cf101116c9	Remove initialize to zero on output variables for inline. Inline that uses temporary variables is currently initializing them to 0 and passing in as output "+r". This CL replaces the output constraint to "=&r" for most meaning an output with early write (before inputs). This allows the initialize to zero step to be removed, saving 1 instruction. BUG=libyuv:580 TESTED=local libyuv build on gcc/linux and try bots R=harryjin@google.com Review URL: https://codereview.chromium.org/1895743008 .	2016-04-18 16:24:26 -07:00
Frank Barchard	22e062a448	Port ARGBToJ420 to AVX2 ARGBToJ420 had an SSSE3 version, but not AVX2. ARGBToI420 had an AVX2, so adapt that code to J420. TBR=harryjin@google.com BUG=libyuv:553 Review URL: https://codereview.chromium.org/1702373004 .	2016-02-17 23:16:39 -08:00
Frank Barchard	cc33dc68c7	Port I411ToARGBRow to AVX2. An SSSE3 version already exists, and an AVX2 version is available for Visual C. This ports the function to AVX2 completing the AVX2 ports of all YUV to RGB functions for AVX2 on gcc. TBR=harryjin@google.com BUG=libyuv:555 Review URL: https://codereview.chromium.org/1687253002 .	2016-02-12 10:26:10 -08:00
Frank Barchard	9e39c1f271	ubsan overflow fix for multiply by 0x01010101 This is an UBSan error reported by libjingle [ RUN ] WebRtcVideoFrameTest.ConvertToYUY2BufferStride [000:000] (videoframe.cc:375): Validate frame passed. format: I420 bpp: 12 size: 1280x720 bytes: 1382400 expected: 1382400 sample[0..3]: 73, 73, 73, 73 ../../chromium/src/third_party/libyuv/source/row_gcc.cc:2903:25: runtime error: signed integer overflow: 128 * 16843009 cannot be represented in type 'int' [8/614] WebRtcVideoFrameTest.ConvertToYUY2BufferStride returned/aborted with exit code 1 (32 ms) [9/614] WebRtcVideoFrameTest.ConvertToYUY2BufferInverted (29 ms) Note: Google Test filter = WebRtcVideoFrameTest.ConvertToYUY2BufferInverted The source is uint8 and the multiply is by 0x01010101 to replicate the byte to 4 bytes. Changing the constant to 0x01010101u should avoid overflow. R=harryjin@google.com TBR=harryjin@google.com BUG=libyuv:563 Review URL: https://codereview.chromium.org/1657533005 .	2016-02-01 12:29:04 -08:00
Frank Barchard	081475b3c8	refactor ARGBToI422 using ARGBToI420 internally R=harryjin@google.com BUG=libyuv:546 Review URL: https://codereview.chromium.org/1574253004 .	2016-01-12 17:05:49 -08:00
Frank Barchard	23c6a83561	Fix ifdef mismatch for mirroruv Macro define and macro ifdef didnt match, leading to C code being used. Make macro match function name. TBR=harryjin@google.com BUG=libyuv:543 Review URL: https://codereview.chromium.org/1579023002 .	2016-01-11 16:33:36 -08:00
Frank Barchard	71deb7ba3a	bug fix - remove shift from InterpolateRow_AVX2 TBR=harryjin@google.com BUG=libyuv:537 Review URL: https://codereview.chromium.org/1547703002 .	2015-12-22 10:28:48 -08:00
Frank Barchard	2cb2e9e1ad	fix for InterpolateRow_AVX2 TBR=harryjin@google.com BUG=libyuv:535 Review URL: https://codereview.chromium.org/1543773002 .	2015-12-21 18:35:12 -08:00
Frank Barchard	3f4d86053e	avx2 interpolate use 8 bit BUG=libyuv:535 R=dhrosa@google.com Review URL: https://codereview.chromium.org/1535833003 .	2015-12-21 10:57:32 -08:00
Frank Barchard	f4447745ae	Add rounding to InterpolateRow for improved quality and consistency. Remove inaccurate specializations for 1/4 and 3/4, since they round incorrectly. Specialize for 100% and 50% are kept due to performance. Make C and ARM code match SSSE3. Make unittests expect zero difference. BUG=libyuv:535 R=harryjin@google.com Review URL: https://codereview.chromium.org/1533643005 .	2015-12-17 15:24:06 -08:00
Frank Barchard	1ccbf8fb7b	use memory for loop counter to work around nearly out of registers TBR=harryjin@google.com BUG=libyuv:533 Review URL: https://codereview.chromium.org/1535433003 .	2015-12-16 17:13:37 -08:00
Frank Barchard	cb44936403	fix typo in avx2 gcc blend. was using wrong register on 32 pixel version. R=harryjin@google.com, dhrosa@google.com BUG=libyuv:527 Review URL: https://codereview.chromium.org/1511433006 .	2015-12-09 10:38:46 -08:00
Frank Barchard	dee77a4ebe	Optimize yuv alpha blend AVX2 code to do 32 pixels at time. out/Release/libyuv_unittest --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=9999 --libyuv_flags=-1 --gtest_filter=*I420Blend_Opt Was LibYUVPlanarTest.I420Blend_Opt (2335 ms) Now LibYUVPlanarTest.I420Blend_Opt (1937 ms) vs SSSE3 LibYUVPlanarTest.I420Blend_Opt (2599 ms) BUG=libyuv:527 R=dhrosa@google.com Review URL: https://codereview.chromium.org/1505673003 .	2015-12-08 18:20:30 -08:00
Frank Barchard	bea690b3e0	AVX2 YUV alpha blender and improved unittests AVX2 version can process 16 pixels at a time for improved memory bandwidth and fewer instructions. unittests improved to test unaligned memory, and test exactness when alpha is 0 or 255. R=dhrosa@google.com, harryjin@google.com BUG=libyuv:527 Review URL: https://codereview.chromium.org/1505433002 .	2015-12-05 22:23:29 -08:00
Frank Barchard	fa2618ee26	Port BlendPlaneRow_SSSE3 to GCC R=dhrosa@google.com, harryjin@google.com BUG=libyuv:527 Review URL: https://codereview.chromium.org/1490273006 .	2015-12-04 11:19:41 -08:00
Frank Barchard	526558b2d8	disable debug build of 411 to work around compiler bug TBR=harryjin@google.com BUG=libyuv:524 Review URL: https://codereview.chromium.org/1461013002 .	2015-11-19 02:25:00 -08:00
Frank Barchard	b7dfb72559	fix for I411 build error on 32 bit x86 TBR=harrjin@google.com BUG=libyuv:525 Review URL: https://codereview.chromium.org/1461693004 .	2015-11-19 01:45:14 -08:00
Frank Barchard	528356a128	syntax fix for gcc movzwl TBR=harryjin@google.com BUG=libtyv:525 Review URL: https://codereview.chromium.org/1460723003 .	2015-11-18 13:14:15 -08:00
Frank Barchard	50f8cb2db3	port I411 movzx 2 byte reader to gcc previously the I411 format used movd to read U, V pixels. But this reads 4 bytes, and can cause a memory exception. pinsrw can be used, but fails on drmemory 1.5, and is slow. So in this change a movzxw is used to read 2 bytes into EBX, then copy to xmm0 with movd. Slightly slower, but no memory exception Was LibYUVConvertTest.I411ToARGB_Opt (577 ms) Now LibYUVConvertTest.I411ToARGB_Opt (608 ms) TBR=harryjin@google.com BUG=libyuv:525 Review URL: https://codereview.chromium.org/1457783004 .	2015-11-18 13:05:39 -08:00

1 2

84 Commits