libyuv

mirror of https://chromium.googlesource.com/libyuv/libyuv synced 2026-02-08 18:56:43 +08:00

Author	SHA1	Message	Date
Frank Barchard	c367751430	ARGBToAR30 SSSE3 use pmulhuw to replicate fields AR30 is optimized with 3 techniques 1. pmulhuw is used to replicate 8 bits to 10 bits. 2. Two channels are processed at a time. R and B, and A and G. 3. pshufb is used to shift and mask 2 channels of R and B Bug: libyuv:751 Test: ARGBToAR30_Opt Change-Id: I4e62d6caa4df7d0ae80395fa911d3c922b6b897b Reviewed-on: https://chromium-review.googlesource.com/822520 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2017-12-12 20:12:58 +00:00
Mirko Bonadei	d94a4867bf	Using all_dependent_configs to pass libyuv_config around. Using public_configs, client projects must rely on public_deps to propagate configurations up in the build graph. This is bad because public_deps allows the exposition of headers that live in another target. This can lead to a really unhealthy build. On the other side, all_dependent_configs is automatically propagated up in the build graph but if a target includes a libyuv header it is forced by GN to declare the dependency (and this will propagate libyuv_config). Bug: webrtc:8605, webrtc:8603 Change-Id: I4d71bb5de0b5b62a4ec110349223614f0b98e655 No-Try: True Reviewed-on: https://chromium-review.googlesource.com/822112 Commit-Queue: Mirko Bonadei <mbonadei@chromium.org> Reviewed-by: Patrik Höglund <phoglund@chromium.org>	2017-12-12 13:39:42 +00:00
Frank Barchard	11dd1b956f	ARGBToAR30 use vpmulhuw to replicate fields AR30 is optimized with 3 techniques 1. vpmulhuw is used to replicate 8 bits to 10 bits. 2. Two channels are processed at a time. R and B, and A and G. 3. vpshufb is used to shift and mask 2 channels of R and B Red Blue With the 8 bit value in the upper bits, vpmulhuw by (1024+4) will produce a 10 bit value in the low 10 bits of each 16 bit value. This is whats wanted for the blue channel. The red needs to be shifted 4 left, so multiply by (1024+4)16 for red. Alpha Green Alpha and Green are already in the high bits so vpand can zero out the other bits, keeping just 2 upper bits of alpha and 8 bit green. The same multiplier could be used for Green - (1024+4) putting the 10 bit green in the lsb. Alpha would be a simple multiplier to shift it into position. It wants a gap of 10 above the green. Green is 10 bits, so there are 6 bits in the low short. 4 more are needed, so a multiplier of 4 gets the 2 bits into the upper 16 bits, and then a shift of 4 is a multiply of 16, so (416) = 64. Then shift the result left 10 to position the A and G channels. Bug: libyuv:751 Test: ARGBToAR30_Opt Change-Id: Ie4f20dce18203bae7b75acb1fd5232db8a8a4f11 Reviewed-on: https://chromium-review.googlesource.com/820046 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-12-12 02:57:54 +00:00
Frank Barchard	0f98c3c1df	Add ARGBToAR30Row_SSE2 to speed up H010ToAR30 Port ARGBToAR30Row_AVX2 to ARGBToAR30Row_SSE2 using same instructions but xmm registers and doing half as many pixels per loop. Bug: libyuv:751 Test: LibYUVConvertTest.ARGBToAR30_Opt Change-Id: Id644e54639133d1caf28ea3cd11ff6ab6891a673 Reviewed-on: https://chromium-review.googlesource.com/817918 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-12-09 00:11:20 +00:00
Frank Barchard	aabe380890	H010ToAR30 and H010ToARGB optimized YUV buffering Reduce allocations of row buffers to 1 alloc/free. Do 2 rows at a time to avoid converting U and V planes twice. Bug: libyuv:715 Test: LibYUVConvertTest.H010ToAR30_Opt Change-Id: I2f3a03b4875df5e3b969112a78a1a0b28399fa2f Reviewed-on: https://chromium-review.googlesource.com/816021 Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-12-08 18:55:03 +00:00
Frank Barchard	3541e46a7e	Add H010ToARGB for 10 bit YUV to ARGB Bug: libyuv:751 Test: LibYUVConvertTest.H010ToARGB_Opt Change-Id: I668d3f3810e59a4fb6611503aae1c8edc7d596e7 Reviewed-on: https://chromium-review.googlesource.com/815015 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-12-07 20:17:50 +00:00
Frank Barchard	2cec89a0d3	Add comment in Makefile OpenMP for MacOS Add a comment in util/Makefile for how to enable OpenMP for MacOS. Requires updated gcc or clang compile. Bug: None Test: /usr/local/bin/g++-7 -msse2 -O3 -fopenmp -static-libgcc -o psnr_omp psnr.cc ssim.cc psnr_main.cc Change-Id: Icb3389bf8cf94f09a185fea055c69823b9fbc66b time ./psnr_omp -ssim -s 1920 1080 ~/test/garden2_mp4.yuv ~/test/garden2_ogv.yuv Reviewed-on: https://chromium-review.googlesource.com/807546 Reviewed-by: Weiyong Yao <braveyao@chromium.org> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2017-12-05 18:53:45 +00:00
Frank Barchard	49d9b1039b	NV21ToABGR for Android camera conversions Bug: libyuv:762 Test: NV21ToABGR unittest Change-Id: I71448ab83930339083f07eeafccf240c6cb41c48 Reviewed-on: https://chromium-review.googlesource.com/795212 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-11-30 20:29:28 +00:00
Frank Barchard	324fa32739	Convert16To8Row_SSSE3 port from AVX2 H010ToAR30 uses Convert16To8Row_SSSE3 to convert 10 bit YUV to 8 bit. Then standard YUV conversion can be used. This improves performance on low end CPUs. Future CL will by pass this conversion allowing for 10 bit YUV source, but the function will be useful as a utility for YUV conversions. Bug: libyuv:559, libyuv:751 Test: out/Release/libyuv_unittest --gtest_filter=H010ToAR30 --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --libyuv_cpu_info=-1 Change-Id: I9b3ef22d88a5fd861de4cf1900b4c6e8fd24d0af Reviewed-on: https://chromium-review.googlesource.com/792334 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Frank Barchard <fbarchard@chromium.org>	2017-11-28 19:22:39 +00:00
Lei Zhang	8445617191	Mark a bunch of kArray variables as const. This allows the linker to move the variables from the .data section to the .rodata section. Bug: libyuv:254 Test: out/Release/libyuv_unittest --gtest_filter=* --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --libyuv_cpu_info=-1 Change-Id: I6998570f1af4337d7b80313d9e18e36aa20d6ec0 Reviewed-on: https://chromium-review.googlesource.com/777033 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Frank Barchard <fbarchard@chromium.org>	2017-11-27 23:38:44 +00:00
Frank Barchard	26173eb73e	H010ToAR30 for 10 bit bt.709 YUV to 30 bit RGB This version of the H010ToAR30 provides a 3 step conversion Convert16To8Row_AVX2 H420ToARGB_AVX2 ARGBToAR30_AVX2 Low level function added to convert 16 bit to 8 bit using multiply to adjust 10 bit or other bit depths and then save the upper 16 bits. Bug: libyuv:751 Test: LibYUVPlanarTest.Convert16To8Row_Opt unittest added Change-Id: I9cc576fda8afa1003cb961d03e0e656e0b478f03 Reviewed-on: https://chromium-review.googlesource.com/783554 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-11-22 23:58:30 +00:00
Frank Barchard	a98d6cdb17	ARGBToAR30 AVX2 conversion function Bug: libyuv:751 Test: LibYUVConvertTest.ARGBToAR30_Opt Change-Id: I09c13eb53ba5f1ce1740c013dc587f8300f1d9e0 Reviewed-on: https://chromium-review.googlesource.com/780437 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-11-21 20:37:01 +00:00
Frank Barchard	19a126ddfa	Add AR30 fourcc unittest Bug: libyuv:749 Test: LibYUVBaseTest.TestFourCC Change-Id: Iec378947248840c7e2cd87b1198503f39e7c7258 Reviewed-on: https://chromium-review.googlesource.com/780619 Reviewed-by: Frank Barchard <fbarchard@chromium.org> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2017-11-20 23:52:01 +00:00
Frank Barchard	a37fe16557	Add AR30 fourcc Bug: libyuv:749 Test: none Change-Id: Icdfb0ff7bb5886d73498f4d88ca4629b2dc3425c Reviewed-on: https://chromium-review.googlesource.com/780443 Reviewed-by: Weiyong Yao <braveyao@chromium.org>	2017-11-20 23:09:50 +00:00
Frank Barchard	f2978400d5	Document AR30 format Bug: libyuv:751 Test: none Change-Id: If6d5e7b9c5e6e8d2a272e03ce5a1cc199ef364ca Reviewed-on: https://chromium-review.googlesource.com/779980 Reviewed-by: Weiyong Yao <braveyao@chromium.org> Reviewed-by: Frank Barchard <fbarchard@google.com>	2017-11-20 22:05:45 +00:00
Frank Barchard	12c904a97c	H420ToRAW and H420ToRGB24 added for bt.709 support. Bug: libyuv:760 Test: LibYUVConvertTest.H420ToRAW_Opt Change-Id: I050385f477309d5db02bb2218088f224c83392ed Reviewed-on: https://chromium-review.googlesource.com/775785 Commit-Queue: Frank Barchard <fbarchard@google.com> Reviewed-by: Weiyong Yao <braveyao@chromium.org>	2017-11-17 01:20:05 +00:00
Frank Barchard	46594be758	add ScalePlane_16 unit tests Tests ScalePlane vs ScalePlane_16 match. Bug: libyuv:749 Test: LibYUVScaleTest.ScalePlaneDownBy4_Box_16 Change-Id: I3f71748da404982d5d48bfb11bbd3ae95a1d021c Reviewed-on: https://chromium-review.googlesource.com/765045 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: richard winterton <rrwinterton@gmail.com> Reviewed-by: Weiyong Yao <braveyao@chromium.org>	2017-11-16 01:40:48 +00:00
Frank Barchard	630c8ed1e0	Fix for ScaleDownBy4_Linear_16 The unittest compares the results of 8 and 16 bit scaling and expects them to be the same. This CL makes the 16 bit scaling filter logic match. Bug: libyuv:749 Test: LibYUVScaleTest.DISABLED_ScaleDownBy4_Linear_16 Change-Id: Ifb3ca4d770ef38f9f16abe9b9aeb843b779bf371 Reviewed-on: https://chromium-review.googlesource.com/772370 Reviewed-by: Weiyong Yao <braveyao@chromium.org> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-15 23:05:22 +00:00
Frank Barchard	3cf056f8c3	clang-format for align_buffer_page_end and free_aligned_buffer_page_end clang-format does nested indents for macros that dont end with ; example: align_buffer_page_end(dst_y_8, dst_y_plane_size) align_buffer_page_end(dst_u_8, dst_uv_plane_size) align_buffer_page_end(dst_v_8, dst_uv_plane_size) align_buffer_page_end(dst_y_16, dst_y_plane_size * 2) align_buffer_page_end(dst_u_16, dst_uv_plane_size * 2) align_buffer_page_end(dst_v_16, dst_uv_plane_size * 2) use a similar allocator to the one used within libyuv in row.h which makes the caller add ; align_buffer_page_end(dst_y_8, dst_y_plane_size); align_buffer_page_end(dst_u_8, dst_uv_plane_size); align_buffer_page_end(dst_v_8, dst_uv_plane_size); align_buffer_page_end(dst_y_16, dst_y_plane_size * 2); align_buffer_page_end(dst_u_16, dst_uv_plane_size * 2); align_buffer_page_end(dst_v_16, dst_uv_plane_size * 2); Bug: libyuv:758 Test: try bots Change-Id: I4a0770707e7053e094a37bbfc3c5884d5663d078 Reviewed-on: https://chromium-review.googlesource.com/762757 Reviewed-by: Patrik Höglund <phoglund@chromium.org> Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-10 22:36:39 +00:00
Frank Barchard	49d1e3b036	MultiplyRow_16_AVX2 for converting 10 bit YUV When converting from lsb 10 bit formats to msb, the values need to be shifted to the top 10 bits. Using a multiply allows the different numbers of bits to be copied: // 128 = 9 bits // 64 = 10 bits // 16 = 12 bits // 1 = 16 bits Bug: libyuv:751 Test: LibYUVPlanarTest.MultiplyRow_16_Opt Change-Id: I9cf226053a164baa14155215cb175065b1c4f169 Reviewed-on: https://chromium-review.googlesource.com/762951 Reviewed-by: richard winterton <rrwinterton@gmail.com> Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-10 22:02:32 +00:00
Frank Barchard	2f58d126b9	MergeUV10Row_AVX2 use multiply to handle different bit depths Instead of hardcoded shift, use a multiply by a parameter. 128 = 9 bits 64 = 10 bits 16 = 12 bits 1 = 16 bits Bug: libyuv:751 Test: LibYUVPlanarTest.MergeUV10Row_Opt Change-Id: Id925edfdbf91243370c90641b50eb8e7625ec329 Reviewed-on: https://chromium-review.googlesource.com/762523 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-10 03:38:07 +00:00
Frank Barchard	e26b0a7e0e	casting for c89 compatibility and lint cleanup Bug: libyuv:756 Test: CFLAGS="-m32 -static -std=gnu89 -mno-sse -O2" CXXFLAGS="-m32 -x c -static -std=gnu99 -mno-sse -O2" make -f linux.mk libyuv.a Change-Id: Ic362f93e01ccbb0bea14f361a58585e79297e7d2 Reviewed-on: https://chromium-review.googlesource.com/759423 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: Patrik Höglund <phoglund@chromium.org> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-09 18:22:17 +00:00
Frank Barchard	735ace2ed3	Re-enable x86 assembly without requiring -msse2 clang does not require -msse2 or -msse for inline, except the "x" parameter. So change this to "m" for 32 bit. 64 bit requires sse2 so use "x" for 64 bit. gcc requires -msse for xmm registers in clobber list. Reduce compiler requirement from -msse2 to -msse for enabling assembly. Bug: libyuv:754, libyuv:757 Test: CC=clang CXX=clang++ CFLAGS="-m32" CXXFLAGS="-m32 -mno-sse -O2" make -f linux.mk Change-Id: I86df72cfee80b7d349561c1fd7c97ad360767255 Reviewed-on: https://chromium-review.googlesource.com/759303 Reviewed-by: richard winterton <rrwinterton@gmail.com> Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-09 00:51:06 +00:00
Frank Barchard	68f852d835	Remove DISABLE_CLANG_MSA cleanup to remove ifdefs around functions affected by a clang bug. gn gen out/Release "--args=is_debug=false target_os=\"android\" target_cpu=\"mips64el\" mips_arch_variant=\"r6\" mips_use_msa=true is_component_build=true is_clang=true" ninja -v -C out/Release libyuv_unittest Bug: libyuv:634 Test: build for mips with clang Change-Id: I278b368dbb2fe89082240e280267d0a27a214c78 Reviewed-on: https://chromium-review.googlesource.com/757980 Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-08 19:55:14 +00:00
Frank Barchard	d997ac287d	Revert "Enable SSE2 code without -msse" This reverts commit 01e994d74e4e3937ee1a3efdc048320a1e51f818. Change-Id: Ie76710d0f4e641e071889c5125fd3be23cdcdb59 Reviewed-on: https://chromium-review.googlesource.com/758499 Reviewed-by: Frank Barchard <fbarchard@google.com>	2017-11-08 19:33:09 +00:00
Frank Barchard	01e994d74e	Enable SSE2 code without -msse Bug: libyuv:754 Test: CC=clang CXX=clang++ CFLAGS="-m32" CXXFLAGS="-m32 -mno-sse -O2" make -f linux.mk Change-Id: I74bf8d032013694e65ea7637bc38d3253db53ff2 Reviewed-on: https://chromium-review.googlesource.com/758043 Reviewed-by: Frank Barchard <fbarchard@google.com>	2017-11-08 02:54:41 +00:00
Frank Barchard	12084cd068	SSSE3 scaling test detect SSSE3 before running Bug: libyuv:755 Test: ~/intelsde/sde -p4p -- out/Release/libyuv_unittest --gtest_filter=LibYUVScaleTest* Change-Id: Ibb0c908c38efc49dc56e86fa54ae7bd48ced02b5 Reviewed-on: https://chromium-review.googlesource.com/756363 Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-11-07 19:34:10 +00:00
Frank Barchard	522fd699e6	AVX512 feature detects for cnl and icl Key instruction sets added for each microarchitecture: AVX512BW, AVX512VL, AVX512DQ - skylake server or later AVX512_VBMI, AVX512_IFMA - cannon lake or later AVX512_BITALG, AVX512_VBMI2, AVX512_VPOPCNTDQ, AVX512_VNNI, GFNI, VAES, VPCLMULQDQ - ice lake or later Bug: libyuv:752 Test: ~/intelsde/sde -icl -- out/Release/libyuv_unittest --gtest_filter=Cpu Change-Id: I9ee28904c90009d66721b9f805a440c5fc2da122 Reviewed-on: https://chromium-review.googlesource.com/755617 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-11-07 00:56:37 +00:00
Frank Barchard	afa98e1f08	HammingDistance_SSSE3 use movd not vmovd vmovd is an AVX instruction. This will crash on an older CPU with only SSSE3 but not AVX. Use movd instead. Bug: libyuv:753 Test: ~/intelsde/sde -mrm -- out/Release/libyuv_unittest --gtest_filter=LibYUVCompareTest.BenchmarkHammingDistance_Opt Change-Id: I1fb0026039d5f83d124f5d03fed7dc0d2d723e49 Reviewed-on: https://chromium-review.googlesource.com/756200 Reviewed-by: Cheng Wang <wangcheng@google.com> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-11-07 00:48:52 +00:00
Frank Barchard	a0c32b9e49	MergeUV10Row_AVX2 for converting H010 to P010 H010 is 10 bit planar format with 10 bits in lower bits. P010 is 10 bit biplanar format with 10 bits in upper bits. This function weaves the U and V channels and shifts the bits into the upper bits. Bug: libyuv:751 Test: LibYUVPlanarTest.MergeUV10Row_Opt Change-Id: I4a0bac0ef1ff95aa1b8d68261ec8e8e86f2d1fbf Reviewed-on: https://chromium-review.googlesource.com/752692 Reviewed-by: Cheng Wang <wangcheng@google.com> Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-03 18:55:36 +00:00
Frank Barchard	75ec56b55a	documentation - iaca, yuvconvert, clang-cl doc updates Bug: None Test: None Change-Id: Ie7cab948b7e879b08e5e5efaae008977513a0a80 Reviewed-on: https://chromium-review.googlesource.com/749895 Commit-Queue: Frank Barchard <fbarchard@google.com> Reviewed-by: Patrik Höglund <phoglund@chromium.org>	2017-11-02 22:34:14 +00:00
Frank Barchard	258057f988	Use -Werror in external/libyuv/files Bug: libyuv:750 Test: build with WITH_TIDY=1 Change-Id: I09b7c24f52d0daa72fe7d1928d11a56208526bd1 Reviewed-on: https://chromium-review.googlesource.com/748307 Reviewed-by: Patrik Höglund <phoglund@chromium.org>	2017-11-01 17:30:10 +00:00
Frank Barchard	de8e9ae10b	HammingDistance_SSE42 register optimized to avoid push Bug: libyuv:701 Test: objdump to confirm code gen Change-Id: Ibdcb2cc6bc9bf14b4ccb874c49fc9ff664650e1a Reviewed-on: https://chromium-review.googlesource.com/745390 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-10-31 21:12:32 +00:00
Frank Barchard	ffc4811863	clang-format fixes Bug: None Test: lint passes Change-Id: I1fd40d3506bab1f4f9100902f633a9c9e7b96337 Reviewed-on: https://chromium-review.googlesource.com/745038 Reviewed-by: Frank Barchard <fbarchard@google.com>	2017-10-31 02:25:01 +00:00
Frank Barchard	80077a80c2	HammingDistance_X86 using popcnt assembly popcnt has a fake dependency on the destination. This assembly avoids the dependency by using a different register for each popcnt. Bug: libyuv:701 Test: LIBYUV_DISABLE_SSSE3=1 out/Release/libyuv_unittest --gtest_filter=HamOpt --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=9999 --libyuv_flags=-1 --libyuv_cpu_info=-1 Change-Id: Ie1d202e2613b7fa8a3c02acd433940e92c80eafa Reviewed-on: https://chromium-review.googlesource.com/731826 Reviewed-by: Cheng Wang <wangcheng@google.com> Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-10-23 21:15:12 +00:00
Frank Barchard	3e5bbea5bf	Add test programs to Android.bp Bug: libyuv:746 Test: mm cpuid yuvconvert compare psnr libyuv_unittest Change-Id: I08f6d5b519151f274e6baee993eb74950161ef53 Reviewed-on: https://chromium-review.googlesource.com/726749 Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-10-19 01:20:17 +00:00
Frank Barchard	af37f9c1d3	cpuid util fix for build warning Bug: libyuv:747 Test: mm cpuid under android builds Change-Id: I7fff13006b47a59873f29f8992bb3faf9bdb85f1 Reviewed-on: https://chromium-review.googlesource.com/727263 Commit-Queue: Frank Barchard <fbarchard@google.com> Reviewed-by: Frank Barchard <fbarchard@google.com>	2017-10-19 00:08:53 +00:00
Frank Barchard	e4aa6aad48	Add Android.bp build file for Android master Bug: libyuv:746 Test: mm from android repo Change-Id: I7c124acfacc87cc263b19483cea79a63084f3f96 Reviewed-on: https://chromium-review.googlesource.com/724237 Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-10-18 18:36:37 +00:00
Frank Barchard	8fa02df3c0	mingw fix ifdefs to use gcc source mingw gcc sets the macro _M_IX86 which is normally only set by Visual C and clangcl which are Visual C style source code style for assembly, but gcc is not Visual C compatible. Add _MSC_VER to most ifdefs to detect that its really Visual C or clangcl and not mingw gcc so the gcc source code will be used. Bug: libyuv:744 Test: CXXFLAGS=-m32 CXX=~/prebuilts/gcc/linux-x86/host/x86_64-w64-mingw32-4.8/bin/x86_64-w64-mingw32-g++ make -f linux.mk Change-Id: I3431aa486eb769b145faa8d5eb75ed639f9d6f5e Reviewed-on: https://chromium-review.googlesource.com/722319 Reviewed-by: Cheng Wang <wangcheng@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-10-17 17:36:35 +00:00
Andrii Shyshkalov	15d48f1e83	Remove Rietveld CQ config. Rietveld CQ has already been disabled and is no longer supoorted. TBR=phoglund@chromium.org No-Try: True Bug: chromium:770592 Change-Id: I2679e9193dfb5ec751bdc76d35fe5d835b44bd0a Reviewed-on: https://chromium-review.googlesource.com/714302 Reviewed-by: Henrik Kjellander <kjellander@chromium.org> Commit-Queue: Andrii Shyshkalov <tandrii@chromium.org>	2017-10-12 06:43:30 +00:00
Frank Barchard	1cebe2c622	TestHammingDistance_Opt to test low level matches C reference. The low level hamming distance functions have size limitations based on counter sizes. The higher level calls the low level in blocks that avoid overflow and then accumulators in int64. This test compares the results of the low levels to the high level and against a known value (all ones) to ensure the count is correct for any specified size. The the size is very large, the result is expected to be different. Bug: libyuv:701 Test: TestHammingDistance_Opt Change-Id: I6716af7cd09ac4d88a8afa25bc845a1b62af7c93 Reviewed-on: https://chromium-review.googlesource.com/710800 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-10-11 20:21:31 +00:00
Frank Barchard	e23b27d040	Reduce HammingDistance block size to 32k to avoid overflow Bug: libyuv:701 Test: HammingDistance unittest with large size Change-Id: Id41a2c27eb8922d03b3a21dab32fa2e7b015ba38 Reviewed-on: https://chromium-review.googlesource.com/708335 Reviewed-by: Cheng Wang <wangcheng@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-10-10 18:42:47 +00:00
Frank Barchard	60f433fbd9	Revert "ComputeHammingDistance reduce SIMD loop to 1 call when possible." This reverts commit ec75df5894845b8d6b1341885a78db1de83decd8. Reason for revert: <INSERT REASONING HERE> Original change's description: > ComputeHammingDistance reduce SIMD loop to 1 call when possible. > > 32 bit x86 has high overhead due to -fpic. So this reduces the > number of calls by 1. > > TBR=kjellander@chromium.org > Bug: libyuv:701 > Test: BenchmarkHammingDistance > Change-Id: I7f557ef047920db65eab362a5f93abbd274ca051 > Reviewed-on: https://chromium-review.googlesource.com/701755 > Reviewed-by: Frank Barchard <fbarchard@google.com> > Reviewed-by: Cheng Wang <wangcheng@google.com> TBR=rrwinterton@gmail.com,fbarchard@google.com,wangcheng@google.com Change-Id: Ia61e8558a8f083c14be5f51e0e141550b6f2b5c1 No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: libyuv:701 Reviewed-on: https://chromium-review.googlesource.com/707823 Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-10-10 01:16:15 +00:00
Frank Barchard	ec75df5894	ComputeHammingDistance reduce SIMD loop to 1 call when possible. 32 bit x86 has high overhead due to -fpic. So this reduces the number of calls by 1. TBR=kjellander@chromium.org Bug: libyuv:701 Test: BenchmarkHammingDistance Change-Id: I7f557ef047920db65eab362a5f93abbd274ca051 Reviewed-on: https://chromium-review.googlesource.com/701755 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-10-09 22:51:23 +00:00
Patrik Höglund	b7b5374263	kjellander -> phoglund in OWNERS R=kjellander@chromium.org Bug:libyuv:741 Change-Id: I6bc18e94ec82e00518f34695cf3b97deed82c76d Reviewed-on: https://chromium-review.googlesource.com/699996 Reviewed-by: Henrik Kjellander <kjellander@chromium.org> Commit-Queue: Patrik Höglund <phoglund@chromium.org>	2017-10-09 11:40:28 +00:00
Frank Barchard	1734712a6f	Fix odd length HammingDistance If length of HammingDistance was not a multiple of 4, the result was incorrect. The old tests did not catch this so a new test is done to count 1s. Bug: libyuv:740 Test: LibYUVCompareTest.TestHammingDistance Change-Id: I93db5437821c597f1f162ac263d4a594bb83231f Reviewed-on: https://chromium-review.googlesource.com/699614 Reviewed-by: richard winterton <rrwinterton@gmail.com> Reviewed-by: Cheng Wang <wangcheng@google.com> Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-10-04 22:21:36 +00:00
Frank Barchard	fecd741794	Port HammingDistance to SSSE3 Bug: libyuv:701 Test: BenchmarkHammingDistance_Opt Change-Id: Ibdd5d382677ebef4f82a62e0d5c3b88614a3b6e4 Reviewed-on: https://chromium-review.googlesource.com/696290 Commit-Queue: Frank Barchard <fbarchard@google.com> Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-10-03 19:11:05 +00:00
Frank Barchard	bde789b176	Hamming Distance SSE2 and AVX2 optimized Bug: None Test: None Change-Id: Id52663f9c957aac3172fba92d888ad1b041d5cf0 Reviewed-on: https://chromium-review.googlesource.com/692981 Reviewed-by: Cheng Wang <wangcheng@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-10-02 22:32:54 +00:00
Frank Barchard	311add63c2	CopyRow_NEON use ldp instead of ld1 for better performance. Under cache thrashing circumstances, ldp/stp perform better than ld1/st1 on QC820/QC821 CPUs. Same performance when hitting cache. Bug: libyuv:738 Test: LibYUVPlanarTest.TestCopySamples_Opt (445 ms) Change-Id: Ib6a0a5d5e6a1b7ef667b9bb2edb39d681cf3614c Reviewed-on: https://chromium-review.googlesource.com/691281 Commit-Queue: Frank Barchard <fbarchard@google.com> Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-09-29 01:52:29 +00:00
Frank Barchard	ccd6d6fc57	add TestCopySamples_Opt unittest as reference for TestScaleSamples_Opt TestScaleSamples_Opt can be slow on ARM if the size of the buffer is 1 MB. This test does a memcpy and behaves the same. Bug: libyuv:738 Test: LibYUVPlanarTest.TestCopySamples_Opt Change-Id: Ia9f30190ed76ea350ebe054c9b899d5268e7e135 Reviewed-on: https://chromium-review.googlesource.com/685751 Reviewed-by: Cheng Wang <wangcheng@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-09-27 19:04:12 +00:00

1 2 3 4 5 ...

2065 Commits