libyuv

mirror of https://chromium.googlesource.com/libyuv/libyuv synced 2026-01-01 03:12:16 +08:00

Author	SHA1	Message	Date
Frank Barchard	2ed2402fa0	I420ToI010 for 8 to 10 bit YUV conversion. Convert planar 8 bit formats to planar 16 bit formats. Includes msan fix for Convert8To16Row_Opt unittest. I420 is YUV bt.601 8 bits per channel with 420 subsampling. I010 is YUV bt.601 10 bits per channel with 420 subsampling. I is color space - bt.601. The function does no color space conversion so H420ToI010 is aliased to this function as well. 0 = 420 subsampling. The chroma channels are half width / height. 10 = 10 bits per channel, stored in low 10 bits of 16 bit samples. For SSSE3 version: out/Release/libyuv_unittest --gtest_filter=*LibYUVConvertTest.I420ToI010_Opt --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --libyuv_cpu_info=-1 [ RUN ] LibYUVConvertTest.I420ToI010_Opt [ OK ] LibYUVConvertTest.I420ToI010_Opt (276 ms) Bug: libyuv:751 Test: LibYUVConvertTest.I420ToI010_Opt Change-Id: I072876ee4fd74a2b74f459b628838bc808f9bdd2 Reviewed-on: https://chromium-review.googlesource.com/846421 Reviewed-by: Miguel Casas <mcasas@chromium.org> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2018-01-02 21:09:39 +00:00
Frank Barchard	768f103b8b	Convert8To16 for better H010 support Convert planar 8 bit formats to planar 16 bit formats. Accepts a parameter that determines the number of bits. Bug: libyuv:751 Test: Convert8To16 unittest Change-Id: I8f6ffe64428ddf5769b87e0c069093a50a2541e9 Reviewed-on: https://chromium-review.googlesource.com/835410 Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-12-28 22:27:24 +00:00
Frank Barchard	56480051e0	Reduce error in unittests to 0 for Intel Most conversion tests exactly match on Intel, so reduce max diff to zero for Intel and leave at 4 for ARM using a macro. TBR=braveyao@chromium.org Bug: libyuv:447 Test: libyuv_unittests pass Change-Id: Ife13a0dd9581b58135b39d95ff1081f74840655c Reviewed-on: https://chromium-review.googlesource.com/832927 Reviewed-by: Frank Barchard <fbarchard@chromium.org> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2017-12-20 18:16:38 +00:00
Frank Barchard	790054ff03	Add AR30ToARGB function Initial AR30ToARGB function to allow converion from AR30 to other formats if necessary and/or for testing. Not optimized at this point. Bug: libyuv:751 Test: LibYUVConvertTest.AR30ToARGB_Opt Change-Id: I38ef192315240f3caa7aee0218b38d5e88a2849f Reviewed-on: https://chromium-review.googlesource.com/833025 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-12-19 01:54:42 +00:00
Frank Barchard	5336217f11	H010Copy function to copy 16 bit planar formats Bug: libyuv:751 Test: LibYUVConvertTest.H010ToH010_Opt Change-Id: I996d309040a14193a97d05b62ac0b3e1ad1ee74b Reviewed-on: https://chromium-review.googlesource.com/823445 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Cheng Wang <wangcheng@google.com> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-12-15 03:34:34 +00:00
Frank Barchard	3b81288ece	Remove Mips DSPR2 code Bug: libyuv:765 Test: build for mips still passes Change-Id: I99105ad3951d2210c0793e3b9241c178442fdc37 Reviewed-on: https://chromium-review.googlesource.com/826404 Reviewed-by: Weiyong Yao <braveyao@chromium.org> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2017-12-14 18:22:16 +00:00
Frank Barchard	bb3180ae80	Add I420ToAR30 10 bit RGB For more complete support of AR30 format, add I420ToAR30 allowing the new RGB 10 bit format to be used from standard 8 bit I420 format. Bug: libyuv:751 Test: I420ToAR30 unittest added Change-Id: Ia8b0857447408bd6adab485158ce5f38d6dc2faa Reviewed-on: https://chromium-review.googlesource.com/823243 Reviewed-by: Weiyong Yao <braveyao@chromium.org> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2017-12-12 23:40:58 +00:00
Frank Barchard	c367751430	ARGBToAR30 SSSE3 use pmulhuw to replicate fields AR30 is optimized with 3 techniques 1. pmulhuw is used to replicate 8 bits to 10 bits. 2. Two channels are processed at a time. R and B, and A and G. 3. pshufb is used to shift and mask 2 channels of R and B Bug: libyuv:751 Test: ARGBToAR30_Opt Change-Id: I4e62d6caa4df7d0ae80395fa911d3c922b6b897b Reviewed-on: https://chromium-review.googlesource.com/822520 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2017-12-12 20:12:58 +00:00
Frank Barchard	11dd1b956f	ARGBToAR30 use vpmulhuw to replicate fields AR30 is optimized with 3 techniques 1. vpmulhuw is used to replicate 8 bits to 10 bits. 2. Two channels are processed at a time. R and B, and A and G. 3. vpshufb is used to shift and mask 2 channels of R and B Red Blue With the 8 bit value in the upper bits, vpmulhuw by (1024+4) will produce a 10 bit value in the low 10 bits of each 16 bit value. This is whats wanted for the blue channel. The red needs to be shifted 4 left, so multiply by (1024+4)16 for red. Alpha Green Alpha and Green are already in the high bits so vpand can zero out the other bits, keeping just 2 upper bits of alpha and 8 bit green. The same multiplier could be used for Green - (1024+4) putting the 10 bit green in the lsb. Alpha would be a simple multiplier to shift it into position. It wants a gap of 10 above the green. Green is 10 bits, so there are 6 bits in the low short. 4 more are needed, so a multiplier of 4 gets the 2 bits into the upper 16 bits, and then a shift of 4 is a multiply of 16, so (416) = 64. Then shift the result left 10 to position the A and G channels. Bug: libyuv:751 Test: ARGBToAR30_Opt Change-Id: Ie4f20dce18203bae7b75acb1fd5232db8a8a4f11 Reviewed-on: https://chromium-review.googlesource.com/820046 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-12-12 02:57:54 +00:00
Frank Barchard	0f98c3c1df	Add ARGBToAR30Row_SSE2 to speed up H010ToAR30 Port ARGBToAR30Row_AVX2 to ARGBToAR30Row_SSE2 using same instructions but xmm registers and doing half as many pixels per loop. Bug: libyuv:751 Test: LibYUVConvertTest.ARGBToAR30_Opt Change-Id: Id644e54639133d1caf28ea3cd11ff6ab6891a673 Reviewed-on: https://chromium-review.googlesource.com/817918 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-12-09 00:11:20 +00:00
Frank Barchard	3541e46a7e	Add H010ToARGB for 10 bit YUV to ARGB Bug: libyuv:751 Test: LibYUVConvertTest.H010ToARGB_Opt Change-Id: I668d3f3810e59a4fb6611503aae1c8edc7d596e7 Reviewed-on: https://chromium-review.googlesource.com/815015 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-12-07 20:17:50 +00:00
Frank Barchard	49d9b1039b	NV21ToABGR for Android camera conversions Bug: libyuv:762 Test: NV21ToABGR unittest Change-Id: I71448ab83930339083f07eeafccf240c6cb41c48 Reviewed-on: https://chromium-review.googlesource.com/795212 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-11-30 20:29:28 +00:00
Frank Barchard	324fa32739	Convert16To8Row_SSSE3 port from AVX2 H010ToAR30 uses Convert16To8Row_SSSE3 to convert 10 bit YUV to 8 bit. Then standard YUV conversion can be used. This improves performance on low end CPUs. Future CL will by pass this conversion allowing for 10 bit YUV source, but the function will be useful as a utility for YUV conversions. Bug: libyuv:559, libyuv:751 Test: out/Release/libyuv_unittest --gtest_filter=H010ToAR30 --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --libyuv_cpu_info=-1 Change-Id: I9b3ef22d88a5fd861de4cf1900b4c6e8fd24d0af Reviewed-on: https://chromium-review.googlesource.com/792334 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Frank Barchard <fbarchard@chromium.org>	2017-11-28 19:22:39 +00:00
Frank Barchard	26173eb73e	H010ToAR30 for 10 bit bt.709 YUV to 30 bit RGB This version of the H010ToAR30 provides a 3 step conversion Convert16To8Row_AVX2 H420ToARGB_AVX2 ARGBToAR30_AVX2 Low level function added to convert 16 bit to 8 bit using multiply to adjust 10 bit or other bit depths and then save the upper 16 bits. Bug: libyuv:751 Test: LibYUVPlanarTest.Convert16To8Row_Opt unittest added Change-Id: I9cc576fda8afa1003cb961d03e0e656e0b478f03 Reviewed-on: https://chromium-review.googlesource.com/783554 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-11-22 23:58:30 +00:00
Frank Barchard	a98d6cdb17	ARGBToAR30 AVX2 conversion function Bug: libyuv:751 Test: LibYUVConvertTest.ARGBToAR30_Opt Change-Id: I09c13eb53ba5f1ce1740c013dc587f8300f1d9e0 Reviewed-on: https://chromium-review.googlesource.com/780437 Commit-Queue: Frank Barchard <fbarchard@chromium.org> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-11-21 20:37:01 +00:00
Frank Barchard	19a126ddfa	Add AR30 fourcc unittest Bug: libyuv:749 Test: LibYUVBaseTest.TestFourCC Change-Id: Iec378947248840c7e2cd87b1198503f39e7c7258 Reviewed-on: https://chromium-review.googlesource.com/780619 Reviewed-by: Frank Barchard <fbarchard@chromium.org> Commit-Queue: Frank Barchard <fbarchard@chromium.org>	2017-11-20 23:52:01 +00:00
Frank Barchard	12c904a97c	H420ToRAW and H420ToRGB24 added for bt.709 support. Bug: libyuv:760 Test: LibYUVConvertTest.H420ToRAW_Opt Change-Id: I050385f477309d5db02bb2218088f224c83392ed Reviewed-on: https://chromium-review.googlesource.com/775785 Commit-Queue: Frank Barchard <fbarchard@google.com> Reviewed-by: Weiyong Yao <braveyao@chromium.org>	2017-11-17 01:20:05 +00:00
Frank Barchard	46594be758	add ScalePlane_16 unit tests Tests ScalePlane vs ScalePlane_16 match. Bug: libyuv:749 Test: LibYUVScaleTest.ScalePlaneDownBy4_Box_16 Change-Id: I3f71748da404982d5d48bfb11bbd3ae95a1d021c Reviewed-on: https://chromium-review.googlesource.com/765045 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: richard winterton <rrwinterton@gmail.com> Reviewed-by: Weiyong Yao <braveyao@chromium.org>	2017-11-16 01:40:48 +00:00
Frank Barchard	3cf056f8c3	clang-format for align_buffer_page_end and free_aligned_buffer_page_end clang-format does nested indents for macros that dont end with ; example: align_buffer_page_end(dst_y_8, dst_y_plane_size) align_buffer_page_end(dst_u_8, dst_uv_plane_size) align_buffer_page_end(dst_v_8, dst_uv_plane_size) align_buffer_page_end(dst_y_16, dst_y_plane_size * 2) align_buffer_page_end(dst_u_16, dst_uv_plane_size * 2) align_buffer_page_end(dst_v_16, dst_uv_plane_size * 2) use a similar allocator to the one used within libyuv in row.h which makes the caller add ; align_buffer_page_end(dst_y_8, dst_y_plane_size); align_buffer_page_end(dst_u_8, dst_uv_plane_size); align_buffer_page_end(dst_v_8, dst_uv_plane_size); align_buffer_page_end(dst_y_16, dst_y_plane_size * 2); align_buffer_page_end(dst_u_16, dst_uv_plane_size * 2); align_buffer_page_end(dst_v_16, dst_uv_plane_size * 2); Bug: libyuv:758 Test: try bots Change-Id: I4a0770707e7053e094a37bbfc3c5884d5663d078 Reviewed-on: https://chromium-review.googlesource.com/762757 Reviewed-by: Patrik Höglund <phoglund@chromium.org> Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-10 22:36:39 +00:00
Frank Barchard	49d1e3b036	MultiplyRow_16_AVX2 for converting 10 bit YUV When converting from lsb 10 bit formats to msb, the values need to be shifted to the top 10 bits. Using a multiply allows the different numbers of bits to be copied: // 128 = 9 bits // 64 = 10 bits // 16 = 12 bits // 1 = 16 bits Bug: libyuv:751 Test: LibYUVPlanarTest.MultiplyRow_16_Opt Change-Id: I9cf226053a164baa14155215cb175065b1c4f169 Reviewed-on: https://chromium-review.googlesource.com/762951 Reviewed-by: richard winterton <rrwinterton@gmail.com> Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-10 22:02:32 +00:00
Frank Barchard	2f58d126b9	MergeUV10Row_AVX2 use multiply to handle different bit depths Instead of hardcoded shift, use a multiply by a parameter. 128 = 9 bits 64 = 10 bits 16 = 12 bits 1 = 16 bits Bug: libyuv:751 Test: LibYUVPlanarTest.MergeUV10Row_Opt Change-Id: Id925edfdbf91243370c90641b50eb8e7625ec329 Reviewed-on: https://chromium-review.googlesource.com/762523 Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-10 03:38:07 +00:00
Frank Barchard	e26b0a7e0e	casting for c89 compatibility and lint cleanup Bug: libyuv:756 Test: CFLAGS="-m32 -static -std=gnu89 -mno-sse -O2" CXXFLAGS="-m32 -x c -static -std=gnu99 -mno-sse -O2" make -f linux.mk libyuv.a Change-Id: Ic362f93e01ccbb0bea14f361a58585e79297e7d2 Reviewed-on: https://chromium-review.googlesource.com/759423 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: Patrik Höglund <phoglund@chromium.org> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-09 18:22:17 +00:00
Frank Barchard	12084cd068	SSSE3 scaling test detect SSSE3 before running Bug: libyuv:755 Test: ~/intelsde/sde -p4p -- out/Release/libyuv_unittest --gtest_filter=LibYUVScaleTest* Change-Id: Ibb0c908c38efc49dc56e86fa54ae7bd48ced02b5 Reviewed-on: https://chromium-review.googlesource.com/756363 Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-11-07 19:34:10 +00:00
Frank Barchard	522fd699e6	AVX512 feature detects for cnl and icl Key instruction sets added for each microarchitecture: AVX512BW, AVX512VL, AVX512DQ - skylake server or later AVX512_VBMI, AVX512_IFMA - cannon lake or later AVX512_BITALG, AVX512_VBMI2, AVX512_VPOPCNTDQ, AVX512_VNNI, GFNI, VAES, VPCLMULQDQ - ice lake or later Bug: libyuv:752 Test: ~/intelsde/sde -icl -- out/Release/libyuv_unittest --gtest_filter=Cpu Change-Id: I9ee28904c90009d66721b9f805a440c5fc2da122 Reviewed-on: https://chromium-review.googlesource.com/755617 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: richard winterton <rrwinterton@gmail.com>	2017-11-07 00:56:37 +00:00
Frank Barchard	a0c32b9e49	MergeUV10Row_AVX2 for converting H010 to P010 H010 is 10 bit planar format with 10 bits in lower bits. P010 is 10 bit biplanar format with 10 bits in upper bits. This function weaves the U and V channels and shifts the bits into the upper bits. Bug: libyuv:751 Test: LibYUVPlanarTest.MergeUV10Row_Opt Change-Id: I4a0bac0ef1ff95aa1b8d68261ec8e8e86f2d1fbf Reviewed-on: https://chromium-review.googlesource.com/752692 Reviewed-by: Cheng Wang <wangcheng@google.com> Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-11-03 18:55:36 +00:00
Frank Barchard	ffc4811863	clang-format fixes Bug: None Test: lint passes Change-Id: I1fd40d3506bab1f4f9100902f633a9c9e7b96337 Reviewed-on: https://chromium-review.googlesource.com/745038 Reviewed-by: Frank Barchard <fbarchard@google.com>	2017-10-31 02:25:01 +00:00
Frank Barchard	80077a80c2	HammingDistance_X86 using popcnt assembly popcnt has a fake dependency on the destination. This assembly avoids the dependency by using a different register for each popcnt. Bug: libyuv:701 Test: LIBYUV_DISABLE_SSSE3=1 out/Release/libyuv_unittest --gtest_filter=HamOpt --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=9999 --libyuv_flags=-1 --libyuv_cpu_info=-1 Change-Id: Ie1d202e2613b7fa8a3c02acd433940e92c80eafa Reviewed-on: https://chromium-review.googlesource.com/731826 Reviewed-by: Cheng Wang <wangcheng@google.com> Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-10-23 21:15:12 +00:00
Frank Barchard	1cebe2c622	TestHammingDistance_Opt to test low level matches C reference. The low level hamming distance functions have size limitations based on counter sizes. The higher level calls the low level in blocks that avoid overflow and then accumulators in int64. This test compares the results of the low levels to the high level and against a known value (all ones) to ensure the count is correct for any specified size. The the size is very large, the result is expected to be different. Bug: libyuv:701 Test: TestHammingDistance_Opt Change-Id: I6716af7cd09ac4d88a8afa25bc845a1b62af7c93 Reviewed-on: https://chromium-review.googlesource.com/710800 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: richard winterton <rrwinterton@gmail.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-10-11 20:21:31 +00:00
Frank Barchard	60f433fbd9	Revert "ComputeHammingDistance reduce SIMD loop to 1 call when possible." This reverts commit ec75df5894845b8d6b1341885a78db1de83decd8. Reason for revert: <INSERT REASONING HERE> Original change's description: > ComputeHammingDistance reduce SIMD loop to 1 call when possible. > > 32 bit x86 has high overhead due to -fpic. So this reduces the > number of calls by 1. > > TBR=kjellander@chromium.org > Bug: libyuv:701 > Test: BenchmarkHammingDistance > Change-Id: I7f557ef047920db65eab362a5f93abbd274ca051 > Reviewed-on: https://chromium-review.googlesource.com/701755 > Reviewed-by: Frank Barchard <fbarchard@google.com> > Reviewed-by: Cheng Wang <wangcheng@google.com> TBR=rrwinterton@gmail.com,fbarchard@google.com,wangcheng@google.com Change-Id: Ia61e8558a8f083c14be5f51e0e141550b6f2b5c1 No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: libyuv:701 Reviewed-on: https://chromium-review.googlesource.com/707823 Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-10-10 01:16:15 +00:00
Frank Barchard	ec75df5894	ComputeHammingDistance reduce SIMD loop to 1 call when possible. 32 bit x86 has high overhead due to -fpic. So this reduces the number of calls by 1. TBR=kjellander@chromium.org Bug: libyuv:701 Test: BenchmarkHammingDistance Change-Id: I7f557ef047920db65eab362a5f93abbd274ca051 Reviewed-on: https://chromium-review.googlesource.com/701755 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-10-09 22:51:23 +00:00
Frank Barchard	1734712a6f	Fix odd length HammingDistance If length of HammingDistance was not a multiple of 4, the result was incorrect. The old tests did not catch this so a new test is done to count 1s. Bug: libyuv:740 Test: LibYUVCompareTest.TestHammingDistance Change-Id: I93db5437821c597f1f162ac263d4a594bb83231f Reviewed-on: https://chromium-review.googlesource.com/699614 Reviewed-by: richard winterton <rrwinterton@gmail.com> Reviewed-by: Cheng Wang <wangcheng@google.com> Reviewed-by: Frank Barchard <fbarchard@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-10-04 22:21:36 +00:00
Frank Barchard	fecd741794	Port HammingDistance to SSSE3 Bug: libyuv:701 Test: BenchmarkHammingDistance_Opt Change-Id: Ibdd5d382677ebef4f82a62e0d5c3b88614a3b6e4 Reviewed-on: https://chromium-review.googlesource.com/696290 Commit-Queue: Frank Barchard <fbarchard@google.com> Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-10-03 19:11:05 +00:00
Frank Barchard	bde789b176	Hamming Distance SSE2 and AVX2 optimized Bug: None Test: None Change-Id: Id52663f9c957aac3172fba92d888ad1b041d5cf0 Reviewed-on: https://chromium-review.googlesource.com/692981 Reviewed-by: Cheng Wang <wangcheng@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-10-02 22:32:54 +00:00
Frank Barchard	ccd6d6fc57	add TestCopySamples_Opt unittest as reference for TestScaleSamples_Opt TestScaleSamples_Opt can be slow on ARM if the size of the buffer is 1 MB. This test does a memcpy and behaves the same. Bug: libyuv:738 Test: LibYUVPlanarTest.TestCopySamples_Opt Change-Id: Ia9f30190ed76ea350ebe054c9b899d5268e7e135 Reviewed-on: https://chromium-review.googlesource.com/685751 Reviewed-by: Cheng Wang <wangcheng@google.com> Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-09-27 19:04:12 +00:00
Frank Barchard	0b0a891cb2	Change TestScaleSumSamples_C test to allow for some float error in sum. The sum of floats can optimize differently with vectorization, producing a different result between NEON and C. Adjust the unittest to allow for some difference in the sum. The NEON version is 8 samples at a time, so the test now rounds up the number of values to multiple of 8. TBR=kjellander@chromium.org Bug: libyuv:717 Test: LibYUVPlanarTest.TestScaleSumSamples_Opt Change-Id: I2a0783780c7e0f240f7a8e4700b2a4d3e6b52d87 Reviewed-on: https://chromium-review.googlesource.com/673708 Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-09-22 18:58:11 +00:00
Frank Barchard	efbf15754a	Step thru full color test by increments of 5 for better test speed. Full color test is the slowest of the unittests, and not catching any additional bugs at the moment. Step thru range of 0 to 255 in steps of 5 to speed up the test. 255 is 3 * 5 * 17, so any of those primes would hit 0 and 255 exactly. Was LibYUVColorTest.TestFullYUV (896 ms) Now LibYUVColorTest.TestFullYUV (212 ms) TBR=kjellander@chromium.org Bug: libyuv:736 Test: LibYUVColorTest.TestFullYUV Change-Id: I5b55fb07ada0dc7bdc3c3c20569d36bf09bb3804 Reviewed-on: https://chromium-review.googlesource.com/672064 Commit-Queue: Frank Barchard <fbarchard@google.com> Reviewed-by: Frank Barchard <fbarchard@google.com>	2017-09-19 02:01:53 +00:00
Frank Barchard	80e27fc747	Add MaskCpuFlags(benchmark_cpu_info_) to unittest initialization When command line --libyuv_cpu_info is used the individual tests used to need to set the cpumask. This CL moves that to the init for each test class so the individual tests dont need to set it. TBR=kjellander@chromium.org BUG=libyuv:720 TEST=LibYUVBaseTest.TestCpuHas Change-Id: I6ae180388debf6cf76be6df5b81cfffeb35ee2eb Reviewed-on: https://chromium-review.googlesource.com/662367 Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-09-12 19:13:06 +00:00
Frank Barchard	1200ef4a37	Gauss unittest reduce buffer sizes on stack Reduce buffers for test to 640 from 1280 to avoid bit stack warning. TBR=kjellander@chromium.org BUG=libyuv:730 TEST=LibYUVPlanarTest.TestGaussRow_Opt and LibYUVPlanarTest.TestGaussCol_Opt Change-Id: I710af3e952f9a4d1c0c0c8f73922c1d98ad9aa29 Reviewed-on: https://chromium-review.googlesource.com/660662 Reviewed-by: Frank Barchard <fbarchard@google.com>	2017-09-11 21:59:43 +00:00
Frank Barchard	1e16cb5c38	SplitRGBPlane and MergeRGBPlane functions added Converts packed RGB to planar and back. TBR=kjellander@chromium.org BUG=libyuv:728 TEST=MergeRGBPlane_Opt and SplitRGBPlane_Opt unittests added Change-Id: Ida59af940afcb1fc4a48bbf62c714f592665c3cc Reviewed-on: https://chromium-review.googlesource.com/658069 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-09-11 21:02:04 +00:00
Frank Barchard	8f5e9cd9eb	ScaleRowUp2_16_C port of NEON to C Single pass upsample with bilinear filter. NEON version optimized - Pixel Sailfish QC821 Was TestScaleRowUp2_16 (5741 ms) Now TestScaleRowUp2_16 (4484 ms) C TestScaleRowUp2_16 (6555 ms) TBR=kjellander@chromium.org BUG=libyuv:718 TEST=LibYUVScaleTest.TestScaleRowUp2_16 (709 ms) Change-Id: Ib04ceb53e0ab644a392c39c3396e313530161d92 Reviewed-on: https://chromium-review.googlesource.com/646701 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-09-05 21:40:39 +00:00
Frank Barchard	f0a9d6d206	Gaussian reorder for benefit of A73 Roughly. instead of 4 loads and 8 multiples, use 1 load and 2 multiples 4 times over. The original code, as with the C code from clang and gcc, did all the loads, then all the math, then the store. The new code does a load, then the math, then the next load, etc. This schedules better on current arm 64 cpus. Number of registers also reduced, reusing the same registers. HiSilicon ARM A73: Now TestGaussRow_Opt (890 ms) TestGaussCol_Opt (571 ms) Was TestGaussRow_Opt (1061 ms) TestGaussCol_Opt (595 ms) Qualcomm 821 (Pixel): Now TestGaussRow_Opt (571 ms) TestGaussCol_Opt (474 ms) Was TestGaussRow_Opt (751 ms) TestGaussCol_Opt (520 ms) TBR=kjellander@chromium.org BUG=libyuv:719 TEST=LibYUVPlanarTest.TestGaussRow_Opt Reviewed-on: https://chromium-review.googlesource.com/627478 Reviewed-by: Cheng Wang <wangcheng@google.com> Reviewed-by: Frank Barchard <fbarchard@google.com> Change-Id: I5ec81191d460801f0d4a89f0384f89925ff036de Reviewed-on: https://chromium-review.googlesource.com/634448 Commit-Queue: Frank Barchard <fbarchard@google.com>	2017-08-25 19:00:05 +00:00
Frank Barchard	ad2409443c	GaussRow_NEON from int to short [ RUN ] LibYUVPlanarTest.TestGaussRow_Opt [ OK ] LibYUVPlanarTest.TestGaussRow_Opt (601 ms) [ RUN ] LibYUVPlanarTest.TestGaussCol_Opt [ OK ] LibYUVPlanarTest.TestGaussCol_Opt (522 ms) TBR=kjellander@chromium.org BUG=libyuv:719 TEST=LibYUVPlanarTest.TestGaussRow_Opt Change-Id: I1242b98672538e889f3ab48f215d6dabc7144ea7 Reviewed-on: https://chromium-review.googlesource.com/627478 Reviewed-by: Cheng Wang <wangcheng@google.com> Reviewed-by: Frank Barchard <fbarchard@google.com>	2017-08-24 01:09:23 +00:00
Frank Barchard	1cc539f7d6	GaussCol_NEON resample from short to int Old NEON LibYUVPlanarTest.TestGaussCol_Opt (916 ms) New NEON LibYUVPlanarTest.TestGaussCol_Opt (520 ms) C vectorized LibYUVPlanarTest.TestGaussCol_Opt (739 ms) TBR=kjellander@chromium.org BUG=libyuv:719 TEST=LibYUVPlanarTest.TestGaussCol_Opt Change-Id: I863b66f700f7a71fcb08a2eabb03240fdaf8a238 Reviewed-on: https://chromium-review.googlesource.com/626938 Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-08-22 23:07:17 +00:00
Frank Barchard	c5bad809b1	Gauss unittest, Scale comments for neon64 half size updated [ RUN ] LibYUVPlanarTest.TestGaussRow_Opt [ OK ] LibYUVPlanarTest.TestGaussRow_Opt (1274 ms) [ RUN ] LibYUVPlanarTest.TestGaussCol_Opt [ OK ] LibYUVPlanarTest.TestGaussCol_Opt (916 ms) TBR=kjellander@chromium.org BUG=libyuv:719 TEST=LibYUVPlanarTest.TestGaussRow_Opt Change-Id: Id480f3870c40c2b40dfb9f072cb7118ebad41afc Reviewed-on: https://chromium-review.googlesource.com/624701 Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-08-21 23:41:46 +00:00
Frank Barchard	78e44628c6	Add MSA optimized SplitUV, Set, MirrorUV, SobelX and SobelY row functions. TBR=kjellander@chromium.org R=fbarchard@google.com Bug:libyuv:634 Change-Id: Ie2342f841f1bb8469fc4631b784eddd804f5d53e Reviewed-on: https://chromium-review.googlesource.com/616765 Reviewed-by: Frank Barchard <fbarchard@google.com>	2017-08-17 18:39:22 +00:00
Frank Barchard	bb17da97cf	Test C vs NEON for ScaleDown2Box_16 TBR=kjellander@chromium.org BUG=libyuv:718 TEST=LibYUVScaleTest.TestScaleRowDown2Box_16 Change-Id: Ic74d29d6f14983ff26e8af541ef702a0f8bf3f17 Reviewed-on: https://chromium-review.googlesource.com/616189 Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-08-16 22:18:48 +00:00
Frank Barchard	7e59ee4c75	Upsample 8x2 pixels to 16x1 with bilinear filtering Downsample 16x2 to 8x1 with box filtering [ RUN ] LibYUVScaleTest.TestScaleRowUp2_16 [ OK ] LibYUVScaleTest.TestScaleRowUp2_16 (579 ms) [ RUN ] LibYUVScaleTest.TestScaleRowDown2Box_16 [ OK ] LibYUVScaleTest.TestScaleRowDown2Box_16 (329 ms) [----------] 2 tests from LibYUVScaleTest (909 ms total) TBR=kjellander@chromium.org BUG=libyuv:718 TEST=LibYUVScaleTest.TestScaleRowUp2_16 and LibYUVScaleTest.TestScaleRowDown2Box_16 Change-Id: I457d44123f2751e5f71bf3935401fff74b8e9db2 Reviewed-on: https://chromium-review.googlesource.com/608876 Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-08-15 22:28:15 +00:00
Frank Barchard	56bbcdf422	Reintroduce the max version of scale add ScaleMaxSamples_NEON function with max done on original values. TBR=kjellander@chromium.org BUG=libyuv:717 TEST=LibYUVPlanarTest.TestScaleMaxSamples_Opt Change-Id: Id99338860782b10ffd24f66242eb42014c2e229e Reviewed-on: https://chromium-review.googlesource.com/614685 Reviewed-by: Frank Barchard <fbarchard@google.com> Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-08-14 23:33:56 +00:00
Frank Barchard	83ca1abe09	Change ScaleSumSamples to return Sum of Squares TBR=kjellander@chromium.org BUG=libyuv:717 TEST=LibYUVPlanarTest.TestScaleSumSamples_Opt Change-Id: I5208666f3968c5c4b0f1b0c951f24216d78ee3fe Reviewed-on: https://chromium-review.googlesource.com/607184 Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-08-09 22:19:45 +00:00
Frank Barchard	8676ad7004	scale float samples and return max value BUG=libyuv:717 TEST=ScaleSum unittest to compare C vs Arm implementation TBR=kjellander@chromium.org Change-Id: Iaa7af5547d979aad4722f868d31b405340115748 Reviewed-on: https://chromium-review.googlesource.com/600534 Reviewed-by: Cheng Wang <wangcheng@google.com>	2017-08-04 23:34:30 +00:00

1 2 3 4 5 ...

517 Commits