The original src_u calculation of FOURCC_I420 shifted half width if
crop_y is odd.
This CL fixs the problem and also add a test case for it.
Bug: b:115278653
Test: pass libyuv_unittest
Change-Id: Ia9732d22e64e13de26df47726ba44ad1c5a06484
Reviewed-on: https://chromium-review.googlesource.com/c/1258743
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
This was changed in 21be9122aadf7824efe3fc19b2a09ff253a688e1.
Change-Id: I6c04dc92f673557e10c231bd090ec8aa88b6bee4
Reviewed-on: https://chromium-review.googlesource.com/1146183
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Currently, libyuv supports MIPS SIMD Arch(MSA),
but libyuv does not supports MultiMedia Instruction(MMI)(such as loongson3a platform).
In order to improve performance of libyuv on loongson3a platform,
this provides optimize 98 functions with mmi.
BUG=libyuv:804
Change-Id: I8947626009efad769b3103a867363ece25d79629
Reviewed-on: https://chromium-review.googlesource.com/1122064
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Each byte is converted to float (0.0 to 255.0) and then multiplied
by a scale parameter.
Bug: None
Test: arm 64 build passes.
Change-Id: I04736798540b8d985f60abdf0388e24a209d075b
Reviewed-on: https://chromium-review.googlesource.com/930226
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Ian Field <ianfield@google.com>
Bug: libyuv:750
Test: builds and runs and passes more tidy tests
Change-Id: I023699a7aa61ea3f5e4a21647112691ea5739281
Reviewed-on: https://chromium-review.googlesource.com/902170
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
Convert planar 8 bit formats to planar 16 bit formats.
Includes msan fix for Convert8To16Row_Opt unittest.
I420 is YUV bt.601 8 bits per channel with 420 subsampling.
I010 is YUV bt.601 10 bits per channel with 420 subsampling.
I is color space - bt.601. The function does no color space
conversion so H420ToI010 is aliased to this function as well.
0 = 420 subsampling. The chroma channels are half width / height.
10 = 10 bits per channel, stored in low 10 bits of 16 bit samples.
For SSSE3 version:
out/Release/libyuv_unittest --gtest_filter=*LibYUVConvertTest.I420ToI010_Opt --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=999 --libyuv_flags=-1 --libyuv_cpu_info=-1
[ RUN ] LibYUVConvertTest.I420ToI010_Opt
[ OK ] LibYUVConvertTest.I420ToI010_Opt (276 ms)
Bug: libyuv:751
Test: LibYUVConvertTest.I420ToI010_Opt
Change-Id: I072876ee4fd74a2b74f459b628838bc808f9bdd2
Reviewed-on: https://chromium-review.googlesource.com/846421
Reviewed-by: Miguel Casas <mcasas@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
LIBYUV_SSSE3_ONLY was for functions that have SSE2 and SSSE3 but are compiling for SSSE3, so SSE2 will never be used.
Remove the SSE2 implementation of ARGBSHUFFLEROW_SSE2 and rely on SSSE3.
Bug: libyuv: 769
Test: ~/intelsde/sde -p4 -- out/Release/libyuv_unittest --gtest_filter=LibYUVConvertTest.ARGBToABGR_Opt
Change-Id: I7443f4d8ee3c6f47edd2cf1d5a1eb0f8d7a1eeeb
Reviewed-on: https://chromium-review.googlesource.com/846541
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Bug: libyuv:765
Test: build for mips still passes
Change-Id: I99105ad3951d2210c0793e3b9241c178442fdc37
Reviewed-on: https://chromium-review.googlesource.com/826404
Reviewed-by: Weiyong Yao <braveyao@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
BUG=None
TEST=try bots and lint test
Change-Id: I1ab462adf2d309117862c5eb4b244a61ae202951
Reviewed-on: https://chromium-review.googlesource.com/450658
Commit-Queue: Frank Barchard <fbarchard@google.com>
Reviewed-by: Henrik Kjellander <kjellander@chromium.org>
R=fbarchard@google.com
BUG=libyuv:634
Performance Gains :- (vs C vectorized)
I422ToARGBRow_MSA : ~1.6x
I422ToRGBARow_MSA : ~1.6x
I422ToARGBRow_Any_MSA : ~1.58x
I422ToRGBARow_Any_MSA : ~1.6x
Performance Gains :- (vs C non-vectorized)
I422ToARGBRow_MSA : ~7x
I422ToRGBARow_MSA : ~7x
I422ToARGBRow_Any_MSA : ~6.9x
I422ToRGBARow_Any_MSA : ~6.8x
Regarding performance measurement, We have created standalone tests which pass in row's data from a 1920x1080 filled buffer to both the C and MSA functions. And such N iterations are executed to get more accurate timings of C vs MSA.
Review URL: https://codereview.chromium.org/2430313005 .
Halffloats have a limited range. It shouldnt normally come up, but if the scale value passed in produces a small value, the half floats will be denormals, which are slow and/or flust to zero. This test ensures they behave the same in C and SIMD and tests the performance of denormals.
TEST=TestHalfFloatPlane_denormal
BUG=libyuv:560
R=hubbe@chromium.org
Review URL: https://codereview.chromium.org/2424233004 .
YUY2ToI422_Any_Neon previously required 16 pixels and duplicated
the last pixel. The replication was not necessary after a previous
change to treat YUY2 to 4 byte macro pixels.
TBR=harryjin@google.com
BUG=libyuv:648
TESTED=util/android/test_runner.py gtest -s libyuv_unittest -t 7200 --verbose --release --gtest_filter=*YUY2ToI422* -a "--libyuv_width=17 --libyuv_height=7 --libyuv_repeat=999 --libyuv_flags=1"
Review URL: https://codereview.chromium.org/2399143002 .
As per the preparation patch added in Chromium sources at,
2150943003: Add MIPS SIMD Arch (MSA) build flags for GYP/GN builds
This patch adds first MSA optimized function in libYUV project.
BUG=libyuv:634
R=fbarchard@google.com
Review URL: https://codereview.chromium.org/2285683002 .
Add public methods SplitUVPlanes and MergeUVPlanes based on the
optimized assembly functions that already exists. Also, de-duplicate the
CPU dispatching code for these functions by moving them to helper
functions.
BUG=libyuv:629
R=braveyao@chromium.org
Review URL: https://codereview.chromium.org/2277603004 .
blur requires memory be aligned. change the unittest allocator to guarantee 64 byte alignment.
re-enable blur any test that fails if memory is unaligned.
TBR=harryjin@google.com
BUG=libyuv:596,libyuv:594
TESTED=local build passes with row.h removed from tests.
Review URL: https://codereview.chromium.org/2019753002 .
When using C version of I420Interpolate for msan, a 50% interpolation
would cause stride to be cast to int, which could cause erroneous
memory reads on 64 bit build.
This CL makes the stride use ptrdiff_t for HalfRow_C
BUG=libyuv:582
TESTED=try bots tests
R=dhrosa@google.com
Review URL: https://codereview.chromium.org/1872953002 .
When attempting to normalize function names to end in Row_SIMD it was made
harder with MIPS_DSPR2 naming convention.
Other CPUs do not include the vendor. This should be named consistently.
Removed the DISABLE_MIPS in favour of DISABLE_ASM for consistency with other
processors.
TBR=harryjin@google.com
BUG=libyuv:562
Review URL: https://codereview.chromium.org/1677633002 .
When width was odd Y channel wrote an extra pixel.
This change splits the Y from UV into a temporary
buffer and memcpy's to the destination. Performance
is slower.
Was
YUY2ToNV12_Any (307 ms)
YUY2ToNV12_Unaligned (213 ms)
TestYUY2ToNV12 (181 ms)
YUY2ToNV12_Opt (177 ms)
YUY2ToNV12_Invert (177 ms)
Npw
YUY2ToNV12_Any (300 ms)
YUY2ToNV12_Unaligned (226 ms)
YUY2ToNV12_Invert (206 ms)
TestYUY2ToNV12 (184 ms)
YUY2ToNV12_Opt (181 ms)
TBR=harryjin@google.com
BUG=libyuv:545
Review URL: https://codereview.chromium.org/1593833002 .
Remove inaccurate specializations for 1/4 and 3/4, since they round
incorrectly. Specialize for 100% and 50% are kept due to performance.
Make C and ARM code match SSSE3.
Make unittests expect zero difference.
BUG=libyuv:535
R=harryjin@google.com
Review URL: https://codereview.chromium.org/1533643005 .