The following functions are added:
planar YUV:
I410ToAR30, I410ToARGB
planar YUVA:
I010AlphaToARGB, I210AlphaToARGB, I410AlphaToARGB
biplanar YUV:
P010ToARGB, P210ToARGB
P010ToAR30, P210ToAR30
biplanar functions can also handle 12 bit and 16 bit samples.
libyuv_unittest --gtest_filter=LibYUVConvertTest.*10*ToA*:LibYUVConvertTest.*P?1?ToA*
R=fbarchard@chromium.org
Bug: libyuv:751, libyuv:844
Change-Id: I2be02244dfa23335e1e7bc241fb0613990208de5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2707003
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
new color util to compute constants needed based on white point.
[ RUN ] LibYUVColorTest.TestFullYUVV
hist -2 -1 0 1 2
red 0 1627136 13670144 1479936 0
green 319285 3456836 9243059 3440771 317265
blue 0 1561088 14202112 1014016 0
Bug: libyuv:877, b/178283356
Change-Id: If432ebfab76b01302fdb416a153c4f26ca0832d6
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2678859
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
These functions convert between planar and interleaved ARGB,
optionally fill 255 to alpha / discard alpha.
This can help handle YUV(A) with Identity matrix, which is
basically planar ARGB.
libyuv_unittest --gtest_filter=LibYUVPlanarTest.*ARGBPlane*:LibYUVPlanarTest.*XRGBPlane*
R=fbarchard@google.com
Change-Id: I522a189b434f490ba1723ce51317727e7c5eb112
Bug: libyuv:877
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2649887
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
MAKEYUVCONSTANTS macro to generate struct for YUV to RGB
Fix I444AlphaToARGB unit test for ARM by adjusting C version to match Neon implementation.
Bug: libyuv:879, libyuv:878, libyuv:877, libyuv:862, b/178283356
Change-Id: Iedb171fbf668316e7d45ab9e3481de6205ed31e2
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2646472
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Wan-Teh Chang <wtc@google.com>
This includes:
- fixing a handrolled raw exec-based DEPS parser that was failing
to parse Str, similar to crbug.com/1106435.
- rolling chromium forward by nearly a year. (The last roll that
landed was crrev.com/c/1797295). This required a bunch of changes in
order to be able to successfully sync, run gn, and compile:
- switching the mirrors for three repositories to match chromium,
which switched in crrev.com/c/2062580.
- making libyuv write an empty gclient_args file
- adding a few build_override gn arguments
- adding nasm as a deps entry, as it's now required by libjpeg_turbo
- android:
- adding jdk, libunwindstack, and turbine
- rolling the android sdk
- rolling bazel and r8
- rolling the cipd packages managed by third_party/android_deps
- adding six and requests to .vpython for the test runner
- switching to memcpy in a few places to avoid SIGBUS errors on
arm due to unaligned reads
- linux:
- checking out instrumented libraries for msan (including adding
depot_tools to deps for the hook)
- mac:
- adding mac_xcode_version to gclient_gn_args
- win:
- limit mac_toolchain to checkout_mac
Bug: 1063768, 1097306
Change-Id: Idd86fffcdac174fd2f7899243a56af4f1ed8077e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2384320
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
Commit-Queue: Mirko Bonadei <mbonadei@chromium.org>
Intel
Was ARGBSubtract_Opt (1760 ms)
Now ARGBSubtract_Opt (1546 ms)
ARM
Was ARGBAdd_Opt (1747 ms)
Now ARGBAdd_Opt (1260 ms)
Bug: None
Change-Id: I52436f6390b6b7313f2a8820833bb4f60ae958be
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2299639
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Neon move prfm after loads for all functions. Example performance improvement
Was
I444ToARGB_Opt (3275 ms)
I444ToNV12_Opt (1509 ms)
Now
I444ToARGB_Opt (2751 ms)
I444ToNV12_Opt (1367 ms)
Bug: libyuv:447
Change-Id: I78bf797b3600084c1eceb0be44cdbc9a575de803
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2189559
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
1. Switch to 8 bit precision.
2. Fix an error in the implementation of MMI and MSA.
About the error:
MMI and MSA implementation for RGBtoY and RGBToYJ used different
precision according to the C implementation( The C version has been
unified in commit fce0fed542001577e6b10f4cf859e0fa1774974e). This
patch unifies the precision to 8 bit for RGBToYJ in MMI and MSA.
Change-Id: Ic6a6e424d27a2f049b0c954f03174192d2beb091
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2155608
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Call a row function for each row, based on ARGBToI400 code.
But implement row functions as 2 step conversion. Adds the
row functions:
RAWToYJ, RGBToYJ, SSSE3 and AVX2 versions, and Any versions.
The smaller row buffer is more cache friendly on large images.
The max cache size can be configured, and is currently:
// Maximum temporary width for wrappers to process at a time, in pixels.
And the row buffer is
SIMD_ALIGNED(uint8_t row[MAXTWIDTH * 4]);
So 8192 bytes are used for the row buffer, leaving the rest for source
and destination buffers.
blaze-bin/third_party/libyuv/libyuv_test '--gunit_filter=*R*To?400_Opt' --libyuv_width=3600 --libyuv_height=2500 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1 | sortms
Was
RAWToJ400_Opt (3996 ms)
ARGBToI400_Opt (3964 ms)
RGB24ToJ400_Opt (3960 ms)
ARGBToJ400_Opt (3909 ms)
RGBAToJ400_Opt (3885 ms)
Now
ARGBToJ400_Opt (4091 ms)
ARGBToI400_Opt (3936 ms)
RGBAToJ400_Opt (3428 ms)
RGB24ToJ400_Opt (3324 ms)
RAWToJ400_Opt (3309 ms)
Bug: libyuv:854, b/147753855
Change-Id: Ieb65fbda94e812c737f4c3c74107354b73c4bcd2
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2016203
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
This pulls in the changes that Firefox made to add BT.2020 support as well
as expands them to the existing 10-bit support. So we now have the following
input formats: U420, U422, U444, U010.
BUG=960620, libyuv:845
Change-Id: If0c47853a465d0ed660f849db08e71489fe1b9c2
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1884468
Commit-Queue: Dale Curtis <dalecurtis@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Recent versions of Clang started warning when the loop doesn't get
vectorized, such as when compiling with -Oz (see bug).
To fix the build, remove the pragma and let the compiler decide on its
own when to vectorize.
Bug: chromium:1015665
Change-Id: I40a610c9e0d94cfd577a6cd2b01e6fdaa08bef7d
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1872580
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Hans Wennborg <hans@chromium.org>
Replace ARM64 only row function with high level function
that implements SSSE3, 32 bit Neon and C.
Compared to 2 step RAWToARGB + ARGBToRGBA on row level:
3.1x faster on ARM
6.2% faster on Intel
BUG=b/140748379
Change-Id: Ia8636d9e4fcdbe10b8c2e81610a54728e29845cd
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1860914
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Neon and GCC Intel optimized, but win32 and mips not optimized.
BUG=libyuv:842, b/141482243
Change-Id: Ia56fa85c8cc1db51f374bd0c89b56d21ec94afa7
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1825642
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Apply clang-format to fix jpeg if() for lint fix.
Change comments about 4th pixel for open source compliance.
Rename UVToVU to SwapUV for consistency with MergeUV.
BUG=b/135532289, b/136515133
Change-Id: I9ce377c57b1d4d8f8b373c4cb44cd3f836300f79
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1685936
Reviewed-by: Chong Zhang <chz@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Alternatives to RGB24 and AYUV for working with GPU.
BUG=libyuv:832
TESTED=out/Release/libyuv_unittest --gtest_filter=*NV21To???24* --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1
R=rrwinterton@gmail.com
Change-Id: I5559c63f4bd4c847492fcb1571f7b03c58146689
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1501735
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
AVX2 port of SSSE3 conversion to output 24 bit RGB
Bug: libyuv:778
Test: LibYUVConvertTest.NV21ToRGB24_Opt
Change-Id: I14f7815522d1b790ecd2bb39d9a3441e803b694a
Reviewed-on: https://chromium-review.googlesource.com/953303
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Use 2 step conversion for NV21ToRGB24 to leverage AVX2
low levels instead of C.
Was C
NV21ToRGB24_Opt (882 ms)
Now SSSE3
NV21ToRGB24_Opt (218 ms)
Bug: libyuv:778
Test: LibYUVConvertTest.NV21ToRGB24_Opt
Change-Id: I58faf766bbec4cc595aab2e217f6c874dd4b4363
Reviewed-on: https://chromium-review.googlesource.com/951629
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Each byte is converted to float (0.0 to 255.0) and then multiplied
by a scale parameter.
Bug: None
Test: arm 64 build passes.
Change-Id: I04736798540b8d985f60abdf0388e24a209d075b
Reviewed-on: https://chromium-review.googlesource.com/930226
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Ian Field <ianfield@google.com>
AR30ToARGB will vectorize if the output is masked
together as an int instead of 4 byte stores.
Performance is 2x faster
Was AR30ToARGB_Opt (1585 ms)
Now AR30ToARGB_Opt (746 ms)
Bug: libyuv:777
Test:LibYUVConvertTest.AR30ToARGB_Opt
Change-Id: Idd47ae599d5d125207bb53e618d6d7e784d4a37c
Reviewed-on: https://chromium-review.googlesource.com/923169
Reviewed-by: Miguel Casas <mcasas@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
vpshufb is used to reverse R and B channels;
Code is otherwise the same as ARGBToAR30.
Bug: libyuv:751
Test: ABGRToAR30 unittest
Change-Id: I30e02925f5c729e4496c5963ba4ba4af16633b3b
Reviewed-on: https://chromium-review.googlesource.com/891807
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
ABGR is the more common format on Android.
This CL converts 10 bit AR30, to standard 8 bit ABGR.
Unoptimized but allows better testing and feature completeness.
Bug: libyuv:751
Test: LibYUVConvertTest.AR30ToABGR_Opt
Change-Id: I0c7e7273158be215129e0a1d355587ae15942299
Reviewed-on: https://chromium-review.googlesource.com/891694
Reviewed-by: Miguel Casas <mcasas@chromium.org>
Removes macros that were part of standard basic_types
header but not used by libyuv itself.
TBR=braveyao@chromium.org
Bug: libyuv:774
Test: try bots still build
Change-Id: I8de6fad5a9277df0a50959881392ba212b1b5972
Reviewed-on: https://chromium-review.googlesource.com/879591
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Switch YUV conversion macro to output 16 bits per channel.
STOREAR30 macro to output AR30.
[ RUN ] LibYUVConvertTest.TestH420ToARGB
uniques: B 220, G, 220, R 220
[ OK ] LibYUVConvertTest.TestH420ToARGB (0 ms)
[ RUN ] LibYUVConvertTest.TestH010ToARGB
uniques: B 256, G, 256, R 256
[ OK ] LibYUVConvertTest.TestH010ToARGB (0 ms)
[ RUN ] LibYUVConvertTest.TestH010ToAR30
uniques: B 883, G, 883, R 883
[ OK ] LibYUVConvertTest.TestH010ToAR30 (0 ms)
Bug: libyuv:751
Test: LibYUVConvertTest.H010ToAR30_Opt
Change-Id: I902b718e2c8b68ede69625ccafebc6519d5af70d
Reviewed-on: https://chromium-review.googlesource.com/869511
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Miguel Casas <mcasas@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
SSSE3 optimized 10 bit YUV conversion to ARGB in single step.
Bug: libyuv:751
Test: I010ToARGB
Change-Id: I234b2850e35992113ee6bd638732bafc7010a60d
Reviewed-on: https://chromium-review.googlesource.com/848238
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Convert planar 8 bit formats to planar 16 bit formats.
Accepts a parameter that determines the number of bits.
Bug: libyuv:751
Test: Convert8To16 unittest
Change-Id: I8f6ffe64428ddf5769b87e0c069093a50a2541e9
Reviewed-on: https://chromium-review.googlesource.com/835410
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Initial AR30ToARGB function to allow converion
from AR30 to other formats if necessary and/or
for testing.
Not optimized at this point.
Bug: libyuv:751
Test: LibYUVConvertTest.AR30ToARGB_Opt
Change-Id: I38ef192315240f3caa7aee0218b38d5e88a2849f
Reviewed-on: https://chromium-review.googlesource.com/833025
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
This version of the H010ToAR30 provides a 3 step conversion
Convert16To8Row_AVX2
H420ToARGB_AVX2
ARGBToAR30_AVX2
Low level function added to convert 16 bit to 8 bit using multiply
to adjust 10 bit or other bit depths and then save the upper 16 bits.
Bug: libyuv:751
Test: LibYUVPlanarTest.Convert16To8Row_Opt unittest added
Change-Id: I9cc576fda8afa1003cb961d03e0e656e0b478f03
Reviewed-on: https://chromium-review.googlesource.com/783554
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>
When converting from lsb 10 bit formats to msb, the values
need to be shifted to the top 10 bits. Using a multiply
allows the different numbers of bits to be copied:
// 128 = 9 bits
// 64 = 10 bits
// 16 = 12 bits
// 1 = 16 bits
Bug: libyuv:751
Test: LibYUVPlanarTest.MultiplyRow_16_Opt
Change-Id: I9cf226053a164baa14155215cb175065b1c4f169
Reviewed-on: https://chromium-review.googlesource.com/762951
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
H010 is 10 bit planar format with 10 bits in lower bits.
P010 is 10 bit biplanar format with 10 bits in upper bits.
This function weaves the U and V channels and shifts the bits
into the upper bits.
Bug: libyuv:751
Test: LibYUVPlanarTest.MergeUV10Row_Opt
Change-Id: I4a0bac0ef1ff95aa1b8d68261ec8e8e86f2d1fbf
Reviewed-on: https://chromium-review.googlesource.com/752692
Reviewed-by: Cheng Wang <wangcheng@google.com>
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
add ScaleMaxSamples_NEON function with max
done on original values.
TBR=kjellander@chromium.org
BUG=libyuv:717
TEST=LibYUVPlanarTest.TestScaleMaxSamples_Opt
Change-Id: Id99338860782b10ffd24f66242eb42014c2e229e
Reviewed-on: https://chromium-review.googlesource.com/614685
Reviewed-by: Frank Barchard <fbarchard@google.com>
Reviewed-by: Cheng Wang <wangcheng@google.com>
Uses 1 add instead of 2 leas to reduce port pressure on ports 1 and 5
used for SIMD instructions.
BUG=libyuv:670
TEST=~/iaca-lin64/bin/iaca.sh -arch HSW out/Release/obj/libyuv/row_gcc.o
Change-Id: I3965ee5dcb49941a535efa611b5988d977f5b65c
Reviewed-on: https://chromium-review.googlesource.com/433391
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
ARGBToUV_C and ARGBToUVJ_C are generated functions with subtle
difference in rounding. Adding comment to make them easier to find.
TBR=kjellander@chromium.org
BUG=libyuv:634
TEST=untested
Change-Id: I9912d256a1e04c58475d33bdb472c37484f6cab9
Reviewed-on: https://chromium-review.googlesource.com/434980
Reviewed-by: Frank Barchard <fbarchard@google.com>
Commit-Queue: Frank Barchard <fbarchard@google.com>
YUV 411 is very uncommon format. Remove support.
Update documentation to reflect that 411 is deprecated.
Simplify tests for YUV to only test with the new side by side YUV but keep old 3 plane test around with a macro for now.
BUG=libyuv:645
R=kjellander@chromium.org
Review URL: https://codereview.chromium.org/2406123002 .
Original bt709 color space coefficients were full range yuv for higher
quality. This change makes the coefficients use the video constrained
color space the same as bt601 which is 16 to 240 for Y and 16 to 235 for
chroma channels.
BUG=libyuv:639
TEST=libyuv unittests run locally
R=hubbe@chromium.org
Review URL: https://codereview.chromium.org/2367253003 .
On visual c 2013 and earlier a warning is generated if externs
are not declared with the same alignment as the declaration, when
using /ltcg
BUG=libyuv:633
TEST=standalong test built with cl /Bv /GL /Ox /nologo a.cc b.cc /link /ltcg
R=skal@google.com
Review URL: https://codereview.chromium.org/2291533004 .
When using C version of I420Interpolate for msan, a 50% interpolation
would cause stride to be cast to int, which could cause erroneous
memory reads on 64 bit build.
This CL makes the stride use ptrdiff_t for HalfRow_C
BUG=libyuv:582
TESTED=try bots tests
R=dhrosa@google.com
Review URL: https://codereview.chromium.org/1872953002 .
NV12ToRGB565Row for Intel is implemented as a 2 step conversion:
NV12ToARGBRow_SSSE3 and ARGBToRGB565Row_SSE2
NV12ToARGBRow has an AVX2 version, so this CL implements
NV12ToRGB565Row_AVX2 with call to NV12ToARGBRow_AVX2 and
ARGBToRGB565Row_SSE2.
R=harryjin@google.com
BUG=libyuv:554
Review URL: https://codereview.chromium.org/1687953002 .
Remove inaccurate specializations for 1/4 and 3/4, since they round
incorrectly. Specialize for 100% and 50% are kept due to performance.
Make C and ARM code match SSSE3.
Make unittests expect zero difference.
BUG=libyuv:535
R=harryjin@google.com
Review URL: https://codereview.chromium.org/1533643005 .
Removes low levels for I420ToBGRA and I420ToRAW and reimplements them as I420ToRGBA and I420ToRGB24 with transposed color matrix.
Adds unittests that do 1 step conversion vs 2 steps to test end swapping versions match direct conversions.
R=harryjin@google.com
BUG=libyuv:518
Review URL: https://codereview.chromium.org/1427993004 .
U contributes to B and G. V contributes to R and G.
By swapping U and V, they contribute to the opposite channels. Adjust the matrix so the U contribution is in the matrix location such that it till contribute to the
new B channel and vice versa.
This allows ABGR versions of YUV conversion to use the same low level code as ARGB, just using a different matrix and swapping U and V pointers.
As a result the existing I444ToABGRRow functions are no longer needed and are removed.
Previously this function was only Intel AVX2 optimized for Windwos. Now it is also optimized for Arm and GCC.
ARMv7 Neon
Was LibYUVConvertTest.I444ToABGR_Opt (75971 ms)
Now LibYUVConvertTest.I444ToABGR_Opt (3672 ms)
20.6 times faster.
R=xhwang@chromium.org
BUG=libyuv:515
Review URL: https://codereview.chromium.org/1414133006 .
The any function for handling ARGBToI411 was not handling the pixel
replication correctly. On 422 and odd width was handled by duplicating
a pixel of source. 411 needs replication for remainders of 1, 2 or 3
pixels.
The C version was handling odd width but with an average of the remainder
pixels, which does not match the SIMD 'any' handling off remainder.
This changes the odd width handling to mimic the any version.
TBR=harryjin@google.com
BUG=libyuv:491
Review URL: https://codereview.chromium.org/1411733004 .
yuv constants for bt.601 were previously ported to neon64, as well
as the code to respect other color spaces. But the jpeg and bt.709
colour conversion constants were still in armv7 form. This changes
the constants for aarch64 builds to be compatible with the code.
yuv constants are now passed as const *
Remove Yvu constants which were used for older version on nv21 but not new code.
TBR=harryjin@google.com
BUG=none
Review URL: https://codereview.chromium.org/1398623002 .
SETUP provided by zhongwei.yao@linaro.org
Previously the 64 bit Neon code had hard coded constants in the setup macro
for YUV conversion, while 32 bit Neon code supported the yuvconstants
parameter.
This change accepts the constants passed to the YUV conversion row function,
allowing different color spaces to be respected - naming JPEG and BT.709.
As well as the existing BT.601.
TBR=harryjin@google.com
BUG=libyuv:472
Review URL: https://codereview.chromium.org/1384323002 .
Low level for NV21ToARGB written to accept yuv matrix used by
other YUV to ARGB functions.
Previously NV21 was implemented for Windows using NV12 with a different
matrix that swapped U and V. But the Arm version of the low level does
not allow the matrix U and V contributions to be swapped.
Using a new low level function that reads NV21 and uses the same
yuvconstants as other YUV conversion functions allows an Arm port of
this function.
TBR=harryjin@google.com
BUG=libyuv:500
Review URL: https://codereview.chromium.org/1388273002 .