* Run on SiFive internal FPGA:
ARGBToAR64_Opt (~13.7x vs scalar)
ARGBToAB64_Opt (~5.81x vs scalar)
AR64ToARGB_Opt (~15.8x vs scalar)
AB64ToARGB_Opt (~2.40x vs scalar)
LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=10
Bug: libyuv:956
Change-Id: Ida642a5077f59d25fb7c5328f671956b2293dadd
Signed-off-by: Bruce Lai <bruce.lai@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4442913
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
* Run on SiFive internal FPGA:
ARGBToRAW_Opt (~1.55x vs scalar)
ARGBToRGB24_Opt (~1.44x vs scalar)
RGB24ToARGB_Opt (~1.77x vs scalar)
LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=10
Bug: libyuv:956
Change-Id: I26722f6848cd68684d95d9a7ee06ce0416e7985d
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4413083
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
* Run on SiFive internal FPGA:
RAWToARGB_Opt (~2x vs scalar)
RAWToRGBA_Opt (~2x vs scalar)
RAWToRGB24_Opt (~1.5x vs scalar)
LIBYUV_WIDTH=1280 LIBYUV_HEIGHT=720 LIBYUV_REPEAT=10
Change-Id: I21a13d646589ea2aa3822cb9225f5191068c285b
Signed-off-by: Darren Hsieh <darren.hsieh@sifive.com>
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/4408357
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
- YUV to RGB use linear for first and last row.
- add assert(yuvconstants)
- rename pointers to match row functions.
- use macros that match row functions.
- use 12 bit upsampler for conversions of 10 and 12 bits
Cortex A53 AArch32
I420ToRGB24_Opt (3627 ms)
I422ToRGB24_Opt (4099 ms)
I444ToRGB24_Opt (4186 ms)
I420ToRGB24Filter_Opt (5451 ms)
I422ToRGB24Filter_Opt (5430 ms)
AVX2
Was I420ToRGB24Filter_Opt (583 ms)
Now I420ToRGB24Filter_Opt (560 ms)
Neon Cortex A7
Was I420ToRGB24Filter_Opt (5447 ms)
Now I420ToRGB24Filter_Opt (5439 ms)
Bug: libyuv:938
Change-Id: I1731f2dd591073ae11a756f06574103ba0f803c7
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3906082
Reviewed-by: Justin Green <greenjustin@google.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
The code already exists to use a specific matrix. This CL simply
adds a function to use a generic YUV matrix for the conversion.
Bug: b/241451603
Change-Id: I0eea7e96a891d045905a9c963b56c053097029ec
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3820903
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
When used in C enum keyword can't be eliminated.
Bug: libyuv:872
Change-Id: Iacff5a8bd84ec7caa1f90889e48f81ffc10071ae
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3513317
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Optimize 44 functions in source/row_lsx.cc file.
All test cases passed on loongarch platform.
Bug: libyuv:913
Change-Id: Ic80a5751314adc2e9bd435f2bbd928ab017a90f9
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3351467
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
1. Add supports for LSX and LASX.
2. Three optimization functions are added in loongarch/row_lasx.cc file:
I422ToARGBRow_LASX,I422ToRGBARow_LASX,I422AlphaToARGBRow_LASX.
Bug: libyuv:912, Bug: libyuv:913
Change-Id: I043c2704f99a5215724b5c0b7f97e6bf5f7a199b
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/3329189
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Add J420 output from RAW.
Optimize RGB24 and RAW To J420 on ARM by using NEON for the 2 step conversion.
Also fix sign-compare warning that was breaking Windows build
Bug: libyuv:887, b/183534734
Change-Id: I8c39334552dc0b28414e638708db413d6adf8d6e
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2783382
Reviewed-by: Wan-Teh Chang <wtc@google.com>
The following functions are added:
planar YUV:
I410ToAR30, I410ToARGB
planar YUVA:
I010AlphaToARGB, I210AlphaToARGB, I410AlphaToARGB
biplanar YUV:
P010ToARGB, P210ToARGB
P010ToAR30, P210ToAR30
biplanar functions can also handle 12 bit and 16 bit samples.
libyuv_unittest --gtest_filter=LibYUVConvertTest.*10*ToA*:LibYUVConvertTest.*P?1?ToA*
R=fbarchard@chromium.org
Bug: libyuv:751, libyuv:844
Change-Id: I2be02244dfa23335e1e7bc241fb0613990208de5
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2707003
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
M420 is a row biplanar variation of NV12 supported on Microsoft webcams.
The code was hardcoded to bt.601 and should be jpeg, but the format is
very old and rare. Is a variation on NV12, so if someone needs it, it
can be re-implemented easily.
Bug: libyuv:858
Change-Id: I246167dba3c190cc76af741b8e91e58e68fde28f
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2212608
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Some processors support both MSA and MMI.
when they are enabled together, MSA will be preferd.
This patch move MSA initialization after MMI, so that
MSA can overide MMI and be setted to effective.
Change-Id: I8a52cce83ee4ec9727d47c99b287c9580329b149
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/2155944
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
This adds some missing prototypes from the BT.2020 CL as well as expands
the H444 and J444 results.
BUG=960620, libyuv:845, b/129864744
Change-Id: I8ea3959379f1bb2edb857d4eb90fb9a1f6aa4e03
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1899093
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
This pulls in the changes that Firefox made to add BT.2020 support as well
as expands them to the existing 10-bit support. So we now have the following
input formats: U420, U422, U444, U010.
BUG=960620, libyuv:845
Change-Id: If0c47853a465d0ed660f849db08e71489fe1b9c2
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1884468
Commit-Queue: Dale Curtis <dalecurtis@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Replace ARM64 only row function with high level function
that implements SSSE3, 32 bit Neon and C.
Compared to 2 step RAWToARGB + ARGBToRGBA on row level:
3.1x faster on ARM
6.2% faster on Intel
BUG=b/140748379
Change-Id: Ia8636d9e4fcdbe10b8c2e81610a54728e29845cd
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1860914
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Gaussian blur low levels ported to 32 bit neon.
But they are not hooked up to anything but a unittest.
Bug:b/248041731, b/132108021, b/129908793
Change-Id: Iccebb8ffd6b719810aa11dd770a525227da4c357
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1611206
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Chong Zhang <chz@google.com>
Alternatives to RGB24 and AYUV for working with GPU.
BUG=libyuv:832
TESTED=out/Release/libyuv_unittest --gtest_filter=*NV21To???24* --libyuv_width=1280 --libyuv_height=720 --libyuv_repeat=1000 --libyuv_flags=-1 --libyuv_cpu_info=-1
R=rrwinterton@gmail.com
Change-Id: I5559c63f4bd4c847492fcb1571f7b03c58146689
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/1501735
Reviewed-by: richard winterton <rrwinterton@gmail.com>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
RAW is a big endian style RGB buffer with R first in memory, then G and B.
Convert NV21 and NV12 to RAW format.
Performance on SkylakeX for 720p with AVX2
I420ToRAW_Opt (388 ms)
H420ToRAW_Opt (371 ms)
NV12ToRAW_Opt (341 ms)
NV21ToRAW_Opt (339 ms)
SSSE3
I420ToRAW_Opt (507 ms)
H420ToRAW_Opt (481 ms)
NV12ToRAW_Opt (498 ms)
NV21ToRAW_Opt (493 ms)
C
I420ToRAW_Opt (2287 ms)
H420ToRAW_Opt (2246 ms)
NV12ToRAW_Opt (2191 ms)
NV21ToRAW_Opt (2204 ms)
Performance on Pixel 2 for 720p
out/Release/bin/run_libyuv_unittest -v -t 7200 --gtest_filter=*NV??ToR*Opt --libyuv_repeat=1000 --libyuv_width=1280 --libyuv_height=720
LibYUVConvertTest.NV12ToRGB24_Opt (1739 ms)
LibYUVConvertTest.NV21ToRGB24_Opt (1734 ms)
LibYUVConvertTest.NV12ToRAW_Opt (1719 ms)
LibYUVConvertTest.NV21ToRAW_Opt (1691 ms)
LibYUVConvertTest.NV12ToRGB565_Opt (2152 ms)
Bug: libyuv:778, b:117522975
Test: add new NV21ToRAW and NV12ToRAW tests
Change-Id: Ieabb68a2c6d8c26743e609c5696c81bb14fb253f
Reviewed-on: https://chromium-review.googlesource.com/c/1272615
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Mirko Bonadei <mbonadei@chromium.org>
This was changed in 21be9122aadf7824efe3fc19b2a09ff253a688e1.
Change-Id: I6c04dc92f673557e10c231bd090ec8aa88b6bee4
Reviewed-on: https://chromium-review.googlesource.com/1146183
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Currently, libyuv supports MIPS SIMD Arch(MSA),
but libyuv does not supports MultiMedia Instruction(MMI)(such as loongson3a platform).
In order to improve performance of libyuv on loongson3a platform,
this provides optimize 98 functions with mmi.
BUG=libyuv:804
Change-Id: I8947626009efad769b3103a867363ece25d79629
Reviewed-on: https://chromium-review.googlesource.com/1122064
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
Use 2 step conversion for NV21ToRGB24 to leverage AVX2
low levels instead of C.
Was C
NV21ToRGB24_Opt (882 ms)
Now SSSE3
NV21ToRGB24_Opt (218 ms)
Bug: libyuv:778
Test: LibYUVConvertTest.NV21ToRGB24_Opt
Change-Id: I58faf766bbec4cc595aab2e217f6c874dd4b4363
Reviewed-on: https://chromium-review.googlesource.com/951629
Commit-Queue: Frank Barchard <fbarchard@chromium.org>
Reviewed-by: richard winterton <rrwinterton@gmail.com>