Manojkumar Bhosale
|
288bfbefb5
|
Add MSA optimized remaining scale row functions
R=fbarchard@google.com
BUG=libyuv:634
Performance Gain (vs C vectorized)
ScaleRowDown2_MSA - ~22.3x
ScaleRowDown2_Any_MSA - ~19.9x
ScaleRowDown2Linear_MSA - ~31.2x
ScaleRowDown2Linear_Any_MSA - ~29.4x
ScaleRowDown2Box_MSA - ~20.1x
ScaleRowDown2Box_Any_MSA - ~19.6x
ScaleRowDown4_MSA - ~11.7x
ScaleRowDown4_Any_MSA - ~11.2x
ScaleRowDown4Box_MSA - ~15.1x
ScaleRowDown4Box_Any_MSA - ~15.1x
ScaleRowDown38_MSA - ~1x
ScaleRowDown38_Any_MSA - ~1x
ScaleRowDown38_2_Box_MSA - ~1.7x
ScaleRowDown38_2_Box_Any_MSA - ~1.7x
ScaleRowDown38_3_Box_MSA - ~1.7x
ScaleRowDown38_3_Box_Any_MSA - ~1.7x
ScaleAddRow_MSA - ~1.2x
ScaleAddRow_Any_MSA - ~1.15x
Performance Gain (vs C non-vectorized)
ScaleRowDown2_MSA - ~22.4x
ScaleRowDown2_Any_MSA - ~19.8x
ScaleRowDown2Linear_MSA - ~31.6x
ScaleRowDown2Linear_Any_MSA - ~29.4x
ScaleRowDown2Box_MSA - ~20.1x
ScaleRowDown2Box_Any_MSA - ~19.6x
ScaleRowDown4_MSA - ~11.7x
ScaleRowDown4_Any_MSA - ~11.2x
ScaleRowDown4Box_MSA - ~15.1x
ScaleRowDown4Box_Any_MSA - ~15.1x
ScaleRowDown38_MSA - ~3.2x
ScaleRowDown38_Any_MSA - ~3.2x
ScaleRowDown38_2_Box_MSA - ~2.4x
ScaleRowDown38_2_Box_Any_MSA - ~2.3x
ScaleRowDown38_3_Box_MSA - ~2.9x
ScaleRowDown38_3_Box_Any_MSA - ~2.8x
ScaleAddRow_MSA - ~8x
ScaleAddRow_Any_MSA - ~7.46x
Review-Url: https://codereview.chromium.org/2559683002 .
|
2016-12-21 13:39:44 +05:30 |
|