2 Commits

Author SHA1 Message Date
Manojkumar Bhosale
73a6f100a9 Add MSA optimized rotate functions (used 16x16 transpose)
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
TransposeWx16_MSA        - ~6.0x
TransposeWx16_Any_MSA    - ~4.7x
TransposeUVWx16_MSA      - ~6.3x
TransposeUVWx16_Any_MSA  - ~5.4x

Performance Gain (vs C non-vectorized)
TransposeWx16_MSA        - ~6.0x
TransposeWx16_Any_MSA    - ~4.8x
TransposeUVWx16_MSA      - ~6.3x
TransposeUVWx16_Any_MSA  - ~5.4x

Review-Url: https://codereview.chromium.org/2617703002 .
2017-01-13 15:50:02 +05:30
Manojkumar Bhosale
6fa5e4eb78 Add MSA optimized TransposeWx8_MSA and TransposeUVWx8_MSA functions
R=fbarchard@google.com
BUG=libyuv:634

Performance Gain (vs C vectorized)
TransposeWx8_MSA          - ~2.7x
TransposeWx8_Any_MSA      - ~2.1x
TransposeUVWx8_MSA        - ~2.5x
TransposeUVWx8_Any_MSA    - ~2.7x

Performance Gain (vs C non-vectorized)
TransposeWx8_MSA          - ~4.6x
TransposeWx8_Any_MSA      - ~2.9x
TransposeUVWx8_MSA        - ~4.4x
TransposeUVWx8_Any_MSA    - ~3.7x

Review URL: https://codereview.chromium.org/2553403002 .
2016-12-15 10:06:01 +05:30