mirror of
https://chromium.googlesource.com/libyuv/libyuv
synced 2025-12-06 16:56:55 +08:00
We cannot use the standard dot-product instructions since the coefficients multiplication results are both added and subtracted, but I8MM supports mixed-sign dot products which work well here. We need to add an additional variant of the coefficient structs since we need negative constants for the elements that were previously subtracted. Reduction in runtimes observed compared to the previous Neon implementation: Cortex-A510: -37.3% Cortex-A520: -31.1% Cortex-A715: -37.1% Cortex-A720: -37.0% Cortex-X2: -62.1% Cortex-X3: -62.2% Cortex-X4: -40.4% Bug: libyuv:977 Change-Id: Idc3d9a6408c30e1bce3816a1ed926ecd76792236 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5712928 Reviewed-by: Frank Barchard <fbarchard@chromium.org> Reviewed-by: Justin Green <greenjustin@google.com> |
||
|---|---|---|
| .. | ||
| basic_types.h | ||
| compare_row.h | ||
| compare.h | ||
| convert_argb.h | ||
| convert_from_argb.h | ||
| convert_from.h | ||
| convert.h | ||
| cpu_id.h | ||
| loongson_intrinsics.h | ||
| macros_msa.h | ||
| mjpeg_decoder.h | ||
| planar_functions.h | ||
| rotate_argb.h | ||
| rotate_row.h | ||
| rotate.h | ||
| row.h | ||
| scale_argb.h | ||
| scale_rgb.h | ||
| scale_row.h | ||
| scale_uv.h | ||
| scale.h | ||
| version.h | ||
| video_common.h | ||