[AArch64] Impose feature dependencies in detection code

The strict architectural requirements between features are reasonably
relaxed and difficult to map out fully, in particular:

* FEAT_DotProd is architecturally available from Armv8.1-A and becomes
  mandatory from Armv8.4-A.

* FEAT_I8MM is architecturally available from Armv8.1-A and becomes
  mandatory from Armv8.6-A. It does not strictly depend on FEAT_DotProd
  being implemented however I am not aware of a micro-architecture where
  FEAT_I8MM is implemented without FEAT_DotProd also being implemented.

* FEAT_SVE is architecturally available from Armv8.2-A. It does not
  strictly depend on either of FEAT_DotProd or FEAT_I8MM being
  implemented. The only micro-architecture I am aware of where FEAT_SVE
  is implemented without FEAT_DotProd and FEAT_I8MM both also being
  implemented is the Fujitsu A64FX.

* FEAT_SVE2 is architecturally available from Armv9.0-A. If FEAT_SVE2 is
  implemented then FEAT_SVE must also be implemented. Since Armv9.0-A is
  based on Armv8.5-A this implies that FEAT_DotProd is also implemented.
  Interestingly this means that FEAT_I8MM is not mandatory since it only
  becomes mandatory from Armv8.6-A (Armv9.1-A), however I am not aware
  of a micro-architecture where FEAT_SVE2 is implemented without all
  three of the above features also being implemented.

Additionally, when testing under emulation there are sometimes bugs
where even mandatory architecture relationships are broken. For example
there is one known case where SVE2 may be reported as available even
when SVE is explicitly disabled.

To simplify these dependencies, don't try to enable later extensions
unless earlier extensions are reported implemented. This notably
penalises code if it were to run on a Fujitsu A64FX, however this is not
a likely target for libyuv deployment.

Change-Id: Ifa32f7a43043641f99afb120e591945e136c9fd1
Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/5546385
Reviewed-by: Frank Barchard <fbarchard@chromium.org>
This commit is contained in:
George Steed 2024-04-22 10:05:26 +01:00 committed by Frank Barchard
parent ec6f15079f
commit c6632d43ae

View File

@ -193,17 +193,23 @@ LIBYUV_API SAFEBUFFERS int AArch64CpuCaps(unsigned long hwcap,
// Neon is mandatory on AArch64, so enable regardless of hwcaps. // Neon is mandatory on AArch64, so enable regardless of hwcaps.
int features = kCpuHasNEON; int features = kCpuHasNEON;
// Don't try to enable later extensions unless earlier extensions are also
// reported available. Some of these constraints aren't strictly required by
// the architecture, but are satisfied by all micro-architectures of
// interest. This also avoids an issue on some emulators where true
// architectural constraints are not satisfied, e.g. SVE2 may be reported as
// available while SVE is not.
if (hwcap & YUV_AARCH64_HWCAP_ASIMDDP) { if (hwcap & YUV_AARCH64_HWCAP_ASIMDDP) {
features |= kCpuHasNeonDotProd; features |= kCpuHasNeonDotProd;
} if (hwcap2 & YUV_AARCH64_HWCAP2_I8MM) {
if (hwcap2 & YUV_AARCH64_HWCAP2_I8MM) { features |= kCpuHasNeonI8MM;
features |= kCpuHasNeonI8MM; if (hwcap & YUV_AARCH64_HWCAP_SVE) {
} features |= kCpuHasSVE;
if (hwcap & YUV_AARCH64_HWCAP_SVE) { if (hwcap2 & YUV_AARCH64_HWCAP2_SVE2) {
features |= kCpuHasSVE; features |= kCpuHasSVE2;
} }
if (hwcap2 & YUV_AARCH64_HWCAP2_SVE2) { }
features |= kCpuHasSVE2; }
} }
return features; return features;
} }
@ -244,9 +250,9 @@ LIBYUV_API SAFEBUFFERS int AArch64CpuCaps() {
if (have_feature("hw.optional.arm.FEAT_DotProd")) { if (have_feature("hw.optional.arm.FEAT_DotProd")) {
features |= kCpuHasNeonDotProd; features |= kCpuHasNeonDotProd;
} if (have_feature("hw.optional.arm.FEAT_I8MM")) {
if (have_feature("hw.optional.arm.FEAT_I8MM")) { features |= kCpuHasNeonI8MM;
features |= kCpuHasNeonI8MM; }
} }
// No SVE feature detection available here at time of writing. // No SVE feature detection available here at time of writing.
return features; return features;