Skip to content

Add Cortex-A320 to MIDR decode table#384

Merged
GregoryComer merged 1 commit intopytorch:mainfrom
npitre:a320-midr
Apr 30, 2026
Merged

Add Cortex-A320 to MIDR decode table#384
GregoryComer merged 1 commit intopytorch:mainfrom
npitre:a320-midr

Conversation

@npitre
Copy link
Copy Markdown
Contributor

@npitre npitre commented Apr 28, 2026

Split out from #379 per review request.

ARM Cortex-A320 (MIDR part 0xD8F) is an ARMv9.2-A efficiency core.
Add its uarch enum value and MIDR decode entry so consumers (XNNPACK,
KleidiAI, etc.) can dispatch optimised kernels when running on this
core.

The A320 implements the ARMv9.2-A mandatory feature set: NEON, SVE2,
dotprod, FP16, BF16, I8MM (per the Cortex-A320 TRM).

The new MIDR/uarch entries are inserted in numerical order alongside
the existing ARMv9 cores added by recent commits (A520, A720, X4,
X925, A725, Lumex variants).

@meta-cla meta-cla Bot added the cla signed label Apr 28, 2026
Copy link
Copy Markdown
Collaborator

@fbarchard fbarchard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

once landed i'll roll it into xnnpack so the uarch_cortex_a320 is available. or youre welcome to, but its almost easier for me to do it. If youre able to run benchmarks in xnnpack, you could select kernels that really are faster on a320. likely a55 or whatever a520 is doing.

@npitre
Copy link
Copy Markdown
Contributor Author

npitre commented Apr 28, 2026

Thanks for the approval!

Happy to let you handle the XNNPACK side — appreciate the offer. Mapping to xnn_uarch_cortex_a510 would be a sensible interim choice (closest existing entry); we can revisit once silicon lands. We also have an XNNPACK PR open (#10060) adding generic Zephyr platform support, which is what surfaced the A320 question in the first place.

On benchmarks: not yet, unfortunately — we're running on the Corstone-1000-A320 FVP, which is functionally accurate but not cycle-accurate for the CPU side, so any kernel ordering we'd derive from it would be misleading. Once we have real A320 silicon we'll come back with measurements and can compare against A55 / A520 kernel paths.

@fbarchard
Copy link
Copy Markdown
Collaborator

Also update ./tools/cpu-info.c with the string. Ideally build and run the tool to confirm detects.

ARM Cortex-A320 (MIDR part 0xD8F) is an ARMv9.2-A efficiency core.
Add its uarch enum and MIDR mapping so XNNPACK can select optimized
kernels when running on this core.

Signed-off-by: Nicolas Pitre <npitre@baylibre.com>
@npitre
Copy link
Copy Markdown
Contributor Author

npitre commented Apr 28, 2026

Done — added the Cortex-A320 string mapping in tools/cpu-info.c and confirmed the tool builds. The Zephyr backend's runtime detection on the A320 FVP correctly resolves to cpuinfo_uarch_cortex_a320 (verified via XNNPACK's cpu_get_uarch reporting 0x300553 once #379 is in place).

@GregoryComer GregoryComer merged commit 3681f0c into pytorch:main Apr 30, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants