Intel y AMD preparan ACE: nueva extensión x86 para acelerar la IA desde la propia CPU

Sports News » Intel y AMD preparan ACE: nueva extensión x86 para acelerar la IA desde la propia CPU
Preview Intel y AMD preparan ACE: nueva extensión x86 para acelerar la IA desde la propia CPU

Intel and AMD have taken another significant step within the x86 Ecosystem Advisory Group, a consortium established by both companies to coordinate the future of the x86 architecture and prevent individual manufacturers from independently advancing critical features. The latest development is the release of the ACE v1.15 specification, which stands for AI Compute Extensions. This new extension to the x86 instruction set is geared towards accelerating artificial intelligence and machine learning workloads, particularly in matrix multiplication and low-precision numerical operations.

The core of ACE lies in its approach: it’s not envisioned as a standalone NPU nor a mere generic performance enhancement. Instead, it represents an instruction layer enabling future x86 CPUs to handle typical AI operations more effectively. Matrix multiplication is fundamental to neural networks and language models. While AVX10 can manage such calculations, the ACE technical document acknowledges the limitations of traditional SIMD’s compute density and scalability. Consequently, ACE introduces matrix primitives that combine AVX vector registers with tile-type registers, aiming for increased performance, scalability, and energy efficiency directly within the CPU.

What Having Future AMD and Intel x86 CPUs with ACE Instructions Entails

According to the official specification, ACE introduces a new register state, including tile registers and block-scaled registers. It also incorporates processing operations that consume AVX inputs and operate on this matrix state. Furthermore, it defines data movements between ACE and AVX registers, along with system management mechanisms. In practical terms, this means CPUs will be able to handle AI-ready data more directly, reducing reliance on less efficient pathways or external accelerators like NPUs or GPUs for certain operations.

Another crucial aspect is support for low-precision formats. ACE includes INT8, INT32, FP32, BF16, FP16, E8M0, FP8, MX FP8, MX FP6, MX FP4, and MX INT8. These formats are particularly relevant for inference, quantization, and AI models, where reduced precision can save bandwidth, memory, and power consumption. The specification also states that compatible implementations must be based on at least AVX10.1, and that various format conversion operations are provided within the AVX10 framework.

This initiative also carries strategic significance. Intel and AMD formed the x86 EAG in 2024 with other ecosystem players to enhance compatibility, reduce fragmentation, and provide developers with a more consistent platform foundation. AMD itself highlighted FRED, AVX10, ChkTag, and ACE as major technical milestones during the group’s first anniversary. They also announced a common goal of modernizing x86 in terms of performance, security, and software compatibility. This is particularly important given x86’s past fragmentation issues with extensions like AVX-512, where not all processors implemented the same subsets or did so inconsistently.

For Instance, AMD’s Zen 7 Architecture Will Apply This New Approach to Accelerate AI Workloads

In essence, ACE aims to prevent a recurrence of the AVX-512 extension scenario. It is presented as a joint specification from Intel and AMD, designed for developers to optimize libraries, compilers, and frameworks without needing to maintain entirely separate code paths for each manufacturer. This signifies that both companies are uniting to address the need for standardization as a foundational reason behind ACE.

Currently, this should not be interpreted as an immediate upgrade for existing processors. The specification itself notes that the document describes technologies in the design phase and that product plans may change. AMD has already discussed improvements related to new AI data types and more AI pipelines in Zen 6 (Ryzen 10000), while Zen 7 (Ryzen 11000) is expected to feature a new matrix engine and AI data format extensions. Nevertheless, the actual impact will depend on when these instructions are implemented in silicon, not to mention the support from operating systems, compilers, libraries like NumPy/SciPy, and frameworks like PyTorch or TensorFlow.

In summary, ACE is another piece in Intel and AMD’s effort to strengthen x86 against alternative architectures and the growing demand for local AI. It does not replace GPUs, NPUs, or dedicated accelerators for massive workloads, but it can make future x86 CPUs more competitive and predictable in AI operations, light inference, servers, workstations, and laptops. The key takeaway is not just the potential performance boost, but that Intel and AMD are aligning the technical foundations of their future CPUs rather than pushing incompatible extensions.