AI

Arm unveils Cortex A320 CPU teamed with NPU to target edge AI

Hyperscalers are spending billions of dollars building AI data centers, but with the growth of IoT, AI computing also is moving further out to edge devices to a greater degree than ever before. Arm is looking to satisfy the growing edge AI processing need with the Armv9 edge AI platform, which includes the new Arm Cortex-A320 CPU and the Arm Ethos-U85 NPU.

The announcement comes about one year after the unveiling of the Ethos-U85 NPU, and at a time when an increasing number of edge AI approaches are incorporating NPUs to add processing heft to address edge AI needs. Together, the new CPU and AI accelerator enable AI models of more than 1 billion parameters to run on-device, according to Paul Williamson, senior vice president and general manager of the IoT Line of Business at Arm.

“We can only realize the potential of AI if we move it to the physical devices and the environments around us and in the world of IoT, it's AI at the edge that matters most,” Williamson said during a media briefing. “Just a few years ago, edge AI workloads were much simpler than today. They were things like basic noise monitoring or anomaly detection, but now the workloads have become much more complex to meet the demands of much more sophisticated use cases.”

Williamson described the Cortex-A320 as “the heart of this new platform” with its ability to deliver a 10x machine learning performance uplift and 30% scalar performance uplift compared to its predecessor, Cortex-A35. Williamson also noted that the Ethos-U85 NPU had previously been paired with the Cortex M85, which was based on the Armv8.1 architecture, but now gets a more advanced partner in the Cortex-A320, and can be driven directly from the A320 for improved latency and lower cost, as a separate controller is not required.

Arm partners such as AWS, Siemens, Renesas, Advantech, and Eurotech, welcomed news of the new platform, with Yasser Alsaied, Vice President of loT, AWS, stating, “The new Arm edge AI platform will enable our customers to run nucleus lite, a lightweight device runtime of AWS IoT Greengrass for constrained edge devices with minimal memory needs, on Armv9 technology. This seamless integration between the two technologies provides an optimized solution for developers to build modern edge AI applications like anomaly detection in precision agriculture, smart manufacturing, and autonomous vehicles.”

Arm also announced it is extending its Kleidi software library, which was announced last year, to IoT. This provides a set of compute libraries for developers of AI frameworks designed to optimize AI and ML workloads on Arm-based CPUs with no additional developer work needed. The company said KleidiAI is already integrated into popular IoT AI frameworks, such as Llama.cpp and ExecuTorch or LiteRT via XNNPACK, accelerating the performance of key models, including Meta Llama 3 and Phi-3. As an example, Kleidi AI brings up to 70% more performance to the new Cortex-A320 when running Microsoft’s Tiny Stories dataset on Llama.cpp.