Arm has come full circle. Before Arm CPUs were licensable designs, they were standalone chips powering BBC Micro systems. Decades later, Arm is offering chips again, this time targeting the opposite end of the performance spectrum. The new Arm AGI processor integrates 136 Neoverse cores running at up to 3.7 GHz. Per-socket memory bandwidth and PCIe connectivity meet or beat similarly performing merchant-market server processors.
Arm’s expansion from licensing designs (IP) to selling chips has long been discussed but seen to be problematic because doing so would put the company in competition with licensees. Meanwhile, licensees, such as Ampere, have attempted to commercialize Arm-compatible server processors with little success. Against this backdrop, Arm’s move requires justification.
Arm’s Transition from IP Licensing to Merchant Silicon
Arm positions its decision to sell finished chips as a response to demand from agentic AI systems and the coming era of artificial general intelligence (AGI), answering the question “What’s changed to justify entering the chip market?” with these buzzwords. When Arm began the AGI project—in early 2023, we estimate, a few months after ChatGPT’s launch—these ideas weren’t common. Moreover, the project builds on the company’s hard-macro (CSS) initiative, which delivered prebuilt computing subsystems instead of soft IP.
Regardless of the hot AI topics du jour and a seemingly negative environment, Arm has picked a propitious time to enter the server-processor market. Compatible deployments are well underway at hyperscalers; Amazon’s Graviton is in its fifth iteration. Recently, merchant-market Arm server processors finally notched a significant win: Meta agreed to deploy Nvidia Grace.
Again, Meta Plays Lead
Playing the server-processor field, much as it plays the AI-accelerator field, Meta is also Arm’s lead customer and collaborator on the AGI chip. The social media company has little reason to roll its own server processor. Block diagrams of competing hyperscalers’ (and SiPearl’s) chips closely resemble Arm’s Neoverse recipe suggestions with little or no added value. Meta can influence the AGI design. If it requires some special logic, Arm can add it, deactivating the circuits for other customers.
Previous Arm server processors integrated many small cores. This enabled them to score well on per-chip synthetic benchmarks, but their weak per-thread throughput, constricted memory bandwidth, and small cache limited their performance on real-world workloads. Server workloads and their requirements vary. Complex software may benefit from large instruction caches to hold a bigger working set. Programs employing frameworks such as Node.js may have big code footprints because they call a variety of small functions. Database and analytics software may scan large data sets, necessitating capacious DRAM pools and fast access thereto.
Agentic AI increases workload variety. It may also entail running client-like software such as scripts and web browsers characterized by branchy code that pressures a CPU front end. Or software that analyzes data could load up a CPU’s back end.
Arm AGI Architecture: Neoverse Cores and Memory Hierarchy
To tackle workload variety, successful server processors employ the most powerful CPU designs, big caches, multiple DRAM interfaces, and large physical address spaces. Indeed, Arm endowed the AGI with these characteristics. Its 3.7 GHz clock rate is modest compared with the peak rates of desktop PC processors, but many-core server processors run slower to constrain chip power.
Arm rates the AGI at 300 W. This rating is for TDP, a fuzzy concept like “typical power.” Peak wattage could be higher. The company references the power of various rack configurations, which, when divided by the number of cores, allocate 600 W per AGI chip. That value includes memory and all other components, indicating 300 W is a reasonable claim, particularly considering the AGI’s speedy DDR5-8800 memory interfaces. The AGI has a dozen of those DRAM channels and can address a total of 3 TB per chip. It also has 96 PCIe Gen6 lanes, which support CXL 3.0 for memory expansion.
Comparing Arm AGI to x86 (AMD Epyc 9755 Turin)
By comparison, the AMD Epyc 9755 (Turin) integrates 128 Zen 5 cores. They can sprint to 4.1 GHz, neutralizing the 9755’s core-count disadvantage. However, its base clock is only 2.7 GHz. Those comparisons ignore per-cycle instruction throughput (IPC). Previous-generation Neoverse and AMD processors performed similarly on average but differed on specific workloads. In its favor, the AGI has 2 MB of L2 cache per core, whereas the Epyc has only 512 MB of L3 cache per core.
Moreover, AMD rates its chip at 500 W TDP, allowing five Arm processors in the same power budget as three Epycs. Arm expects a fully stuffed 36 kW rack to deliver twice the performance of a comparable x86 design. The RISC vendor has yet to provide standard benchmarks or supply systems for third-party evaluation, and AMD plans to ship Zen 6 Epyc (Venice) processors this year.
Looking Forward
Arm has an aggressive short-term roadmap. Next year, it plans to release an AGI 2 based on the upcoming Neoverse CSS V4. Arm updates its Lumex (nee Cortex) mobile CPUs annually, but the changes are usually small. It’s unlikely to ramp up the Neoverse cadence to match. However, the Neoverse V3 derives from the Cortex-X4 microarchitecture, and a refresh is due.
Bottom Line
Selling chips puts Arm in competition with its design and architecture licensees, but we do not foresee the AGI adding friction between Arm and them:
- Hyperscalers with in-house Arm processors aren’t hurt by the company selling chips to other customers and could prefer to buy the off-the-shelf AGI instead of continuing to assemble their own chips.
- Ampere is owned by SoftBank, Arm’s largest shareholder.
- Nvidia is too busy printing money to care about Arm AGI taking designs from Grace or Vera.
- Qualcomm has yet to announce an Oryon-based server processor, but the animosity between Qualcomm and Arm can’t get any greater.
Although hyperscalers have licensed Arm’s Neoverse IP and assembled general-purpose chips, it’s more important that they have ported software to the architecture. Although licensees can tweak their Arm-based designs to address unique requirements, well-crafted server processors excel at many tasks. Thus, Meta found the make-versus-buy decision favored buying, particularly if it could influence details and get early access. Arm additionally has lined up software and system partners, positioning it to address customers beyond hyperscalers.

