enable images in email to see diagram of Arm Client CSS with Cortex-X925, Cortex-A725, Cortex-A520, Immortalis-G925, DSU-120, and interconnect and SMMU blocks

Hard Macros Are Back in Town Thanks to the Arm Client CSS

Der Haifisch, der hat Zähne.

To eke out more performance and efficiency, Arm now offers compute subsystems (CSS) for its client-targeted designs, including those employing the new Cortex-X925. An approach first employed with the infrastructure-focused Arm Neoverse intellectual property (IP), a CSS is a physical implementation combining CPUs, interconnect, and other blocks necessary to complete a subsystem. Customers stand to reduce design time by employing a CSS.

Arm has also released the following:

  • The Cortex-X925 CPU (the successor to the Cortex-X4)
  • The Cortex-A725 CPU
  • Immortalis-G925 GPU (the successor to the G720 and available in lesser configurations of the same architecture as Mali-G725 and Mali-G625)
  • A new set of AI and computer-vision software libraries called Arm Kleidi

We analyze the Arm Cortex-X925 and A725 separately, and will also do the same for the new GPU.

Hop Off, Pop

A CSS is a hard macroblock, a once-common design-IP approach overshadowed by RTL-based soft macros (IP). Soft digital IP meshes with most companies’ SoC design flows and affords them flexibility. Synthesis and place and route, however, can yield a physical design requiring more area and power and delivering less performance than a flow involving manual tuning.

To improve results from using soft IP, Arm has paired its RTL with its Pop physical IP, which includes special cells that can speed up critical paths or save power/area, depending on a customer’s requirements. Nonetheless, a less-automated approach can yield better results, albeit at the expense of design effort. By handling the layout, Arm offloads this activity from licensees and can amortize it among them. Likely most of Arm’s effort is laying out the cores, with less required to configure cores’ number and composition for each customer.

New PC Processors Might Benefit the Most

An important market for the Neoverse CSS is hypercalers seeking their own server processor. Long on cash but short on chip-design expertise, the Neoverse CSS jump-started their efforts. A Neoverse CSS includes CPUs, on-chip mesh, and DRAM and PCIe interfaces—all the core functions of a server processor.

The dominant smartphone-processor companies differ. Qualcomm and MediaTek aren’t lacking in chip-design expertise. Starting this year, Qualcomm’s flagship smartphone processors will use its internally developed CPUs instead of licensed cores. MediaTek’s flagships have used Arm’s CPUs and GPUs, but the company has differentiated its chips through their CPUs’ leading performance per area. It’s unlikely Arm can get much more out of its designs than MediaTek. Moreover, a smartphone processor incorporates many blocks, and its maker benefits from the flexibility of soft IP. Therefore, CSS is irrelevant to leading smartphone-processor suppliers.

Many other companies use Cortex CPUs and Mali GPUs, however. Those pushing the boundaries of power, performance, and area (PPA) could use CSS. In particular, we expect several suppliers to launch Arm-compatible PC processors to compete with the various new Qualcomm Snapdragon X SKUs. Although these suppliers are major companies, their chips will integrate fewer blocks than a smartphone processor, power and performance pressure will be sky high, and the market is uncertain. Thus, we expect some to employ the new CSS.

Although these suppliers are major companies, their chips will integrate fewer blocks than a smartphone processor, power and performance pressure will be sky high, and the market is uncertain. Thus, we expect some to employ the new CSS.

Who is the Competition?

  • DIY—Although Arm is happy to grant architecture licenses, it’s better for the company to license designs. The overlapping development among its CPUs has enabled the company to leverage Cortex-A’s popularity to create the Cortex-X and Neoverse lines. Nonetheless, Cortex-X and Neoverse need licensees to justify their development. To win designs, Arm must deliver cores that match the performance of the proprietary CPUs in the Snapdragon X, Apple M4, and AmpereOne processors.
  • x86—poor CPU performance is one reason previous Arm-based Windows PCs have sold poorly. Backing Arm-based PCs, Microsoft seeks to shake up PC design. New PC-processor suppliers require speedy CPUs to compete with Intel and AMD, and Arm must employ both logic-and physical-design techniques to deliver competitive power and performance. Analysis of Intel’s dice shows that it optimizes physical design to good effect, enabling it to stay competitive as its process technology has fallen behind. However, it also shows how attending to physical design can slow microarchitecture changes; despite its resources, Intel has employed the same microarchitecture for a few generations.
  • RISC-V—the open architecture has been gaining adherents, but it’s not yet competitive in the markets that Cortex-X and high-end Cortex-A target.
  • GPU—Mali/Immortalis is competitive but not as strategic as Cortex because it doesn’t provide the same compatibility lock in. In fact, in the PC market, Arm’s uptake could improve if chipmakers match Cortex with an AMD or Nvidia GPU.

Bottom Line

Arm has introduced hard macros (IP blocks), first to meet the needs of data-center customers and now for client-device makers. What processor/SoC types will adopt Arm client CSS is unclear, but it’s more likely for forthcoming PC processors than for Tier Ones’ smartphone processors. Soft IP is not outmoded, but where PPA demands are the highest, hardened designs like CSS help to meet customer needs.


Posted

in

by


error: Unable to select