Graviton comic book character

Tests Show Arm Server Chips Rival x86 Processors


In the past several weeks, Michael Larabel at Phoronix has tested the Arm-based Graviton 4. Compared with the 96-core AMD Epycs based on Zen 4, the new 96-CPU processor from Amazon Web Services (AWS) performs similarly. In so doing, it shows that x86 processors no longer have an across-the-board advantage in server applications.

x86 processors no longer have an across-the-board advantage in server applications

Graviton 4 employs the Arm Neoverse V2 CPU, which mostly shares its microarchitecture with the Cortex-X3. It runs at 2.8 GHz and has four 128-bit SIMD units compared with the Epyc 9654 (and 9R14), which has a 2.4 GHz base clock and six-pipe 256-bit SIMD unit (although it can only execute two fused multiply-adds in parallel).

Although the Neoverse V2 microarchitecture looks like it could deliver competitive scalar execution throughput (IPC), it falls short on the Coremark synthetic benchmark, even after normalizing for clock rate—where the Epyc has the advantage in this case owing to its 3.55 GHz all-core boost frequency.

On more realistic server workloads, the results vary. The two processors’ performance is within 10% in many cases but varies substantially in others owing to the processors’ different cache configurations and other architectural differences. Overall, the similarities are great enough to tip the balance toward Graviton’s low-cost instances.

The similarities are great enough to tip the balance toward Graviton

Database workloads are an example of an area where the two are similar taken as a whole, but one may be better for a specific workload. The Arm chip has the edge running RocksDB key-value database, does about the same on the ClickHouse column-oriented database, and falls behind on the PostGreSQL relational database. Compilation speeds are all over the map, depending on the code base or system being tested.

On HPC, however, Graviton proves to be faster in most cases, outperforming Epyc on MiniFE finite-element analysis, likely because of its SIMD-unit configuration and higher base clock. The forthcoming Zen 5 Epyc (Turin) has wider SIMD data paths and could turn the tables.

Phoronix also recently ran a few tests on the first AmpereOne processors (Oracle A2 instances). It performs about the same as the older Ampere Altra processor in some cases but is much faster in others. The preliminary results, however, show no indication it will be faster than Graviton per core. Owing to its simpler microarchitecture, the AmpereOne CPU likely is much smaller than the Neoverse V2—and Ampere certainly stuffs more CPUs on each chip—but the results do nothing to quell concerns that the company would be better off licensing cores.

Bottom Line

Customers indifferent to instruction-set architecture (ISA) and attendant software issues will find that server processors, such as the AWS Graviton 4, based on the newest Arm CPUs will perform similarly to high-core-count AMD Epyc processors. Moreover, Graviton instances are less expensive than their x86 counterparts. Lower core-count Epycs, however, clock faster and could give AMD users an advantage, and AMD Zen 5 chips and new Intel Xeons are soon to be released. Nonetheless, the conclusion remains that the ISA monopoly has been broken. (Arm Neoverse V3 processors are likely in the offing, too.)

Links

Note that Phoronix compared Graviton 4 with other AMD processors, the Ampere Altra Max, and Intel processors. We highlighted the comparison with the 96-core Zen 4 processor because of the common core count, one-thread-per vCPU instances, and similar performance profile.


Posted

in

by


error: Unable to select