Banner reading XPU dot pub
Enable images to see this photo of AMD Turin Dense server processor showing its 12 compute dice.

AMD Announces Zen 5 Products and an MI300 Upgrade at Computex 2024

AMD’s Computex announcements span laptop, desktop, and data-center products as the company prepares a processor-upgrade wave.

In Theory, AMD is Staying Ahead of Nvidia. In Theory.

The AMD MI325X upgrades the AMD MI300 AI accelerator (NPU) by replacing its 144 GB of HBM3 memory with 288 GB of HBM3e. The newer memory type raises peak interface throughput by 1.3×. Bigger models and those where memory transfers bottleneck MI300 execution should therefore run faster.

Targeting a 4Q24 release, AMD compares the MI325X with the Nvidia H200 (Hopper’s HBM-based midlife kicker) and claims a theoretical 1.3× processing advantage and the ability to run models that are twice as big. However, it’s well past put-up-or-shut-up time for AMD to release MLPerf results. Nvidia’s ability to extract performance from its monster chips is among its advantages, and AMD must show what the MI300 series can deliver.

AMD also announced the MI350 NPU will come in 2025. Based on an updated NPU architecture that adds FP4 support and will be implemented in a 3 nm process, it will compete with Nvidia Blackwell. AMD claims it will be 35× faster at inference than the MI300, which might say as much about the MI300’s real-world performance as it does about the MI350’s greater maximum throughput. Committing to a one-year cadence—a fast rate also promised by Nvidia—AMD plans an MI400 in 2026.

Zen 5 Comes to Server Processors

AMD claimed a 33% server-processor share, attributing its gains to hyperscalers employing Epyc for internal workloads, as opposed to public instances. Given the company’s data-center revenue and Epyc’s performance, the claims are believable but also speak to Intel’s stickiness with IT managers.

The new fifth-generation Epycs (Turin) will have up to 192 compact Zen 5 cores or 128 standard ones, compared with 128 and 96 Zen 4 cores for the current-gen Bergamo and Genoa processors. Slated to ship by the end of the year, they’ll compete with Intel’s Sierra Forest and Granite Rapids Xeons. With those chips, Intel will finally meet or beat AMD’s core count. Meanwhile, Amazon, Ampere, Google, and Microsoft will have Arm-based alternatives, at least for the denser server processors.

Zen 5 Increases IPC but Maybe Not GHz

In addition to their larger core counts, the Turin family will increase performance by employing the new Zen 5 CPU. AMD Zen 5 details aren’t publicly confirmed other than AMD stating front-end bandwidth, some buses, and AVX512 throughput are up to 2× greater than what Zen 4 delivers.

Zen 4 runs AVX512 operations on a 256-bit data path, halving throughput compared with native 512-bit execution. We infer that Zen 5 has a full-width data path, which will benefit applications like AI and video coding but not those like compilation or databases. This widening could significantly increase die size compared with Zen 4. Die photos show that after normalizing for its process-technology advantage, Zen 4’s vector/floating-point unit is much greater than its Xeon counterpart, suggesting that Intel expends more physical-design effort.

The wider vector/FP unit would starve if the paths between it and the data cache weren’t also widened. Likewise, a wider front end also requires these paths to be widened. In summary, the “up to 2×” claim is likely an actual physical-resource doubling. Doubling instruction decoders would be a major change—especially given x86 decoders’ porkiness—but would be consistent with a trend Apple has set the industry on. Note, however, that Zen 4 already can feed nine ops from its L0 cache; it’s unlikely Zen 5 doubles this rate.

These changes will increase instructions per cycle (IPC). AMD gave no indication that clock rates will increase, suggesting that pipeline depth is similar, if not unchanged. Turin’s 3 nm process may enable AMD to boost clock rates, but the higher core counts will thermally constrain base frequencies.

Wider AVX-512 Units Boost Desktop-Processor IPC

AMD announced its premium desktop-processor lineup based on Zen 5. Due to be available at retail later this year, the AMD Ryzen 9000 (Granite Ridge) series has similar core counts, cache sizes, and clock rates as the Zen 4 Ryzen 7000 (Raphael) series. Performance gains, therefore, come from Zen 5’s greater IPC.

AMD claims a 16% typical per-tick uplift and reports the results from a set of benchmarks. Among this set are a few programs for which AMD also supplied IPC improvement data when it launched Raphael. The new chip delivers a 10% speedup on Far Cry 6, whereas the older one provided a 12% gain—suggesting in this generation users will see a decent but smaller improvement on integer code. However, code that can use Zen 5’s wider AVX unit will see a greater speedup than in the past generation. AMD reports the Puget Adobe Premier Pro and Cinebench R23 tests run 16% and 17% faster at the same frequency on Granite Ridge than Raphael. This is more than Raphael provided, which was only 11% and 9% faster per clock on these benchmarks than its predecessor.

AMD has long done well with home PC builders owing to Ryzen’s performance and efficiency advantages over Intel’s desktop processors. Ryzen 9000 will put AMD ahead, at least until Intel launches its Arrow Lake family. The Ryzen 9000 family also has the advantage of being socket-compatible with its predecessor, enabling Ryzen 7000 customers to upgrade only their processor. (Note also that Turin retains socket compatibility with its predecessor, too.)

AMD Pushes Laptop AI

AMD also announced two laptop-processor models. The company would’ve previously called these APUs because they combine CPU and GPU cores, but they’re now Ryzen AI chips. The new Ryzen 9 AI HX 370 and AI 9 365 (AMD Strix Point) update AMD Hawk Point and Phoenix 2 processors. Combining the smaller Zen 5 CPU employed in the dense Turin models and the bigger, faster version used in the 128-core Turin and Granite Ridge processors, the new chips bring hefty general processing to laptops. A newer graphics architecture with more GPU cores boosts matters further.

The headline product feature, however, is the new NPU delivering a peak 50 INT8 TOPS, surpassing Qualcomm Snapdragon X and dwarfing Intel Meteor Lake. This XDNA 2 NPU supports block FP16, which should improve performance density on applications that can use it instead of standard FP16. Block representations save area by sharing a single exponent among multiple mantissas.

More important than product features is OEM support, where AMD has badly lagged. The company promises Acer, Asus, HP, Lenovo, and MSI will have Ryzen AI 300 systems available in July. These companies have AMD-powered systems now. To gain share, AMD needs them to offer more SKUs and for other OEMs to join them.

Validating Qualcomm’s market impact, AMD compared its new laptop processors to Snapdragon X as well as Intel Meteor Lake. Video editors and gamers will see a difference. Everyone else must wait to see if Microsoft Copilot wows us more on one processor than another or if battery life significantly differs.

Bottom Line

Zen 5 delivers moderately greater performance than Zen 4, despite being a more comprehensive redesign than Zen 4 was over Zen 3. Like the new Arm Cortex-X925, the biggest generational gains come from added vector/FP resources. The new CPU advances AMD’s server, desktop, and laptop processors, but AMD’s challenge is to cement, if not grow, its server-processor share as Intel rights its ship and Xeon full-chip throughput begins to catch up with Epyc. AMD’s relative PC-processor advantages over Intel, particularly for laptops, are growing, but its share has been stubbornly low, and Qualcomm is bearing down. Share gains in PCs at Intel’s expense would slow down its rival’s turnaround.

AMD’s planned steady data-center NPU upgrades parallel those of Nvidia and contrast with Intel’s regrouping around the new Falcon Shores architecture. AMD has gone from nowhere in data-center AI to second place among merchant suppliers in remarkably little time. In absolute terms, Nvidia’s business is growing faster. AMD investors must realize it will take years, not quarters, for the company to significantly grow its data-center-AI share. After all, the company took years to reach 33% server share in much more favorable conditions.




error: Unable to select