For its newest flagship smartphone processor, MediaTek has adopted the most recent Arm technology, including the SME2 capabilities that speed up CPU-based AI processing. That’s not the only AI-related enhancement, however. The Dimensity 9500 includes a second-generation AI accelerator (NPU) that doubles the peak execution rate of its predecessor and adds a new low-power NPU based on unusual computing-in-memory technology. Graphics, multimedia, and communications also receive upgrades. Expected in smartphones shipping next month, the flagship chip not only raises performance but also improves power efficiency.
CPU
- Long a leading Arm licensee, MediaTek is the first to disclose using Arm Lumex designs (IP).
- Like its most recent predecessors, the Dimensity 9500 is an all-big-core design. It employs one Lumex C1-Ultra, three C1-Premiums, and four C1-Pros. Having used two Cortex-X configurations in previous years, MediaTek may have inspired Arm to create the Premium tier, an official Arm reduced-area version of the highest-performance microarchitecture.
- MediaTek continues to perform backend design instead of using Arm’s CSS hard macros.
- The Dimensity 9500 operates the C1-Ultra at up to 4.21 GHz, well above the 3.62 GHz of the Cortex-X925 in last year’s Dimensity 9400 and higher than the Arm-rated 4.1 GHz. Besting Arm could be a function of MediaTek’s backend design and possibly a willingness to raise voltage to achieve a temporary speed boost.
- The other cores run about 10% faster than their counterparts in the 9400.
- MediaTek claims that single-core performance is up 32%, consistent with Arm’s expected 15% instruction throughput (IPC) gains for the Ultra-C1 and MediaTek’s clock-rate boost.
- The company withholds multicore performance gains, but the higher clock rates and Lumex Pro’s IPC increases suggest it will go up by at least 25% on some benchmarks.
- Peak multicore CPU power is 37% lower than 9400’s, and peak single-core power is down by 55%, a significant improvement considering the higher performance.
NPU
SME2
- Lumex is the first Arm CPU family to implement scalable matrix extensions, specifically SME2.
- A cluster’s CPUs share SME2 units, and the Dimensity 9500 implements only a single instance. By contrast, Arm’s similar exemplary configuration has two. MediaTek’s configuration occupies less area, reflecting the balance between seeding the market and needing applications to ship to justify a new feature.
NPU 990
- The 9500’s primary AI accelerator, the NPU 990, adds dedicated hardware for transformers, used by large language models (LLMs). This could be exponentiation hardware to speed up the softmax function.
- The NPU delivers twice the integer and floating-point performance of the Dimensity 9400’s NPU 890. Raw performance is about twice that offered by NPUs integrated into PC processors, which is evidence that smartphones have proven to be more effective AI devices than PCs.
- The new NPU supports microscaling data formats (reduced precision in which data blocks share a scaling factor or exponent). It is unclear if these account for any of the performance gains (as measured by operations per second).
- MediaTek reports the NPU doubles the speed of a 3 billion-parameter LLM’s execution. It also supports 128K-token contexts (about 200 pages of text).
- MediaTek claims to be the first to support the BitNet 1-bit framework, developed by Microsoft to improve 1.58-bit LLM power and performance. (Each parameter is a single ternary bit, which equates to 1.58 binary information bits.)
Ultra-Efficient NPU
- MediaTek has added a low-power NPU.
- This NPU targets small models, likely for always-on functions. No indication of how OEMs will harness it has emerged.
- This NPU is a compute-in-memory (CIM) design that saves power by eliminating data movement between computing and storage functions. In this approach, circuits both store and process data; CIM may use analog techniques for the arithmetic operations.
GPU
- The Dimensity 9500 upgrades the GPU to Arm’s newest, the Mali Ultra-G1.
- As before, it’s a 12-core configuration.
- MediaTek claims the 9500 has 33% higher peak performance and 42% improved energy efficiency compared with the prior generation. These figures exceed Arm’s claims for the G1-Ultra. Better backend design and an updated 3 nm process could account for MediaTek exceeding its supplier. Independent benchmark tests are needed to validate these claims.
- The Ultra-G1 implements better ray-tracing acceleration, and Arm reports gains exceeding 100%. Arm’s performance estimates are similar.
Other
- Because it’s a phone chip, communications are still relevant. Carrier aggregation supports 5 CC, delivering a 15% bandwidth improvement over the previous modem that supported only up to 4 CC.
- Multimedia and display functions have improved, such as by supporting 4K120 Dolby Vision video capture with stabilization.
- The interface to flash memory has doubled in width to four lanes, which should speed up loading large AI models and other large files.
Bottom Line
MediaTek’s all-big-core flagships must meet customers’ needs because the company has produced yet another. Since first supplying the highest-tier devices in 2022, the company has grown shipments by 350% and raised revenue even more. The company mostly supplies Chinese brands but has notable wins in a few Samsung devices.
This year, the company has been relatively muted about AI, focusing on tangible hardware upgrades instead of showcasing eye-catching but marginally relevant demos or discussing a dreamy AI-powered future. Integrating three AI-acceleration options, the Dimensity 9500 offers developers more capability than earlier chips. The CPUs’ SME2 and the low-power CIM NPU present them with new opportunities to ply their creativity, and a few years must pass to see what they enable.
Like its predecessors, the Dimensity 9500 employs Arm IP, which the licensor upgrades on an annual cadence. MediaTek’s rival Qualcomm no longer uses Arm’s designs, differentiating the two companies. However, new alternatives, such as Xiaomi’s Xring O1, aided by Arm’s CSS stand to deny both MediaTek and Qualcomm sales. MediaTek’s modem, multimedia, and AI expertise will be important differentiators as Arm deals CPU and GPU technology to all comers. But the CPU and GPU playing field might not be level, as MediaTek’s high peak clock rates and GPU throughput prove.