Banner reading XPU dot pub
picture of a stoned-looking David Spade and Chris Farley in a car in the movie Black Sheep

Tesla Dojo Details Disclosed in 2022 Foreshadow Cautious Comments

Robes, rogues, rudes, roads, ro-ads.

I recently came across some words I wrote last year at the other place: “Tesla plans to scale out Dojo over the coming years but is notorious for missing milestones. Nonetheless, a single exapod should be capable of running the company’s models. Tesla is keen to improve its cars’ so-called full self-driving (FSD) capability—one of its many overpromised technologies. More AI resources will increase FSD’s rate of improvement. In the meantime, the company plans also to keep its Nvidia-based systems for secondary workloads.”

Uncharacteristically, Tesla’s CEO is admitting—at least implicitly—that previous claims were unrealistic, saying the following about Dojo during the company’s most-recent earnings call: “GPU is a funny word—like vestigial. … We’re pursuing the dual path of Nvidia and Dojo, but I would think of Dojo as a long shot. It’s a long shot worth taking because the payoff is potentially very high, but it’s not something that is a high probability.” (I’m not sure what that first part was about, but it reminds me of the roads scene in Black Sheep featuring Farley and Spade.) Source: Seeking Alpha. By comparison, less than a year ago, the company said it planned to have 100 exaflops of Dojo capacity online by October 2024.

The company disclosed Dojo 1 details in 2022 at Hot Chips and elsewhere. The system’s computing tile vaguely resembles the Cerebras WSE, providing raw BF16 throughput between the WSE-2 and WSE-3 and occupying similar area (about 12 × 12 inches compared with about 8 × 8 inches). Tesla’s tile isn’t a monolithic silicon sheet, however, but an array of chips packaged together using TSMC’s Info-SOW technology. Each tile has less on-chip SRAM than a WSE-2 or WSE-3 and less fabric bandwidth.

Alleviating the I/O bottleneck that starved computing resources in an earlier Dojo iteration, Tesla attaches Dojo Interface Processors (DIPs) to the tiles, which provide additional connectivity among chips and 32 GB of HBM per DIP. That Tesla had to apply such a fix speaks to the challenges of building a large-scale AI system and to a possible lack of forethought.

The attachment of 20 DIPs underneath a six-tile tray is unusual, if not downright odd. In conjunction with the use of Info-SOW and a Dojo rack’s enormous power, this configuration reveals Tesla’s ambitions and unfettered design approach. If the company wants to free itself from Nvidia, it could source NPUs from (or acquire) a startup like Cerebras or Tenstorrent or turn to established Nvidia challengers like AMD.




error: Unable to select