Condor Computing Cuzco floorplan

RISC-V Startup Condor Computing’s Cuzco Core Takes Flight


A beautiful city in the Andes, Cuzco shares its name with a high-performance RISC-V core from Andes Technology affiliate Condor Computing. Disclosed at Hot Chips, Cuzco is a wide microarchitecture like other high-performance CPUs, but it implements a novel mechanism for issuing instructions that reduces power and area. The company also takes an unusual approach to organizing execution units, dividing them among four slices. Andes will handle licensing and support; lead customers should have access to Cuzco RTL by the end of the year.

Traditional Front and Back Ends

High-performance CPU designs comprise front and back ends. The former fetches instructions and issues them to the latter, which performs the actual execution. To raise performance, these CPUs issue instructions out of order (OoO) and speculatively, employing content-addressable memories (CAMs) to select candidate instructions to send downstream. During the execution process, if data, buses, or other resources aren’t available for an operation to complete, execution halts, and the CPU must reissue the instruction and replay execution. CAMs are notoriously power-hungry circuits, and canceling instructions is wasteful.

New Time-Based Instruction Issuing

Taking advantage of the perfect view of all instructions in the back end at any moment, Condor has replaced the sixty-year-old Tomasulo algorithm alluded to above with a new approach. Based on previous instructions and cache timing, Cuzco knows when data dependencies will be met and when buses and function units will be available, associating these times with a counter. Thus, instead of pulling instructions from a CAM stochastically, Cuzco schedules them at a precise time. If a delay, such as a cache miss, occurs, it updates the planned issue time accordingly.

Power and Area Savings with a Small Penalty

Without CAMs and instruction replays, the design reduces power and die area. Condor reports its time-based approach slightly reduces per-cycle throughput (IPC) by about 5.3% on SpecInt2006 compared with an ideal Tomasulo scheduler, topping 17.5 points per gigahertz on that benchmark without tapping the vector units. That score places it among the fastest RISC-V cores from Akeana and SiFive and doubles what can be achieved with the fastest Andes RISC-V core. Condor targets clock rates of 2.0 GHz and above in a 5 nm process.

Bottom Line

As Condor takes flight, Cuzco’s novel technology exemplifies how an open instruction set fosters innovation. As RISC-V has consolidated less-popular and proprietary architectures, companies can focus resources on design instead of enablement. At the same time, RISC-V enables startups to flourish without petitioning for a license from Arm or the x86 suppliers. Helped by the RISC-V environment, Cuzco exhibits beauty, much like its namesake Andean city.

Future XPU.pub coverage will discuss the slice-based execution units and other details.


Posted

in

by


error: Selecting disabled if not logged in