If you follow CPU architectures and design, you know that the industry has been in a rut for almost a decade. We’ve explored these topics a great deal at ExtremeTech, including why CPU clock speeds have stalled out, why multithreading hasn’t been a perfect replacement solution, and the ongoing work from companies like Intel, Nvidia and AMD to find better solutions. Now, a new semiconductor design firm, Soft Machines, has gone public with a new conceptual architecture it thinks could usher in a new era of scaling and performance — and after talking to the company, I think there’s a chance it has invented something huge.

Introducing VISC

Soft Machines’ goal with Variable Instruction Set Computing (VISC) is to break through the efficiency barriers that have prevented modern chips from realizing great architectural improvements. The problem with conventional out-of-order execution (OoOE) designs is that adding additional re-ordering capabilities drives up power consumption and cost at an enormous rate. This is why Intel and AMD don’t have dozens of execution units bolted to enormously complex out-of-order execution engines — the difficulty of building them scales up much faster than the performance improvement of doing so past a certain point.

VISC attempts to avoid the difficulties of scaling multiple threads in hardware — and providing support for such in software — by providing a framework in which workloads that appear sequential to the operating system are then scheduled across a set of virtual cores in hardware.

Instead of relying on a heavy, power-intensive, and complicated out-of-order execution engine, VISC’s hardware focuses on extracting ILP (instruction-level parallelism) from a workload, then scheduling that workload in very small threads (called threadlets) across a flexible number of cores. While the diagram above shows each virtual core mapping to one physical core, that’s not required — in fact, the flexibility from mapping across all four cores is vital to the architecture’s function.

The diagram above shows two different applications that’ve been passed to a VISC core. In the first case, the CPU interleaves the serial workload across two physical cores, extracting parallelism where it finds it, and executing the application (theoretically) more efficiently than in a conventional OoOE engine. In the second case, two applications are scheduled for simultaneous operation. The “Heavy” app is handed most of the core’s resources, while the “Light” app receives a fraction of the total.

According to Soft Machines executives, the VISC architecture is capable of shifting core resource allocation every cycle — meaning that if a heavy app becomes light or vice versa, the computer can flexibly adjust its scheduling to compensate, moving resources where they’re needed most. Obviously it’s still going to be best if workloads are fairly static, but this flexibility is what Soft Machines believes will give it an advantage over more general purpose architectures.

The approach is supposedly OS- and vendor-agnostic; VISC can handle both ARM and x86 code with a roughly 5% performance hit from translation. This may remind you of Transmeta, whose Code Morphing Engine translated x86 ops into a VLIW-compatible structure that the CPU could execute, but Soft Machines stresses that VISC is actually the opposite approach. Instead of an incredibly complex and difficult translation engine that relies on software execution, the work of extracting ILP and scheduling the code to run across hardware is done on-chip.

Next page: VISC performance, and does VISC stand a chance against ARM and x86?