NVIDIA released a couple white papers this week detailing some of the design decisions behind its upcoming quad-core Tegra 3 mobile processor, codenamed "Kal-El." In particular, Kal-El aims to bring desktop-like performance in gaming and video, and nearly double the raw compute performance of current dual-core ARM-based mobile chips, while still saving power. To do so, Kal-El uses a combination of four "fast process" cores mated to a 5th "low power process" core that handles idle background processing tasks.

Kal-El includes four ARM Cortex A9-based cores designed on a 40nm process. The transistors that make up these cores can be switched at higher frequencies while using a lower operating voltage compared to, say, transistors on a 65nm process. Optimizing Kal-El for higher operating frequencies allowed NVIDIA to lower the operating voltage of the cores.

This lowering of core operating voltage is critical to improving Kal-El's power efficiency. The power draw of any processor is largely made up of dynamic power use—the power needed to switch its various transistors on and off billions of times per second—and a tiny amount of "leakage" power caused by current that "leaks" across the tiny gaps between conductors in a transistor.

Dynamic power in a processor is proportional to both the operating frequency and the square of the operating voltage. This makes it possible to increase the operating frequency—i.e. make the processor faster—while still reducing overall power consumption by reducing the operating voltage.

Since Kal-El's four main ARM cores can run at higher frequencies but lower voltages than the current dual-core Tegra 2, the chip can either match Tegra 2's performance within a much lower power envelope, or it can nearly double the performance of Tegra 2 while still using slightly less power overall.

There is a downside to this shrinking of the process to increase the performance-per-watt, however. The smaller transistor gate gaps increases the amount of leakage current that flows throughout each core. So while Kal-El's cores end up using less power under heavy load than the Tegra 2, those cores would end up using more power while a device is in an "idle" or standby state—running low priority background processes such as checking for new e-mails or tweets, receiving push notifications, running timers, and more.

The graph below shows two hypothetical processors, one using a larger, low-power process, and one using a smaller, higher-performance process. To ramp up the frequency in a CPU using a larger, low-power process, voltage must be increased, resulting in a steep climb in power draw. Pushing a smaller, higher-performance process requires less voltage, and therefore lower power, at full tilt. But such a processor draws much more power at idle.

To minimize both standby as well as high performance power consumption, NVIDIA added what it calls a 5th "stealth" core to Kal-El. This core is designed using a "low power process" which has a much lower leakage current than the process used for the main four cores. This core operates at a maximum of 500MHz, and handles the sorts of background tasks that run while a mobile device is on standby or common background tasks such as playing music.

As more CPU power is demanded, Kal-El's Variable Symmetric Multiprocessing (vSMP) technology will dynamically switch to use a high-performance core. vSMP continually manages whether Kal-El runs on a single low-power core, single high-performance core, or two or four high-performance cores. The switching is handled independently by dedicated hardware on the chip, independent of the operating system. Kal-El can effectively act as a single-core low-power processor, a single-core high performance processor, a dual-core high performance processor, or a quad-core high performance processor, all on demand as needed.

The vSMP system is designed to keep the shifts between cores to a maximum of 2ms, and according to NVIDIA these changes are "not perceptible to end users and [do] not result in any OS scheduling penalties." While a mobile device is on standby, background functions continue to run on the slow-clocked, low-power efficient "companion" core. One high-performance core switches on for reading e-mail, basic browsing, or 2D graphics. Two cores come online for running Flash, multitasking, or playing back video. All four cores come online for multi-tabbed browsing, media processing, or high-end 3D gaming. The operating voltage can be regulated across all four cores to minimize power draw for a given performance level.

Kal-El was supposed to be available by August this year, according to NVIDIA's initial aggressive schedule. That milestone has come and gone, but judging from the data in its latest white papers, it should be nearing release in the first half of 2012. NVIDIA did not respond to our request for more information about a new target for a Tegra 3 launch.