Last night, I was sitting around thinking about Intel's forthcoming discrete GPU, codenamed "Larrabee." Specifically, I was thinking about how Larrabee will have lots of simple, in-order cores with a short pipeline, and it occurred to me that they might look something like Pentium cores. And lo and behold, I awakened this morning to a rumor that Intel is reviving the Pentium MMX microarchitecture for Larrabee.

The story told by Tech.co.uk is that, "the late 90s processor core will be resurrected in a highly modified form... Precisely how closely related Larrabee's execution cores will be to the original Pentium MMX isn't known."

Well, let's see. If you take the Pentium MMX microarchitecture, which fits the bill of short, simple, and in order, you enlarge the front end to support four-way SMT, you massively expand the floating-point/vector datapaths from 80 bits to 512 bits, and you put Vec16 hardware in the back end, then you have... something bears about as much resemblance to the Pentium MMX as Core 2 Duo does to the Pentium Pro.

In other words, a high-level block diagram will show you a rough family resemblance, but calling it the "Pentium MMX" redivivus is a stretch.

What the rumor makes me wonder about is, how many instructions will Larrabee's cores be able to issue per clock? The "Pentium MMX" rumor suggests that the answer is two, but that seems like a low number for a core that can do four-way SMT. Still, the SMT is there for latency hiding, and issuing more than two instructions per-clock (without some kind of bundling, like micro-ops fusion) is probably infeasible for a statically scheduled machine. The idea that it could reach something like four or more instructions per clock using some kind of Pentium M-style macro- or micro-ops fusion mechanism is interesting, but such tricks may add too much complexity to the front end's predecode hardware to be worth it.