Nintendo hyped the Super FX as a "16-bit RISC" chip with "DSP functions." The first chip used in StarFox and Super Star Fox Weekend ran at 10.74 MHz. Later versions ran at 21.477 MHz. Below is what the original chip looked like, with a 100-pin package, 8 megabit (1 MByte) ROM, and 256 kilobit (32 KByte) SRAM. Note that it is indeed called the "Mario Chip 1".

Super NES sprites are always 4-bit pixels, but backgrounds can be 2-bit, 4-bit, or 8-bit. Super FX supports rendering to all of these pixel types, and 4-bit pixels can be automatically dithered between 2 colors. Star Fox uses this dithering extensively.

And then you need to draw the graphics, pixel-by-pixel. Here again the SNES falls flat. The graphics hardware mostly understands tile maps in a "bitplane" format, and you don't want to bog down the pokey CPU.

The Super NES, by itself, has limited math prowess, except for its renowned "Mode 7" which can only scale and rotate a single 2D background. Consider this: the Super NES' main processor, the CPU, doesn't even have a command for multiplication! (It has a math box bolted on after the fact.)

What made the Super FX special? Math power and pixel plotting. If there's one thing 3D graphics need alot of, it's fast multiplication. Polygons, for example, are alot like playing connect-the-dots, but before you draw them to the screen they need to be moved, animated, rotated, scaled, and placed in camera perspective. That needs alot of matrix calculation.

The Mario Chip is the brainchild of Argonaut Software in the U.K., who specified its functionality. Ben Cheese Electronic Design implemented the chip's design. It was prototyped using Actel FPGAs and manufactured by Sharp.

What is the Super FX chip? The Super FX chip is the marketing name for the "Mario Chip" that was employed in StarFox to make that game's 3D polygon graphics feasible. There's no way the Super NES by itself can handle that kind of math fast enough for a good framerate. At the time the Super FX was an attractive, reasonably low-cost solution to math-intensive, per-pixel graphic effects on the Super NES. StarFox was released in 1993 as the result of years of work, both on the game and on the Super FX itself.

The Super FX is 16-bit internally but for various practicalities (cost, host system compatibility, board layout, power) talks to the outside world over 3 independent 8-bit buses.

The Super FX talks independently to 3 external devices: the Super NES itself, its own program ROM, and its own RAM. It also talks to its own internal buffers and cache. Moreover it lets the host Super NES also talk to the ROM and RAM as transparently as practical, although Super NES transactions are slow.

The Super FX RAM is normally a single chip SRAM device, although the patent explains that DRAM could also be supported (at expense of performance).

And although no commercial game ever did so, Nintendo's developer manual does have exotic memory mapping for mixed memories. For example, you can have a Super FX with its own RAM and ROM (subject to its own capacity limits) and also have independent ROM and RAM that only the Super NES itself can see.

Okay, so what does the Super FX do for StarFox?

StarFox's game engine showcases the Super FX in many ways. It's amazing how much that game engine does on a 10.74 MHz chip:



Can't N64 and all modern consoles do all that and more?

Absolutely. Alot has changed since 1993. Today's systems are much more suited to making believable 3D worlds with even more complicated effects. They do this by running faster and having extra special circuitry that handles polygons directly. This may surprise you, but the Super FX has no specific "polygon hardware." It's all about software; it needs a good programmer to make proper use of it. Not to belittle today's more complicated games, but all a system like N64 needs to do is have a game engine generate a list of things to display and tell the graphics chip to draw it. On the Super FX, however, you also need to write software telling the Super FX how to draw stuff---a software rasterizer. Something of a lost art nowadays.

So can the Super FX do other stuff, too?

It can do whatever you program it to do, basically. That doesn't mean it's going to be ideal for everything. (I'm not sure I'd want it doing tons of graphics AND sound together, for example!) But that's a nice thing about it. Different games can do different things. Yoshi's Island does some simple polygons, but most of the time the Super FX is used for complicated 2D graphic effects like sprite scaling. DOOM, as many people know, doesn't use polygons at all for its graphics. Commanche was to have used a voxel engine.

What are its limits?



Of course alot depends on how good your program is. You also have a limited amount of memory on the cartridge. (Typically 256 kilobits (32 KBytes) or 512 kilobits (64 KBytes) or 1 megabit (128 KBytes) of RAM and under 2 MBytes of ROM for the game program.) The images drawn by the FX fit in the RAM on the cartridge itself, and the Super NES itself is told to periodically stop the Super FX in its tracks and read the finished image. This is then shown on your TV until the next is ready. (A form of double buffering.)

Even in the worst case (16-bit x 16-bit multiply with a 32-bit result) it can return a result in 9 cycles. A smaller 8-bit x 8-bit multiply is 2 cycles. Pixel plotting is fast as long as its pixel buffers are used efficiently. So performance, if doing nothing but math, is still in the millions of operations/second. There is no instruction-level paralellism, so don't expect more than 1 operation per cycle in the best case. For the math-impaired, that means AT MOST 21 million operations/second on the fastest Super FX chip. In real life, out of its 10.74 MHz chip StarFox can achieve a few MIPs (usually around 4 on the map screen) and faster games can do more. DOOM, one of most lambasted games, is actually impressive. It achieves the best cache utilization of any game I've looked at so far and can routinely hit 7-12 MIPs on a per-frame basis. Today of course $10 will buy you a chip that can do tens of billions of operations/second and decode a DVD but that's progress for you.

This means if you want 20 frames per second in a game you have a budget of--at most--a million or so operations per frame. Actually less since the Super FX has to be halted when the SNES is reading the results.

The other limit is the SNES itself. You can only read so much from the cartridge in a given amount of time. (That's its DMA speed.) But more than that, you can only send it to video RAM when the PPU isn't accessing it to draw the screen, which brings me to the next point:

Another limit is the size of the image the SuperFX can draw to: at most, 192 lines high, which is less than the SNES itself displays. Consequently, most SuperFX games have a black border around the screen or cleverly use regular SNES graphics as a wrapper around SuperFX-generated graphics. This black border serves a very critical secondary function -- it gives the SNES more time to get the results (via DMA) from the Super FX chip to its video RAM. The SNES cannot update its video RAM contents while it is driving the display, so by forcing VBLANK longer (during which the SNES outputs black lines) the game can yield more time to DMA SuperFX RAM contents to video RAM.

And so your framerate is limited by both the complexity of what the game is drawing (compute time) and by how long it takes to send the results to the Super NES itself (DMA speed).

StarFox, in particular, uses 75% of the screen resolution for its frame buffer during the action stages of the game. These frames (in the action stages) use 4-bit pixels. Each frame, once completed, takes 2 Super NES screen periods to transfer to video RAM. For less complex scenes, such as the title screen or the boss battle on Titania, 20 FPS is easily achieved. The worst performance is during the end credits where low single digit FPS is achieved as programmer names fly toward the screen.

Other games use different size frame buffers, both resolution and/or color depth, and sometimes it takes 3 Super NES screen periods to send larger amounts of data over.

How many polygons per second can it draw?

See above. I have no idea where Nintendo pulled the 76458 number from. I also still need to find that letter they sent me. It's very much lost right now.

But we have a better number that wasn't pulled from a hat. Nintendo Power volume 69, page 63, has a choice quote from Jez San regarding the ill-fated FX Fighter game. "When you throw in both fighters, plus the floor polygons and multiply it by the frame rate, you get somewhere between 15,000 to 20,000 polygons per second being displayed."

Now this low-sounding number is somewhat believable, and actually pretty impressive. We know that Fight for Life on the Atari Jaguar struggled to hit 20,000, and that system, in spite of its many handicaps and errata, is more capable overall.

As an aside, FX Fighter's development saga was weirder than Jez San himself would realize. From the title it is plainly obvious that this was intended for the Super FX chip, but when the Super NES game was cancelled it instead got ported to the PC. The PC game got a new software renderer called "B Render". The PC game also sucks, so you get an idea why the Super NES game was axed, even though we really might have appreciated it as a tech demo game. However, no one expected Silicon Graphics to pony up cash to buy B Render, which subsequently went nowhere.

Also for reference, a bottom rung Silicon Graphics Indy workstation from this time period (MIPS R4000 CPU, XL graphics) isn't rated for that much higher (39K triangle mesh polygons/second...or less in some of the Flip Objects demos) although you certainly do get higher resolution and more colors and so on.

What is the Super FX2 chip?

There are several incarnations of this chip. Nintendo rarely marketed their chips (outside of a few Nintendo Power articles, how many people knew or cared if their NES cartridge had MMC3B?) and in spite of the immense marketing push on the first Super FX games there was a lull until someone decided to pull the trigger and make up "Super FX2"!

Meanwhile in reality a few chips had come and gone.

It's clear the original chip was the cut-down model. Normally you cut down clock speed in a mass-market product for 2 reasons: yield (cost!) and/or power budget. (Nintendo's developer manual specifically mentions that SuperFX games should detect the presence of a multitap and halt gameplay to avoid taking too much power!) They did, however, let you run the multiplier inside at full-speed (21.477 MHz). There's a register bit that triggers this.

At some point a full-speed 21.477 MHz SuperFX did indeed get out. It's unclear when exactly this cutover happened. It could have been a later flavor of the original MARIO chip or the GSU-1. And it's hard to peg down if this is "Super FX2" or not because it happened before the marketing message got out.

With GSU-2 we have a larger chip package, with 112 pins instead of 100 pins, and since the SuperFX patent only covers the 100-pin model in any detail at all we don't know just what functionality was changed. Generally we associate the "SuperFX2" monicker with a chip that runs at full-speed (21.477 MHz) and can support (optionally) more ROM (16 megabits, as per Yoshi's Island). It is not two chips in one. That would cost alot more because the chip would have twice as many parts. The Super FX costed alot as it is. The initial cost to developers was reported to be $10 at the time, which is probably why so many third party games had such small memory sizes (4 megabits)--they wanted to reduce cost in other ways.

Below is a picture of the Super FX2. AKA "GSU-2". Notice the larger chip package, with 112 pins. Although the glare hides it, the board has a 64KB SRAM installed and could support a 128 KB SRAM.



Did other games have extra chips?

Games like Pilotwings and Super Mario Kart have fixed-program NEC DSPs for extra math power. F1-ROC2 had a custom-programmed DSP as well.

Super Mario RPG has a much more general-purpose SA1 chip as a second CPU, and Street Fighter Alpha 2 has the SDD1 chip for graphics decompression. In this generation of games, coprocessor chips were not very common mostly due to cost, unlike on the NES where small, cheap MMC chips were required to access larger games on almost every cartridge. While the Super FX only added $10 to the cost of the cartridge, that price tag was enough to turn off most developers. There was a slightly reduced cost cartridge available--at least at a reduced price to Nintendo--that used simpler glop-top packaging, but that still didn't popularize it beyond a handful of games.



What games had FX chips?



Original Super FX chip games: