AMD Makes some Lemonade…

AMD’s Mullins and Beema APUs redefine their mobile aspirations

I guess we could say that AMD has been rather busy lately. It seems that a significant amount of the content on PC Perspective this month revolved around the AMD AM1 platform. Before that we had the Kaveri products and the R7 265. AMD also reported some fairly solid growth over the past year with their graphics and APU lines. Things are not as grim and dire as they once were for the company. This is good news for consumers as they will continue to be offered competing solutions that will vie for that hard earned dollar.

AMD is continuing their releases for 2014 with the announcement of their latest low-power and mainstream mobile APUs. These are codenamed “Beema” and “Mullins”, but they are based on the year old Kabini chip. This may cause a few people to roll their eyes as AMD has had some fairly unimpressive refreshes in the past. We saw the rather meager increases in clockspeed and power consumption with Brazos 2.0 a couple of years back, and it looked like this would be the case again for Beema and Mullins.

It isn’t.

I was again expecting said meager improvements in power consumption and clockspeeds that we had received all those years ago with Brazos 2.0. Turns out I was wrong. This is a fairly major refresh which does a few things that I did not think were entirely possible, and I’m a rather optimistic person. So why is this release surprising? Let us take a good look under the hood.

Process and Design

The previous generation of low power APUs is based on the Kabini and Temash designs. Kabini/Temash is comprised of Jaguar CPU cores and GCN graphics/compute cores. These products spanned the 4W to 25W TDP range, and did so very well. It competed with the latest Intel Bay Trail parts fairly effectively (beating Bay Trail out in graphics, but pulling more power overall). These products had a fair amount of success in the marketplace, and it culminated with the aforementioned AM1 platform. While these products are good, they just could not overcome the power and thermal advantages that Intel brought to the table with Bay Trail.

So AMD went back to the drawing board. Instead of doing a clean sheet design that would take years to complete, they went over Kabini/Temash with a fine-toothed comb and extracted every little bit of performance and efficiency out of the part. They also introduced new software technology that helps the APU address multiple workloads in the fastest and most efficient manner it could. AMD also appears to have worked very closely with the foundry people to extract the greatest amount of improvements possible from the 28 nm HKMG node.

Currently Intel is the only semiconductor manufacturer that is using a sub-28 nm process in any kind of volume. Yes, there are a few 20 nm products from other manufacturers in production, but they have not reached the market so far. 20 nm production and below has been problematic for the industry (except for Intel) due to a variety of reasons. We do not expect 20 nm products from pure-play foundries and their partners until 2H 2014, and those look to be primarily low power parts and SOCs. So to have any refresh product in 2014, these designers have to use 28 nm HKMG and its derivatives. For quite a while TSMC was the only pure-play foundry offering 28 nm, but GLOBALFOUNDRIES has finally gotten their process running at a decent pace. GLOBALFOUNDRIES appears to be the manufacturer of the 25 watt Kabini AM1 products for AMD.

AMD has not confirmed that GF is producing the latest Beema and Mullins chips, but there is a very good chance that they are. One thing that GF does not have that TSMC does is a lot of customers vying for space on process lines. This does not allow designers a lot of engineering time with foundry engineers who could work with individual designs to improve thermals and clockspeeds if there is great demand for line space. There is a certain amount of work being done by foundry engineers to improve yields and bins, but demand does not allow as in-depth work as is desired to be done. There just is not enough manpower, money, and time to effectively do this.

If GF is in fact producing these parts, it looks like they have worked extensively with AMD to more adequately optimize Beema/Mullins for their 28 nm process. Now, this is complete speculation based on factors such as line space availability, GF opening up their 28 nm process, and the pretty significant improvements found with the latest Beema/Mullins APUs. Needless to say, even if TSMC is the primary foundry for this APU release, a lot of work has been done at 28 nm to get these improvements.

AMD engineers have also played a key role in the improvements we see. The Beema/Mullins parts are based on Puma+ x86 cores. This core is functionally identical to the earlier Jaguar, but it is again highly optimized to improve clock speeds and power consumption. There was some work done on the GPU side, but I do not think nearly as extensively as what we see from the CPU side. All indications point to process improvements and the lower TDP Puma+ cores allowing more thermal headroom for the GPU portion to be clocked significantly higher.

So how much faster are these new parts? Well, they are quite a bit faster clocked than the previous generation. What is more significant is that they can clock higher while achieving much LOWER TDPs. The results so far look almost like a half-node jump in thermals and clockspeeds, but these chips are still produced on the 28 nm HKMG that was introduced almost two and a half years ago. I must reiterate, these chips are functionally identical (except for one major feature) to the previous Kabini/Temash parts. The basic design is a quad core CPU with 2 x GCN compute cores (128 total stream units).

The top-end SKU is the 15 watt A6-6310. This number has a top clock speed of 2.4 GHz for the four CPU cores, which is up 400 MHz from the 25 watt TDP Kabini parts. The GPU portion runs at 800 MHz, a full 200 MHz faster than the aforementioned Kabini. So, we have the same process node, but the TDP goes from 25 watts down to 15 watts, and the clockspeeds for the CPU and GPU portions rise pretty dramatically. Progress! That 15 watts also allows the memory controller to address memory at speeds up to DDR-3 1866.

The top “efficiency” SKU is the A10 Micro-6700T. This is still a quad core part with 2 x GCN compute cores (128 stream units) with impressive numbers. It has a max clock of 2.2 GHz for the CPU and 500 MHz for the GPU. This part in particular has a rated TDP of 4.5 watts. It also supports DDR3L-1333 memory (L stands for low-power). This product has a SDP of 2.8 watts (Scenario Design Power). Temash did not get anywhere near close to these numbers, though they were in the 4 watt TDP ballpark.