Saying that Intel Xeon Phi had a troubled birth is an understatement. Even with the products being deployed in thousands to tens of thousands of units, profitability is nowhere to be found. The company now plans to address that using a two-fold strategy which begins with – reduction in BoM cost to the company. The original Xeon Phi or “Larrabee resurrected” is a standard PCI Express board that utilizes dual slot cooling and you could easily confuse it for a graphics card. Xeon Phi, formerly known as “Knights Corner” consists out of Knights Corner accelerator (manufactured in 22nm), ranging from 57 to 61 cores. Number of physical cores is of course larger (for yield purposes), and depending on what Xeon Phi you buy, you will acquire a combo of 57, 60 or 61 cores paired with 6, 8 or 16GB of ultra-fast GDDR5 memory.

This does not come for cheap, as Intel priced Xeon Phi between $1695 (3120P, 57 cores, 6GB) and $4129 (7120P, 61 cores, 16GB). The reality is different though, and we learned what are the real ASPs (Average Selling Price). First and foremost, in most of Xeon Phi deals, Intel made money on bundling it with the CPU and making money on the least expected, but a nice profit driver – the chipset itself. Xeon Phi on its own did not make a whole lot of money, especially given that most parts were outside Intel’s jurisdiction (the company does not manufacture GDDR5 memory, nor most of PWM elements or the PCB itself). Figures we learned were even enough for a potential anti-trust action against Intel, given that the company was often selling Phi parts (or even giving them away, to bank institutions such as UBS) under price, and well in the red.

Furthermore, first generation Xeon Phi is extremely sensitive to PCI Express bus, and the latencies / bandwidth can go up or come crashing down as different motherboards are used. Our friends at Puget Systems gave the best example just how sensitive the Xeon Phi is to different PCIe standards. Remember, PCIe Gen3 x16 slots aren’t present on every server motherboard, as the makers just love to cram as much PCIe x8 slots as possible. The fact that you have to adjust your software as well certainly did not help, as people in the know told us that it does not matter how much software changes you have to do, it is how much of a speedup you can get as multi-socket and multi-server approaches introduce unwanted latencies.

Under the new management lead by Brian Krzanich, Intel is trying to make the part which will make money for the company, and not be reliant on 3rd party makers. How? The answer is easy – by turning Xeon Phi into a socketed product which requires a regular Intel Xeon to run. Yep, you’ve read it correctly – future of Phi is to become a co-processor, something that reminded us of the days when AMD had that technological leadership and talked about Torrenza, Open HyperTransport etc. That was one of key reasons why AMD snapped up ATI (after the merger talks with Nvidia failed), to bring a GPU onto the CPU socket. From the looks of it, Intel will be first out the door with the real coprocessor (if we forget what the Chinese are doing with HyperTransport and Alpha / MIPS cores), reducing Xeon Phi to a pure die-meets-Organic LGA Package play. Naturally, GDDR5 is no longer an option and we will see a reduction in sheer bandwidth, which currently reaches 352GB/s. Going with DDR4 on a 256-bit memory bus (should the company keep LGA-2011 and LGA-2011B in foreseeable future) should cut the available bandwidth to 150GB/s range. At the same time, the size of memory should grow from 16GB to 32, 64, 128 or 256GB/s, all while improving the pure TFLOPS performance.

This way, it will be up to system integrators to decide just how much system memory should be plugged in, and a potential world of hurt for Nvidia and their Tesla, as well as AMD’s FirePro lineups – which all have memory limited to the amount of PCB space the companies can give it. According to information that we received so far, first socketed samples are out in the wild, but the timeframe of introduction itself is not known. We would not be surprised if Intel quickly replaces its current Phi lineup as soon as Haswell-EP/EX makes its public appearance, paving a path for a new and more profitable server platform. Naturally, Nvidia already came up with a counter announcement, going with Stacked DRAM for their 2015-16 Volta GPU architecture. AMD plans aren’t known at the time, but knowing the reactive nature of that company, we wouldn’t be surprised if something comes a year or two after Intel. With around a 10 year delay from the original HTX concept, of course.

One thing is certain – things are heating up in the GPGPU/Accelerator arena, and the key question is – who stands to benefit the most? ‘Big Iron’ market is growing, and Intel just might have a way to turn its ‘ugly duck’ into a beautiful money making swan.

Original Author: Theo Valich

This news article is part of our extensive Archive on news that have been happening in the past 10 years. Here at BSN we love to cover latest tech news, so be sure to visit our homepage for up-to date stuff. Additionally, we take great pride in our Home Office section, as well as the VPN one, so be sure to check them out as well.