中文版在英文下方，阅读愉快!

(Chinese translation below English.)

There have been several successful television shows dedicated to understanding the many great feats of engineering that have been accomplished. Medical breakthroughs, space exploration, technological marvels; we are fascinated by what we have been able to achieve. At the 2015 International Solid-State Circuits Conference (ISSCC) conference, AMD revealed details on how we accomplished our latest engineering marvel - the upcoming “Carrizo” Accelerated Processing Unit (APU). The semiconductor industry has long relied on axioms of process technology, such as Moore’s Law and Dennard scaling, to drive improvements in device power and performance. As these laws become more challenging, AMD is responding by implementing a wealth of power management and architecture improvements that in many cases deliver even greater benefit than traditional technology scaling. So, how do we do that?

Carrizo Real-estate





The new “Carrizo” microprocessor will include four “Excavator” processor cores and powerful AMD Radeon™ Graphics Core Next (GCN) cores. With approximately the same area footprint as its predecessor “Kaveri”, “Carrizo” fits 29% more transistors (3.1 billion) onto a die. By utilizing a high-density library design, “Carrizo” achieves a 23% area reduction for the “Excavator” cores while still providing more transistors and more instructions per clock (IPC). The thermal density challenge of the smaller “Excavator” core is mitigated through intelligent floorplan placement and the use of lower leakage transistors. The area reduction for the cores enabled a larger area of the chip to be allocated for graphics, multimedia, and the integration of southbridge and AMD Secure Processor logic onto the APU. The increased footprint for graphics intellectual property (IP) was used to improve the compute performance of “Carrizo,” which is designed to be the world’s first heterogeneous system architecture (HSA) 1.0 compliant part. The multimedia IP has been enhanced with a new high-performance video decoder and double the video compression engines of “Kaveri”. This larger multimedia engine can transcode nine real-time 1080p video streams, an impressive 3.5× improvement over “Kaveri”.

Energy Efficiency and Power Consumption





HSA innovation from AMD saves energy by eliminating connections between discrete GPU and CPU processors, reduces computing cycles by treating the CPU and GPU as peers, and enables the seamless shift of computing workloads to the optimal processing component. HSA allows many workloads to execute more efficiently using GPU compute resources in addition to CPU resources providing better performance at the same energy consumption. Additionally, “Carrizo” moves the GCN cores to a separate conditionally-enabled power supply. This allows the graphics cores to operate at their optimal voltage, which can give a 20% power improvement over “Kaveri” with six GCN cores. “Excavator” supports AMD’s first implementation of adaptive voltage-frequency scaling (AVFS), an improved version of other adaptive voltage approaches. AVFS allows each part to self-calibrate and determine the optimal voltage for current operating frequency and conditions. Timing-margin prediction vs. actual timing margin indicates the ability of AVFS to set the minimum voltage required across the entire voltage range, resulting in up to 30% power savings. The full implementation cost of AVFS is under one percent of the core area. In addition to the area reduction, the “Excavator” core has achieved program goals by reducing power versus the previous “Steamroller” core by 40%!

So… How do we do that?





Through a multitude of impressive optimizations, AMD has been able to combine four “Excavator” cores, eight Radeon™ GCN cores, the southbridge, AMD Secure Processor technology for enterprise-class security and a HSA-1.0 design on a single “Carrizo” APU. The new “Excavator” cores are smaller, more powerful and more energy efficient than the previous generation. The power optimized GCN graphics cores provide impressive performance-per-watt improvements. HSA capabilities enable new, more efficient applications. Multimedia throughput is improved by 3.5x, and hardware support for H.265 decode is included. All of this is done without a change in process technology, and while holding the die size flat generationally. “Carrizo” is truly a feat of engineering, a great step toward AMD’s 25x20 energy efficiency goal and a testament to the AMD commitment to deliver great products.

To dig further into the details, check out the ISSCC 2015 AMD press release and presentation on the ISSCC page of the AMD website.

Kevin Lensing is Sr. Director, Client Product Management, Computing and Graphics for AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites are provided for convenience and unless explicitly stated, AMD is not responsible for the contents of such linked sites and no endorsement is implied.









高能效创新之道

有好几档不错的电视节目专门展现了人类在工程技术方面取得的许多重大成就。医疗技术突破、太空探索、技术飞跃，我们被人类自己的创举所震撼并为之着迷。即将问世的 AMD “Carrizo” 加速处理器 (APU) 就是一项最新的工程技术成就。AMD 在 2015 国际固态电路大会 (ISSCC) 上披露了这一产品的一些细节。长久以来，半导体行业依靠工艺技术的规律（例如摩尔定律和登纳德微缩定律）来改善电子器件的功耗和性能。如今这些定律越来越难应验和遵从。AMD 在诸多方面进行了大量的电源管理和架构改进，甚至比利用传统微缩工艺带来的优势更明显，从而很好地应对了这一挑战。那么AMD究竟是如何做到的呢？

Carrizo 的基础架构





新款 “Carrizo” 微处理器将包含四个 “挖掘机”处理器内核和多个性能强劲的AMD Radeon™ 下一代图形核心(GCN)。与前代“Kaveri”相比，“Carrizo” 在几乎同样大小的芯片上集成的晶体管数量多出了29% （31亿个）。通过采用高密度库，“Carrizo”使“挖掘机”内核的面积减少了23%，同时容纳了更多的晶体管并且每时钟周期指令数(IPC)也有所提高。 “挖掘机”内核变小会带来热密度挑战，AMD通过采用智能平面布局和低泄漏晶体管使这一挑战得以解决。内核面积缩小后，APU就会有更大的芯片空间留给集成显卡和多媒体，并用于集成南桥和AMD安全处理器逻辑电路。更大的集显 (专利技术) 面积用于增强 “Carrizo”的计算性能，并使其成为全球首款支持异步系统架构 (HSA) 1.0 的产品。由于采用了新型高性能视频解码器和两倍于“Kaveri”的视频压缩引擎，多媒体 (专利技术) 性能也得到了增强。更强大的多媒体引擎可以转码9个实时1080p视频流，比 “Kaveri”提升了惊人的3.5倍。

能效与功耗





HSA 是 AMD 的一大创新。它通过消除独显和CPU之间的连接节省了能源，通过把CPU和GPU放在对等的位置减少了计算周期，而且它能够把计算负载妥善地转移到最优的处理部件上。HSA能够利用GPU来更高效地执行许多负载，而不单纯依赖CPU资源，在相同能耗的情况下带来更好的性能。此外，“Carrizo”的GCN内核采用独立的、按条件启用的电源，因此图形核心可以工作在最佳电压之下，其功耗与具有6个GCN核心的 “Kaveri”相比改进了20%。 “挖掘机”支持AMD 率先应用的自适应电压-频率调节 (AVFS) 。相比其他自适应电压方案，AVFS经过了改进。它让每个部件都能够根据当前的工作频率和状态进行自我调节并确定最佳工作电压。通过比较预估时钟余裕和实际时钟余裕可以看出AVFS在整个电压范围内确定所需最低电压的能力。这一技术可节能高达30%。而完整实施AVFS的仅占用不到1%的内核面积。“挖掘机”内核不仅减小了面积，而且还实现了计划目标：功耗比“压路机”内核降低40%！

那么，效果如何呢？





经过大量令人赞叹的优化之后，AMD已经能够将四个“挖掘机”内核和八个Radeon™ GCN内核、南桥、面向企业级安全应用的AMD安全处理器技术、以及HSA-1.0设计，集成到一个“Carrizo” APU中。新的“挖掘机”内核比上一代更小、更强大并且更节能。功率经过优化的GCN图形核心的每瓦性能提升惊人。HSA特性带来新的、更加高效的应用。多媒体处理能力提升了3.5倍，并且支持 H.265 硬解码。如此多的革新和进步并未通过改变工艺便已经实现，而且芯片尺寸依旧纤薄小巧。“Carrizo” 真的是工程技术的精妙杰作，也是AMD朝着 25x20 能效目标迈出的一大步，更是 AMD 履行“创造伟大产品”承诺的有力证明。

如欲了解更多详情，请参见ISSCC 2015 AMD新闻稿和AMD官方网站的 ISSCC 讲稿。

Kevin Lensing 是AMD 图形与计算事业部的客户端产品管理高级总监。他的博文仅代表他个人观点，不代表AMD的立场、战略或观点。指向第三方网站的链接仅供方便您阅读之目的，除非特别注明，AMD不对指向的网站上的内容负责，也不意味着AMD对其内容持赞同态度。





*Originally Posted by System Admin in AMD Business on Feb 23, 2015 6:53:29 PM