Enlarge / AMD Radeon Intuition MI60

AMD at this time charted out its plans for the subsequent few years of product growth, with an array of recent CPUs and GPUs within the growth pipeline.

On the GPU entrance are two new datacenter-oriented GPUs: the Radeon Intuition MI60 and MI50. Primarily based on the Vega structure and constructed on TSMC’s 7nm course of, the playing cards are aimed not primarily at graphics (regardless of what one would possibly suppose given that they are referred to as GPUs) however reasonably at machine studying, high-performance computing, and rendering functions.

MI60 will include 32GB of ECC HBM2 (second-generation Excessive-Bandwidth Reminiscence) whereas the MI50 will get 16GB, and each have a reminiscence bandwidth as much as 1TB/s. ECC can be used to guard all inner reminiscence inside the GPUs themselves. The playing cards may even assist PCIe Four.zero (which doubles the switch charge of PCIe three.zero) and direct GPU-to-GPU hyperlinks utilizing AMD’s Infinity Cloth. This may provide as much as 200GB/s of bandwidth (3 times greater than PCIe Four) between as much as Four GPUs.

The playing cards will assist a variety of information sorts for computation; for neural networks and machine studying, there are half precision 16-bit floating level and Four- and Eight-bit integer assist; for HPC workloads, there’s single (32-bit) and double (64-bit) precision floating level. AMD claims that the MI60 would be the quickest double precision accelerator at as much as 7.4TFLOPS, with the MI50 not far behind at 6.7TFLOPS.

The playing cards additionally embody built-in assist for virtualization, permitting one card to be securely shared between a number of digital machines. This makes it simpler for cloud operators to supply GPU-accelerated digital machines.

The MI60 will ship to datacenter clients by the top of the 12 months; MI50 is coming slightly later however ought to be out there by the top of Q1 2019.

On the CPU aspect of issues, AMD talked extensively in regards to the forthcoming Zen 2 structure. The purpose of the unique Zen structure was to get AMD, on the very least, aggressive with what Intel needed to provide. AMD knew that Zen wouldn’t take the efficiency lead from Intel, however the pricing and options of its chips made them nonetheless enticing, particularly in workloads that highlighted sure shortcomings of Intel’s elements (fewer reminiscence channels, much less I/O bandwidth). Zen 2 guarantees to be not merely aggressive with Intel, however superior to it.

Enlarge / TSMC’s 7nm course of provides AMD the manufacturing benefit over Intel.


Key to that is TSMC’s 7nm course of, which provides twice the transistor density of the 14nm course of the unique Zen elements used. For a similar efficiency degree, energy is lowered by about 50 %, or conversely, on the identical energy consumption, efficiency is elevated by about 25 %. TSMC’s 14nm and 12nm processes each path behind Intel’s 14nm course of by way of efficiency per watt, however with 7nm, TSMC will take the lead.

Zen 2 may even tackle sure weak features of the unique Zen. For instance, the unique Zen used 128-bit knowledge paths to deal with 256-bit AVX2 operations; every operation was break up into two elements and processed sequentially. In workloads utilizing AVX2, this gave Intel, with its native 256-bit implementation, an enormous benefit. Zen 2 doubles the floating-point execution items and knowledge paths to be 256-bit, doubling the bandwidth out there and significantly enhancing the efficiency of this code. For integer workloads, department prediction and prefetching have been made extra correct, and a few caches enlarged.

Zen 2 may even provide improved safety towards some variants of the Spectre assaults.

The unique Zen used a multichip module design. Chips used one, two, or 4 dies (for Ryzen, first-generation Threadripper, and Epyc/second-generation Threadripper, respectively) all put collectively right into a single bundle. Every die had two Core Complexes (blocks of 4 cores), two reminiscence controllers, some Infinity Cloth hyperlinks (for connections between dies), and a few PCIe channels. This made it simple for AMD to scale up from the single-die, Eight-core/16-thread Ryzen as much as the 32-core/64-thread Epyc.

The unique Zen topology: every die has all of the elements wanted for a whole processor.


Zen 2 is taking a really totally different strategy, albeit one that also makes use of a multichip design. As an alternative of getting every die include CPUs, reminiscence controllers, and I/O, the brand new design splits up the totally different roles. There will probably be a single 14nm I/O die, with Eight reminiscence controllers, Eight Infinity Cloth ports, and PCIe lanes, after which various 7nm “chiplets” containing solely CPUs and Infinity Cloth. This new strategy ought to treatment a number of the extra awkward features of the unique Zen; for instance, there’s a important latency overhead when a core on one Zen die has to make use of reminiscence from one other die. With the Zen 2 design, reminiscence latency ought to develop into rather more uniform.

The brand new Zen 2 design: widespread I/O capabilities are placed on the 14nm I/O die, with the 7nm “chiplets” containing solely CPUs.


AMD says that Zen 2 is sampling now, with processors as a consequence of hit the market in 2019. Zen three, utilizing an enhanced model of the 7nm course of, is at present “on monitor” and prone to land in 2020, and Zen Four, on a extra superior course of, is at present within the design stage.


Please enter your comment!
Please enter your name here