Ampere Altra ARM CPUs Launch with Up to 80 Cores to Challenge Xeon, Epyc


Ampere has been working on its first ARM CPU architecture intended to challenge Intel and AMD for the data center market, and the company is finally ready to launch the part. The new Altra CPU is a 7nm chip built on an SoC with up to 80 cores. Up until now, Ampere has had the EMAG in-market, but that chip was a single-socket part originally designed by Applied Micro. Altra supports dual-socket cache-coherent operation and the ARMv8.2+ CPU standard (the + is there because Ampere reportedly pulled in future features to get them into silicon more quickly). Each Altra core packs a 64KB L1 I/D cache and 1MB of L2, while the entire CPU is backed by a 32MB L3 cache.

Ampere-Altra-Introduction

Each CPU core contains two 128-bit SIMD units, which is rather less throughput than an equivalent AMD or Intel CPU would offer. INT8 and FP16 workloads are supported for machine learning support and the Altra is reportedly a 4-wide design with a 3GHz turbo frequency. SMT is not implemented; each core is single-threaded. An 80-core Altra is an 80C/80T system, compared with AMD’s maximum of 64C/128T per socket. TDP is stated to be 210W for the 80-core part according to ServeTheHome.

Ampere-Altra-Processor-Complex

The Ampere Altra supports 8 channels of DDR4-3200 per chip (4TB of memory support per socket). Like Epyc, Altra offers 128 PCIe lanes per CPU and uses 32 of them for a socket-to-socket connection in a dual-socket configuration, allowing for 192 PCIe lanes in a 2P system. AMD currently uses 48 lanes for socket-to-socket configuration via IF, which means 2P configurations top out at 160 PCIe lanes. AMD can technically use even more lanes for chip-to-chip connectivity but hasn’t fielded any parts in this configuration.

Performance Expectations

Altra is making some significant estimated performance claims, arguing that its 80-core chip will outperform an Epyc 7742 and heavily outperform the Xeon Platinum 8280. The new Intel Xeon Gold 6258R would be expected to replace the Platinum 8280 in this comparison, given that it’s vastly cheaper and offers identical performance. However, Ampere chose to de-rate the Epyc and Intel platforms to account for compiler differences. Essentially, this means Ampere lowered the expected performance of the AMD and Intel platforms by 16.5 percent and 24 percent, respectively.

Ampere-Altra-End-Note-1

One other thing to note, however, is that the Ampere Altra estimated on this slide has a 3.3GHz top clock speed, 10 percent higher than the part the company is actually shipping. Given that they claim only a 4 percent advantage over AMD, it’s obvious why they chose to model a higher clock speed. The question of whether we see that clock on a shipping part hasn’t been answered yet. STH has more details on the performance and power comparisons, but there are concerns about the validity of the TDP comparisons, so I’m not going to bother writing them up. Basically, Ampere used rated TDP not measured real-world usage, which raises serious questions about how effective the comparison actually is.

Still, there’s no arguing with the larger point. We’re finally starting to see some ARM companies challenging for data center positions. So far, these efforts are fairly characterized as fledgling, but more firms are tackling the space. Nuvia recently launched its own efforts around future server parts and Amazon has launched its own Graviton parts for cloud instances. Whether they can take meaningful market share from x86 is unclear, but they obviously intend to try.

Now Read:

Leave a Reply

Your email address will not be published. Required fields are marked *