Intel has released its Cooper Lake family of server processors, bringing some long-awaited improvements to the Intel Xeon CPU family. In 2020, the server CPU division will be covered by two families: Ice Lake in the 1-2 socket segment, and Cooper Lake in the 4S and 8S segments.
Originally, Cooper Lake was expected to be a top-to-bottom Xeon refresh, but Intel announced that Ice Lake would fill this role after all, implying improved yields from what the company thought it would be able to offer. The three major new features are:
bfloat16: bfloat16 is a new data type that increases the amount of useful data that can be packed into a 16-bit number for AI calculations by using an 8-bit exponent width and 7 bits for precision, rather than a 5-bit exponent with 10 bits for precision. Bfloat16 retains the full range of the 32-bit float32 format, but reduces the precision from 23 bits to 7 bits. In both cases, one bit is reserved to set the number positive or negative (the sign bit).
Doubled Socket Bandwidth: Straightforward increase here. Cooper Lake uses six UPI (Ultra Path Interconnect) links backed up by three controllers, meaning each CPU can only connect to up to three other chips, but total bandwidth between the CPUs has doubled to 20.8GT/s thanks to the additional lanes.
Any CPU can connect to any other CPU with a minimum of two hops. Intel supports up to eight CPU sockets in total with this connection scheme, while additional logic can allow for more chips if a third-party OEM wants to build the platform for it.
Faster RAM, New Optane Modules: Finally, there’s Cooper Lake’s support for higher-clocked RAM and new second-generation Optane modules from Intel. In terms of RAM clock, Cooper Lake supports up to DDR4-3200 if you use just 1 DIMM per channel (DPC). If you use two DIMMs per channel, you’ll need to drop to DDR4-2933.
Base model Cooper Lake CPUs now support 1.125TB of RAM, up from 1TB standard, allowing customers to combine 6x 64GB RAM modules and 6x 128GB RAM modules. It isn’t clear why Intel doesn’t just support 12x 128GB RAM modules for 1.5TB of total capacity — that’s still a 3TB gap between the 1.5TB-supporting Xeons and the 4.5TB-supporting Xeons. Intel used to have a 2TB-supporting set of products, but competition from AMD’s Epyc and its 4.0TB of RAM support across all SKUs led the company to remove those chips. The Xeons with 4.5TB of RAM support are listed as “HL” CPUs.
Then there’s the new second-generation Optane memory. According to Intel, it will be available in 128GB – 512GB capacities (just like first gen) and will operate at DDR4-2666, like the first generation. According to Intel, however, there’s a 25 percent increase in bandwidth at the same clock thanks to an (assumed) new Optane memory controller and software optimizations at the system level.
A 25 percent performance gain for Optane is an important jump, given the RAM’s uncertain position between DDR4 and NAND. Optane absolutely has workloads it excels at compared with NAND flash, but there’s no easily accessible killer application for most users. Any additional performance will help Optane close the gap with DRAM and differentiate itself from NAND flash.
The 25 percent performance improvement is when comparing Optane 1st Gen to 2nd Gen at 15W. Optane 1st Gen allows for up to 18W maximum power, while 2nd Gen does not. Intel did not disclose performance figures for 18W versus 15W.
Sockets and SKUs
AKA, the worst role-playing game ever?
Cooper Lake debuts alongside Socket LGA4189, which technically comes in two flavors — LGA4189-4, and LGA4189-5. LGA4189-4 is intended for Ice Lake, while LGA4189-5 is for Cooper Lake. The difference comes down to PCIe version — LGA4189-4 supports PCIe 4.0, while LGA4189-5 is PCIe 3.0-only. Ice Lake Xeons should not be put in a Cooper Lake motherboard, but Cooper Lake can drop into an ICL motherboard without a problem.
Intel is only releasing three SKUs — Platinum 8300, Gold 6300, and Gold 5300. Anandtech has assembled a table of how they all compare against AMD’s Epyc:
Intel has substantially improved the frequencies and performance of the old 8180M with the new 8380HL. Intel offers wins in top frequency, maximum supported sockets, vector extensions, and AI-specific extensions, along with maximum addressable RAM if the use of Optane is included.
AMD has advantages in core counts, price, PCIe bandwidth, and memory bandwidth. Bfloat16 support is going to be the major draw, here — if there’s a place where Intel can outperform AMD, it’s going to be in AI workloads.
This shows where Intel expects to make its largest gains with Cooper Lake. Intel is predicting an 11 percent improvement for 3rd Gen over 2nd Gen in AI Training performance when AVX-512 and FP32 are used. With DLBoost engaged, 3rd Gen (Cooper) is 1.91x faster than 2nd Gen, and 1.73x faster than itself when relying on AVX-512 + FP32 code.
In inference, DLBoost with Cooper Lake is only 1.07x faster than DLBoost with Cascade Lake, and only 1.04x faster in FP32. The BF16 figures may look bad as well, but the lower performance is expected in this context.
BF16 isn’t as fast as INT8, but it offers better accuracy than the rival operating mode. Note that this graph also contains the explicit impact of further software optimizations over time. You aren’t just seeing Intel’s AI processing getting faster in hardware, you’re seeing the impact of better AI software, too.
After Cooper Lake, Intel will re-unify its entire product family with Sapphire Rapids, expected in 2021 and built on 7nm. Assuming Intel’s plans to launch 7nm late in 2021 are still on schedule, we can expect the CPU to debut towards the end of the year.
- Intel Announces Cooper Lake Will Be Socketed, Compatible With Future Ice Lake CPUs
- Intel’s Cascade Lake With DL Boost Goes Head to Head with Nvidia’s Titan RTX in AI Tests
- Intel Unveils Lakefield CPU Specifications: Up to 3GHz, 64 EUs, and 7W TDP