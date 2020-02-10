AMD has revised the rules for desktop performance in the past three years. On Friday, the microprocessor manufacturer introduced its AMD Ryzen Threadripper 3990X, the world’s first 64-core CPU with a socket. I’ve already written a teaser for this article and worked through some of my early CPU considerations, but here we browse the data on the chip and see what the reports can tell us.

In the right circumstances, the Ryzen Threadripper 3990X delivers incredible performance. With non-optimized workloads, the Ryzen Threadripper 3970X is flat or even declining. I’ve spent a lot of time putting the CPU through its paces, looking for scenarios where it will succeed, and analyzing where it falls flat. I also tossed it outside in 12 degree Fahrenheit (-11C) air and overclocked it with the Asus Zenith II Extreme just for fun. We’ll talk about it too. I may have missed an entrepreneurial record with liquid nitrogen, but the CB20 score I achieved would still qualify this system for second place, according to HWBot.

I am assuming that you are familiar with the 3990X Threadripper and have read our 3970X review and last week’s introduction of the 3990X.

The Windows Thread Scheduler

A problem that affects our Windows 10 results is the fact that the operating system doesn’t scale very effectively over 64 threads. The operating system divides CPU usage into processor groups, with each group assigned up to 64 threads. Some applications offer their own thread scheduler, but applications that are often not limited to ~ 50 percent of CPU usage on the 3990X. Linux scaling is generally better. Rob Williams at Techgage has more data on this. 3D rendering applications are by far the strongest use category for the 3990X because they have the highest scaling factors in these applications. It has been reported that Windows 10 Enterprise may offer better scaling, but our contacts at AMD indicated that when a 3990X was purchased, a user had no reason to run Windows 10 Enterprise. The official guide from AMD is that Windows 10 Pro is enough.

The (missing) Intel competition

We contacted Intel to ask if the company would provide Xeon server CPUs for benchmarking with the 3990X, but the company decided not to sample the AMD CPU. While these comparisons would not be price-matched, they would have allowed us to compare top-end solutions from both companies. Without the option to use Xeon, our comparison vehicle was limited to the Core i9-10980XE.

At $ 1000, the 10980XE may not be seen as fair competition for the 3990X at $ 4000, but it is the closest Intel CPU we have, and I wanted to give a hint as to how it builds up. Since different applications react so differently to CPUs with a high number of cores, there are cases in which the 10980XE corresponds to or exceeds the 3990X. Even today, more cores are not always better.

Result formatting, test setup

This review is bigger than my typical reporting and I have divided the results into different categories. Our standard test suite compares the top thread ripper and Core i9-10980XE (if possible together with the Ryzen 9 3950X) in a variety of applications. The next section describes SMT scaling for the 3990X with some specific evaluations in applications like DaVinci Resolve using the advanced Puget Systems benchmarks for that application. Finally, in the overclocking section, our OK results are discussed with a little help from Mother Nature.

All test beds were equipped with 64 GB DDR4-3600 in four RAM sticks. XMP was enabled on both AMD and Intel systems, but the Intel Cascade Lake required a DDR4-3200 RAM clock, not DDR4-3600. Both Intel and AMD systems were compared to a Corsair MP600 SSD, although the AMD system used the drive in PCIe 4.0 mode, while the Intel rig was limited to PCIe 3.0.

In all cases, an RTX 2080 was used to run GPU tests with Nvidia GeForce Game Ready Driver 442.19. The latest UEFI images have been loaded on all motherboards.

A few application-specific tips before we start: I have inserted MATLAB results here. MATLAB prefers Intel by default for reasons that we cover in more detail in this article. I compared AMDs with enabled (default) and disabled “Cripple AMD” instructions.

We have also added a significant workload for prosumer / workstation. Puget Systems sells its own expanded benchmark suite for various applications, including DaVinci Resolve. Nothing goes wrong with these tests – the 8K DaVinci Resolve Suite requires a GPU with up to 20 GB VRAM, so we couldn’t test it. The free trial version of DaVinci Studio was used, which affects the performance of an H.264 benchmark according to Puget. However, the test results that we provided are correct compared to this version of the application, and the same workload was running on the software in all of the CPUs that we tested. I also have data from Agisoft Metashape, but by late Sunday it became clear to me that I would have to check these results again.

Special thanks to Antonio Bosi, who designed our Arnold Render CPU benchmark Maya 2020 by modifying a scene from his existing test suite. Antonio maintains his website with a range of Arnold render tutorials, personal graphics, and 3D models to download. The optimized version of the standard fast benchmark has a four percent better scaling than the standard version and, with 1.4 times scaling, delivered the strongest render uplift between the two CPUs compared to the 3970X.

We also downloaded a number of Blender scenes for test rendering, all of which were used in the open Blender movie “Spring”. Above and below are two screenshots of representative animation frames:

test results

The tests included: 7zip, Blender (standalone benchmark and full application), Cinebench R15 and R20, handbrake 1.31, Indigo Bench, Maya 2020, Neat Bench, POV-RAY 3.7 and a Qt compilation benchmark with MSVC 2019.

In these results we see two different performance profiles. In some applications – generally when rendering applications – the 3990X is 1.3x – 1.4x faster than the 3970X. Some tests, such as V-Ray, predict even greater scaling. We benchmarked a number of rendering applications to demonstrate that existing software uses these features in many cases. In some cases, like Arnold Render, the 3990X has even proven to be cost effective over the 10980XE, which takes 3.63 times longer to render our test scene and costs 25 percent as much. We will also examine a renderer that is not scaled in our SMT area.

Outside of rendering applications, the 3970X is generally a better choice, especially since it’s more cooperative with the regular version of Windows 10. For rendering applications, the improved performance can pay off for creatives who have money to burn.

I chose to use slideshows for this article due to the amount of data, but some of our results don’t fit well in this format for various reasons. Our MATLAB benchmark was provided by Intel to test the performance of the Core i9-10980XE. The following table summarizes MATLAB performance on AMD hardware compared to Intel.

MATLAB is an application that scales no more than 64 threads on Windows 10 Pro. As a result, the 3990X is slower than the 3970X with SMT enabled because its base clocks are lower even in 32-core mode. We have seen some examples of this.

The Blender 1.0Beta2 benchmark is based on an older version of Blender (2.79) and runs slower than the current version (2.81). It also crashes on the 10980XE for unknown reasons (the 10980XE does not have this problem in actual use). Blender is a solid win for the 3990X, and although we don’t see our best scaling in this renderer, the 3990X renders between 25 and 35 percent faster than the 3970X in these scenes.

SMT scaling

Our standard test suite examined the performance scaling between the 3970X and 3990X and generally found that the 3970X is the stronger option for the typical user, but with specific improvements in certain workloads for the 3990X. Now we are dealing with the question of SMT scaling under Windows 10.

Tests included: Mixer, DaVinci Resolve (Puget Systems), Keyshot 9, Maya 2020, Maxwell Render 4.2.

There are only a handful of applications that suffer from performance degradation when SMT is enabled, but Maxwell Render 4.2 definitely does. Unfortunately, the “Benchwell” scene included in Maxwell Render 4 is no longer loaded in Maxwell Render 5, so I could not test whether the same problem occurs in the latest test version. Turning SMT off wins the 3990X over the 3970X, which is not enough to justify CPU costs.

However, other tests showed consistent scaling by enabling SMT and benefiting from using it. We have rendered an extensive set of blender scenes (Junk Shop, Spring, Agent 327 and Mr. Elephant), all of which show different responses to SMT usage. I have run a number of other test renderers that are not shown here to keep the record clear. The overall performance improvement in Blender for the renderers available through Blender Cloud at a professional level is a very consistent 1.3 to 1.4 times. Keyshot scales a little less at around 1.2x-1.25x.

Intel doesn’t win in DaVinci Resolve 16, but the price / performance category. The 3990X is a little different from the 3970X, but I don’t know that many people would pay four times as much for a 1.16x performance increase. It may depend on the specific codec and media settings you have edited, as the 3990X has made major improvements over the 10980XE in some tests.

Intel continues to advocate its own usefulness and relevance in these workloads. The 3970X and 3990X models have opened up real terrain, but they are not a blast in every situation.

Overclocking performance

17 years ago, literally to the day, AMD launched the Athlon XP 2500+, 2800+ and 3000+ on February 10, 2003. To overclock the 3000+ – and because we all knew at the time that it couldn’t match the Pentium 4’s standard clock – I glued it to OC outside and pushed the CPU to 2.6 GHz, 1.2 times above the standard. I never did it again until last weekend. This article was originally supposed to be published on Friday, but I noticed that it is actually going to be published today because it is fun to align the data so that it perfectly matches. The 3990X is a pretty good overclocking CPU, if my only sample is something, and the fact that this article runs on the same day is just the icing on the cake.

How do I do that? I glued the whole system outside in 12F / -11C air. I actually experimented with testing the system inside by placing a secondary cooling fan and a CPU cooler fan assembly against a window pane so that the cooler could be sucked in directly from the outside air. Then I shut off the bathroom ventilation and sealed the room so the temperature would drop. This worked up to a point, but did not provide the desired cooling. Solution: Overclocking outdoors.

It also sounds better than “overclocking in the bathroom”.

I started my tests with 3.7 GHz and 1.4 V for all cores, but that was too much voltage for the Asus Zenith II Extreme. OCP protection on the motherboard would start after half of the stress tests. Lowering the voltage to 1.35 V led to better results. I ran the complete Blender Benchmark 1.0Beta 2-Suite with 3.6, 3.7, 3.8 and 3.9 GHz all-core and lowered the voltage with every step.

At 3.9 GHz and 1.3275 V, I decided to jump. I knew that a 4 GHz all-core would not exceed the 32 K mark. Here I had to start to reach the (now second highest) value. I chose 4.1 GHz and she posted … but my score was still too low. Since I wrote my article on Friday, the world record has been requested by someone with 5.3 GHz overclocking, and I knew I wouldn’t be able to do it, but I thought I might be able to finish in second place ,

At this time, it is approximately 1:30 a.m. on Sunday morning. Anyone passing by would have seen a remarkable sight – a brilliant one (because Naturally Both the motherboard and the main memory are equipped with LEDs. A star shines in my front yard. If they had gotten closer, they would have been amazed at the sturdy, nondescript and unexpectedly valuable bedside table, which keeps around $ 6,000 in computer equipment out of everything that computer equipment should never touch. Without a case. Without srewing. With said hardware in a position that could be described as “precarious use of the available space”.

I thought about it. I was thinking about testing extremely expensive hardware outdoors at night. I thought about snow and wind and thought about the fact that I ran significantly less voltage than I had used to maintain a stable 3.7 GHz OC.

If you want to overclock well, you have to understand it as art. Systems don’t happen to become unstable. There is an order and a hierarchy. Systems have to POSTER, boot and then run benchmarks. The longer and stricter the test, the greater the likelihood that overclocking will be stable. I knew that the 3.7 GHz all-core was stable enough to withstand a number of tests and that 3.9 GHz was stable with Blender. However, I knew that I couldn’t be too far from reaching the motherboard’s overcurrent protection. The CPU was idle in the frozen wasteland of my front yard at 6 ° C, but under load it already reached 67 ° C. Overclocked CPUs are often thermally far more sensitive than their counterparts on the exchange, and 67C was more than I expected.

Every time you push a CPU to the outer edge of the envelope (and here it could mean anything from a standard cooler to LN2), you’re dancing on a pinhead. It’s a gamble that you can tweak the CPU just enough to get a test result without sacrificing performance in a way that negates the net effect of your improvement.

The CPU power dissipation increases with the TDP, but increases much more due to the voltage. I played that I reduced the VRM load enough to handle a 4.3 GHz all-core at 1.3275 V, although I hit the machine hard at 4 GHz and 1.4 V. ,

Remember, this is a CPU clock that is approximately 1.26 times higher than the 3990X clock I regularly install on, and it’s about running it on 64 CPU cores at the same time perform. All-core 4.3 GHz = 275.2 GHz. 0.275 THz. Yes, it starts with one decimal place. I do not care.

I felt like Han Solo would reach for the Millennium Falcon’s hyperspace levers if Han had been the greatest fucking nerd on earth. My hands were freezing, my ears were numb, my front yard sounded like a hair dryer, and the rigging had already startled a passing dog. It was time to see what she had. I typed “43”, hit “Save and Exit” and kept my fingers crossed. She posted. Booted.

Set a second world record with Cinebench R20 and Cinebench R15, although I’m still working on the submission process for HWBot.

I’ve been a reviewer for 18.5 years. I tested systems worth over $ 10,000. I have never tested a CPU that can be called the second fastest in the world. Even if I know that my own record will soon be broken by 3990X owners with LN2 and exotic cooling systems that don’t depend on the weather, even though I know that the result is simply a benchmark, there is undoubtedly something cool about it.

I don’t think people will get 4.3 GHz overclocks out of the 3990X regularly, but lower clocks seem to be possible. The results above show that they can offer a sustained advantage and the voltage required to maintain 3.7 GHz alternating current is well below 1.3275 V since I used the same voltage to reach 4.3 GHz , I leave it up to manufacturers like Boxx to figure out the possibilities, but these test results imply that they may be good.

With the right workloads, overclocking the 3990X can make sense for the right buyer. My performance improved by 10 to 17 percent after switching from camp to 3.7 AC.

Conclusion

The 3990X is not for everyone. The scaling is not sufficient to objectively justify the price unless you shop in markets where the price is not an object. Even if better scaling is required on Linux or Enterprise Windows, it is unlikely that enough applications will be useful to make the chip an objective improvement for many buyers.

All of this is perfectly normal for products that are on top of a product stack. The Intel Xeon W-3265 is a 24-core chip with 2.7 GHz base / 4.4 GHz boost. The Xeon W-3275 is a 28 core CPU with 2.5 GHz base / 4.4 GHz boost. The W-3265 costs $ 3,349. The W-3275 costs $ 4,449. This is a 1.32-fold price increase with a 1.17-fold increase in the number of cores. The Xeon Platinum 8280 is a 28-core CPU with a value of $ 10,009, the Xeon Platinum 8270 is a 26-core CPU with a value of $ 7405. Nobody blinks when Intel reimburses parts in this way, even though there is no workload on Earth where an 8280 with only two additional cores provides reasonable buoyancy over the 8270.

But the 3990X doesn’t try to offer everything to everyone. It is the laurel wreath. It is a victory round. The 3970X is the CPU that does exactly what Intel has to offer. It is the 3990X that does business for AMD customers for whom money doesn’t matter.

As for the importance of it? It is the first time in 15 years that AMD has a product that primarily competes for the “money is not an object” segment. You have to go back to the time of Opteron with two cores and Athlon 64 FX when AMD took on Prescott and Smithfield to find a time when AMD was so confident that the final would play a role in that position. Other reviewers who have access to more expensive Xeons than I have confirmed that AMD wins benchmarks for Xeon CPUs worth $ 20,000 in several areas. This is the kind of performance disparity that makes even the “money is not an object” amount stand out.

Well played.

