CPU Tests: Simulation

Simulation and Science have a lot of overlap in the benchmarking world, however for this distinction we’re separating into two segments mostly based on the utility of the resulting data. The benchmarks that fall under Science have a distinct use for the data they output – in our Simulation section, these act more like synthetics but at some level are still trying to simulate a given environment.

DigiCortex v1.35: link

DigiCortex is a pet project for the visualization of neuron and synapse activity in the brain. The software comes with a variety of benchmark modes, and we take the small benchmark which runs a 32k neuron/1.8B synapse simulation, similar to a small slug.

The results on the output are given as a fraction of whether the system can simulate in real-time, so anything above a value of one is suitable for real-time work. The benchmark offers a 'no firing synapse' mode, which in essence detects DRAM and bus speed, however we take the firing mode which adds CPU work with every firing.

The software originally shipped with a benchmark that recorded the first few cycles and output a result. So while fast multi-threaded processors this made the benchmark last less than a few seconds, slow dual-core processors could be running for almost an hour. There is also the issue of DigiCortex starting with a base neuron/synapse map in ‘off mode’, giving a high result in the first few cycles as none of the nodes are currently active. We found that the performance settles down into a steady state after a while (when the model is actively in use), so we asked the author to allow for a ‘warm-up’ phase and for the benchmark to be the average over a second sample time.

For our test, we give the benchmark 20000 cycles to warm up and then take the data over the next 10000 cycles seconds for the test – on a modern processor this takes 30 seconds and 150 seconds respectively. This is then repeated a minimum of 10 times, with the first three results rejected. Results are shown as a multiple of real-time calculation.

(3-1) DigiCortex 1.35 (32k Neuron, 1.8B Synapse)

For users wondering why the 5800X wins, it seems that Digicortex prefers single chiplet designs, and the more cores the better. On the Intel side, the 10700 pulls a slight lead.

Dwarf Fortress 0.44.12: Link

Another long standing request for our benchmark suite has been Dwarf Fortress, a popular management/roguelike indie video game, first launched in 2006 and still being regularly updated today, aiming for a Steam launch sometime in the future.

Emulating the ASCII interfaces of old, this title is a rather complex beast, which can generate environments subject to millennia of rule, famous faces, peasants, and key historical figures and events. The further you get into the game, depending on the size of the world, the slower it becomes as it has to simulate more famous people, more world events, and the natural way that humanoid creatures take over an environment. Like some kind of virus.

For our test we’re using DFMark. DFMark is a benchmark built by vorsgren on the Bay12Forums that gives two different modes built on DFHack: world generation and embark. These tests can be configured, but range anywhere from 3 minutes to several hours. After analyzing the test, we ended up going for three different world generation sizes:

  • Small, a 65x65 world with 250 years, 10 civilizations and 4 megabeasts
  • Medium, a 127x127 world with 550 years, 10 civilizations and 4 megabeasts
  • Large, a 257x257 world with 550 years, 40 civilizations and 10 megabeasts

DFMark outputs the time to run any given test, so this is what we use for the output. We loop the small test for as many times possible in 10 minutes, the medium test for as many times in 30 minutes, and the large test for as many times in an hour.

(3-2a) Dwarf Fortress 0.44.12 World Gen 65x65, 250 Yr(3-2b) Dwarf Fortress 0.44.12 World Gen 129x129, 550 Yr(3-2c) Dwarf Fortress 0.44.12 World Gen 257x257, 550 Yr

Dolphin v5.0 Emulation: Link

Many emulators are often bound by single thread CPU performance, and general reports tended to suggest that Haswell provided a significant boost to emulator performance. This benchmark runs a Wii program that ray traces a complex 3D scene inside the Dolphin Wii emulator. Performance on this benchmark is a good proxy of the speed of Dolphin CPU emulation, which is an intensive single core task using most aspects of a CPU. Results are given in seconds, where the Wii itself scores 1051 seconds.

(3-3) Dolphin 5.0 Render Test

CPU Tests: Office and Science CPU Tests: Rendering
Comments Locked

210 Comments

View All Comments

  • dullard - Thursday, January 21, 2021 - link

    Yes, it is sad that even well respected PhDs in the field can't seem to understand that TDP is not total consumed power. Never has been, never will be. TDP is simply the minimum power to design your cooling system around.

    I actually think that Intel went in the right direction with Tiger Lake. It will do everyone a service to drop any mention of TDP solely into the fine print of tech documents because so many people misunderstand it.

    Yes, TSMC has a fantastic node right now with lower power that AMD is making good use of. Yes, that makes Intel look bad. Lets clearly state that fact and move on.

    Power usage matters for mobile (battery life), servers (cooling requirements and energy costs), and the mining fad (profits). Power usage does not matter to most desktop users.
  • dullard - Thursday, January 21, 2021 - link

    Also don't forget that we are talking about 12 seconds or 28 seconds of more power, then it drops back down unless the motherboard manufacturer overrides it. The costs to desktop users for those few seconds is fractions of a penny.
  • bji - Thursday, January 21, 2021 - link

    "minimum power to design your cooling system around" makes NO SENSE.

    You don't design any cooling system to handle the "minimum", you design it to handle the "maximum".

    It sounds like you've bought into Intel's convoluted logic for justifying their meaningless TDP ratings?
  • iphonebestgamephone - Thursday, January 21, 2021 - link

    Why are there low end and high end coolers then? Arent the cheap ones for the minimum, in this case 65w?
  • Spunjji - Friday, January 22, 2021 - link

    dullard's comments are, indeed, a post-hoc justification in search of an audience.
  • dullard - Friday, January 22, 2021 - link

    Bji, no, that is not how how engineering works. You need to know the failure limit on the minimum side. If your cooling system cannot consistently cool at least 65W, then your product will fail to meet specifications. That is a very important number for a system designer. Make a 60W cooling system around the 10700 chip and you'll have a disaster.

    You can always cool more than 65W and have more and/or faster turbos. There is no upper limit to how much cooling capability you can use. A 65W cooler will work, a 125W cooler will work, a 5000 W cooler will work. All you get with better cooling is more turbo, more often. That is a selling point, but that is it - a selling point. It is the the 65W number that is the critical design requirement to avoid failures.
  • edzieba - Friday, January 22, 2021 - link

    Minor correction on " Never has been, never will be": TDP and peak package power draw WERE synonymous once, for consumer CPUs, back when a CPU just ran at a single fixed frequency all the time. It's not been true for a very long time, but now persists as a 'widely believed fact'.
    Something being true only in very specific scenarios but being applied generally out of ignorance is pretty common in the 'enthusiast' world: RAM heatsinks (if you're not running DDR2 FBDIMMs they're purely decorative), m.2 heatsinks (cooling the NAND dies is actively harmful, cooling the controller was only necessary for a single model of OEM-only brown-box Samsung drives because nobody had the tool to tell the controller to not run at max power all the time), hugely oversized WC radiators (from the days when rad area was calculated assuming repurposed low-density-high-flow car AC radiators, not current high-density-low-flow radiators), etc.
    Even now "more cores = more better" in the consumer market, despite very few consumer-facing workloads spanning more than a handful of threads (and rarely maxing out more than a single core).
  • dullard - Friday, January 22, 2021 - link

    I'll give you credit there. I should have said "not since turbo" instead of "never has been". Good catch. I wish there was an edit button.
  • Spunjji - Friday, January 22, 2021 - link

    What's really sad is that you apparently prefer to write a long comment trying to dunk on the author, rather than read the article he wrote for you to enjoy *for free*.

    "I actually think that Intel went in the right direction with Tiger Lake"
    You think poorly.

    "Yes, TSMC has a fantastic node right now with lower power that AMD is making good use of. Yes, that makes Intel look bad. Lets clearly state that fact and move on."
    Aaaand there's the motivation for the sour grapes.
  • dullard - Friday, January 22, 2021 - link

    Spunjji, I must assume since you didn't have anything to actually refute what I said, that you have nothing to refute it and instead choose to bash the messenger. Thanks for backing me up!

Log in

Don't have an account? Sign up now