PCPartPicker

  • Log In
  • Register

RAM and its effects in games redux

Forum Search

Guidelines

  • Be respectful to others
  • No spam
  • No NSFW content
  • No piracy or key resellers
  • No link shorteners
  • Offensive content will be removed

Topic

526christian 18 months ago

Sections of this post:

Technical Background

Bottlenecks and Misleading Benchmarks

Main Memory and Video Memory Capacity and Game Performance

Memory Channel Configurations and Game Performance

DIMM Speeds and Game Performance

Timings and Game Performance

Ranks and Game Performance

Updates

Recap, Reality Check, and Concluding Thoughts

This is a re-do of my previous RAM and its effects in games post in April 2016. Since then, many of the benchmarks have become effectively outdated. New hardware has been released, new games have come out, and new benchmarks have been done since then. Many of the benchmarks linked in that post also were misleading (more on that later). This post will better fit my original vision for that one. And this time, I'll keep it updated.

This post will be very long. In fact, it's just over 10,000 words long. I don't expect many to read through the body fully. However, I recommend reading at least the following two/three sections if you want to skip to the conclusion/summary at the end of the post. Benchmarks for the 4th - 7th sections (channel configurations to ranks) are linked at the beginning of each section. Use the above list of sections to CTRL + F to them easily. Also know that nearly half of this post's body was written weeks before posting, so sorry if some of it is written a bit differently.

Now with newer benchmarks, hardware, and games out, and I myself having learned much more, let's talk about how main memory plays a role in your gaming experience.

Technical Background

Before moving on, it will help to know a bit about memory. Some of these terms will be used in the rest of the post, so bear with me.

RAM (usually referring to system memory AKA main memory, simply referred to as the DRAM or SDRAM, but will often be called memory in this post) is used for the operating system, applications, instructions, and the data the applications work with. When you open an application, for example, it's copied into the memory, where it can be worked with where it's more readily accessible. In games, memory is used for game code, audio, game state, and as a pipeline for sending information to the GPU.

As you probably already know, memory is found in the form of DIMMs (also called sticks or modules), which are PCBs with the DRAM chips and associated circuitry along with pins that are used for receiving signals and for sending or receiving data (these are also called the I/O pins). They are plugged into the motherboard in DIMM slots, where they are physically connected to the CPU via memory channels.

Memory channels are the links to the CPU. Each of them has a 64-bit wide data bus, where 64 bits are sent across it at a time. With recent platforms, memory channels are independent of one another, where data can be read or written in each channel at the same time separately one another. Having multiple independent channels improves the ability for multiple accesses to be carried out simultaneously. Older systems logically combined memory channels (or had the option to do so), but this is not done anymore (to my knowledge, and if it is, it's certainly not typical) for various reasons. In order to use memory channels, we need at least one module installed in each channel. For example, dual-channel configurations have at least one module installed in two channels, quad-channel configurations have at least one module installed in four channels, and so on. If you have more than 2 DIMM slots, see your motherboard manual for confirmation on where to install (slots are usually color coded, but just to be sure)

The DRAM chips on DIMMs are organized into ranks. Ranks are groups of DRAM chips that work in unison. When memory is accessed, each and every DRAM chip within the relevant rank work together to read or write data 64 bits (in total) at a time. Typically, a rank is physically composed of 8 DRAM chips, but it's also technically possible for rank to consist of 16 DRAM chips.

In each DRAM chip are multiple banks, each of which can be carrying out different operations independent of each other. In each bank are arrays that work in unison, where we can find the memory cells. These memory cells are arranged in rows and columns, kind of like a spreadsheet or a chess board.

Modern CPUs include memory controllers built in. As you could guess by its name, the memory controller controls the flow of data, keeps the DRAM refreshed so data isn't lost, and is in charge of taking requests for data from the CPU and sending the necessary signals to the memory.

For a quick organization overview, it looks like this: CPU and memory controller ----(memory bus / channels)----> DIMMs > Ranks > DRAM chips > Banks > Arrays > Rows x Columns. If it's of any help visualizing it, here's an illustration from Memory Systems: Cache, DRAM, Disk.

When talking about main memory performance, there are two aspects that ought to be considered: Latency and bandwidth. Bandwidth in memory is how fast we can "push" or "pull" data to or from the memory. This is usually measured in gigabytes or megabytes per second (GB/s or MB/s). Latency can refer to a bunch of things in DRAM, but for the purpose of this post, I will talk about access latency — the time between when a read request is sent by the CPU and when the requested data is returned. While modern CPUs can execute instructions out-of-order, allowing them to keep executing instructions that have needed data immediately available, the access latency to main memory is long enough that useful work runs out, leading to wasted clock cycles. A longer access latency means more wasted cycles at the CPU when data is read from memory.

As system builders, we have two ways we can take advantage of the physical organization of memory to improve performance: channel and rank interleave. These two methods are independent of each other.

In channel interleave, blocks of memory the size of 64 bytes is alternated between memory channels. One block goes in one channel, another goes in the next channel, and so on for the number of memory channels used. Thanks to the principle of locality, memory accesses are thus distributed across the memory channels when using channel interleave. This provides a massive memory bandwidth improvement over when channel interleave isn't used, with the theoretical max bandwidth improving by a factor of how many channels more are used beyond one. The ways of channel interleave work alongside the number of channels, but when using an unequal capacity among channels, memory will be separated into sections where the largest common amount of memory across channels works in the same way of interleave as the number of channels used, and the same works for the other section(s). For example, if you have a quad-channel system and configuration with 16GB in two channels, 20GB in another, and 24GB in the last, 16GB from each of the four channels works in four-way channel interleave, 4GB from the third and fourth channels work in two-way channel interleave, and the last 4GB in the fourth channel works in 1-way channel interleave — none at all. Here's an illustration from the Memory Deep Dive series to give an idea of how this works. Intel-based systems do this (called Flex Mode), but I have no confirmation when it comes to AMD-based systems (if anyone can confirm/deny, please do). Taking a quick look at the manuals of recent AM4 motherboards, ASUS clearly says that the above happens, ASRock says you need the same capacity (and everything else), and MSI and Gigabyte are unclear when it comes to varying memory capacity. Odd. I heard from an AMD rep that AMD does definitely have its own asymmetric channel interleave just like Intel's Flex Mode, but is just isn't given a name. But then what's the deal with the motherboard manuals? Still odd.

In rank interleave, blocks of memory are spread across multiple ranks, allowing for concurrent accesses to multiple ranks. Rank interleaving leads to more accesses to already open rows, in turn leading to the Row to Column Delay and Row Precharge time timings taking place much less often. This reduces the access latency on average. Rank interleaving comes in ways of powers of two (2, 4, 8, 16...), and requires at least the same number of ranks per channel used. Using a non-even number of ranks in any channel leaves you stuck with 1-way rank interleave - none at all. For example, for 2-way rank interleave, you need one dual-rank module or two single rank modules in the same channel. Similar ideas apply to higher ways of rank interleave as well. Rank interleave beyond 2-way brings minimal performance benefits.

In memory specs, the two things to look at regarding performance are the transfer rate and timings. We see the transfer rate next to the generation of memory separated by a dash (ex: DDRY-XXXX, where Y is the generation, and the X's together represent the transfer rate), and timings are one or two-digit numbers separated by dashes, but among a bunch of other timings, are typically separate in the BIOS/UEFI and in programs that show detailed info about your memory. Ranks may also (rarely) be in specs, represented by the number of ranks followed by an R.

The transfer rate represents how many data transfers there can be in megatransfers per second (MT/s). Remember, in current main memory, Double Data Rate is used, where data is transferred twice per clock cycle at the I/O bus. Half the transfer rate tells us the actual clock speed of the I/O bus, where the I/O pins work at that frequency (DDR4-2133 has an actual clock frequency of 1066 MHz, for example). Memory bandwidth scales linearly with transfer rates, and higher transfer rates / frequencies decrease access latency since the memory controller works at the memory frequency in current desktop CPUs, and affect timings measured in clock cycles (use the equation (Timing in clock cycles / Transfer rate) * 2,000). Throughout this post, I will usually refer to memory frequency / transfer rate simply as "memory speed" or "DIMM speed".

The timings are essentially the time it takes for different operations and actions to take place. From the perspective of the memory controller, they represent the minimum times the memory controller must wait between issuing two commands (albeit in different situations). The four timings shown in the specs when buying memory modules are the CAS latency (tCAS or tCL for short), the Row to Column Delay (tRCD for short), the Row Precharge time (tRP for short), and the Row Access Strobe latency (tRAS for short), respectively. There are over 50 timings out there, each applying to different situations. Ones not listed in specs are typically called subtimings or secondary timings (tertiary timings can also refer to ones that may or may not be modifiable in the UEFI and aren't in specs), while the ones listed in specs are often called primary timings. Side note: Since memory access in modern games is by nature more random than many other applications, tRCD and tRP will impact memory access latencies more often than in other applications.

As a reminder, when you buy a memory kit, its DIMM speed and timings specs are simply a guarantee the DIMM(s) can achieve those specs when not held back by other hardware, and will have an XMP profile to instantly try to set the DIMM to work at those specs when XMP (or given equivalent for a certain motherboard such as A-XMP and DOCP) is enabled in the UEFI/BIOS.

By the way, if you want to know more about main memory, check out my document, Modern CPU Microarchitecture: An Overview which talks a bit about main memory, and/or my post called Main Memory: A Primer (this needs updating). In both of those, I talk about main memory purely technically. Be warned: they are much longer and even more complex than this section was.

Now that we have some necessary background (I think?), let's get on to the good stuff.

Bottlenecks and Misleading Benchmarks

In games, we have two main bottlenecks for our framerates: Our CPU, and our GPU. I've already talked about CPU/GPU bottlenecking in games before. If you haven't read that post, then I suggest doing so before moving on.

When we improve memory subsystem performance, especially access latency and possibly bandwidth (depending on the bandwidth utilization and pattern of utilization of running applications — memory bandwidth is a shared, finite resource), we improve overall CPU execution speeds. A better-performing memory subsystem means a better-performing CPU in practice. Just like with a better CPU, improving memory subsystem performance when GPU-bound does us effectively nothing in that GPU-bound moment as framerate depends on how quickly the GPU can render frames, not anything else hardware-wise. However, the potential is always there to see benefits from better memory performance when we see a bottleneck by the CPU.

Before moving on, it is occasionally suggested that we are memory-bound in games when not GPU-bound, being the explanation for performance improvements seen when memory subsystem performance is improved. For a bottleneck to lie at main memory, it has to be within two categories: bandwidth-bound or latency-bound. When we're bandwidth-bound, all available memory bandwidth is used, and the CPU's memory controller can't keep up with requests. Whenever you're bandwidth-bound, there would be no performance scaling with more cores or with the addition of SMT at all, even if the game has enough parallelism for us to otherwise see performance benefits. The frequency and severity of which we are bandwidth-bound will be driven not only by the main memory subsystem's bandwidth, but also the bandwidth utilized and the pattern of utilization of a given game and other applications running on the system. When we're latency-bound, main memory access latency is the biggest setback. This situation mostly only occurs when code has poor locality and data is accessed in a random, non-sequential manner. Large amounts of memory bandwidth is used, but only a small part of it is for data/instructions that will be put to use, with large amounts of CPU cache being needlessly filled. A couple examples of latency-bound applications is sparse matrix algebra and hash table lookups. In the case of modern AAA games, we know that, when we aren't GPU-bound, we certainly aren't spending the vast majority or all of the time memory-bound (assuming a given system doesn't have awful memory subsystem performance), as that would mean we would see little or no benefit from better-performing CPUs when not GPU-bound. But, we may spend a decent part of non-GPU bottlenecked time memory-bound.

Recent testing by a Reddit user, kokolordas15, in 5 games and an overclocked 6700K in 720p using Intel VTune indicates that, in modern AAA games, we spend at least 23% of the time memory-bound with DDR4-2133 in a dual-channel config, when not GPU-bound. Between DDR4-2133, DDR4-2666, and DDR4-3333, each game changed by about 4-5% in time spent memory-bound, improving as DIMM speed increased. Hitman spent the most time memory-bound at a whopping 41% with DDR4-2133 and 30% with DDR4-3333, and the Witcher 3 was close behind at 3-4% less of the time spent memory-bound. Project Cars, GTA 5, and CS:GO all spent roughly 25% of the time memory-bound with DDR4-2133. While some more testing would be nice, such time spent memory-bound isn't too surprising. After all, modern AAA games have much larger working sets than many in the past, while still having intrinsically more random memory access than many other applications since games have many variable scenarios. Not to mention the still-wide performance gap between CPUs and main memory and usually further stress placed on the memory subsystem with more cores and/or SMT with recent CPUs...

However, the CPU remains the most common bottleneck in-game hardware-wise when it's not the GPU. And, from the perspective of the GPU in a game, being memory bound is still kind of like being CPU bound - you're simply slowing down the CPU further. As such, I will be referring to situations when not GPU-bound (or effectively limited by software) as simply CPU-bound.

As you will later see, different games benefit variously from improving memory subsystem performance while CPU-bound. Different games will have different working set sizes and will vary in how often memory needs to be accessed, and how that is done from the perspective of the CPU. We see different methods of improving memory performance having varying scaling in framerates between games. This is related to, but not directly tied with, how much time the game tends to spend memory-bound when not GPU-bound.

As you might have noticed, in the past there has been many benchmarks pointing to minimal differences when it comes to memory performance. While it might be clear to some of you why these benchmarks shouldn't be used, I still see benchmarks such as these used in various tech forums.

Here are some examples of such misleading benchmarks:

Memory channels

http://www.gamersnexus.net/guides/1349-ram-how-dual-channel-works-vs-single-channel/Page-3

https://www.reddit.com/r/buildapc/comments/1fcs77/discussion_ram_single_vs_dual_channel_speed/

http://www.hardwaresecrets.com/does-dual-channel-memory-make-difference-in-gaming-performance/

DIMM speeds

https://www.youtube.com/watch?v=dWgzA2C61z4

http://techbuyersguru.com/gaming-ddr4-memory-2133-vs-26663200mhz-8gb-vs-16gb?page=1

https://www.youtube.com/watch?v=rwuE8IWQAu8

http://www.legitreviews.com/ddr4-memory-scaling-intel-z170-finding-the-best-ddr4-memory-kit-speed_170340/5

As for the listed benchmarks regarding memory channels,

The first two tests look at merely one and two games respectively. Already, this should not be used to represent how memory affects games as a whole. On top of this, the first one only looks at average framerates (which alone gives us only part of the story) and uses a low-mid range GPU with an overclocked i5. So not only is only one game tested, but framerates were also pretty darn likely to have been limited by the GPU anyway. The third one does test a variety of games, yet uses an i7 with a mid-range GPU at 1080p high+ settings, and only looks at average framerates (What about frametimes? Framerates in the worst case scenarios? Average framerates don't paint us the whole picture). Not to mention, these benchmarks are all outdated today anyway.

As for the listed benchmarks regarding DIMM speeds,

The first one only tests two games, and as expected, uses an overclocked i7 and cranked up the graphics settings real high. These measly two benchmarks are clearly GPU-bound (mainstream i7 only getting ~30 fps in two not-very-CPU-demanding games?), and, of course, only look at average framerates. The second one uses, you guessed it, an overclocked i7 + a 980 Ti in 1440p resolution with graphics settings cranked up high. The 980 Ti is by no means weak, but even it has its limits. It thankfully looks at minimum framerates as well, but those too can be misleading. The third one only looks at two games, and tests using a GTX 960 at high settings. Like I said before, we can't look at only a few or fewer games to represent games in general when investigating the effects of memory subsystem performance. The last one only looks at two games, using pre-loaded in-game benchmarks. It used an overclocked i7 and a 980 Ti, but thankfully only at 1080p very high settings. It also only measures average framerates, which, like I said, is misleading. While GPU bottlenecking would be less likely here than other benchmarks, the use of only two games and average framerates makes this a poor representation of memory performance's effects in games.

Why does being GPU-bound make a benchmark misleading when looking at the effects of main memory? Like I already explained, being GPU or CPU-bound for a given hardware configuration depends on a bunch of factors. On top of that, different people have different goals and expectations when it comes to framerates and visual quality. Being GPU-bound, these benchmarks can make the less-informed believe memory performance has minimal effect in games altogether, while the possibility remains that their specific game workloads could lead them to experience CPU-bound situations. It can also be argued that being CPU-bound makes benchmarks misleading too, as that doesn't necessarily mean users will be CPU-bound either. But, being informed on CPU/GPU bottlenecking in gaming (again, see my post about it) and how memory affects the CPU counters this. Knowing how these things work, you won't be mislead.

For this post, we're going to look at some benchmarks that each look at more than 4 games. Ideally, all benchmarks would use detailed frametime and framerate measurements such as those in Tom's Hardware and TechReport CPU and GPU reviews. However, this is not done by the vast majority of people doing game benchmarks. Many benchmarks meant to be part of making my main points looks at not just average framerates, but the 1% and .1% lows, as well. I tried to find as many as I could that used 1% and .1% lows or had frametime measuring, but in the case of timing and ranks where the amount of testing is limited, I had little choice but to settle for average and/or average and minimum framerates. When I am talking about the results, games will be in bold to make it easier to separate them while reading. When calculating percentages, I round to the nearest whole number except when +/-1 from .5, and I am calculating by dividing the upgrade's result by the pre-upgrade's result (i.e. when comparing results with DDR4-2666 and DDR4-2133, I divided the DDR4-2666 result by the DDR4-2133 result).

Main Memory and Video Memory Capacity and Game Performance

Because the nature of this section is different, I won't be listing the benchmarks all here as I will be using them for individual points.

Note: To my knowledge, there is no recent and in-depth testing with main memory or VRAM capacity in games. So, I worked with what I know and have for this section.

When you run out of main memory capacity, you automatically start using the page file — which is on a certain storage drive — as long as you have it enabled. The page file can be thought of as something that "catches" data from an overflowing system memory. Having it disabled when you run out of memory will lead to application crashes. Windows puts pages of memory into the page file that are considered "unused". Let's try a game, for example, and you have browser tabs in the background. If the game pushed you over the physical capacity of your memory, and you aren't using your browser for anything (such as listening to music or watching a walkthrough video) currently, Windows will start moving over memory the browser is using to the page file. As long as nothing from the game is being moved over to the page file, you likely won't experience a performance impact ingame from using too much of your main memory. As such, it's hard to make statements about memory capacity needs for gamers, as everyone's own memory consumption and method of use of it beyond games will vary from user to user and time to time.

Side note: Remember that capacity usage tends to increase alongside capacity. Keep this in mind when asking others about their capacity utilization. This is likely due to OS' caching algorithms.

Semi-recent testing from Salazar Studio pointed towards 4GB of system memory occasionally being problematic (using a GTX 1070 and without intentionally pushing VRAM capacity, so lack of VRAM is an unlikely issue) or hurting performance in some recent games like GTA 5, but The Witcher 3, Cities Skylines, and Total War Warhammer did not run much worse with 4GB. It should be noted that not many details were provided involving what else was running in the background. Slightly older testing from OzTalksHW pointed towards 4GB leading to the same or slightly worse performance in games than 8GB, with stuttering or crashes when he was recording (so possibly, he was using way too much). Again, note that we are not given much detail as to what else was running alongside the games tested, and he used average, minimum, and maximum framerates like Salazar, and not average and 1% and 0.1% lowest framerates.

And, of course 2GB of RAM is in the past now.

One contributor to both your main memory consumption in a game and performance is graphics card video memory capacity. The VRAM is necessary for performing necessary calculations for rendering what you see on screen. When you run out of video memory, anything more "spills over" to your main memory, taking up further space, and hurting performance when the GPU needs to access it as main memory has a far lower bandwidth and much higher latency to access for the GPU. A higher VRAM capacity theoretically allows for higher visual details and quality (resolution, AA, texture quality, etc) without impacting performance more negatively than would otherwise be experienced. The VRAM capacity that gamers often need is still sometimes debated and the general consensus at different resolutions and with different-tier GPUs is always changing. This issue is a tricky one, as VRAM consumption is affected by graphics settings, resolution, the game, and the area of the game.

3DNews.ru's testing indicates that, at 1080p, 3GB and occasionally 4GB of VRAM can present an issue with stutter (0.1% lows) at very high / max settings in recent AAA games when you have 8GB of main memory. In their testing, using 16GB of system memory instead of 8GB reduced or even eliminated the impact. Semi-recent testing from computerbase also shows the GTX 1060 3GB's VRAM being problematic for frametimes in 1080p when cranking up texture quality in several AAA games, but not the RX 480 4GB's VRAM or more. Keep in mind that in use, you can simply turn down graphics settings if you are running into VRAM capacity issues.

When you do happen to run out of memory from a single game and data from the game is being moved over to the page file, using an SSD may help reduce performance impacts. Remember that on top of a normally far greater data throughput in many workloads, SSDs have far lower (normally <1ms) access times than HDDs (12+ ms typical for 7,200 RPM). In 3Dnews.ru's testing, using an SSD instead of an HDD often led to much better 0.1% and 1% lowest framerates when running out of main memory ingame (aside from ME Andromeda which was the other way around in 0.1% lows, but this was likely a fluke). However, do remember that in many tests showing this difference, the graphics card was out of video memory, so this could be playing a role in performance effects using an SSD vs. HDD. Regardless, it can be said for sure that when running out of system memory is impacting your game performance, an SSD is the safest bet for performance.

Memory Channel Configurations and Game Performance

https://www.reddit.com/r/buildapc/comments/5agh8f/skylake_cpu_and_ram_gaming_impact_benchmarked/

https://youtu.be/u827fdCOCao?t=362

https://www.youtube.com/watch?v=sQz4XmEFbTU

In the first benchmarks, we see the effects of a dual-channel (with equal capacity in both for full 2-way channel interleave) vs single-channel configuration in 7 semi-recent AAA games using several emulated Skylake processors. On average, the i3 gained between 10% and 15% in framerates in average framerates and both 1% and 0.1% lows. The i5-6500 gained between 15 and 20% in each three performance metric. The i7-6700 gained between 20 and about 25% in framerates. Interestingly, the 4.5GHz i5 benefited the most overall from 2-way channel interleave, gaining only about 20% in average framerates, yet 30% in 1% lows and about 33% in .1% lows. The 4.5GHz i5 and i7 paired with 3000 MT/s memory instead of 2133 MT/s gained between 10 and 15% in average framerates, but between 15% and 20% in 1% and .1% lows. As the benefits weren't as large as the 4.5GHz i5 with 2133 MT/s memory, I think it's likely this indicates GPU bottlenecking limiting average benefits. While the single-channel tests used half the memory capacity, it is not likely this affected the results as they wouldn't be running out of VRAM with the 1070, and they probably kept a minimal amount of background applications running. These benchmarks show a general trend that stronger CPUs (whether by higher clock frequency, more physical cores, or more logical cores thanks to SMT) see greater benefits from the use of 2-way channel interleave over 1-way.

With the i3, we see little benefit with a dual-channel configuration in ARMA 3 and RoTR. The i5-6500 also sees minimal benefit in RoTR. In general, the games least affected by channel interleave are ARMA 3 (i5-6500 and i7-6700 see about 10% improvement in framerates, with the overclocked i5 gaining between 15 and 20% in framerates across the board), Rise of the Tomb Raider (i5-6500 and i3, but overclocked i5 and the i7-6700 see great gains in 1% and .1% lows between 20-30% and over 40%, respectively. Just a reminder, such high improvements in worst-case scenarios for framerates wouldn't be represented if only average framerates were looked at), and Total War: Atilla (at most, 16% higher framerates were seen with dual channel, although each processor with 2133 MT/s memory saw at least 8% improvement in at least one framerate metric). The games that benefited more from the use of dual channel over single channel were Hitman and Project Cars. In Hitman, each CPU (i3-6100, i5-6500, i7-6700, 4.5GHz 6600K) saw between 25-40% higher framerates (average, 1%, .1% lows) across the board (4.5GHz i5 and the i7-6700 saw nearly 60% improvement in 1% lows!). In Project Cars, the i3 saw a nearly 20% benefit in average and 1% lowest framerates, the i5-6500 saw a nearly 25% improvement in each metric, and the 4.5GHz i5 and i7-6700 saw benefits reaching over 30%, with only the i7-6700's .1% lows going to nearly 25%.

While those tests were done with Skylake, Kabylake's memory subsystem is identical to that of Skylake's. Aside from Kabylake CPUs having different core clock speeds, those results should apply all the same.

Now, let's move on to the second benchmarks.

These benchmarks test 6 games - Overwatch, GTA 5, The Witcher 3, ARMA 3, Fallout 4, and Battlefield 1 using a R7 1700 running at 3.75GHz. While he only used an RX 470, he tested at both 720p low settings and 1080p high settings to show both a CPU-bound scenario and a typically GPU-bound scenario. Overwatch saw between 20% - 25% framerate gains across the board when GPU-bound. With DDR4-2133, GTA V saw a roughly 16% improvement in framerates across the board, while with DDR4-3200 it saw between 20% and 25% (1%, 0.1% lows) gains while GPU-bound. With DDR4-2133, the Witcher 3 saw 27% - 31% gains in framerates, and with DDR4-3200 it saw between 22% and 25% gains in framerate when CPU-bound. ARMA 3 experienced the lowest impact from a full dual-channel configuration over single-channel with Ryzen when CPU-bound, only between a few % and 7%. Memory speed impacts this game much more using AMD Ryzen. Fallout 4 saw about 13% gain in framerates with dual channel over single channel when CPU-bound. Battlefield 1, when CPU-bound, saw 25% (average) - 30% (1%, 0.1% lows) gains with dual-channel and DDR4-2133, and 13% (average), 30% (1% lows), and 25% (0.1% lows) with dual-channel over single-channel and DDR4-3200. Overall, one can typically expect around either 13%, 16%, 25%, or 30% gains in framerates when CPU-bound with a dual-channel over single-channel configuration using AMD Ryzen, depending on the game. The Witcher 3, Overwatch, and Battlefield 1 benefited the most from dual-channel, while ARMA 3 saw little and Fallout 4 and GTA V saw some benefit. In these benchmarks, a dual-channel, DDR4-2133 configuration typically allowed for performance either very similar to or even slightly outmatching a single-channel, DDR4-3200 configuration (aside from ARMA 3, of course). Other benchmarks using a Ryzen 5 1600 have shown this same performance relationship between dual-channel DDR4-2133 and single-channel DDR4-3200 configurations, as well.

Now on to the third benchmarks. These look at the difference between a dual and single-channel configuration with a Core i7-6700K in 5 games using DDR4-2133 and a 980 Ti. However, the graphics settings were not stated (or if they were, I wouldn't know because I don't know Polish, sorry). Crysis 3 and GTA V saw little difference. Far Cry 4 saw a ~16% improvement with dual-channel. Rise of the Tomb Raider saw a large 26% (average) - 38% (minimum) improvement using dual-channel. The Witcher 3 saw a 20% increase in average framerates and a 16% in minimum framerates using a dual-channel configuration over single-channel. Keep in mind the possibility that they were GPU-bound in Crysis and GTA, and the possibility of the GPU limiting gains elsewhere, considering the results of the first benchmarks.

Side note: Remember that running with asymmetrical channel interleave (or sections of memory with different ways of channel interleave when you have an asymmetrical capacity between channels) can play a role in gaming performance when CPU-bound. This video gives us an example (see the 12GB and 24GB configurations' results) of it affecting performance, especially in Rise of the Tomb Raider. As long as data from the game is being placed where there is a lower level of channel interleave, the potential for performance impact, very small or otherwise, is there.

DIMM Speeds and Game Performance

Note: While I made a point of using benchmarks that look at 1% and 0.1% lows instead of just average framerates with or without minimum framerates, the first two don't use 1% and 0.1% lows. While I would have preferred to avoid these, there isn't much half-decent, recent testing with different DIMM speeds and Intel processors.

http://www.eurogamer.net/articles/digitalfoundry-2017-intel-core-i5-7600k-review

http://www.techspot.com/article/1171-ddr4-4000-mhz-performance/page3.html

https://www.youtube.com/watch?v=XOsYOASddeo

https://www.youtube.com/watch?v=tBNT8kvPsYg

https://www.youtube.com/watch?v=RZS2XHcQdqA

Finally, we can move on to DIMM speeds and their effects on performance. Let's, of course, start with the first benchmarks.

These benchmarks (scroll down a bit) show the differences between DDR4-3000, DDR4-2400, and DDR4-2133 with a Core i5-7600K, both stock and overclocked to 4.8GHz in 7 semi-recent games. In Assassin's Creed Unity, there was a 4.6% increase in famerate between DDR4-2133 and DDR4-2400, and a 7% increase from DDR4-2400 to DDR4-3000. With the CPU overclocked, there was a 7% increase going to DDR4-2400 from DDR4-2133, and a 5% increase going to DDR4-3000 from DDR4-2400. Ashes of the Singularity saw a 4.5% increase going to DDR4-2400 from DDR4-2133, and a 7% increase going to DDR4-3000 from DDR4-2400. With the CPU overclocked, these gains were reduced to 3% and nearly 6% respectively. Crysis 3 demonstrated minimal difference between different DIMM speeds, with the largest difference being nearly 3%. The Division also showed little differences. In Far Cry Primal, there was a 6% gain in framerate going to DDR4-2400 from DDR4-2133, and a 12.5% gain going to DDR4-3000 from DDR4-2400. With the CPU overclocked to 4.8GHz, there was minimal difference between DDR4-2133 and DDR4-2400, but there was an 11% gain going to DDR4-3000 from DDR4-2400. In Rise of the Tomb Raider, there was a 6% framerate increase with DDR4-2400 over DDR4-2133, and a nearly 5% gain with DDR4-3000 over DDR4-2400. With the CPU overclocked, the differences were about the same. In The Witcher 3, there was a 7.4% gain going to DDR4-2400 from DDR4-2133, and a 6% increase going to DDR4-3000 from DDR4-2400. Oddly enough, with the CPU overclocked, there was a 16% gain going to DDR4-3000 from DDR4-2400, but a 4% gain going to DDR4-2400 from DDR4-2133. I'm inclined to believe this 16% difference is a fluke, considering it wasn't demonstrated with the CPU not overclocked. Looking at the frametimes in the video, there doesn't seem to be any real difference in sudden frametime increases between DIMM speeds.

In these benchmarks, 11%-13% gains in framerate were common with an upgrade to DDR4-3000 from DDR4-2133. Far Cry Primal showed the largest gain of nearly 19%, while The Division and Crysis 3 showed minimal differences between DIMM speeds.

Now on to the second benchmarks.

These benchmarks test the difference between many DIMM speeds between DDR4-2133 and DDR4-4000 using an i7-6700K and two 980 Tis in 6 semi-recent games. Keep in mind that timing differences were not stated. Do note that Steve tested at 1440p with "Ultra" settings or higher (AA). ARMA 3 saw a 10% difference between DDR4-2133 and DDR4-2400, a 4-5% difference between DDR4-2400 and DDR4-3000 and DDR4-3000 and DDR4-3600, and a 3.5-5% difference between DDR4-3600 and DDR4-4000. Black Ops 3 saw a 4.4% (average) and 5% difference between DDR4-2133 and DDR4-2400, a 3% (average) and 2.5% (minimum) difference between DDR4-2400 and DDR4-3000, a 4% (average) and 1.6% (minimum) difference between DDR4-3000 and DDR4-3600, and a ~2.5% difference between DDR4-3600 and DDR4-4000. Civilization Beyond Earth saw a 5.4% (average) and 2.4% (minimum) difference between DDR4-2133 and DDR4-2400, a tiny 3% (average) and 3.6% (minimum) between DDR4-2400 and DDR4-3000, a tiny 3% (average) and 2% (minimum) difference between DDR4-3000 and DDR4-3600, and minimal difference between DDR4-3600 and DDR4-4000. Fallout 4, however, saw much higher differences. It saw a 10% (average) and 21% (min) difference between DDR4-2133 and DDR4-2400, a 7.4% (average) and 6% (min) difference between DDR4-2400 and DDR4-3000, a 12% (average) and 15% (minimum) difference between DDR4-3000 and DDR4-3600, and a tiny 3% difference between DDR4-3600 and DDR4-4000. The Division saw very little difference between different DIMM speeds, only about a few % or so in minimum framerates each step up. The Witcher 3 saw an 8.4% (average) and 7% (min) framerate difference between DDR4-2133 and DDR4-2400, a small 5.5% (average) and 2.5% (min) difference between DDR4-2400 and DDR4-3000, a 4.2% difference in average framerates between DDR4-3000 and DDR4-3600, and a tiny 3% difference in average framerates between DDR4-3600 and DDR4-4000.

In these second benchmarks, there appears to be the largest difference ( between DDR4-2133 and DDR4-2400. A ~15-18% difference between DDR4-2133 and DDR4-3000 is seen in 3 of the tested games. The benchmarks consistently showed next to no difference beyond DDR4-3600, which was not unexpected.

Now for the third benchmarks, which are more recent and include Ryzen.

These benchmarks look at the Core i7-7700K and the Ryzen 7 1800X with DDR4-2133, DDR4-2666, DDR4-2933, and DDR4-3200 in 9 games. In Mass Effect Andromeda, little differences were seen due to GPU bottlenecking.

With the Core i7-7700K,

Deus Ex saw a 5% (avg), 13% (1% lows), and 5% (0.1% lows) difference between DDR4-2133 and DDR4-2666, very small (largest 3%, 0.1% lows) differences between DDR4-2666 and DDR4-2933, and again very small 3% differences between DDR4-2933 and DDR4-3200. GTA 5 saw an 8% (avg) and 13% (1%, 0.1% lows) difference between DDR4-2133 and DDR4-2666, a 5.4% (avg), 6% (1% lows), and 5% (0.1% lows) difference between DDR4-2666 and DDR4-2933, and a ~3.3% (avg, 1% lows) difference between DDR4-2933 and DDR4-3200. Battlefield 1 saw an 8% (avg), 10% (1% lows), and 14% (0.1% lows) difference between DDR4-2133 and DDR4-2666, a 6% (1% lows) and 1%-2% (avg, 0.1% lows) difference between DDR4-2666 and DDR4-2933, and a 3% (1%, 0.1% lows) difference between DDR4-2933 and DDR4-3200. Hitman shows a 7.5% (avg), 5% (1% lows), and 4% (0.1% lows) difference between DDR4-2133 and DDR4-2666, a 7% (avg), 6% (1% lows), and 12.5% (0.1% lows) difference between DDR4-2666 and DDR4-2933, and a 6.5% (avg), 6% (1% lows), and 5.6% (0.1% lows) difference between DDR4-2933 and DDR4-3200. Mafia 3 demonstrates a 9% (avg) difference between DDR4-2133 and DDR4-2666, a 7.4% (avg), 6% (1% lows), and 7% (0.1% lows) difference between DDR4-2666 and DDR4-2933, and a 5% (avg), 7% (1% lows), and 8% (0.1% lows) difference between DDR4-2933 and DDR4-3200. Watch Dogs 2 saw a 20% (avg), 23.4% (1% lows), and 24% (0.1% lows) difference between DDR4-2133 and DDR4-2666, a 7.4% (avg), 3.4% (1% lows), and 4% (0.1% lows) difference between DDR4-2666 and DDR4-2933, and a 7% (avg) and 13% (1% lows) difference between DDR4-2933 and DDR4-3200. Overwatch saw a 16.6% (avg), 11.6% (1% lows), and 7% (0.1% lows) difference between DDR4-2133 and DDR4-2666, a 4% (avg) and <2% (1%, 0.1% lows) difference between DDR4-2666 and DDR4-2933, and a 3.5% (avg) difference between DDR4-2933 and DDR4-3200. Ashes of the Singularity saw a 6% (avg), 9% (1% lows), and 8% difference between DDR4-2133 and DDR4-2666, a 3% (avg), 4.6% (1% lows), and 5% (0.1% lows) difference between DDR4-2666 and DDR4-2933, and very little difference between DDR4-2933 and DDR4-3200.

In these results, we again often see the largest gap between DDR4-2133 and DDR4-2666, even when comparing DDR4-3200 and DDR4-2666. Oftentimes, the largest differences were seen in 1% and/or 0.1% lows, but occasionally these showed little difference while averages saw actual change. Moving up to DDR4-3200 over DDR4-2133 yielded framerate increases around 10%, 20%, 23%, and 25%, except for Mass Effect. Watch Dogs 2 showed an even larger increase of 36%-44%.

With the Ryzen 7 1800X,

Deus Ex: Mankind Divided saw an 8% (avg), 13% (1% low), and 5% difference between DDR4-2133 and DDR4-2666, a small difference (3%) between DDR4-2666 and DDR4-2933, and another small difference (4.4%-3%) between DDR4-2933 and DDR4-3200. GTA 5 saw an 11% (avg) and 13% (1%, 0.1% lows) difference between DDR4-2133 and DDR4-2666, a ~5.5% difference between DDR4-2666 and DDR4-2933, and a ~3.3% difference (avg, 1% lows) between DDR4-2933 and DDR4-3200. Battlefield 1 demonstrated a 9%-10% difference (avg, 1% lows) between DDR4-2133 and DDR4-2666, a 3.4% (avg), 8% (1% lows), and 13% (0.1% lows) difference between DDR4-2666 and DDR4-2933, and a 4.6% (avg), 6.4% (1% lows), and 8% (0.1% lows) difference between DDR4-2933 and DDR4-3200. Hitman showed a 13% (avg), 7.6 (1% lows), and 4% (0.1% lows) difference between DDR4-2133 and DDR4-2666. The differences between the rest were effectively nothing. Mafia 3 showed a 17% (avg), 10% (1% lows), and 4% (0.1% lows) between DDR4-2133 and DDR4-2666, a 10% (avg) and 5.6% (0.1% lows) difference between DDR4-2666 and DDR4-2933, and a 16% (avg), a 10.6% (1% lows), and 18% (0.1% lows) between DDR4-2933 and DDR4-3200. Watch Dogs 2 saw a 14% (avg), 8% (1% lows), and 10.5% (0.1% lows) difference between DDR4-2133 and DDR4-2666, a 7.5% (avg), 11% (1% lows), and 16.6% (0.1% lows) difference between DDR4-2666 and DDR4-2933, and a 6% (avg), 6.5% (1% lows), and 12% (0.1% lows) difference between DDR4-2933 and DDR4-3200. Overwatch saw a 13.4% (avg), 6.6% (1% lows), and 6% (0.1% lows) difference between DDR4-2133 and DDR4-2666, a 4.5% (avg), 4% (1% lows), and 3% (0.1% lows) difference between DDR4-2666 and DDR4-2933, and a small 3% or lower differences between DDR4-2933 and DDR4-3200. Ashes of the Singularity saw a 10% (avg), 13% (1% lows), and 19% (0.1% lows) difference between DDR4-2133 and DDR4-2666, a 9% (avg), 7% (1% lows), and 4% (0.1% lows) difference between DDR4-2666 and DDR4-2933, and a 5% (avg, 0.1% lows) and 4% (1% lows) difference between DDR4-2933 and DDR4-3200.

In these results, the gap between DDR4-2133 and DDR4-2666 appears to be the largest most often, normally about equal to or larger than the gap between DDR4-2666 and DDR4-3200. The framerate difference between DDR4-2133 and DDR4-3200 often goes around 13%, 18% - 21%, and 25%, except for Mass Effect, which we do not know. Mafia 3 showed an even larger increase from 29.4% - 38%, and Watch Dogs 2 also showed a larger increase of 27.4% (1% lows), 45% (0.1% lows), and 30% (avg).

Now for the fourth benchmarks. These compare DDR4-3200 and DDR4-2133 with a Ryzen 7 1700 at 3.75GHz in 16 games.

Thankfully, the uploader was nice enough to put the graphs in both raw data (framerate) and percentage form on imgur. Here is the framerate graphs, and here are the percentage graphs. As such, I won't be going over the results here since it would be easy for you to do so. In these benchmarks, the framerate improvement brought by DDR4-3200 only once went over ~20%, that being a whopping 43% 0.1% low improvement, something he found twice.

Finally, the last benchmarks. These look at the Ryzen 7 1700X at nearly 4 GHz in 7 games with DDR4-2133, DDR4-2666, DDR4-3200, and DDR4-3600 while paired with a GTX 1070 (note that this might affect some results). Timings are kept largely the same until DDR4-3600, which have 2-3 clock cycle higher timings.

Battlefield 1 experienced only very small changes in average framerate. It saw a 9% (1% lows) and 14% (.1% lows) difference between DDR4-2133 and DDR4-2666, a <3% difference between DDR4-2666 and DDR4-3200, and again a <3% increase going to DDR4-3600 from DDR4-3200. GTA 5 experienced an 11.4% (average), 10.5% (1% lows), and an 8.6% (0.1% lows) difference going to DDR4-2666 from DDR4-2133, a 6.5% (average), 11.5% (1% lows), and 13% (0.1% lows) difference between DDR4-2666 and DDR4-3200, and a minimal difference in 1% and 0.1% lows, but a 4% difference in average framerates between DDR4-3200 and DDR4-3600. Mass Effect Andromeda saw a 12% increase in 0.1% lows going to DDR4-2666 from DDR4-2133, a 5.5% (1% lows) and 13% (0.1% lows) difference between DDR4-2666 and DDR4-3200, and a 16% (1% lows) and 10.4% (0.1% lows) difference between DDR4-3200 and DDR4-3600. Rise of the Tomb Raider saw a 6.4% (avg), 14% (1% lows), and 3.5% (0.1% lows) difference between DDR4-2133 and DDR4-2666, a 4.8% (avg), 18% (1% lows), and a 36.5% (0.1% lows) difference between DDR4-2666 and DDR4-3200, and minimal increase going to DDR4-3600 from DDR4-3200. Mafia 3 demonstrated an 8% (avg), 9% (1% lows), and a 12% (0.1% lows) difference between DDR4-2133 and DDR4-2666, and minimal increases in framerate between DDR4-2666 and DDR4-3200 and DDR4-3200 and DDR4-3600, except for the 8% increase in 1% lows between DDR4-3200 and DDR4-3600. Watch Dogs 2 saw a 5% (avg), 9% (1% lows), and 32% (0.1% lows, likely a fluke) difference between DDR4-2133 and DDR4-2666, a minimal increase in average framerates and 4.6% (1% lows) and 6% (0.1% lows) difference between DDR4-2666 and DDR4-3200, and a 3-4% increase in average framerates and 0.1% lows between DDR4-3200 and DDR4-3600. Crysis 3 saw a 5% increase in average framerates and <3% increase in 1% and 0.1% lows going to DDR4-2133 from DDR4-2666, a 5.4% (1% lows) and 5% (0.1%) difference between DDR4-2666 and DDR4-3200, and a regression in 1% and 0.1% lows (likely also a fluke) and no real increase in average framerates going to DDR4-3200 from DDR4-3600.

These benchmarks in general somewhat reflect the others with AMD Ryzen, although with a few odd results. In general, it shows the same decent increases going above DDR4-2133 seen in the other benchmarks, but mostly only in 1% and 0.1% lows (1% lows especially, as 0.1% lows at DDR4-2133 were oddly relatively lower than in the other benchmarks).

Timings and Game Performance

https://3dnews.ru/950757/page-2.html

https://www.pcper.com/reviews/Processors/Ryzen-Memory-Latencys-Impact-Weak-1080p-Gaming/Timings-timings-timings

While timings are often considered to be important when it comes to memory and gaming, there isn't actually much testing for it that isn't outdated or heavily outdated.

The first benchmarks use a Ryzen 7 1800X and a variety of DIMM speeds (DDR4-2133, DDR4-2400, DDR4-2666, DDR4-2933, DDR4-3200, DDR4-3466) and a variety of primary timing configurations for each speed except DDR4-3466 in 1-2 clock cycle increments for each in 4 games. Yes, I know I said I would only use benchmarks with more than 4 games, but this section needed something more than what the second benchmarks provided. Testing of more than 4 games and with looking at 1% and 0.1% lows would give us a better idea when it comes to performance. However, it is unclear if any timings beyond primary timings were modified. If I were to tell how each different timing/DIMM speed configuration all compared at the same level of detail as everything else, this section would be unnecessarily long for something that isn't as important an issue as something like DIMM speeds. To summarize, we can say that a difference of +/- 3 or 4 clock cycles in timings (+/- 3-3-3-3 or 4-4-4-4) at the same DIMM speed has about the same effect as a +/- ~266 MT/s difference in DIMM speed with AMD Ryzen, according to 3DNews.ru. This goes not only for their gaming tests, but also their "application" tests.

In the second benchmarks (second graph on the page), a dual-rank DDR4-2666 DIMM is compared with a single-rank DDR4-2666 DIMM using a Ryzen 5 1600X in 8 games. Sadly, however, the games they tested are in general a bit dated. Anyway, using dual-rank over single rank, Anno 2205 saw a 3% gain in both min and average framerates, AC Syndicate saw a 10% gain in both min and average framerates, Crysis 3 saw a tiny 1.5% gain in both min and average framerates, Dragon Age Inquisition saw a small 3% and 5% gain in min and average framerates respectively, F1 2015 saw a between 11% and 12% gain in min and average framerates, Far Cry 4 saw a small 4.5% and 2% gain in min and average framerates respectively, StarCraft II LoTV saw a small 6% and 3% gain in min and average framerates respectively, and saw The Witcher 3 an 8% and 6.6% gain in minimum and average framerates respectively.

Due to no reliable, relevant evidence to strengthen/conflict with what is seen above, and the first benchmarks above only looking at 4 games while the other looks at 8, it is hard to come to any conclusions here. Before making any statements about how timings play a role when CPU-bound in current systems and games, there should be more tests done on both Ryzen and Intel-based systems.

Ranks and Game Performance

https://www.golem.de/news/ram-overclocking-getestet-ryzen-profitiert-von-ddr4-3200-und-dual-rank-1704-127262.html

http://www.pcgameshardware.de/Ryzen-5-1600X-CPU-265842/Tests/R5-1500X-Review-Mainstream-1225280/3/

Currently, the only recent testing looking at ranks and game performance are with Ryzen processors. That said, let's move on to the first benchmarks.

The first benchmarks sadly only look at average framerates, but looks at 8 games and compares DDR4-2666 single-rank and dual-rank (same timings) along with DDR4-3200 (single-rank) with slightly lower timings. All games were tested at ~720p, although with an unstated GPU, but it's likely they used a recent GPU in which case it's doubtful they were GPU-bound ever. With a dual-rank DIMM (so 2-way rank interleave), Ashes of the Singularity and Watch Dogs 2 saw a 10% and 11% improvement in average framerates respectively, while Deus Ex Mankind Divided and F1 2016 saw a between 7-8% gain in average framerates. Fallout 4 and GTA 5 saw a ~6.5% gain in average framerates with a dual-rank DIMM, and Dishonored 2 saw only a 5% gain in average framerates. Interestingly, Ashes of the Singularity and Watch Dogs 2 experienced nearly the same gain in framerates over the single-rank DDR4-2666 configuration as with DDR4-3200 and slightly lower timings, and F1 2016 saw nearly the same gain between both (only about 1% absolute in favor of DDR4-3200 over DDR4-2666 single-rank). Aside from those few games, DDR4-2666 dual-rank experienced about half or slightly more than half of the improvement with DDR4-3200 and lower timings of DDR4-2666 single-rank.

In the second benchmarks (second graph on the page), a dual-rank DDR4-2666 DIMM is compared with a single-rank DDR4-2666 DIMM using a Ryzen 5 1600X in 8 games. Sadly, however, the games they tested are in general a bit dated. Anyway, using dual-rank over single rank, Anno 2205 saw a 3% gain in both min and average framerates, AC Syndicate saw a 10% gain in both min and average framerates, Crysis 3 saw a tiny 1.5% gain in both min and average framerates, Dragon Age Inquisition saw a small 3% and 5% gain in min and average framerates respectively, F1 2015 saw a between 11% and 12% gain in min and average framerates, Far Cry 4 saw a small 4.5% and 2% gain in min and average framerates respectively, StarCraft II LoTV saw a small 6% and 3% gain in min and average framerates respectively, and saw an 8% and 6.6% gain in minimum and average framerates respectively.

Also on the page of the second benchmarks, DDR4-2666 single-rank vs DDR4-2400 dual-rank is compared. The lower-speed dual-rank won in their gaming tests (aside from Far Cry 4, where it lost by a few percent), performing usually about the same (Anno, Crysis, Dragon Age) or slightly better (3-6%, F1 2015, The Witcher 3, AC Syndicate).

Taking from both of these benchmarks, we can say that Ashes of the Singularity, Watch Dogs 2, AC Syndicate, and F1 2015 (2016 slightly less so) benefit the most from 2-way rank interleave vs. no (1-way) rank interleave, gaining about 11% in framerates from it alone. Most games seem to get about 5% - 8% in framerates, while a few like Anno 2205 and Crysis 3 see almost no benefit in framerates from rank interleave. We can also say that it's better to have a slightly lower (-266 MT/s) DIMM speed with rank interleave than to get a slightly higher DIMM speed but with no rank interleave when using AMD Ryzen.

Updates

7/27/2018 update:

In recent times, there has been a glaring lack of gaming benchmarks investigating the role of any of the above specifications / channel configurations, which I feel are reliable enough to fit in this thread. Ever since the release of Intel's mainstream 8th gen CPUs, I've checked online almost every month for new benchmarks looking at memory speed, channel configurations, timings, and ranks. Only one which looked at memory speeds with a Ryzen 1800X fit my requirements for this post most of the time. However, some others seem to do the job, but merely don't have comparative, summarizing data. These are those benchmarks.

https://www.youtube.com/watch?v=qBmElSVy4U8 (Dual vs. single channel with i7-7700K)

https://www.youtube.com/watch?v=80s5bIMQU-I (Dual vs. single channel with i5-8600K)

https://www.youtube.com/watch?v=yFAvef2kDQc (DDR4-2133 CL12 vs. DDR4-3000 CL16 with i5-8600K)

https://www.youtube.com/watch?v=uMgF1TWhhs8 (DDR4-2666 vs. DDR4-3200 with Ryzen 1700X)

There is also these benchmarks. I am mostly satisfied with them, but it compares 7 different results from 21 games. I'm way too lazy to calculate percentage differences and compare all of that, so I'm putting it here.

https://www.youtube.com/watch?v=HsQAtsKX3mE (6 speeds + timing-optimized speed with Ryzen 1800X, note a GPU bottleneck is hit in some games)

11/2/2018 update:

Here is a couple more benchmarks:

https://www.youtube.com/watch?v=Vl5DQVmXbR0 (compares 2 x 8 GB and 4 x 4 GB at 3200 MT/s on a Ryzen 5 2600x. The effect might be from rank interleaving, but the only timing specified is CAS latency, and no subtimings, so be wary)

https://www.youtube.com/watch?v=PWnQx09MSao (a more up-to-date comparison of 4 GB and 8 GB, keep note of the 0.1% and 1% lows)

Also, here's a couple benchmarks for Ryzen APUs since I missed that last update:

https://www.gamersnexus.net/guides/3244-amd-r5-2400g-memory-kit-benchmarks-and-single-vs-dual-channel

https://www.anandtech.com/show/12621/memory-scaling-zen-vega-apu-2200g-2400g-ryzen/3

Recap, Reality Check, and Concluding Thoughts

I hope that this post was useful. Here's a quick recap:

It is best to have at least 8 GB of system memory. "Multitasking" may or may not be a problem when it comes to performance with only 8 GB, depending on what else is going on besides the game. Having a web browser with a lot of tabs doing nothing active while being pushed over 8 GB of memory usage might not cause a performance impact, as the memory used by the browser would simply be moved to the page file. However, if a game alone takes up more memory than you have, a performance impact could be expected due to the much slower storage drive having to be accessed. Not having enough VRAM will also hurt performance as the graphics card uses system memory, which not only slows things down at the GPU's end, leading to stutter and worse framerates, but also increases main memory consumption. It is a good idea to have at least 4GB of VRAM at 1080p "very high"+ settings to avoid running out, if playing current AAA games. As resolution increases, look towards making sure you get more as long as you're playing AAA games. If or when you do run out of main memory, it is a good idea to make sure you have an SSD and have the page file on it, rather than an HDD, as that might reduce the performance impacts considerably.

When it comes to memory subsystem performance, there is only an effect in gaming performance when CPU-bound. Whenever you are GPU-bound, there isn't an impact on performance no matter what you do with your memory (aside from changing capacity, of course). This is a crucial detail to remember when considering memory in a gaming machine. As such, the following information in this recap only applies to CPU-bound scenarios.

We see the use of dual-channel memory configurations providing some framerate benefit over single-channel configurations in games. Different games see different impacts using dual-channel. While some like ARMA 3 don't see much, others like Hitman, Project Cars, The Witcher 3, Rise of the Tomb Raider, and Battlefield 1 can see gains in the realm of 20's to 30's in percent, which is certainly not something to scoff at. Evidence points to greater performance scaling using processors with more physical cores or SMT, and higher clock speeds. For example, while the i3-6100 typically demonstrated small benefits (on average of 7 games, 10-15%), a Ryzen 7 at 3.75GHz, i7-6700, and an overclocked i5 often saw framerate gains above 20%, sometimes much larger. Occasionally, these benefits showed themselves mostly in 1% and .1% lows, improving your experience in worst-case scenarios. With Ryzen processors, a dual-channel configuration using DDR4-2133 seems to allow for performance similar to or even outmatching a single-channel configuration with DDR4-3200. Currently, to my knowledge, there isn't any recent, non-misleading gaming benchmarks looking at memory channel configurations greater than dual-channel.

The use of higher DIMM speeds has a varying impact in different games. Some games benefit more in framerates than others. Ones like The Division and Titanfall 2 seem to experience little to no differences, while many fall into groups of 10-18% and 18 - 25% higher framerates when using DDR4-3200 / DDR4-3000 over DDR4-2133, a few like Mafia 3 and Watch Dogs 2 might show even larger benefits. Sometimes, games might show most of their framerate gains from higher DIMM speeds in the form of better 1% and/or 0.1% lowest framerates, giving a better experience in worst-case scenarios and generally taxing moments. There also does not appear to be much difference in performance scaling between Intel and AMD Ryzen-based systems (but it seems Ryzen-based systems see larger gains more often), although it should be noted that it is easier to become CPU-bound with Ryzen than a recent-gen, competing Intel CPU. In terms of scaling across different DIMM speeds, if the differences between DDR4-2133 and DDR4-2666, and DDR4-2666 and DDR4-3200 are any indication, we sometimes see scaling slow down past DDR4-2666 (whether in small amounts of in large amounts, there is plenty of both), and other times we don't really see that happen. In case anyone's wondering, yes, decent gains have been seen with increasing memory speed on lower-end CPUs and with older DDR3 systems before, not unlike those seen above.

The role of timings in performance in current systems and games isn't too clear. Current, limited evidence suggests that with Ryzen, a difference of 3-4 clock cycles in primary timings (X-X-X-X) has the same effect as a 266 MT/s difference, and that Intel systems benefit similarly in games to that in some games, almost not at all in a few, and much larger (10% - 15%) in others with a slightly higher timing difference. This difference between Intel and AMD Ryzen-based systems may actually be the case, or it may simply be because the benchmarks with the Intel system seen in this post looked at more games which would show such differences. Without other, decent and recent benchmarks in games with either Intel or AMD Ryzen-based systems, I recommend not taking this as truth just yet. The benchmarks analyzed in this post also didn't mention whether they changed any subtimings, but it's likely they would have mentioned it if they did. I mean, why wouldn't you?.

With AMD Ryzen, rank interleave (requiring 2 or 4 ranks per channel, so 1 or 2 dual-rank DIMMs per channel) provides framerate benefits in games vs no rank interleave ranging from ~11% in some games, about equal to or slightly better than a 266 MT/s increase (4-8%, generally) in most, or next to nothing in a few games. While there doesn't seem to be any relatively up-to-date benchmarks showing ranks and gaming with Intel systems, it wouldn't be too surprising if the differences were similar.

Let's do a quick reality check.

A 5% framerate improvement is equal to...

30 to 31.5 fps, 40 to 42 fps, 50 to 52.5 fps, 60 to 63 fps, 70 to 73.5 fps, 80 to 84 fps, 90 to 94.5 fps, 100 to 105 fps, 110 to 115.5 fps, 120 to 126 fps, 130 to 136.5 fps, and 140 to 147 fps. These differences can largely be ignored — they are effectively inconsequential in real life.

A 10% framerate improvement is equal to...

30 to 33 fps, 40 to 44 fps, 50 to 55 fps, 60 to 66 fps, 70 to 77 fps, 80 to 88 fps, 90 to 99 fps, 100 to 110 fps, 110 to 121 fps, 120 to 132 fps, 130 to 143 fps, and 140 to 154 fps. These differences aren't very noticeable, but are nice to have.

A 15% framerate improvement is equal to...

30 to 34.5 fps, 40 to 46 fps, 50 to 57.5 fps, 60 to 69 fps, 70 to 80.5 fps, 80 to 92 fps, 90 to 103.5 fps, 100 to 115 fps, 110 to 126.5 fps, 120 to 138 fps, 130 to 149.5 fps, and 140 to 161 fps. The differences here start getting noticeable, especially on the lower end. They aren't exactly big, though.

A 20% framerate improvement is equal to...

30 to 36 fps, 40 to 48 fps, 50 to 60 fps, 70 to 84 fps, 80 to 96 fps, 90 to 108 fps, 100 to 120 fps, 110 to 132 fps, 120 to 144 fps, 130 to 156 fps, and 140 to 168 fps. Framerate differences are getting to be quite decent.

A 25% framerate improvement is equal to...

30 to 37.5 fps, 40 to 50 fps, 50 to 62.5 fps, 60 to 75 fps, 70 to 87.5 fps, 80 to 100 fps, 90 to 112.5 fps, 100 to 125 fps, 110 to 137.5 fps, 120 to 150 fps, 130 to 162.5 fps, and 140 to 175 fps.

A 30% framerate improvement is equal to...

30 to 39 fps, 40 to 52 fps, 50 to 65 fps, 60 to 78 fps, 70 to 91 fps, 80 to 104 fps, 90 to 117 fps, 100 to 130 fps, 110 to 143 fps, 120 to 156 fps, 130 to 169 fps, 140 to 182 fps. At this point, framerate differences are considerable.

Ultimately, choosing your main memory for a gaming-centered PC is a question of pricing, budget, compatibility, upgradeability, and of course, how much performance when CPU-bound matters to you.

Currently, in the United States, we see 16GB of memory starting at about $122. At 16GB, kits with different capacities per module cost about the same. 8GB of memory starts at about $63. The cheapest 2 x 4GB kits start at about $68, $5 more than the cheapest single 8GB DIMMs. At both capacities, we see DDR4-2400 and DDR4-2133 being pretty much equivalently priced. No reason to not get DDR4-2400 here. 8GB DDR4-2666 kits start at about $70 (mostly 1 x 8GB), while 8GB DDR4-2800 kits start at $73. 16GB ~DDR4-3000 kits are starting at about $133 right now, and 16GB DDR4-2666 kits are starting at $130. Prices for both 8GB and 16GB kits/modules start to rise quickly beyond DDR4-3200 pricing, causing much worse price/performance beyond DDR4-3200 given the performance seen in current benchmarks. This is ignoring minor sales on individual DIMMs/kits, of course. There is almost always such small sales, so you will often see something better than the rest at a price point.

For gaming PCs in which all system memory is being bought new, I recommend always purchasing 2 x XGB memory kits when reasonably possible as long as your motherboard has four DIMM slots. This offers not only better CPU-bound performance right away, but you still retain good capacity upgradeability. If, say, you bought a 2 x 4GB kit when you have four DIMM slots, only price can stop you from buying and installing a 2 x 16GB kit to have 40GB of memory (really, how many people need that much anyway?). While an approach like that carries a risk of incompatibility when you upgrade, you can reduce this risk greatly by choosing a kit that is meant to work at similar timings, voltage, and DIMM speed. Preferably, get a kit that uses the same DRAM chips as your current memory. If you have only 2 DIMM slots, then you should still think about it, but be more careful in your decision based on the information from this post. If you were to fill both DIMM slots, any upgrading would mean switching out one or both DIMMs, and this could mean living with asymmetrical channel interleave ways and/or losing money (assuming you don't reuse the previous DIMM(s) or sell for same/higher price).

When you decide on memory for gaming, do consider how often you will probably be CPU limited. If you are using a 120Hz+ monitor, you should definitely look towards higher memory performance. Remember, the CPU is the ultimate framerate limiter, and pushing such framerates is difficult in some AAA games. Higher memory performance will ease CPU bottlenecking, allowing you to push higher maximum framerates and extend the useful lifespan of your CPU in games. Weigh the benefits you might get with cost and any upgradeability or aesthetic limitations. Take into account how capable the CPU is in games you want to play and in general. With something like an overclocked i7-8700K, it's going to be harder to become CPU-bound than with a Ryzen processor or a Pentium G4560. It must be considered how well your CPU meets your performance expectations, for how long it will, and whether any CPU upgrades are planned before it no longer meets your preferences.

Comments Sorted by:

TheShadowGuy 2 points 18 months ago

This is an extremely well-researched and useful compilation. Excellent work.

Rexper 1 point 18 months ago

If the Video Card's VRAM is full, and some data is transferred to the DRAM instead, if the DRAM is full, can that same data originally from the VRAM go to the page file on a storage device?

526christian submitter 2 points 18 months ago

I don't have any hard confirmation offhand since I've been away from my PC, but as far as I know, yes.

[comment deleted by staff]
526christian submitter 6 points 18 months ago

Possibly.

tragiktimes101 1 Build 1 point 18 months ago

I think it does.