Jump to content

Potential CPU otimisation for Ryzen CPUs


Mannymac

Recommended Posts

4 hours ago, Kevin Perry said:

This is a good article (I think) about the why's and wherefore's of SMT/HT: https://bitsum.com/tips-and-tweaks/why-you-should-not-disable-hyper-threading-or-why-you-should/

Scheduler and architecture has changed: previously, Windows had a "habit" of using logical cores on the same physical core rather than spreading threads out over logical cores across physical dies.

This article is from 2017. I read beginning of this year somewhere that Windows 10 since the spring 2020 update has made significant changes under the hood that better support multicore CPUs and more specifically benefit AMD processors.
Here's another article from the February 2020 (before the WIN10 changes) that adds some info to your article: https://www.techjunkie.com/disable-hyperthreading/

From what I understand so far, disabling might be beneficial as long as you have pc tasks that don't exceed the maximum the amount of cores. When you cross that number hyperthreading is probably more beneficial (also according to many benchmarks).   With large projects in CbB I've seen that all my 32 hyperthread cores are being used, so I wonder if disabling hyperthreading would benefit that situation.

On 11/25/2020 at 6:51 PM, Mannymac said:

My Ryzen 3950x has 16 "real" cores/ 32 logical ones.
Go into the config file under settings and set the max thread count to your real cores, in my case 16 and set the thread scheduling model to 2. Et voila!

Interesting finding!
I get that many things could play a part in the code under the hood and that CbB might still not be as efficient as possible with new AMD CPUs.  Apart from that, however, when you use only 16 cores in CbB and have more than 16 processes (CbB + Windows + others) that need CPU at the same time, Windows will still use hyperthread cores because of operating system tasks (and maybe some other software that is running alongside CbB) as long as you have not physically disabled hyperthreading in the BIOS.  Maybe most of the time the non-CbB processes are limited and CbB can still use the majority of the cores without hyperthreading. Or, when you set the amount of cores, they are completely claimed by CbB , but that would mean that the operating system has no cores to run...

Anyway, I would like a scenario where I don't need to switch off hyperthreading and don't need to use only my logical cores in CbB (can't get rid of the gut feeling that more cores is better?) , considering that I also use the pc for photo/video work, which are known to significantly benefit from hyperthreading.

Hopefully CbB can still be improved regarding multicore use.

Link to comment
Share on other sites

You should run Xmeters in the taskbar of your computer.  I have it configured to show operating system stuff in orange, and the other color for application work. You'll quickly see that a CPU logical core is never dedicated to one or the other, but *very* dynamically mixes the workload.

Another thing is that it's quite interesting to see CbB go all out with Drum Replacement. It's actually quite beautiful to see how evenly loaded all 32 logical cores are during that process. 

https://entropy6.com/xmeters/

  • Like 1
  • Thanks 1
Link to comment
Share on other sites

On 12/3/2020 at 10:47 AM, Teegarden said:

This article is from 2017. I read beginning of this year somewhere that Windows 10 since the spring 2020 update has made significant changes under the hood that better support multicore CPUs and more specifically benefit AMD processors.
Here's another article from the February 2020 (before the WIN10 changes) that adds some info to your article: https://www.techjunkie.com/disable-hyperthreading/

From what I understand so far, disabling might be beneficial as long as you have pc tasks that don't exceed the maximum the amount of cores. When you cross that number hyperthreading is probably more beneficial (also according to many benchmarks).   With large projects in CbB I've seen that all my 32 hyperthread cores are being used, so I wonder if disabling hyperthreading would benefit that situation.

Interesting finding!
I get that many things could play a part in the code under the hood and that CbB might still not be as efficient as possible with new AMD CPUs.  Apart from that, however, when you use only 16 cores in CbB and have more than 16 processes (CbB + Windows + others) that need CPU at the same time, Windows will still use hyperthread cores because of operating system tasks (and maybe some other software that is running alongside CbB) as long as you have not physically disabled hyperthreading in the BIOS.  Maybe most of the time the non-CbB processes are limited and CbB can still use the majority of the cores without hyperthreading. Or, when you set the amount of cores, they are completely claimed by CbB , but that would mean that the operating system has no cores to run...

Anyway, I would like a scenario where I don't need to switch off hyperthreading and don't need to use only my logical cores in CbB (can't get rid of the gut feeling that more cores is better?) , considering that I also use the pc for photo/video work, which are known to significantly benefit from hyperthreading.

Hopefully CbB can still be improved regarding multicore use.

There is no processor architecture specific coding in Cbb for multiprocessing and I would wager most other apps as well. It would be dangerous to do that since performance would vary across different CPU models. To Cbb all threads are equal.

My guess is that with threads on virtual cores  you are at the mercy of the OS implementation, system load and the hyperthreading implementation, since these threads are not truly running on a dedicated core. This old post kind of sums up this behavior. My guess is that when the primary core is really busy due to a heavy load, the virtual core doesnt get a lot of time to run. In a mixing workload with multiple tracks the full cycle cannot complete until all channels are complete so if some cores are starved I can see it having a detrimental effect to simply running without hyperthreading.

I could try and add an experimental mode where we filter out processing workloads on threads assigned to virtual cores. Or perhaps do this dynamically based on the actual CPU workload. i.e. if the workload is lower, then utilize hyperthreaded cores but as it gets closer to 80% or some value dial back to physical cores only. Who knows if that will help :)

  • Like 5
Link to comment
Share on other sites

26 minutes ago, Noel Borthwick said:

My guess is that with threads on virtual cores  you are at the mercy of the OS implementation, system load and the hyperthreading implementation, since these threads are not truly running on a dedicated core. This old post kind of sums up this behavior. My guess is that when the primary core is really busy due to a heavy load, the virtual core doesnt get a lot of time to run. In a mixing workload with multiple tracks the full cycle cannot complete until all channels are complete so if some cores are starved I can see it having a detrimental effect to simply running without hyperthreading.

I could try and add an experimental mode where we filter out processing workloads on threads assigned to virtual cores. Or perhaps do this dynamically based on the actual CPU workload. i.e. if the workload is lower, then utilize hyperthreaded cores but as it gets closer to 80% or some value dial back to physical cores only. Who knows if that will help :)

Sounds interesting, I wonder if you can separate the OS cores from the cores used by the DAW, and if you can give certain (hyperthread)cores priority to DAW tasks that have the biggest impact on pops. cracks and dropouts. 
Maybe the others on this thread could give more feedback to your suggestions (I'm a PC hobbyist, but this goes above my head, unfortunately).

I don't know how it works between software developers, but is it not a good idea to try to get in touch with a Windows 10 development team that is concerned with OS-CPU communication development?  That way you will understand better how MS deals with multicores and hyperthreading and they might also be interested understanding the needs from DAW developers/users.

Link to comment
Share on other sites

1 hour ago, Teegarden said:

Sounds interesting, I wonder if you can separate the OS cores from the cores used by the DAW, and if you can give certain (hyperthread)cores priority to DAW tasks that have the biggest impact on pops. cracks and dropouts. 
Maybe the others on this thread could give more feedback to your suggestions (I'm a PC hobbyist, but this goes above my head, unfortunately).

I don't know how it works between software developers, but is it not a good idea to try to get in touch with a Windows 10 development team that is concerned with OS-CPU communication development?  That way you will understand better how MS deals with multicores and hyperthreading and they might also be interested understanding the needs from DAW developers/users.

We've been close to Microsoft for over 20 years :)

  • Like 4
Link to comment
Share on other sites

16 hours ago, Noel Borthwick said:

We've been close to Microsoft for over 20 years :)

Great, so that's covered?.
Exactly at the time of our discussion AnandTech published a test investigating performance of multithreading on zen 3 and amd ryzen 5000/5 . It still confuses me a bit. However, there is some interesting feedback in the comments section that give more info and suggestions.

 

17 hours ago, Glenn Stanton said:

this is a product i use: Bitsum. Real-time CPU Optimization and Automation https://bitsum.com/ they have a free version with some of the functionality.

I've had the free version of Process Lasso (the Bitsum software you refer to) for a long time, forgot about it and never used it. Maybe this is a good time to start using it... Any recommendations on how to use it to finetune CbB CPU optimization?

I also noticed another program on there website parkcontrol which could be useful too, but its functionality seems also included in Process Lasso. 

" With ParkControl, we revealed hidden CPU settings that control core parking, and wrote about how CPU core parking and frequency scaling can affect performance of real-world CPU loads. Put simply, these power saving technologies come with a performance trade-off, so they should be disabled when maximum performance is desired.

Both ParkControl and Process Lasso offer a power profile, Bitsum Highest Performance, that is pre-configured for ultimate performance. In this power plan, your CPU always remains ready to execute new code. Core parking is disabled and the CPU never drops below its nominal (base) frequency.

Since you probably don’t want to be in this power plan all the time, we include automation to switch the active power plan when specific applications or games are running (Performance Mode), or only when the user is active (IdleSaver).

Process Lasso also allows for specific power profiles to be associated with an application in case you want to use different power plans.

Finally, the IdleSaver feature of Process Lasso will switch to a more conservative power plan when you go idle. Similarly, ParkControl has a function called Dynamic Boost that is essentially the opposite of IdleSaver – it raises to a more aggressive power plan when the system is active."

If it works as advertised it is a very handy addon: I've used Power Buddy to manually switch to max performance when using photo or audio editing software. I regularly forget to switch it on or off...
Now this can be done automatically without having to think about switching to a more efficient power plan when the hard work is done and vice versa! 

 

Link to comment
Share on other sites

  • 2 weeks later...

I have a 9900K (8 core/16 thread - 5.1Ghz) system with a custom waterblock loop (420 mm radiator/ EK Magnitude wb) and fairly good overclocking capabilities. It is about 2 years old now and it has performed quite well in things like Folding@home and neural net training,  beside audio. 

I use Windows  Pro 20H2 build 19042.662.

I built a Reaktor ensemble that allows me to dial in the amount of load it uses on an audio track from about 10% load to 90% load (as reported by the Reaktor CPU load indicator.

Although I normally don't worry much about audio settings, when I need some extreme performance, I find that hyperthreading might get in the way. In the case of using this Reaktor tool, it is immediately obvious - but that is probably because it is the same process running in the hyperthreads, so sharing CPU resources between the two logical cores is not optimal.

Here are some images of the results.

The first is showing 8 copies of the tool set at load = 60%. I hear audio glitches sporadically at this setting, but not when the load is 50%.

 

592513601_ReaktorStressTest-allcores.thumb.png.130b1f7f32c5b58a9a64d950438c758f.png

 

Next, if I set the CPU affinity to Cakewalk to use only the even cores I can run the load up to 90% with no glitches at all.

 

1592942540_ReaktorStressTest-eventhreadsonly.thumb.png.378a3091596c75007f18cf408646fffb.png

 

I use the 'Ultimate Performance Power Plan':

929026487_UltimatePerformance.png.29c727e0154a671d8e79599864a9cc38.png

 

And I use MSI Util to set all interrupts in the MSI mode (but I don't think this matters very much, I don't see much different when normal interrupts are on):

 

 

1972472717_MSIUtility.thumb.png.a0bd854731deb023d20cb5be4d1608d1.png

 

Here are my driver settings:

 

1174914498_DriverSettings.png.71e5532c6bd9582982e94b1eb12b68bf.png

 

and Aud.ini:

 

Aud-ini.png.ce0e16a87bc76e83e15f10b7c165aca2.png

 

And a short Latency Mon run:

 

LatencyMon.thumb.png.2d377740b8a5c8a22fa15cc853ae62a8.png

Edited by Jim Hurley
Added Windows version number
  • Like 2
Link to comment
Share on other sites

Yes. I often run Reaktor in standalone mode using this batch file to put eight copies in their own core. (Reaktor only uses a single core for its audio).

In an even more extreme case I can just select the best performing cores as most CPUs these days have varying qualities of core performances.

I have 4 or 5 good ones and 3 poorer performing ones, I believe that is fairly common in the Silicon Lottery. I don't believe my CPU is above average in any regard.

The '/high' means set high priority.

Quote

rem 
rem Start 8 versions of Reaktor 6 standalone, one in each physical core, running in the even hyperthreads
rem
rem Jim Hurley 10 Feb 2019
rem
start "CPU  0               1" /high /affinity 0x0001 "C:\Program Files\Native Instruments\Reaktor 6\Reaktor 6.exe"
start "CPU  2             100" /high /affinity 0x0004 "C:\Program Files\Native Instruments\Reaktor 6\Reaktor 6.exe"
start "CPU  4           10000" /high /affinity 0x0010 "C:\Program Files\Native Instruments\Reaktor 6\Reaktor 6.exe" 
start "CPU  6         1000000" /high /affinity 0x0040 "C:\Program Files\Native Instruments\Reaktor 6\Reaktor 6.exe" 
start "CPU  8       100000000" /high /affinity 0x0100 "C:\Program Files\Native Instruments\Reaktor 6\Reaktor 6.exe"
start "CPU 10     10000000000" /high /affinity 0x0400 "C:\Program Files\Native Instruments\Reaktor 6\Reaktor 6.exe"
start "CPU 12   1000000000000" /high /affinity 0x1000 "C:\Program Files\Native Instruments\Reaktor 6\Reaktor 6.exe" 
start "CPU 14 100000000000000" /high /affinity 0x4000 "C:\Program Files\Native Instruments\Reaktor 6\Reaktor 6.exe"

 

Edited by Jim Hurley
  • Like 1
Link to comment
Share on other sites

  • 2 weeks later...

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
×
×
  • Create New...