Welcome Stranger to OCC!Login | Register

A Look at RX Vega 64 Efficiency

   -   
» Discuss this article (5)

Wolfenstein 2: The New Colossus Results:

Since I collected this data first, I want to go through it first. As I stated earlier, I accidentally missed recording a run in this game for the 130 FPS target at the Undervolt +50% Power Limit (UV +50) configuration. This is not the only issue with this data, which you should see when comparing the results for the Stock Default run to the other two Default runs. After looking back at the OCAT data I determined I made a procedural mistake causing the Stock Default run to result in better performance than the others, but I was able to avoid this in the future and likely is something that will not affect your experience, if you play the game.

The way OCAT works is as a front-end for PresentMon, which is what actually captures the frame time data. To collect this data it hooks into every application it can and then saves the frame time information to a file, marking which application it is for. A summary file is also created by OCAT, identifying things like average frame rate for an application, and when I looked at this I saw that the Radeon Settings application was also being recorded. (I have a script to remove unwanted applications from the raw CSVs, so I never saw these entries in the data itself.) For The New Colossus I had left the Radeon Settings utility for managing the GPU and FRTC open in the background, and apparently rendering it was impacting the performance of the game. I did not have it open for the Stock Default run, which I believe is why its average frame rate was higher and why the frame times were also smoother for that run, compared to the others.

I did not realize this issue before uninstalling The New Colossus, so I could not re-do the impacted runs, but I am going to act under the assumption the impact was consistent across the impacted runs, and consider the Stock Default run as the outlier.

Here is a video of the run I did. As this is actual gameplay, there were variances between each run, but they are roughly consistent, especially as the enemy placement was the same between each run.

 

 

 

 

The first graphs I have for you to look at show the average power use for each run in the sets. The label at the base of the columns is the average value while the box plots are of the frame time data, and align with the right axis marked Frame Rate (FPS). The crossed circle marks the average FPS while the center line of the box plots marks the median, so half the data is below and half above.

 

 

In these graphs we can see there are some differences between the Stock, UV, and UV +50 sets, though the differences are not necessarily as significant as one might want to see. The difference between similar runs between the sets is at most around 10 W, which is not that much, but we do see a flatter trend at the higher frame rates for both the UV and UV +50 sets.

This next set of graphs shows the average clock speeds of the different runs, box plots of the clock speeds, box plots of the frame rate, and the core P-states. There is a lot of information here, so I hope I constructed the graphs well enough to be understood. Here we see some more differences between the sets as the Stock runs show a greater variability to the clock speeds compared to the UV or UV +50 runs. What I believe is noteworthy is how the UV and UV +50 runs reported having higher average clock speeds compared to Stock. This makes sense as an undervolt should reduce the heat produced by the chip, thereby also reducing thermal throttling.

 

 

While I do have measurements for the temperature of the GPU, this information is not all that valuable here. The drivers have a temperature target of 75 ºC, so this is where the temperature will trend toward when under any appreciable load. Instead the fan speed, which is also measured, is more valuable data to consider. The fan speed is going to be directly related to the heat produced by the GPU because its speed determines how quickly the heat is removed. The faster the fan speed, the more heat you know was being removed, and this is true regardless of what the temperature of the GPU is.

 

 

Though the 130 frame rate target for Stock is unusual, what we can see on these graphs is the Stock runs had approximately the same or higher average fan speeds than the UV or UV +50 runs, which follows with what we would expect from an undervolt. Some of the UV +50 runs had a higher fan speed than the UV runs, which makes sense if the increased power limit was being taken advantage of, allowing the GPU to pull more power if it needs it, but it is not particularly evident here.

The next graphs I am going to share plot the GPU clock speed against the power use, with these first three just looking at the averages of these values. This is interesting to see because while the power use might be similar between them, we can clearly see the clock speeds were able to be higher for the UV and UV +50 sets. This supports the idea that undervolting can allow for increased performance without increasing power use. Of course this graph does not indicate how successful the GPU was at achieving and maintaining these frame rates, but I do have such graphs that I can share later.

 

 

While averages are a useful characteristic of data, sometimes you want more information than they can provide, which is why I made this next group of graphs. These are graphing the individual measurements of clock speed and power use, with them grouped and colored by the specific run. Instead of showing specific points I am using hexagon bins, which are easier to see and reduce the number of points on the graph. The rug plots against the axes however would still be for the specific values. These plots put a line against the axis that if continued out would intersect at the point on the graph, so you can see how the GPU clock and ASIC power are distributed between the runs by their coloring. Finally, there is a regression line based on all of the data, not any one run, on top of the hexagon bins. It is a polynomial regression of degree five, which means it is assuming the relationship between the GPU Clock and ASIC Power is a polynomial with the highest power being five. Why five? Because I know it needs to be an odd number, so that it roughly follows a lower-left to upper-right trend (even numbers would draw out a parabola) and five instead of three, because five allows for a greater sensitivity to the data. (I should mention that while I do have a BS in mathematics, I earned it without ever taking a statistics course, so I am self-taught here.)

 

 

The regression line should roughly follow what we saw in the graphs of the averages before and ideally we want things to be low (meaning less power was used) and to the right (meaning a higher clock speed). We are able to see the inflection points, where the direction of the curve changes, on these graphs thanks to the regression and for the Stock set it looks to be around 1250 MHz to 1300 MHz. The position of the inflection point tells us when the power draw starts increasing greatly as the clock speed increases, so the farther to the right the better, as that means the frequency can be higher before power demand spikes. For the UV and UV +50 sets the inflection point looks to be around 1300 MHz to 1350 MHz. This kind of shift is nice to see.

Looking at the UV and UV +50 regression lines we also see an interesting difference in behavior. For the UV line we see an increase to the curve of the line around 1500 MHz while the increase to the curve on the UV +50 line starts more around 1425 MHz. The slope of the UV +50 line past this point is less severe, so while the power draw might start increasing sooner, it also does not increase as rapidly compared to the UV set.

I said earlier I have graphs to show how well the GPU could achieve and maintain the frame rate targets, so here they are, though I am actually showing you the display rate instead of frame rate. The reason for using display rate is that this measurement tells you about the time between frames going to the display, and so it better pertains to the experience you will have than the time between the Present() call of the frame time measurements.

 

 

The spiky line is a frequency plot for all of the OCAT MsBetweenDisplayChange data I have for the respective set, which means the line shows approximately how many measurements there are at the specific value. We are interested in seeing how wide these spikes are, especially at the lower frame rates, more than their height, though at higher frame rates the height is important. A sudden drop would indicate the GPU was having greater difficulty in producing frames at the targeted rate, but what I cannot explain is why in the Stock runs there were drops followed by increases. It looks like this occurs near changes in P-state, but I am not sure why that would have an impact. It does not seem to occur as often or as dramatically with the UV or UV +50 sets.

If you do not believe me that the display rate values are better, here are some graphs showing the difference between MsBetweenPresents and MsBetweenDisplayChange. As you can see, MsBetweenDisplayChange, the display rate I was referring too, shows much smoother lines, which matches what I visually experienced. By the way, the points have opacity of 10%. I did this because where there are a lot of points, they will appear black, while those removed from the main group, the outliers, are less visible. I feel this is a better way to identify the outliers than trying to draw in the percentile lines I do when showing single runs in a larger graph.

 

 

 

 

Also, in case you were interested in seeing the frequency plots without the runs being combined into a single line, here they are separated by run:

 

 

 

 

That pretty much covers the data from The New Colossus, so time to move on to the next game.




  1. RX Vega 64 Efficiency - Introduction
  2. RX Vega 64 Efficiency - Procedure
  3. RX Vega 64 Efficiency - Wolfenstein 2: The New Colossus Results
  4. RX Vega 64 Efficiency - Middle-erath: Shadow of War Results
  5. RX Vega 64 Efficiency - Killing Floor 2 Results
  6. RX Vega 64 Efficiency - All Graphs and P-State Tables Part 1
  7. RX Vega 64 Efficiency - Conclusion
Related Products
Random Pic
© 2001-2018 Overclockers Club ® Privacy Policy
Elapsed: 0.4028911591   (xlweb1)