NVIDIA GeForce RTX 3080 Ti FE Review
NVIDIA GeForce RTX 3080 Ti FE: Introduction
Hello and welcome to another review! Today we will be taking a look at another one of NVIDIA's Ampere-based graphics card. This one particularly aims to bring the performance of an RTX 3090 with a bit of a price adjustment. The RTX 3080 Ti is a stop-gap to keep AMD at bay. Using the same GA102 silicon found in the RTX 3080/3090, it has the means to do just that. This one is however takes a small slice out of GA102-300 GPU by removing two SMs from the RTX 3090 to bring the total count down slightly but still houses a staggering 10240 CUDA cores! With only a small reduction in computing performance, the biggest change here is NVIDIA cutting the VRAM in half, now equipped with 12GB instead of 24GB used previously. This design choice is not only to reduce manufacturing cost, but also to put the RTX 3080 Ti more in line with previous generations flagships like the GTX 1080 Ti and RTX 2080 Ti. This is more of a realistic and practical use case scenarios geared towards hardcore gamers over professional content creators.
With the Ampere generation comes with it extra benefits and architectural improvements not found on these pre-RTX generational overhual like ray-tracing and DLSS support thanks to in part by the inclusion of RT and Tensor cores. Since ray-tracing now has been fully implemented into DirectX 12 Ultimate and Vulkan API, game developers no longer have to rely on special implementation. Game engines like Epic's Unreal 4 bring this to a wide range of titles including NVIDIA DLSS now baked right into the game engine as an additional seamless plugin. Those who have discounted AI upscaling previously may be surprised by the advancements since the introduction of NVIDIAs take on the technology starting with the last generations RTX 20 series Turing architecture. NVIDIA DLSS is certainly something to consider in graphically intense games as it can significantly bring the frame rate up with minimal sacrifice to the image quality.
Now that AMD and NVIDIA have both released a majority of their higher-end lineups, it is time to see how the RTX 3080 Ti compares to everything else released so far. In this review, I benchmarked the GPU on multiple systems, games, resolutions, and ray-tracing. We will be going over temperatures and power consumption, and overclocking as well. This is another one of those monstrous reviews with lots of topics, so let's get started!
NVIDIA GeForce RTX 3080 Ti FE: Ampere
For those who missed the Ampere product launch last summer, here is a recap of what has changed in terms of generation improvements. With this Ampere generation (2nd Gen RTX) a lot of minor architectural improvements have been made over the Turing GPUs. Packing a whopping 10240 CUDA Cores compared to NVIDIA Turing flagship 2080 Ti, which had 4352 and was very impressive to see on a single die for its time. A large portion of the Streaming Multiprocessors (SM), RT, and Tensor cores have been reworked to maximize the improvements in this Ampere GPU generation. With a massive 28.3 billion transistors (GA102), all these combined make for some major uplifts in the performance that fundamentally changes how game developers can utilize these optimizations fully in future AAA games. Being touted as 1.5x the performance of the 2080 Ti in traditional rasterization (by NVIDIA Marketing Team) and 1.9x increase per watt over the Turing series in ray-tracing applications. It gives a tiny preview of what to expect in this review.
Each SM contains 128 CUDA Cores, four third-generation Tensor Cores, a 256KB register file, four texture units, one second-generation Ray Tracing Core, and finally 128KB of L1 Shared Memory. The memory section of the GA102 GPU consists of twelve 32-bit memory controllers (creating 384-bit total for the RTX 3080 Ti) and 512KB of L2 cache that is paired with each 32-bit memory controller. This brings the total to 6MB of L2 cache for the full GA102 GPU.
The Ampere generation includes a 5th generation NVDEC (NVIDIA hardware-based decoder) and also includes new improvements. With it comes AV1 hardware decoding and brings H.265 8K encoding up to 12-bit Chroma 4:4:4. This is a big improvement over the previous generation and HEVC (H.265) is finally reaching mass adoption now that streamers and content creators can share video high-quality videos without waiting hours to encode. Encoding will be quicker and YouTube will soon see an increase in 8K content as well as more 4K content thanks to NVIDIA GPUs off-loading most of the encoding tasks from Professional applications like BlackMagic DaVinci Resolve and Adobe Premiere. Let's not forget OBS streaming software either!
One of the major architectural improvements from Turing is the improvement to the SM cores with Programmable Shader being increased to two shaders calculations per clock. Next is the 32-bit floating-point (FP32) throughput which now has double data paths. When optimized it can be best described as double the performance with the same amount of CUDA Cores. In the Turing generation, each partition had two primary datapaths. However only one could process FP32, while the other was limited to integer maths. With Ampere, the FP32 now flows down both datapaths. Ampere now executes 128 FP32 operations per clock (32 per partition). In practice, the total uplift will be less due to the nature of this improvement.
It seems Integer math has been left unchanged, meaning that some applications will only see the increase purely from the extra CUDA cores, as INT32 does not have dual data paths in the Ampere GPU. As game engines better optimize for this change, the performance gained from doubling the FP32 paths and combining them will only increase over time.
Next, let's talk about the RT Core, which is used for off-loading the math involved in bounding boxes and triangle intersection for ray-tracing. This Ampere generation RT Core has also been revamped with an improved triangle solver dealing with the bounding boxes and now is considered in its second iteration (Gen 2). This intersection throughput for bounding boxes has been doubled bringing the total RT-TLFOPS to 58. On top of this improvement, new accelerations have been introduced to handle blurred objects. This leads to an 8x performance increase for dealing with motion blurred ray-traced objects.
GDDR6X (G6X) memory introduced with the RTX 30 series has major improvements over GDDR6 as it now carries four bits per clock cycle instead of the traditional two bits used in most graphical memory designs on the market today. This allows NVIDIA to avoid using the extremely expensive HBM2 memory in favor of a proven cheaper method with higher production yields with similar overall bandwidth. This isn't without some trade-offs. This mainly has to deal with this GDDR6X being very sensitive to temperature and frequency. As such, the methods of just sending four signals via a different set of voltages would not work on their own. The signal is first encoded before entering the memory pipeline, allows a clear and clean signal to come out the other end. This also means memory overclocking will be underwhelming due to being less forgiving as the frequency is raised past the specifications provided by NVIDIA. I would go as far as to say that memory overclocking with GDDR6X is not a real viable option for many end-users without serious consideration to alternate cooling methods.
NVIDIA GeForce RTX 3080 Ti FE: Closer Look
Now all the generational architectural improvements have been covered it's time to break into this box and see what is inside! Besides a new title on the outside not much has changed in terms of packaging from the rest of the Ampere Founders Editions. Opening it up is a simple procedure of cutting the tape and laying the box down. Now you are free to lift the top and you will be greeted with an RTX 3080 Ti just waiting to be installed!
By now this new heat sink design has been seen in every bit of marketing material over the last year. The RTX 3080 Ti has taken a different approach and forgoes the massive 3-slot cooler used for the RTX 3090 and is instead using the same cooler as the RTX 3080 Founders Edition. Maybe a bit of a misstep in terms of cooling potential. After all, when overclocked the GPU pulls 400 watts. Nothing to sneeze at. But I am getting ahead of myself. Check out the overclocking section of this review for more information on said topic.
Those who have seen any Founders card in passing over the years first saw NVIDIAs take on a newly designed blower-style cooler used on the original RTX Titan, which wasn't special but worked well enough. That ended up being an issue once bigger GPUs came around requiring a higher power draw, thus needing a better cooling solution. NVIDIA then moved to the dual fan design found in the RTX 20 series. Once again a need for change as cooling requirements increase. This newest cooler design also uses two fans, however, a fundamental change occurred. Now one is set to push air out the I/O bracket and the other upwards into the case itself. This design relies on a case with high airflow, especially for the rear fan exhaust. While the sides do have a set of fins, these are more passive due to the frame blocking the most direct airflow from the fans.
One of my main concerns previously was that the heat generated by the graphics card directly passing over the system memory would be cause for concern. Well, this turns out to be a double-edged sword in a sense. I have personally dealt with system memory overheating due to poor airflow in one of my testing cases, but I have also experienced the opposite in my other test system with excellent temperature for all components and a negatable impact on the system memory. This cooler is designed to exhaust a large amount of heat and it is up to the end-user to have a good amount of airflow in the case. The hot air is pushed right up in the hopes the case exhaust fans will wisp it away quicker than if it was allowed to passively exit the case. This cooler design allows for a smaller footprint at the possible expense of other computer components.
NVIDIA has placed great care into the design of this cooler and it shows in the attention to detail turning a hunk of metal into a work of art. The RTX 3080 Ti operates at 350 watts (stock settings) under full load. Luckily the cooler design elements previously discussed dissipates this thermal load without issue. It doesn't thermal throttle under these conditions. It stays around 74-78 °C with room to spare. Though that heat has to go somewhere, meaning a good case with plenty of airflow is highly suggested.
Included with every Founders Edition is the Microfit 3.0 to dual 8-pin PEG dongle. While NVIDIA has made the 12-pin connector publicly available for anyone to use, although it is clear by now that mass adoption will never happen. I can't think of one partner card that used this new connector. One concern is the warranty card. It states that the warranty may become void if any other adapter is used. I understand the concern that many people have. My only advice for those who want to use something else is just making sure the wires are 18AWG or you might be in for a surprise. SeaSonic and Corsair both sell (sold) a direct PSU to Microfit 3.0 cable. The SeaSonic Microfit cable I am using is rated for 9 AMP / 16AWG thus supporting 645 watts in total. Well past the maximum 400 watts this RTX 3080 Ti allows. Honestly, if the connector wasn't in the middle of the frame, I would prefer it over a triple 8-pin PEG connector design.
The I/O section has gone under a radical change from previous generations due to the heatsink design. Below the large open area is the monitor display connections. The push for 4K content has been growing for a long period now and has reached mass adoption. It is no surprise the push for 8K content is next as TV manufactures are eager to sell you more things. While the push for higher imagine fidelity is welcomed, I think it will be a long time before graphics cards can properly support that resolution. Not without the help of AI upscaling, at least for now.
Both 8K and HDR are the biggest marketing strategies now, but the data bandwidth used for displaying this content is nowhere close to being ready. First of all, because the DisplayPorts found on all Ampere cards are all version 1.4a instead of revision 2.0, it means PC gamers are going to be left behind from the highest possible refresh rate and image quality. This also means that those who have been waiting for higher 4K refresh rates above 144Hz or 8K 60Hz gaming must use the single HDMI 2.1 connector. Do not think that HDMI 2.1 is the solution as this HDMI revision also has its bandwidth limits. For example, Chroma Sampling of 4:2:0 at 8K resolution and 360Hz refresh rates is needed due to having to rely on Display Stream Compression (DSC) and limited cable bandwidth. Those are only two examples of the current limitations.

Facebook
Twitter
YouTube
RSS Feeds