Intel Architecture Day 2021 Announces Arc Branding for GPUs, XeSS, and Alder Lake CPU Details
Posted: 02:45PM
Author: Guest_Jim_*
This week Intel has been sharing a lot of information on future products at its Architecture Day 2021, including its upcoming discrete GPUs to challenge AMD's and NVIDIA's duopoly, and the Alder Lake CPU with hybrid architecture. Starting with the GPUs, Intel has decided to brand this product line as Arc, not to be confused with Ark, the company's database for all of its processor specifications. This brand name will cover the consumer hardware as well as related software and services. While we can expect more information on the products this year, unfortunately the release of these Xe HPG graphics cards will not be until Q1 2022.
In addition to the Arc branding, we also have the first codenames for the GPUs, with DG2 being renamed Alchemist. Its successors will be Battlemage, Celestial, and Druid and while its names certainly invoke high fantasy games, they are also alphabetical so we can all have some fun guessing new code names for future letters. All of the Arc GPUs will support the DirectX 12 Ultimate features, including accelerated ray tracing, variable rate shading, and mesh shaders.
The Anandtech article provides more information than the Intel press release, such as the introduction of the Xe Core and Render Slice. The Cores consist of 16 vector engines that can process 256 bits per cycle and 16 Matrix Engines that can handle 1024 bits per cycle that are paired with the appropriate caches, and possibly some other specialized math processors. These Matrix Engines are likely analogous to NVIDIA's Tensor Cores, but interestingly as the Anandtech article points out, these Matrix Engines could offer twice the throughput as the Tensor Cores, though the described vector throughput would match NVIDIA's Ampere architecture. It is that vector math capability that is most important to gaming though, so how this decision turns out in the end could be quite interesting.
Above the Xe Cores is the Render Slice, with four cores along with a Ray Tracing Unit and texture sampler per Core, frontends for geometry, rasterization, and two pixel backends making up one Slice. The way the RT Units work may prove interesting as NVIDIA and AMD have taken different approaches on where to focus accelerating the ray tracing workload, which has an impact on performance and optimization potential. The Alchemist GPU will consist of eight Render Slices with a memory fabric connecting them, and that fabric containing an L2 cache.
The Alchemist GPUs will be built on TSMC's N6 process, which is an improved version of the N7 process AMD has been using for its latest CPUs and GPUs. One of the differences between these two processes is the replacement of additional DUV layers with EUV, though without changing design rules or tools. Performance between N6 and N7 should be very similar, but N6 is simpler to work with and offers somewhat higher densities as well.
Also shared, though I have had a hard time finding an Intel press release or similar on it, was information on XeSS, the AI-assisted super-sampling technology to compete with NVIDIA's DLSS and AMD's FSR. Of those two already available technologies, XeSS appears to be more similar to DLSS in function as it will also apply AI to assist in upscaling and uses temporal super samples. However, it is also like FSR as it will work on non-Intel GPUs, provided they support DP4a instructions, and will be open source. The way it will work is to first apply a jitter to the scenes so sampling positions are not always at the center of each pixel, collect multiple frames to then feed into its AI-assisted upscaling system, ultimately producing a final image. On Arc GPUs it will use Xe Matrix Extension (XMX) cores to accelerate the AI work, but GPUs lacking these but with the previously mentioned DP4a instructions will still be able to run the upscaling system. The Arc GPUs should be faster, but it is not a vendor-locked technology. It will be interesting to see how AMD and NVIDIA may respond to this, as well as the larger video game industry as both this and FSR can be enabled for a wide variety of GPUs.
Turning to the CPU news now, Intel has shared the first in-depth look at Alder Lake, the company's first performance hybrid architecture. For quite some time now, mobile devices have been utilizing a hybrid design that pairs high performance cores with high efficiency cores. Much like boosting technologies on CPUs, the idea is to activate the high performance cores that drain more power when the task demands that extra performance or will be done fast enough to save on power, while running other tasks on the high efficiency cores to reduce power usage. The Performance or P-core architecture in Alder Lake is named Golden Cove while the Efficiency or E-core architecture is named Gracemont. Besides their power-performance profiles, a difference between these is the P-cores offer multithreading while the E-cores do not. The designs targeting different markets will each use eight E-cores that will be combined with two P-cores for low power mobile designs, six P-cores for higher performance mobile designs, and eight P-cores for the desktop market.
According to Intel, the Gracemont microarchitecture can provide40% greater single-threaded performance at the same power as the Skylake architecture, or the same performance at 40% less power. Four E-cores combined offer 80% more performance while taking less power than just two Skylake cores. The Golden Cove P-cores offer about 19% more performance over the current Cypress Cove architecture of 11th Gen processors, when run at the same frequency. These P-cores also feature the new Advanced Matrix Extensions, AMX to increase matrix multiplication significant, an important upgrade for work in machine learning.
Though I compared the hybrid design to traditional boosting earlier, the two are quite difference technologies and a hybrid architecture is much more complex as logic is required to place tasks onto the appropriate cores. To that end, Intel has created its Thread Director to work with the operating system so the correct core is used from the start, so no time is wasted moving a task. By working with Microsoft, the Windows 11 thread scheduler will be able to make its decisions based on more information than the Windows 10 scheduler has. For example, the new scheduler can understand performance modes, instruction sets, as well as a notion of priority, so tasks that need more performance can bump others off of threads. Anandtech lists the performance levels from Intel as starting with loading the P-cores first, then all of the E-cores before the logical/SMT threads of the P-cores. The Intel Thread Director will be an embedded microcontroller that will feed information from the CPU back to Windows 11 to suggest it move threads as is appropriately.
Besides the hybrid architecture and new core designs, Alder Lake will also support DDR5 at 4800 MT/s, DDR4 3200, LPDDR5-5200, and LPDDR4X-4266, allowing it to work in a variety of platforms. It will also support PCIe 5.0 with 16 lanes and four lanes of PCIe 4.0 from the CPU while the chipset will offer 12 lanes of PCIe 4.0 and 16 of PCIe 3.0. Alder Lake-based products are to be shipping later this year.
Source: Intel [1] (Arc Press Release), [2] (Intel Arc Webpage), Anandtech [1] (A Sneak Peek at the Xe-HPG GPU Architecture), [2] (Alder Lake, Golden Cove, and Gracemont Detailed), and Techspot (XeSS Information)

Facebook
Twitter
YouTube
RSS Feeds