EACH year Intel's architects look forward to disclosing architectural innovations they have been working on for products that are imminent at Intel Architecture Day.
According to Intel Corporation's senior vice president and general manager of the Accelerated Computing Systems and Graphics Group, Raja M. Koduri in his editorial said that this year's event, their third, was the most exciting one yet.
Unveiling their biggest shifts in Intel architectures in a generation, the event saw the first in-depth look at Alder Lake, Intel's first performance hybrid architecture with two new generations of x86 cores and the intelligent Intel Thread Director workload scheduler; Sapphire Rapids, Intel's new standard-setting data centre architecture with their new Performance-core and various accelerator engines; new discrete gaming graphics processing unit (GPU) architecture; new infrastructure processing units (IPUs); and Ponte Vecchio, it's tour-de-force data centre GPU architecture with Intel's highest ever compute density.
"These architectural breakthroughs set the stage for our next era of leadership products, starting soon with Alder Lake. The breakthroughs we disclosed also demonstrate how architecture will satisfy the crushing demand for more compute performance as workloads from the desktop to the data centre become larger, more complex and more diverse than ever.
"Our architects are working hard, combining Intel's unique and rich selection of scalar, vector, matrix and spatial compute engines, to create hybrid computing architectures that deliver non-linear gains on our customers' most demanding workloads."
Here are the highlights of the presentation:
Efficient-core: A highly scalable x86 microarchitecture for addressing compute requirements across the entire spectrum of Intel customer's needs, from low-power mobile applications to many-core microservices. Compared with Skylake, Intel's most prolific CPU microarchitecture, the Efficient-core delivers 40 per cent more single-threaded performance at the same power, or the same performance while it is said to consume less than 40 per cent of the power. For throughput performance, four Efficient-cores deliver 80 per cent more performance while still consuming less power than two Skylake cores running four threads or the same throughput performance while consuming 80 per cent less power.
Performance-core: This x86 core is not only the highest performing CPU core Intel has ever built, but it also delivers a step function in CPU architecture performance that will drive the next decade of compute. It was designed as a wider, deeper and smarter architecture to expose more parallelism, increase execution parallelism, reduce latency and increase general purpose performance. It also helps support large data and large code footprint applications. Performance-core provides a Geomean improvement of about 19 per cent, across a wide range of workloads over our current 11th Gen Intel Core architecture (Cypress Cove core) at the same frequency.
Targeted for data centre processors and for the evolving trends in machine learning, Performance-core brings dedicated hardware, including Intel's new Advanced Matrix Extensions (AMX), to perform matrix multiplication operations for an order of magnitude performance – a nearly eight times increase in artificial intelligence acceleration. This is architected for software ease of use, leveraging the x86 programing model.
Intel Thread Director: Intel's unique approach to scheduling was developed to ensure Efficient-cores and Performance-cores work seamlessly together, dynamically and intelligently assigning workloads from the start and optimising the system for maximum real-world performance and efficiency. With intelligence built directly into the core, Intel Thread Director works seamlessly with the operating system to place the right thread on the right core at the right time.
Alder Lake: Reinventing the multicore architecture, Alder Lake will be Intel's first performance hybrid architecture with the new Intel Thread Director. This is Intel's most intelligent client system-on-chip (SoC) architecture, featuring a combination of Efficient-cores and Performance-cores, scaling from ultra-mobile to desktop, and leading the industry transition with multiple industry leading I/O and memory. Products based on Alder Lake will begin shipping this year.
Xe HPG and Alchemist SoC: A new discrete graphics microarchitecture is designed to scale to enthusiast-class performance for gaming and creation workloads. The Xe HPG microarchitecture features a new Xe-core, a compute-focused programmable and scalable element, and full support for DirectX 12 Ultimate. New matrix engines inside the Xe-cores (referred to as Xe Matrix eXtensions, XMX) accelerate artificial intelligence workloads such as XeSS, a novel upscaling technology that enables high-performance and high-fidelity gaming. Xe HPG-based Alchemist SoCs (formerly code-named DG2) will be coming to market in the first quarter of 2022 under the new Intel Arc brand.
Sapphire Rapids: Combining Intel's Performance-cores with new accelerator engines, Sapphire Rapids sets the standard for next-generation data center processors. At the heart of Sapphire Rapids is a tiled, modular SoC architecture that delivers significant scalability while still maintaining the benefits of a monolithic CPU interface thanks to Intel's EMIB multi-die interconnect packaging technology and advanced mesh architecture.
Infrastructure Processing Unit: Mount Evans is Intel's first dedicated ASIC-based IPU, along with a new FPGA-based IPU reference platform, Oak Springs Canyon. With an Intel IPU-based architecture, cloud service providers (CSPs) can maximise data centre revenue by offloading infrastructure tasks from CPUs to IPUs. Offloading infrastructure tasks to the IPU allows CSPs to rent 100 per cent of their server CPUs to customers.
Xe HPC, Ponte Vecchio: The most complex SoC Intel has ever built and a great example of the company's IDM 2.0 strategy come to life, Ponte Vecchio takes advantage of several advanced semiconductor processes, their revolutionary EMIB technology, and its Foveros 3D packaging. With this product, Intel is bringing to life their moon-shot project, the 100 billion-transistor device that delivers industry-leading FLOPs and compute density to accelerate artificial intelligence, high performance computing and advanced analytics workloads. At Architecture Day, Intel showed that their early Ponte Vecchio silicon is already demonstrating leadership performance, setting an industry-record in both inference and training throughput on a popular AI benchmark. Their A0 silicon is already providing greater than 45 TFLOPS FP32 throughput, greater than 5 TBps Memory Fabric bandwidth and greater than 2 TBps connectivity bandwidth. Ponte Vecchio, as with Intel's Xe architectures, will be enabled by oneAPI, their open, standards-based, cross-architecture and cross-vendor unified software stack.
"Looking back at just the past year, technology was at the heart of how we all communicated, worked, played and coped through the pandemic. Enormous computing power proved crucial. Looking ahead, we face a massive demand for compute – potentially a 1,000x need by 2025. That 1,000-times boost in four years is Moore's Law to the power of five," he added.
Intel's CEO, Pat Gelsinger, also an architect, stated at Architecture Day: "We face daunting compute challenges that can only be solved through revolutionary architectures and platforms … Our talented architects and engineers made possible all this technology magic".