Intel Unveils Meteor Lake Architecture: Intel 4 Heralds the Disaggregated Future of Mobile CPUs
by Gavin Bonshor on September 19, 2023 11:35 AM ESTDuring the opening keynote at Intel's Innovation event in San Jose, Chief Executive Officer Pat Gelsinger unveiled a score of details about the upcoming Meteor Lake client platform. Intel's Meteor Lake marks the beginning of a new era for the chipmaker, as they move away from the chaotic Intel 7 node and go into a rollout of their Foveros 3D packaging with EUV lithography for their upcoming client mobile platform. Meteor Lake uses a tiled, disaggregated chiplet architecture for its client-centric processors for the first time, changing the very nature of Intel's consumer chips going forward. And, according to Intel, all of these changes have allowed them to bring some significant advancements to the mobile market.
Intel's first chiplet-based consumer CPU breaks up the common functions of a modern CPU into four individual tiles: compute, graphics, SoC, and an I/O tile. Within the makeup of the compute tile is a new pair of cores, a P-core named Redwood Cove and a new E-core called Crestmont. Both these cores promise IPC gains over their previous counterparts, but perhaps the most interesting inclusion is a new type of E-core embedded directly into the SoC tile, which Intel calls 'Low Power Island.' These new LP E-cores are designed with the idea that light workloads and processes can be taken off the more power-hungry compute tile and offloaded onto a more efficient and lower-powered tile altogether. Other major additions include a first-for-Intel Neural Processing Unit (NPU), which sits within the SoC tile and is designed to bring on-chip AI capabilities for workloads and inferencing, paving the way for the future.
With Meteor Lake, Intel is aiming to put themselves in a more competitive position within the mobile market, with notable improvements to compute core hierarchy, Intel's Xe-LPG Arc-based graphics tile looking to bolster integrated graphics capabilities, and an NPU that adds various AI advantages. Meteor Lake also sets the scene for Intel and modular disaggregation, with Foveros 3D packaging set to become a mainstay of Intel's processor roadmap for the future, with the Intel 4 process making its debut and acting as a stepping stone to what will become Intel's next mainstay node throughout its fabs, Intel 3.
Intel Meteor Lake: Intel 4 Using Foveros 3D Packaging
Intel's Meteor Lake architecture is not just another iteration in the long line of processor advancements; according to the company it's a revolutionary leap forward. Eloquently put by the executive VP and GM of Intel's Client Computing Group (CCG), Michelle Johnston Holhaus, at Intel's Tech Tour 2023 in Penang, Malaysia, she remarked that Intel had reached an inflection point in their client roadmap. Unveiling more details about Meteor Lake at the Intel Technology Tour in Malaysia, the architecture is an advancement from its current client processors regarding performance as we advance through Intel's '5 nodes in 4 years' roadmap.
Meteor Lake is built upon Intel's disaggregated architecture pushed through by Foveros packaging. This is designed to optimize both performance and energy efficiency. The architecture itself consists of four unique and distinct tiles connected via Intel's Foveros 3D packaging technology. This includes the compute tile, which is built on Intel 4, while the graphics tile is built on TSMC's N5 node. The other two tiles Intel implements within Meteor Lake are the SoC tile which acts as the central hub through the embedded NOC. This is the first time Intel has used a Network-on-Chip (NOC) on their client processors, which is a simplified approach to the NOC on their existing Agilex FPGAs. While on Agilex, the NOC is individualized into different NoC targets and switches within the substructure of the NoC, on Meteor Lake, it directly connects to the I/O fabric through the IoC, which then goes into the I/O fabric. The NOC itself is directly connected to the graphics tile, compute tile, and other components within the SoC.
This modular approach allows for a ground-up and scalable power management architecture that supports disaggregation, enabling each tile to function independently. This disaggregated design prioritizes performance by negating bandwidth bottlenecking through things like the I/O on a monolithic design and targets enhanced power efficiency. Perhaps the most notable element of disaggregation is that Intel can select specific silicon processes for each tile and isn't limited to one process node. On top of the power efficiency and package area benefits of a tiled architecture, it is cheaper for Intel to manufacture CPUs with fewer masks through EUV, but it allows Intel to scale out new IP into future products while keeping the same base, which is another cost-saving factor (for Intel).
Compared to mobile Raptor Lake, which was done using Multi-Chip Packaging (MCP), Meteor Lake uses Foveros BGA packaging and offers low-power die-to-die interconnects, which Intel confirmed has a small power penalty of between 0.15 and 0.3 picojoules (pJ) communicating from tile to tile. Some of the advantages of Foveros include better customizability through tiling, which allows Intel to manufacture chips and implement specific tiles and IP depending on the grade of the chip, etc., low power with more I/O, or high-end tiles with all the latest gadgets and gizmos. With the Intel 7 node not being as viable as they would have hoped, Intel promises higher wafer yield on Intel 4, which uses less wafer space for logic-based silicon.
Power management is done using a scalable power management system that supports the independent functioning of each tile. Coordination between multiple power management controllers (PMC) and system software is designed to be optimized for various workloads. Intel's Meteor Lake architecture also introduces a new scalable fabric to improve energy efficiency and extend bandwidth in areas that have been previously bottlenecked, such as I/O.
Touching more on the power controllers within the Meteor Lake architecture, Intel has integrated independent Power Management Controllers within each of the tiles. As part of the disaggregation of using Foveros, each tile needs to be independently power managed, and using PMCs on the NOC, the I/O fabric, as well as each tile allows power management to be agnostic depending on the number of the cores on each package.
Meteor Lake itself represents a monumental architectural shift, not just a mere incremental update, as it represents the most significant architectural transformation in client processors in four decades. This is because it's the first client processor to be made using chiplets instead of a monolithic design. The architecture is designed to be the cornerstone of Intel's strategy to push PC innovation for the next decade. Looking at some of the finer details of Intel's Meteor Lake architecture, it uses Intel's Foveros packaging technology, which uses 3D chip stacking to tackle the pitfalls of traditional 2D chip layouts.
As we can see from the above disclosure from Intel at Hot Chips 2023, the top and bottom layers have bumps for interconnecting each die together. Using the Foveros FDI packaging offers a low-voltage complementary metal oxide semiconductor (CMOS) interface, which means the power circuitry can be run with lower voltages and, thus, with a lower power envelope. Another benefit of FDI is both synchronous and asynchronous signaling, which means signal transmission can handle fully duplexed data blocks.
Constructing a Meteor Lake SoC includes a package substrate, which is the foundation for the base tile to sit onto, which uses the Foveros Die Interconnect (FDI). The base tile has a 36µm die to die pitch with metal layering and a 0.15 to 0.3 pJ operating power at 2 GHz; this may fluctuate or vary depending on voltages, amps, and frequency. As it's a base tile that isn't an active chiplet itself, its only function is to serve as a base for all of the various logic, and metal layers for the chiplets to be placed upon.
The makeup of Intel's Meteor Lake architecture uses four distinct tiles to create a Meteor Lake CPU. This includes a Compute, SoC, GPU, and an I/O tile, all with distinct uses, capabilities, and flexibility in regard to IP. Power management also gets a revamp. With disaggregation through Intel's Foveros packaging technology, each tile in Meteor Lake requires its own power management. Intel's solution is a hierarchical power management system that uses Power Management Controllers on the NoC, IO fabric, and each individual tile.
Over the next few pages, we'll give an insight into each of the four tiles, what each tile brings to the table, and more about the various technologies driving innovation through Meteor Lake.
107 Comments
View All Comments
PeachNCream - Thursday, September 21, 2023 - link
Nice trolling lemur! You landed like an entire page of nerd rage this time. You're a credit to your profession and if I could give you an award for whipping dead website readers into a frenzy (including regulars who have seen you do this for years now) I would. Congrats! 10/10 would enjoy again.IUU - Thursday, September 28, 2023 - link
Intel does not need to do anything about its architecture to to match or surpass m3. It just needs to build its cpus on a similar node. Which is not happening anytime soon, thus perpetuating the illusion of efficiency of apple cpus.Two things more. First it is hilarious to compare the prowess of Intel on designing cpus to that of Apple. Apple has long time "building" machines like a glorified Dell borrowing cpus from IBM or Intel and only recently understood the scale and effort needed to design your silicon by improving on ARM designs.
Secondly, it is misguided to say that if a cpu needs 10 times more wattage on the same node to achieve 2 or 3 times the performance is less efficient. This is not how physics works . If Intel built their cpus on N3 of tsmc they would be 2 or 3 times faster best case scenario. Wattage does not scale linearly with performance. This is the same as saying that a car that has 10 times the power would be 10 times faster. Lololol.
Apple designs good cpus recently , but all the hype about its efficiency is just hype. Even if we assume the design is totally coming from Apple , which it doe not, being a very good modification at best, it does not even build its nodes. By large its efficiency is TSMC efficiency. If it were not for TSMC Apple would be non existent on the performance charts.
Silma - Tuesday, September 19, 2023 - link
TLDR:- Intel 4 < TSMC N6
- To not be late, Intel 3 must arrive within 3 months,which is highly doubtful, since Intel 4 isn't even shipping yet
- I assume Intel 3 < TSMC N6, otherwise, why bother enriching the competition?
- Parts of the new tech stack looks promising, but Intel refrains from any real performance claims, or any comparison with offerings from AMD or Apple.
- Did Intel announce another architecture for desktop computers, probably more similar to that of AMD, e.g. perhaps many performance tiles plus one cache tile?
Drumsticks - Tuesday, September 19, 2023 - link
Maybe. Or maybe TSMC6 is cheaper, and Intel doesn't need the power savings or area savings of I4 over TSMC6 for what the non-compute tiles need to accomplish. It's not exactly uncommon to see the SoC / IO tile on a lower node, doesn't AMD do the same thing?Roy2002 - Tuesday, September 19, 2023 - link
Intel 4 and 3 are basically the same with the same device density as 3 is enhanced 4. I assume it has slightly higher density value than TSMC 5nm and performance is slightly better. Let's see.kwohlt - Tuesday, September 19, 2023 - link
Intel 4 is not library complete. It can't be used for the SoC tile.sutamatamasu - Tuesday, September 19, 2023 - link
I wonder if current processor have an dedicated NPU, then what the heck happen with GNA?It still in there or they're remove it?
Exotica - Tuesday, September 19, 2023 - link
Intel should've either implemented TB5 in Meteor Lake or waited until after Meteor Lake shipped to announce TB5. Because as cool and impressive as meteor lake seems, for some of us, it's already obsolete in that it makes no sense to buy a TB4 laptop/PC and instead wait on TB5 silicon to hit the market.FWhitTrampoline - Tuesday, September 19, 2023 - link
Why use TB4 or USB4/40Gbs and have to deal with the extra latency and bandwidth robbing overhead compared to PCI-SIG's OCuLink that's just pure PCIe signalling delivered over an external OCuLink Cable. OCuLink and PCIe requires no extra protocol encapsulation and encoding/decoding steps at the PCIe link stage so that's lower latency there compared to USB4/TB4 and later generations that have to have extra encoding/decoding of any PCIe protocol packets to send that out over TB4/USB4. And for external GPUs 4 lanes of PCIe 4.0 connectivity can provide up to 64Gbs of bandwidth over an OCuLink port/cable and OCuLonk ports can be 8 PCIe lanes and wider there.Once can obtain an M.2/NVMe slot to OCuLink adapter and get an external OCuLink connection of up to 64Gbs as long as the M.2 is 4, PCIe 4.0 lanes wide and no specialized controller chip required on the MB to drive that. And GPD on their Handhelds offers a dedicated OCuLiink port and an external portable eGPU that supports OCuLink or USB4/40Gbs-TB interfacing. TB5 and USB4-V2 will take years to be adopted whereas OCuLink is just PCIe 3.0/4.0 there delivered over an external cable.
Exotica - Tuesday, September 19, 2023 - link
Unlike thunderbolt, Occulink doesn't have hotplugging, meaning your device must be connected at cold boot. Not so good for external storage needs.