Even as 2024 marks the rapid arrival of the AI PC (and an AI button on the new Windows keyboard), the introduction of AI acceleration into Intel PC processors was not a foregone conclusion.
“It was hotly debated,” says Tim Wilson, who led the system-on-chip (SoC) development for Intel Core Ultra, code-named Meteor Lake. The question a few years back, he explains, was, “Why dedicate so much silicon to this NPU? AI is amorphous – we don’t know how they’ll use it.”
Today, there’s plenty for AI to help you with. “To see how that landed (amid) the AI explosion was really serendipitous,” Wilson says. “To remember those conversations and then see the payoff — it’s very gratifying.”
Part of the reason Wilson and team were able to introduce new AI functionality for Intel Core Ultra was a complete rethink in the way a processor is made. Rather than building one big chip using a traditional monolithic design, they went for a more efficient tile-based architecture.
More Complex Means Less Complicated (No, Really)
This isn’t the first tile-based design from Intel, nor is Intel the only company to do it, but existing Intel deployments target servers and high-end desktops. Core Ultra delivers disaggregated design to mainstream client devices.
“The same forces that were driving our explorations of disaggregated design in Lakefield and Ponte Vecchio were being felt in client as well,” he explains. Lakefield was launched back in 2020 and targeted thin-and-light laptops, as well as foldable and dual-screen PCs. Ponte Vecchio goes in the opposite direction, powering data centers and supercomputers around the world. Both leverage a chiplet design to deliver performance and power efficiency.
“The technology is mature enough and the need is there. We wanted to bring it to mainstream and at the same time bring in major new capabilities like AI.”
While this was not uncharted territory, the approach was new.
“We’ve taken chips and stacked them before, combining isolated chips. For Meteor Lake, we broke things apart in ways we haven’t before. It was a blank sheet program,” he says. “We wanted to leverage disaggregation strategically and for the benefit of the product. Not just ‘Can we do it?’ But ‘Why would we do it?’
This meant methodically reimagining all the tiles to create a new modular design. Wilson says it’s a complexity that breeds simplicity. Over the past 20 years, chips have expanded to include more – more cores, then more functionality in the uncore, then memory, graphics and display capabilities. Today, SoCs can have roughly 100 intellectual properties (IPs) and optimizing them all simultaneously on a single piece of silicon becomes increasingly difficult.
“It forces simplification at the tile level. Breaking the tiles apart means you can choose the right transistor for the right tile – high-performance transistors for the CPU, high density for the GPU and low power for the SoC blocks,” he says. “The tiles can be easily swapped, adapting chip capabilities to different requirements. A new, scalable fabric means all blocks within the SoC can get full memory bandwidth when needed.”
Intel's Foveros packaging technology allows for a tightly packed and stacked chip that uses a blend of processes. Rather than co-optimizing everything for the same process, you can mix and match – cutting-edge Intel 4 for the cores and an established battle-tested process technology for IO, which doesn’t need all that performance.
When 20 Minds are Better than One
Rethinking chip architecture from the ground up is no small feat. Intel Core Ultra is Intel’s biggest SoC design change in 40 years, and Wilson, a 22-year Intel veteran, says it took a unique team with a massive breadth of expertise.
“We brought in a larger and more diverse team to build this product,” Wilson says. “It required us to work more closely with several IP teams, and there were more groups involved in design and development than we’ve had before. You’d be hard-pressed to find a more inclusive team, I think.”
With so many firsts, day-to-day problem-solving often meant wading into uncharted territory. In addition to a brand-new disaggregated design wrapped in Foveros packaging, there’s the new 3D performance hybrid architecture (three kinds of processor cores), Intel Arc graphics integrated right into the SoC and significant improvements in power efficiency. Throw in the fact that this platform is the first built on the latest Intel 4 process and it’s easy to believe Wilson when he says there was no shortage of fresh challenges, and that the work itself was inspiring to be a part of.
“It’s really satisfying to spend a lot of time with a lot of people and make an impact in the industry, and to then see that work reflected in the products that my friends and family are using every day,” he says. (Wilson is now working with an entirely new team on the server side – he transitioned to a new role as General Manager of XEG after Intel Core Ultra launched.)
A project like Intel Core Ultra takes years, and to some extent, an ability to predict the future – like the decision to incorporate an NPU for AI tasks. These kinds of debates are the beating heart of innovation, though, and Wilson says that the team tackled the execution the same way they did the architecture: They broke it down.
“Ultimately it came down to relationships. The smartest person in the world is going to fail if they haven’t built good relationships, and Meteor Lake’s success is in the trust between the people and teams,” he says. “A great idea might be attributed to one person, but it’s the culmination of 20 different people working through a problem.”