Intel Discloses New Details On Meteor Lake VPU Block, Lays Out Vision For Client AI
by Ryan Smith on May 29, 2023 9:00 AM EST- Posted in
- CPUs
- Intel
- Machine Learning
- Movidius
- VPU
- NPU
- Meteor Lake
While the first systems based on Intel’s forthcoming Meteor Lake (14th Gen Core) systems are still at least a few months out – and thus just a bit too far out to show off at Computex – Intel is already laying the groundwork for Meteor Lake’s forthcoming launch. For this year’s show, in what’s very quickly become an AI-centric event, Intel is using Computex to lay out their vision of client-side AI inference for the next generation of systems. This includes both some new disclosures about the AI processing hardware that will be in intel’s Meteor Lake hardware, as well as what Intel expects OSes and software developers are going to do with the new capabilities.
AI, of course, has quickly become the operative buzzword of the technology industry over the last several months, especially following the public introduction of ChatGPT and the explosion of interest in what’s now being termed “Generative AI”. So like the early adoption stages of other major new compute technologies, hardware and software vendors alike are still in the process of figuring out what can be done with this new technology, and what are the best hardware designs to power it. And behind all of that… let’s just say there’s a lot of potential revenue waiting in the wings for those companies that succeed in this new AI race.
Intel for its part is no stranger to AI hardware, though it’s certainly not a field that normally receives top billing at a company best known for its CPUs and fabs (and in that order). Intel’s stable of wholly-owned subsidiaries in this space includes Movidius, who makes low power vision processing units (VPUs), and Habana Labs, responsible for the Gaudi family of high-end deep learning accelerators. But even within Intel’s rank-and-file client products, the company has been including some very basic, ultra-low-power AI-adjacent hardware in the form of their Gaussian & Neural Accelerator (GNA) block for audio processing, which has been in the Core family since the Ice Lake architecture.
Still, in 2023 the winds are clearly blowing in the direction of adding even more AI hardware at every level, from the client to the server. So for Computex Intel is disclosing a bit more on their AI efforts for Meteor Lake.
Meteor Lake: SoC Tile Includes a Movidius-derived VPU For Low-Power AI Inference
On the hardware side of matters, the big disclosure from Intel is that, as we have long suspected, Intel is baking some more powerful AI hardware into the disaggregated SoC. Previously documented in some Intel presentations as the “XPU” block inside of Meteor Lake’s SoC tile (middle tile), Intel is now confirming that this XPU is a full AI acceleration block.
Specifically, the block is derived from Movidius’s third-generation Vision Processing Unit (VPU) design, and going forward, is aptly being identified by Intel as a VPU.
The amount of technical detail Intel is offering on the VPU block for Computex is limited – we don’t have any performance figures or an idea of how much of the SoC tile’s die space it occupies. But Movidius’s most recent VPU, the Myriad X, incorporated a fairly flexible neural compute engine that’s been responsible for giving the VPU its neural network capabilities. The engine on the Myriad X is rated for 1 TOPS of throughput, though at almost 6 years and several process nodes later, Intel is almost certainly aiming far higher for Meteor Lake.
As it’s part of the Meteor Lake SoC tile, the VPU will be present in all Meteor Lake SKUs. Intel will not be using it as a feature differentiator, ala ECC or even integrated graphics; so it will be a baseline feature available to all Meteor Lake-based parts.
The purpose of the VPU, in turn, is to provide a third rail option for AI processing. For high-performance needs there is the integrated GPU, whose sizable array of ALUs can provide relatively copious amounts of processing for the matrix math behind neural networks. Meanwhile the CPU will remain the processor of choice for simple, low-latency workloads that either can’t afford to wait for the VPU to be initialized, or where the size of the workload doesn’t justify the effort. That leaves the VPU in the middle, as a dedicated but low-power AI accelerator to be used for sustained AI workloads that don’t need the performance (and the power hit) of the GPU.
It's also worth noting that, while not explicitly in Intel's diagrams, the GNA block will also remain for Meteor Lake. It's speciality is ultra-low-power operation, so it is still needed for things like wake-on-voice, and compatibility with existing GNA-enabled software.
Past that, there’s a lot left we don’t know about the Meteor Lake VPU. The fact that it’s even called a VPU and includes Movidius tech implies that it’s a design focused on computer vision, similar to Movidius’s discrete VPUs. If that’s the case, then the Meteor Lake VPU may excel at processing visual workloads, but lack performance and flexibility in other areas. And while today’s disclosure from Intel disclosure quickly brings to mind questions about how this block will compare in performance and functionality to AMD’s Xilinx-derived Ryzen AI block, those are questions that will have to wait for another day.
For now, at least, Intel feels that they are well positioned to lead the AI transformation in the client space. And they want the world – developers and users alike – to know.
The Software Side: What to Do With AI?
As noted in the introduction, hardware is only one-half of the equation when it comes to AI accelerated software. Even more important than what to run it on is what to do with it, and that is something that Intel and its software partners are still working on.
At a most basic level, including a VPU provides additional, energy-efficient performance for executing tasks that are already more-or-less AI driven today on some platforms, such as dynamic noise suppression and background segmentation. In that respect, the inclusion of a VPU is catching up with smartphone-class SoCs, where things like Apple’s Neural Engine and Qualcomm’s Hexagon NPU provides similar acceleration today.
But Intel has their eyes on a much larger prize. They both want to foster moving what are currently server AI workloads to the edge – in other words, moving AI processing to the client – as well as fostering entirely new AI workloads.
What those are, at this point, remains to be seen. Microsoft laid out some of its own ideas last week at their annual Build conference, including the announcement of a copilot function for Windows 11. And the OS vendor is also laying some groundwork for developers with their Open Neural Network Exchange (ONNX) runtime.
To some degree, the entire tech world is at a point where it has a new hammer, and now everything is starting to look a whole lot like a nail. Intel, for it’s part, is certainly not removed from that, as even today’s disclosure is more aspirational than it is about specific benefits on the software side of matters. But these truly are the early days for AI, and no one has a good feel for what can or cannot be done. Certainly, there are a few nails that can be hammered.
To that end, Intel is looking to foster a “built it and they will come” ecosystem for AI in the PC space. Provide the hardware across the board, work with Microsoft to provide the software tools and APIs needed to use the hardware, and see what new experiences developers can come up with – or alternatively, what workloads they can shift on to the VPU to reduce power usage.
Ultimately, Intel is expecting that AI-based workloads are going to transform the PC user experience. And whether or not that entirely happens, it’s enough to warrant the inclusion of hardware for the task in their next-generation of CPUs.
17 Comments
View All Comments
hansmuff - Wednesday, May 31, 2023 - link
Processing of audio and video. Look at NVIDIA Broadcast software. Great enhanced mic recording and background blur/replacement that exceed the capabilities of any "meeting" software I've used (like Teams or Cisco Meetings.)Surely, that's a limited field, but for people working from home a really nice enhancement.
jmlocatelli - Wednesday, May 31, 2023 - link
Or for example, ask excel to build a worksheet based on a natural language description. No more use of the insane amount of buttons. It will allow users to be a lot more producfive.hansmuff - Wednesday, May 31, 2023 - link
I really hate to disagree with you on that because it's a great use, but I think cloud services are going to want their cut of that transaction. Yeah, Windows will record your voice, but it'll be sent to a MS cloud service, and the solution sent to you. For a price.nandnandnand - Thursday, June 1, 2023 - link
Wait until we see how many square millimeters are devoted to the VPU. Damn near nothing, I expect.Laptops have power constraints just like phones.
As far as the applications go, they exist. More will exist after the inference hardware is widespread for a few years.
tipoo - Friday, June 2, 2023 - link
Where you don't want to peg your CPU cores to 100% when dedicated silicon could trivially handle it at low power?jjjag - Friday, June 2, 2023 - link
missing the point - to move some of these capabilities TO the edge devices. ALL edge devices. Keeping them in the DC means high latencies to access and means unrealistic response times. We expect our PCs to do the same things as our cell phones, but much better and faster since they are bigger and more powerful. The use cases are literally detailed in the slides above. Your avatar's face in a game can show real-time expressions from your camera. The possibilities are endless, once the hardware is in place people will find a cool way to use it. Also nobody wants their laptop running at 100W, the battery life is not acceptable. Dedicated accelerators are more power efficient