Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I assume its still x86-64?

What actually makes it an AI platform? Some tight integration of an intel ARC GPU, similar to the Apple M series processors?

They claim 2-5x performance for soem AI workloads. But aren't they still limited by memory? The same limitation as always in consumer hardware?

I don't think it matters much if you're limited by a nvidia gpu with ~max 16gb or some new intel processor with similar memory.

Nice to have more options though. Kinda wish the intel arc gpu would be developed into an alternative for self hosted LLMs. 70b models can be quite good but still difficult / slow to use self-hosted.



These processors have NPU (Neural Processing Unit) which is supposed to accelerate some small local neural networks. Nvidia RTX GPUs have much more powerful NPUs, so it's more about laptops without discrete GPU.


And as far as I can see, it's a total waste of silicon. Anything running in it will anyway be so underpowered that it doesn't matter. It'd be better to dedicate the transistors to the GPU.

The latest Ryzen mobile CPU line didn't improve performance compared to its predecessor (the integrated GPU is actually worse), and I think the NPU is to blame.


If you ask NVIDIA, inference should always run on the GPU. If you ask anybody else designing chips for consumer devices, they say there's a benefit to having a low-power NPU that's separate from the GPU.


Okay, yeah, and those manufacturers’ opinions are both obvious reflections of market position independent of the merits, what do people who actually run inference say?

(Also, the NPUs usually aren't any more separate from the GPU than tensor cores are separate from an Nvidia GPU, they are integrated with the CPU and iGPU.)


If you're running an LLM there's a benefit in shifting prompt pre-processing to the NPU. More generally, anything that's memory-throughput limited should stay on the GPU, while the NPU can aid compute-limited tasks to at least some extent.

The general problem with NPUs for memory-limited tasks is either that the throughput available to them is too low to begin with, or that they're usually constrained to formats that will require wasteful padding/dequantizing when read (at least for newer models) whereas a GPU just does that in local registers.


Depends on how big the NPU is and how much power/memory the inference model needs.


But like.....what for example. As a normal windows PC user, what kind of software can I run that will benefit from that NPU at all?


We don't ask that question. In reality everything is done in the cloud. Maybe they package some camera app that applies snapchat-like filters with NPUs, but that's about the extent of it.

Jokes aside: they really seem to do some things like live captions and translations. Pretty sure you could also do these things on the iGPU or CPU at a higher power draw.

https://blogs.windows.com/windows-insider/2024/12/18/releasi...



No for sure, but afaik you get all of those features even if you don't have an NPU. And even if you do have one, it's unclear to me which one of them actually use the NPU for extra power or if they all just run on the CPU. Like the thing that is missing for me is "this is the thing that you can only do on a Copilot PC and it's not available otherwise".


Try searching for some like "My mouse pointer is too small"

https://x.com/rfleury/status/2007964012923994364


Incredible. 100% typical microsoft though. I'm a "veteran" windows/xbox developer and none of this surprises me.


They're going to find a way to accelerate the Windows start menu with it.


Oh boy, instead of building an efficient index or optimizing the start menu or its built-in web browser, they're adding more power usage to make the computer randomly guess what I want returned since they still can't figure out how to return search results of what you actually typed.


God I hope so


It is another way Microsoft has tried to cater to OEMs as means to bring PC sales back to the glory exponential growth days, especially under the CoPilot+ PC branding, nowadays still siloed into Windows ARM.

In fairness NPUs can use less hardware resources than a general purpose discrete GPU, thus better for laptop workloads, however we all know that if a discrete GPU is available, there is not a technical reason for not using it, assuming enough local memory is available.

Ah, and NPUs are yet another thing that GNU/Linux folks would have to reverse engineer as well, as on Windows/Android/Apple OSes they are exposed via OS APIs, and there is yet no industry standard for them.



That is not an industry standard that works across vendors in an OS and GPU agnostic way, which is why Khronos has started a new standard effort.

https://www.khronos.org/events/building-the-foundation-for-a...


Windows Recall?


1) tick AI checkbox 2) ??? 3) profit


Are we calling tensor cores NPUs now?


How did we end up with Tensor Cores and a Tensor SoC from two different companies?


The same way we ended up with both Groq and Grok branded LLMs

Maybe these people aren't that creative....




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: