Apple don't sell M4 chips separately, but the general best-guess I've seen seems to be they're in the $120 range as a cost to Apple. Certainly it can't exceed the list price of the cheapest Mac mini with one (US$599).
As bleeding-edge tech, those are expensive transistors, but still 10 of them would have enough transistors for 256 GB of RAM plus all the compute each chip already has. Actual RAM is much cheaper than that.
10x the price of the cheapest Mac Mini is $6k… but you could then save $400 by getting a Mac Studio with 256 GB RAM. The max power consumption (of that desktop computer but with double that, 512 GB RAM) is 270 W, representing an absolute upper bound: if you're doing inference you're probably using a fraction of the compute, because inference is RAM limited not compute limited.
But irregardless, I'd like to emphasise that these chips aren't even trying to be good at LLMs. Not even Apple's Neural Engine is really trying to do that, NPUs (like the Neural Engine) are all focused on what AI looked like it was going to be several years back, not what current models are actually like today. (And given how fast this moves, it's not even clear to me that they were wrong or that they should be optimised for what current models look like today).
> Fairly sure most Windows laptops are still sold with 8GB RAM and basically 512MB of VRAM (probably less), practically the same thing for Android phones.
That sounds exceptionally low even for budget laptops. Only examples I can find are the sub-€300 budget range and refurbished devices.
For phones, there is currently very little market for this in phones, the limit is not because it's an inconceivable challenge. Same deal as thermal imaging cameras in this regard.
> I was thinking of building a local LLM powered search engine but basically nobody outside of a handful of techies would be able to run it + their regular software.
This has been a standard database tool for a while already. Vector databases, RAG, etc.
Despite which, they sell them as consumer devices.
> and what you're talking about in term of "just the hardware needed" is so far beyond consumer hardware it's funny.
Not as big a gap as you might expect. M4 chip (as used in iPads) has "28 billion transistors built using a second-generation 3-nanometer technology" - https://www.apple.com/newsroom/2024/05/apple-introduces-m4-c...
Apple don't sell M4 chips separately, but the general best-guess I've seen seems to be they're in the $120 range as a cost to Apple. Certainly it can't exceed the list price of the cheapest Mac mini with one (US$599).
As bleeding-edge tech, those are expensive transistors, but still 10 of them would have enough transistors for 256 GB of RAM plus all the compute each chip already has. Actual RAM is much cheaper than that.
10x the price of the cheapest Mac Mini is $6k… but you could then save $400 by getting a Mac Studio with 256 GB RAM. The max power consumption (of that desktop computer but with double that, 512 GB RAM) is 270 W, representing an absolute upper bound: if you're doing inference you're probably using a fraction of the compute, because inference is RAM limited not compute limited.
This is also very close to the same price as this phone, which I think is a silly phone, but it's a phone and it exists and it's this price and that's all that matters: https://www.amazon.com/VERTU-IRONFLIP-Unlocked-Smartphone-Fo...
But irregardless, I'd like to emphasise that these chips aren't even trying to be good at LLMs. Not even Apple's Neural Engine is really trying to do that, NPUs (like the Neural Engine) are all focused on what AI looked like it was going to be several years back, not what current models are actually like today. (And given how fast this moves, it's not even clear to me that they were wrong or that they should be optimised for what current models look like today).
> Fairly sure most Windows laptops are still sold with 8GB RAM and basically 512MB of VRAM (probably less), practically the same thing for Android phones.
That sounds exceptionally low even for budget laptops. Only examples I can find are the sub-€300 budget range and refurbished devices.
For phones, there is currently very little market for this in phones, the limit is not because it's an inconceivable challenge. Same deal as thermal imaging cameras in this regard.
> I was thinking of building a local LLM powered search engine but basically nobody outside of a handful of techies would be able to run it + their regular software.
This has been a standard database tool for a while already. Vector databases, RAG, etc.