Hacker Newsnew | past | comments | ask | show | jobs | submit | roughly's commentslogin

I’ve lived in my rental for ~15 years now (rent control) - to be honest, if I’d known when I moved in how long I’d be here, I’d have paid for some upgrades. It’s not equity, but I do still live here.

Do you plan on moving any time soon?

The best time to plant a tree was ten years ago. The second best time is now.


Oh I know. Started tackling a couple things already, and it does make a difference.

The two I really can’t do on my own that I absolutely would are replace the windows and put in solar.


This article repeatedly cites revenue growth numbers as an indicator of Nvidia and Apple’s relative health, which is a very particular way of looking at things. By way of another one, Apple had $416Bn in revenue, which was a 6% increase from the prior year, or about $25Bn, or about all of Nvidia’s revenue in 2023. Apple’s had slow growth in the last 4 years following a big bump during the early pandemic; their 5 year revenue growth, though, is still $140Bn, or about $10Bn more than Nvidia’s 2025 revenues. Nvidia has indeed grown like a monster in the last couple years - 35Bn increase from 23-24 and 70Bn increase from 24-25. Those numbers would be 8% and 16% increases for Apple respectively, which I’m sure would make the company a deeply uninteresting slow-growth story compared to new upstarts.

I get why the numbers are presented the way they are, but it always gets weird when talking about companies of Apple’s size - percent increases that underwhelm Wall Street correspond to raw numbers that most companies would sacrifice their CEO to a volcano to attain, and sales flops in Apple’s portfolio mean they only sold enough product to supply double-digit percentages of the US population.


I agree. People confuse relative for absolute numbers.

And ironically Apple acts like being a small contender the moment they feel some heat after a decade of relatively easy wins everywhere it seemed.

So finally there is a company that gives Apple some much needed heat.

That’s why I in absolute terms side with NVIDIA, the small contender in this case.

PS: I had one key moment in my career when I was at Google and a speaker mentioned the unit “NBU”. It stands for next billion units.

This is ten years ago and started my mental journey into large scale manufacturing and production including all the processes included.

The fascination never left. It was a mind bender for me and totally get why people miss everything that large.

At Google it was just a milestone expected to be hit - not one time but as the word next indicates multiple times.

Mind blowing and eye opening to me ever since. Fantastic inspiration thinking about software, development and marketing.


Did google ever ship a billion units of any hardware? Can't think of anything substantial.

Apple hit 3 billion iphones in mid 2025.


How did you get into large scale manufacturing and production? Was it a career switch? Downsides? It too fascinates me. Any book recommendations?

It’s also strange because I highly doubt Google has manufactured a billion physical units of anything. Most of their consumer hardware is designed and built by partners, including Pixel.

>> I highly doubt Google has manufactured a billion physical units of anything

Technically, there are billions of transistors in every tensor chip manufactured by Google


Even all pixel and nexus models combined must be far off the billion. Apple just hit 3 billion iphones last year.

I think the parent comment said "mental journey", not a real one, although it will be good to get more insights.

Waiting OP response too, fascinating.

US tech companies aren’t built to be like 3M is/was and able to have their hands in infinite pies.

The giant conglomerates in Asia seem more able to do it.

Google has somewhat tried but then famously kills most everything even things that could be successful if smaller businesses.


I think there's something about both the myth of the unicorn and of the hero founder/CEO in tech that forces a push towards legibility and easy narratives for a company - it means that, to a greater degree than other industries, large tech companies are a storytelling exercise, and "giant corporate blob that sprawls into everything" isn't a sexy story, nor is "consistent 3% YoY gains," even when that's translating into "we added the GDP of a medium-sized country to our cash pile again this year."

Every time a CEO or company board says "focus," an interesting product line loses its wings.


It's because the storytelling needed for Wall Street. It's the only way to get sky high revenue multiples, selling a dream, because if you're a conglomerate all you can do is to sell the P&L - it's like selling an index. If you have a business division that's does exceedingly well compared to the rest, you make more money by spinning it off.

I think Asian companies are much less dependent on public markets and have as strong private control (chaebols in South Korea for example - Samsung, LG, Hyundai etc).

If you look at US companies that are under "family control" you might see a similar sprawl, like Cargill, Koch, I'd even put Berkshire in this class even though it's not "family controlled" in the literal sense, it's still associated with two men and not a professional CEO.


I think this is more of a result of big US tech being extremely productive (with their main competency)

Yeah, it is insane what areas and products companies like Mitsubishi, Samsung, IHI or even Suntory are involved in.

Think Google has done a pretty good job at that actually! Consider their various enterprises that weren't killed:

* Search/ads

* YouTube

* Android/Play, Chrome, Maps

* Google Cloud, Workspace

* Pixel, Nest, Fitbit

* Waymo, DeepMind

* Google fiber

They're not a conglomerate like Alibaba but they're far from a one-trick pony, either :)


Because shares are no longer about investing in a company that is making healthy margins and has a solid business, that will pay you a decent dividend in return for your investment.

Shares are a short-term speculative gamble; you buy them in the hope that the price will rise and then you can sell them for a profit. Sometimes the gap between these two events is measured in milliseconds.

So the only thing that matters to Wall St is growth. If the company is growing then its price will probably rise. If it's not, it won't. Current size is unimportant. Current earnings are unimportant (unless they are used to fund growth). Nvidia is sexy, Apple is not, despite all the things you say (which are true).


> Nvidia has indeed grown like a monster in the last couple years - 35Bn increase from 23-24 and 70Bn increase from 24-25.

Worringly for Nvidia, Apple is producing products people want and are provenly useful, thus a vast majority of its value is solid, so revenue streams for fabs Apple uses is solid.

Nvidia on the other hand, is producing tangible things of value, GPUs, but which are now largely used in unproven technologies (when stacked against lofty claims) that barely more than a few seem to want, so Nvidia's revenue stream seems flimsy at best in the AI boom.

The only proven revenue stream Nvidia has (had?) is GPUs for display and visualisation (gaming, graphics, and non-AI non-crypto compute, etc.)


Calling AI an unproven market is a wild statement. My mother and every employed person around me is using AI backed by Nvidia GPUs in some way or the other on a daily basis.

The AI market is running on VC and hype fumes right now, costing way more than it brings in. Add to that the circular financing, well, statements, in the hundreds of billions of dollars that are treated as contracts instead of empty air, and compare that to Apple, where the money is actually there and profitable, and the comparison makes sense.

It may still be profitable for TSMC to use NVidia to funnel all the juicy VC game money to themselves, but the statement about proven vs unproven revenue stream is true. It'll be gone with the hype, unless something truly market changing comes along quickly, not the incremental change so far. People are not ready to pay the full costs of AI, it's that simple right now.


Unproven in the sense that it'll become 'super intelligent', et al.

For a statistical word salad generator that is _generally_ coherent, sure it's proven.

But for other claims, such as replacing all customer service roles[1], to the lament of customers[2], and now that a number of companies are re-hiring staff they sacked because 'AI would make them redundant'[3] still make me strongly assert that Generative AI isn't the trillion dollar industry it is trying to market itself as.

Sure it has a few tricks, and helps in a number of cases, therefore is useful in those cases, but it isn't an 'earth-shattering mass-human-redundancy' technology, that colossally stupid amounts of circular investments are being poured into it which, I argue, makes fabs mostly, if not solely, dedicating themselves to AI are now in a precarious position when the AI bubble collapses.

[1] https://www.cxtoday.com/contact-center/openai-ceo-sam-altman...

[2] https://www.thestreet.com/technology/salesforce-ai-faces-bac...

[3] https://finance.yahoo.com/news/companies-quietly-rehiring-wo...


It might matter that Nvidia sells graphics cards and Apple sells computers and computer-like devices with cases and peripherals and displays and software and services. TSMC is responsible for a much larger proportion of Nvidia's product than Apple's.

I'm not even sure how to compare revenue, whether relative or absolute, when Nvidia is deeply involved in multiple deals that have all the signs of circular financing scams.

I am of the opinion that Nvidia's hit the wall with their current architecture in the same way that Intel has historically with its various architectures - their current generation's power and cooling requirements are requiring the construction of entirely new datacenters with different architectures, which is going to blow out the economics on inference (GPU + datacenter + power plant + nuclear fusion research division + lobbying for datacenter land + water rights + ...).

The story with Intel around these times was usually that AMD or Cyrix or ARM or Apple or someone else would come around with a new architecture that was a clear generation jump past Intel's, and most importantly seemed to break the thermal and power ceilings of the Intel generation (at which point Intel typically fired their chip design group, hired everyone from AMD or whoever, and came out with Core or whatever). Nvidia effectively has no competition, or hasn't had any - nobody's actually broken the CUDA moat, so neither Intel nor AMD nor anyone else is really competing for the datacenter space, so they haven't faced any actual competitive pressure against things like power draws in the multi-kilowatt range for the Blackwells.

The reason this matters is that LLMs are incredibly nifty often useful tools that are not AGI and also seem to be hitting a scaling wall, and the only way to make the economics of, eg, a Blackwell-powered datacenter make sense is to assume that the entire economy is going to be running on it, as opposed to some useful tools and some improved interfaces. Otherwise, the investment numbers just don't make sense - the gap between what we see on the ground of how LLMs are used and the real but limited value add they can provide and the actual full cost of providing that service with a brand new single-purpose "AI datacenter" is just too great.

So this is a press release, but any time I see something that looks like an actual new hardware architecture for inference, and especially one that doesn't require building a new building or solving nuclear fusion, I'll take it as a good sign. I like LLMs, I've gotten a lot of value out of them, but nothing about the industry's finances add up right now.


> I am of the opinion that Nvidia's hit the wall with their current architecture

Based on what?

Their measured performance on things people care about keep going up, and their software stack keeps getting better and unlocking more performance on existing hardware

Inference tests: https://inferencemax.semianalysis.com/

Training tests: https://www.lightly.ai/blog/nvidia-b200-vs-h100

https://newsletter.semianalysis.com/p/mi300x-vs-h100-vs-h200... (only H100, but vs AMD)

> but nothing about the industry's finances add up right now

Is that based just on the HN "it is lots of money so it can't possibly make sense" wisdom? Because the released numbers seem to indicate that inference providers and Anthropic are doing pretty well, and that OpenAI is really only losing money on inference because of the free ChatGPT usage.

Further, I'm sure most people heard the mention of an unnamed enterprise paying Anthropic $5000/month per developer on inference(!!) If a company if that cost insensitive is there any reason why Anthropic would bother to subsidize them?


> Their measured performance on things people care about keep going up, and their software stack keeps getting better and unlocking more performance on existing hardware

I'm more concerned about fully-loaded dollars per token - including datacenter and power costs - rather than "does the chip go faster." If Nvidia couldn't make the chip go faster, there wouldn't be any debate, the question right now is "what is the cost of those improvements." I don't have the answer to that number, but the numbers going around for the costs of new datacenters doesn't give me a lot of optimism.

> Is that based just on the HN "it is lots of money so it can't possibly make sense" wisdom?

OpenAI has $1.15T in spend commitments over the next 10 years: https://tomtunguz.com/openai-hardware-spending-2025-2035/

As far as revenue, the released numbers from nearly anyone in this space are questionable - they're not public companies, we don't actually get to see inside the box. Torture the numbers right and they'll tell you anything you want to hear. What we _do_ get to see is, eg, Anthropic raising billions of dollars every ~3 months or so over the course of 2025. Maybe they're just that ambitious, but that's the kind of thing that makes me nervous.


> OpenAI has $1.15T in spend commitments over the next 10 years

Yes, but those aren't contracted commitments, and we know some of them are equity swaps. For example "Microsoft ($250B Azure commitment)" from the footnote is an unknown amount of actual cash.

And I think it's fair to point out the other information in your link "OpenAI projects a 48% gross profit margin in 2025, improving to 70% by 2029."


> "OpenAI projects a 48% gross profit margin in 2025, improving to 70% by 2029."

OpenAI can project whatever they want, they're not public.


They still have shareholders who can sue for misinformation.

Private companies do have a license to lie to their shareholders.


Sounds like the railway boom.. I mean bond scam's

> Yes, but those aren't contracted commitments, and we know some of them are equity swaps.

It's worse than not contracted. Nvidia said in their earnings call that their OpenAI commitment was "maybe".


The fact that there's an incestual circle between OpenAI, Microsoft, NVidia, AMD, etc.. where they provide massive promises to each other for future business is nothing short of hilarious.

The economics of the entire setup are laughable and it's obvious that it's a massive bubble. The profit that'd need to be delivered to justify the current valuations is far beyond what is actually realistic.

What moat does OpenAI have? I'd argue basically none. They make extremely lofty forecasts and project an image of crazy growth opportunities, but is that going to ever survive the bubble popping?


I still don't really understand this "circle" issue. If I fix your bathroom and in return you make me a new table, is that an incestuous circle? Haven't we both just exchanged value?

The circle allows you to put an arbitrary "price" on those services. You could say that the bathroom and table are $100 each, so your combined work was $200. Or you could claim that each of you did $1M work. Without actual money flowing in/out of your circle, your claims aren't tethered to reality.

You don’t think real money is changing hands when Microsoft buys Nvidia GPUs?

What about when Nvidia sells GPUs to a client and then buys 10% of their shares?

Their shares will be based on the client's valuation, which in public markets is externally priced. If not in public markets it is murkier, but will be grounded in some sort of reality so Nvidia gets the right amount of the company.

My point was that's an indirect subsidy. NVIDIA is selling at a discount to prop up their clients.

It's a soft version of money printing basically. These firms are clearly inflating each other's valuations by making huge promises of future business to each other. Naively, one would look at the headlines and draw the conclusion that much more money is going to flow into AI in the near future.

Of course, a rational investor looks at this and discounts the fact that most of those promises are predicated on insane growth that has no grounding in reality.

However, there are plenty of greedy or irrational investors, whose recklessness will affect everyone, not just them.


For Nvidia shares: converting cash into shares in a speculative business while guaranteeing increasing demand for your product is a pretty good idea, and probably doesn't have any downsides.

For the AI company being bought: I wouldn't trust these shares or valuations, because the money invested is going on GPUs and back to Nvidia.


GPUs are supply constrained and price isn't declining that fast so why do you expect the token price price to decrease. I think the supply issue will resolve in 1-2 years as now they have good prediction of how fast the market would grow.

Nvidia is literally selling GPUs with 90% profit margin and still everything is out of stock, which is unheard of before.


> Further, I'm sure most people heard the mention of an unnamed enterprise paying Anthropic $5000/month per developer on inference

I haven't and I'd like to know more.


>Further, I'm sure most people heard the mention of an unnamed enterprise paying Anthropic $5000/month per developer on inference

Companies have wasted more money on dumber things so spending isn't a good measure.

And what about the countless other AI companies? Anthropic has one of the top models for coding so that's like saying there ins't a problem pre dot com bubble because Amazon is doing fine.

The real effects of AI is measured in rising profit of the customers of those AI companies otherwise you're looking at the shovel sellers


> Is that based just on the HN "it is lots of money so it can't possibly make sense" wisdom?

I mean the amount of money invested across just a handful of AI companies is currently staggering and their respective revenues are no where near where they need to be. That’s a valid reason to be skeptical. How many times have we seen speculative investment of this magnitude? It’s shifting entire municipal and state economies in the US.

OpenAI alone is currently projected to burn over $100 billion by what? 2028 or 2029? Forgot what I read the other day. Tens of billions a year. That is a hell of a gamble by investors.


The flip side is that these companies seem to be capacity constrained (although that is hard to confirm). If you assume the labs are capacity constrained, which seems plausible, then building more capacity could pay off by allowing labs to serve more customers and increase revenue per customer.

This means the bigger questions are whether you believe the labs are compute constrained, and whether you believe more capacity would allow them to drive actual revenue. I think there is a decent chance of this being true, and under this reality the investments make more sense. I can especially believe this as we see higher-cost products like Claude Code grow rapidly with much higher token usage per user.

This all hinges on demand materialising when capacity increases, and margins being good enough on that demand to get a good ROI. But that seems like an easier bet for investors to grapple with than trying to compare future investment in capacity with today's revenue, which doesn't capture the whole picture.


I am not someone who would ever be ever be considered an expert on factories/manufacturing of any kind, but my (insanely basic) understanding is that typically a “factory” making whatever widgets or doodads is outputting at a profit or has a clear path to profitability in order to pay off a loan/investment. They have debt, but they’re moving towards the black in a concrete, relatively predictable way - no one speculates on a factory anywhere near the degree they do with AI companies currently. If said factory’s output is maxed and they’re still not making money, then it’s a losing investment and they wouldn’t expand.

Basically, it strikes me as not really apples to apples.


Consensus seems to be that the labs are profitable on inference. They are only losing money on training and free users.

The competition requiring them to spend that money on training and free users does complicate things. But when you just look at it from an inference perspective, looking at these data centres like token factories makes sense. I would definitely pay more to get faster inference of Opus 4.5, for example.

This is also not wholly dissimilar to other industries where companies spend heavily on R&D while running profitable manufacturing. Pharma semiconductors, and hardware companies like Samsung or Apple all do this. The unusual part with AI labs is the ratio and the uncertainty, but that's a difference of degree, not kind.


> But when you just look at it from an inference perspective, looking at these data centres like token factories makes sense.

So if you ignore the majority of the costs, then it makes sense.

Opus 4.5 was released on November 25, 2025. That is less than 2 months ago. When they stop training new models, then we can forget about training costs.


I'm not taking a side here - I don't know enough - but it's an interesting line of reasoning.

So I'll ask, how is that any different than fabs? From what I understand R&D is absurd and upgrading to a new node is even more absurd. The resulting chips sell for chump change on a per unit basis (analogous to tokens). But somehow it all works out.

Well, sort of. The bleeding edge companies kept dropping out until you could count them on one hand at this point.

At first glance it seems like the analogy might fit?


Someone else mentioned it elsewhere in this thread, and I believe this is the crux of the issue: this is all predicated in the actual end users finding enough benefit in LLM services to keep the gravy train going. It's irrelevant how scalable and profitable the shovel makes are, to keep this business afloat long term, the shovelers - ie the end users - have to make money using the shovesl. Those expectations are currently ridiculously inflated. Far beyond anything in the past.

Invariably, there's going to be a collapse in the hype, the bubble will burst, and an investment deleveraging will remove a lot of money from the space in a short period of time. The bigger the bubble, the more painful and less survivable this event will be.


Inference costs scale linearly with usage. R&D expenses do not.

That's not to mention that Dario Amodei has said that their models actually have a good return, even when accounting for training costs [0].

[0] https://youtu.be/GcqQ1ebBqkc?si=Vs2R4taIhj3uwIyj&t=1088


> Inference costs scale linearly with usage. R&D expenses do not.

Do we know this is true for AI?


It’s pretty much the definition of fixed costs versus variable costs.

You spend the same amount on R&D whether you have one hobbyist user or 90% market share.


Yes. R&D is guaranteed to fall as a percentage of costs eventually. The only question is when, and there is also a question of who is still solvent when that time comes. It is competition and an innovation race that keeps it so high, and it won't stay so high forever. Either rising revenues or falling competition will bring R&D costs down as a percentage of revenue at some point.

Yes, but eventually may be longer than the market can hold out. So far R&D expenses have skyrocketed and it does not look like that will be changing anytime soon.

That's why it is a bet, and not a sure thing.

>Consensus seems to be that the labs are profitable on inference. They are only losing money on training and free users.

That sounds like “we’re profitable if you ignore our biggest expenses.” If they could be profitable now, we’d see at least a few companies just be profitable and stop the heavy expenses. My guess is it’s simply not the case or everyone’s trapped in a cycle where they are all required to keep spending too much to keep up and nobody wants to be the first to stop. Either way the outcome is the same.


This is just not true. Plenty of companies will remain unprofitable for as long as they can in the name of growth, market share, and beating their competition. At some point it will level out, but while they can still raise cheap capital and spend it to grow, they will.

OpenAI could put in ads tomorrow and make tons of money overnight. The only reason they don't is competition. But when they start to find it harder to raise capital to fund their growth, they will.


> I mean the amount of money invested across just a handful of AI companies is currently staggering and their respective revenues are no where near where they need to be. That’s a valid reason to be skeptical.

Yes and no. Some of it just claims to be "AI". Like the hyperscalers are building datacenters and ramping up but not all of it is "AI". The crypto bros have rebadged their data centers into "AI".


> The crypto bros have rebadged their data centers into "AI"

That the previous unsustainable bubble is rebranding into the new one, is maybe not the indicator of stability we should be hoping for


> (at which point Intel typically fired their chip design group, hired everyone from AMD or whoever, and came out with Core or whatever)

Didn't the Core architecture come from the Intel Pentium M Israeli team? https://en.wikipedia.org/wiki/Intel_Core_(microarchitecture)...


Correct. Core came from Pentium M, which actually came from the Israeli team who took the Pentium 3 architecture, and coupled this with the best bits from the Pentium 4

Yeah, that bit was pure snark - point was Intel’s gotten caught resting on their laurels a couple times when their architectures get a little long in the tooth, and often it’s existential enough that the team that pulls them out of it isn’t the one that put them in it.

I think that's an overly reductive view of a very complicated problem space, with the benefit of hindsight.

If you wanted to make that point, Itanium or 64-bit/multi-core desktop processing would be better examples than Core.


Yes, and the newest Panther Lake too!

https://techtime.news/2025/10/10/intel-25/


> and the only way to make the economics of, eg, a Blackwell-powered datacenter make sense is to assume that the entire economy is going to be running on it, as opposed to some useful tools and some improved interfaces.

And I'm still convinced we're not paying real prices anywhere. Everyone is still trying to get market share so the prices are going to go up when this all needs to sustain itself. At that point, which use cases become too expensive and does that shrink it's applicability ?


What about TPUs? They are more efficient than nvidia GPUs, a huge amount of inference is done with them, and while they are not literally being sold to the public, the whole technology should be influencing the next steps of Nvidia just like AMD influenced Intel

TPUs can be more efficient, but are quite difficult to program for efficiently (difficult to saturate). That is why Google tends to sell TPU-services, rather than raw access to TPUs, so they can control the stack and get good utilization. GPUs are easier to work with.

I think the software side of the story is underestimated. Nvidia has a big moat there and huge community support.


My understanding is all of Google's AI is trained and run on quite old but well designed TPUs. For a while the issue was that developing these AI models still needed flexibility and customised hardware like TPUs couldn't accomodate that.

Now that the model architecture has settled into something a bit more predictable, I wouldn't be surprised if we saw a little more specialisation in the hardware.


> The reason this matters is that LLMs are incredibly nifty often useful tools that are not AGI and also seem to be hitting a scaling wall

I don't know who needs to hear this, but the real break through in AI that we have had is not LLMs, but generative AI. LLM is but one specific case. Furthermore, we have hit absolutely no walls. Go download a model from Jan 2024, another from Jan 2025 and one from this year and compare. The difference is exponential in how well they have gotten.


> exponential

Is this the second most abused english word (after 'literally')?

> a model from Jan 2024, another from Jan 2025 and one from this year

You literally can't tell the difference is 'exponential', quadratic, or whatever from three data points.

Plus it's not my experience at all. Since Deepseek I haven't found models that one can run on consumer hardware get much better.


I’ve heard “orders of magnitude” used more than once to mean 4-5 times

In binary 2x is one order of magnitude

exactly!

I've been wondering about this for quite a while now. Why does everybody automatically assume that I'm using the decimal system when saying "orders of magnitude"?!


I'd argue that 100% of all humans use the decimal system, most of the time. Maybe 1 to 5% of all humans use another system some of the time.

Anyway, there are 10 types of people, those who understand binary and those who don't.


Because, as xkcd 169 says, communicating badly and then actung smug when you're misunderstood is not cleverness. "Orders of magnitude" refers to a decimal system in the vast majority of uses (I must admit I have no concrete data on this, but I can find plenty of references to it being base-10 and only a suggestion that it could be sometihng else).

Unless you've explicitly stated that you mean something else, people have no reason to think that you mean something else.


There is a lot of talking past each other when discussing LLM performance. The average person whose typical use case is asking ChatGPT how long they need to boil an egg for hasn't seen improvements for 18 months. Meanwhile if you're super into something like local models for example the tangible improvements are without exaggeration happening almost monthly.

Random trivia are answered much better in my case.

> The average person whose typical use case is asking ChatGPT how long they need to boil an egg for hasn't seen improvements for 18 months

I don’t think that’s true. I think both my mother and my mother-in-law would start to complain pretty quickly if they got pushed back to 4o. Change may have felt gradual, but I think that’s more a function of growing confidence in what they can expect the machine to do.

I also think “ask how long to boil an egg” is missing a lot here. Both use ChatGPT in place of Google for all sorts of shit these days, including plenty of stuff they shouldn’t (like: “will the city be doing garbage collection tomorrow?”). Both are pretty sharp women but neither is remotely technical.


>go download a model

GP was talking about commercially hosted LLMs running in datacenters, not free Chinese models.

Local is definitely still improving. That’s another reason the megacenter model (NVDA’s big line up forever plan) is either a financial catastrophe about to happen, or the biggest bailout ever.


GPT 5.2 is an incredible leap over 5.1 / 5

5.2 is great if you ask it engineering questions, or questions an engineer might ask. It is extremely mid, and actually worse than the o3/o4 era models if you start asking it trivia like if the I-80 tunnel on the bay bridge (yerba buena island) is the largest bore in the world. Don't even get me started on whatever model is wired up to the voice chat button.

But yes it will write you a flawless, physics accurate flight simulator in rust on the first try. I've proven that. I guess what I'm trying to say is Anthropic was eating their lunch at coding, and OpenAI rose to the challenge, but if you're not doing engineering tasks their current models are arguably worse than older ones.


But how many are willing to fork over $20 or so a month to ask simple trivia questions?

In addition to engineering tasks, it's an ad-free answer-box, outside of cross checking things, or browsing search results it's totally replaced Google/search engine use for me. I also pay for Kagi for search. In the last year I've been able to fully divorce myself from the google ecosystem besides gmail and maps.

My impression is that software developers are the lions share of people actually paying for AI, but perhaps that's just my bubble world view.

According to OpenAI it's something like 4.2% of the use. But this data is from before Codex added subscription support and I think only covers ChatGPT (back when most people were using ChatGPT for coding work, before agents got good).

https://i.imgur.com/0XG2CKE.jpeg


The execs I've talked to, they are paying for it to answer capex questions, as a sounding board for decision making, and perhaps most importantly, crafting/modifying emails for tone/content. In the bay area particularly a lot of execs are foreign with english as their second language and LLMs can cut email generation time in half.

I'd believe that but I was commenting on who actually pays for it. My guess is that most individuals using AI in their personal lives are using some sort of free tier.

Yes 95% are unpaid

how is “GPT 5.2 is good” a response to “downloadable models aren’t relevant”?

> Go download a model from Jan 2024, another from Jan 2025 and one from this year and compare.

I did. The old one is smarter.

(The newer ones are more verbose, though. If that impresses you, then you probably think members of parliament are geniuses.)


Yeah agreed, there were some minor gains, but new releases are mostly benchmark overfit sycopanthic bullshit that are only better on paper and horrible to use. The more synthetic data they add the less world knowledge the model has and the more useless it becomes. But at least they can almost mimic a basic calculator now /s

For api models, OpenAI's releases have regularly not been an improvement for a long while now. Is sonnet 4.5 better than 3.5 outside pretentius agentic workflows it's been trained for? Basically impossible to tell, they make the same braindead mistakes sometimes.


> I am of the opinion that Nvidia's hit the wall with their current architecture

Google presented TPUs in 2015. NVIDIA introduced Tensor Cores in 2018. Both utilize systolic arrays.

And last month NVIDIA pseudo-acquired Groq including the founder and original TPU guy. Their LPUs are way more efficient for inference. Also of note Groq is fully made in USA and has a very diverse supply chain using older nodes.

NVIDIA architecture is more than fine. They have deep pockets and very technical leadership. Their weakness lies more with their customers, lack of energy, and their dependency on TSMC and the memory cartel.


Underrated acquisition. Gives NVIDIA a whole lineup of inference-focused hardware that iirc can retrofit into existing air cooled data centres without needing cooling upgrades. Great hedge against the lower-end $$$-per-watt and watt-per-token competition that has been focused purely at inference.

Also a hedge from the memory cartel as Groq uses SRAM. And a reasonable hedge in case Taiwan gets blockaded or something.

Thanks for this. It put into words a lot of the discomfort I’ve had with the current AI economics.

We've seen this before.

In 2001, there were something like 50+ OC-768 hardware startups.

At the time, something like 5 OC-768 links could carry all the traffic in the world. Even exponential doubling every 12 months wasn't going to get enough customers to warrant all the funding that had poured into those startups.

When your business model bumps into "All the <X> in the world," you're in trouble.


Especially when your investors are still expecting exponential growth rates.


What do I care if there's no profit in LLM's..

I just want to buy ddr5 and not pay an arm and a leg for my power bill!


> which is going to blow out the economics on inference

At this point, I don't even think they do the envelope math anymore. However much money investors will be duped into giving them, that's what they'll spend on compute. Just gotta stay alive until the IPO!


Remember that without real competition, Nvidia has little incentive to release something 16x faster when they could release something 2x faster 4 times.

You’re right but Nvidia enjoys an important advantage Intel had always used to mask their sloppy design work: the supply chain. You simply can’t source HBMs at scale because Nvidia bought everything, TSMC N3 is likewise fully booked and between Apple and Nvidia their 18A is probably already far gone and if you want to connect your artisanal inference hardware together then congratulations, Nvidia is the leader here too and you WILL buy their switches.

As for the business side, I’ve yet to hear of a transformative business outcome due to LLMs (it will come, but not there yet). It’s only the guys selling the shovels that are making money.

This entire market runs on sovereign funds and cyclical investing. It’s crazy.


For instance, I believe Callcenters are in big trouble, and so are specialized contractors (like those prepping for an SOC submission etc).

It is, however, actually funny how bad e.g. the amazon chatbot (Rufus) is on amazon.com. When asked where a particular CC charge comes from, it does all sorts of SQL queries into my account, but it can't be bothered to give me the link to the actual charges (the page exists and solves the problem trivially).

So, maybe, the callcenter troubles will take some time to materialize.


Based on conversations I've had with some people managing GPU's at scale in the datacenters, inference is an after thought. There is a gold rush for training right now, and that's where these massive clusters are being used.

LLM's are probably a small fraction of the overall GPU compute in use right now. I suspect in the next 5 years we'll have full Hollywood movies being completely generated (at least the specialfx) entirely by AI.


Hollywood studios are breathing their last gasps now. Anyone will be able to use AI to create blockbuster type movies, Hollywood's moat around that is rapidly draining.

Have you....used any of the video generators? Nothing they create make any goddamn sense, they're a step above those fake acid trip simulators.

> Nothing they create make any goddamn sense,

I wouldn’t be that dismissive. Some have managed to make impressive things with them (although nothing close to an actual movie, even a short).

https://www.youtube.com/watch?v=ET7Y1nNMXmA

A bit older: https://www.youtube.com/watch?v=8OOpYvxKhtY

Compared to two years ago: https://www.youtube.com/watch?v=LHeCTfQOQcs


The problem with all of these, even the most recent one, is that they have the "AI look". People have tired of this look already, even for short adverts; if they don't want five minutes of it, they really won't like two hours of it. There is no doubt the quality has vastly improved over time, but I see no sign of progress in removing the "AI look" from these things.

My feeling is the definition of the "AI look" has evolved as these models progressed.

It used to mean psychedelic weird things worthy of the strangest dreams or an acid trip.

Then it meant strangely blurry with warped alien script and fifteen fingers, including one coming out of another’s second phalanx

Now it means something odd, off, somewhat both hard to place and obvious, like the CGI "transparent" car (is it that the 3D model is too simple, looks like a bad glass sculpture, and refracts light in squares?) and ice cliffs (I think the the lighting is completely off, and the colours are wrong) in Die Another Day.

And if that’s the case, then these models have covered far more in far less time then it took computer graphics and CGI.


What changed my whole perspective on this a few months ago was Google's Genie 3 demo: https://www.youtube.com/watch?v=PDKhUknuQDg

They have really advanced the coherency of real-time AI generation.


Have you seen https://www.youtube.com/watch?v=SGJC4Hnz3m0

It's not feature length movie but I'm not sure there's any reason why it couldn't be, and its not technically perfect but pretty damn good.


Anybody had the ability to write the next great novel for a while, but few succeed.

There are lots of very good relatively recent novels on the shelf at the bookstore. Certainly orders of magnitude more than there are movies.

The other thing to compare is the narrative quality. I find even middling books to be of much higher quality than blockbuster movies on average. Or rather I'm constantly appalled at what passes for a decent script. I assume that's due to needing to appeal to a broad swath of the population because production is so expensive, but understanding the (likely) reason behind it doesn't do anything to improve the end result.

So if "all" we get out of this is a 1000x reduction in production budgets which leads to a 100x increase in the amount of media available I expect it will be a huge win for the consumer.


Anyone with a $200M marketing budget.

Throw it on YouTube and get a few key TikTokers to promote it.

it's so weird how they spend all this money to train new models and then open sources it. it's gold rush but nvidia is getting all the gold.

> I am of the opinion that Nvidia's hit the wall with their current architecture

Not likely since TSMC has a new process with big gains.

> The story with Intel

Was that their fab couldn’t keep up not designs.


If Intel's original 10nm process and Cannon Lake had launched within Intel's original timeframe of 2016/17, it would have been class leading.

Instead, they couldn't get 10nm to work and launched one low-power SKU in 2018 that had almost half the die disabled, and stuck to 14nm from 2014-2021.


> nothing about the industry's finances add up right now

Nothing about the industry’s finances, or about Anthropic and OpenAI’s finances?

I look at the list of providers on OpenRouter for open models, and I don’t believe all of them are losing money. FWIW Anthropic claims (iirc) that they don’t lose money on inference. So I don’t think the industry or the model of selling inference is what’s in trouble there.

I am much more skeptical of Anthropic and OpenAI’s business model of spending gigantic sums on generating proprietary models. Latest Claude and GPT are very very good, but not better enough than the competition to justify the cash spend. It feels unlikely that anyone is gonna “winner takes all” the market at this point. I don’t see how Anthropic or OpenAI’s business model survive as independent entities, or how current owners don’t take a gigantic haircut, other than by Sam Altman managing to do something insane like reverse acquiring Oracle.

EDIT: also feels like Musk has shown how shallow the moat is. With enough cash and access to exceptional engineers, you can magic a frontier model out of the ether, however much of a douche you are.


It's become rather clear from the local LLM communities catching up that there is no moat. Everyone is still just barely figuring out how this nifty data structures produce such a powerful emergent behavior, there isn't any truly secret sauce yet.

> local LLM communities catching up that there is no moat.

they use Chinese open LLMs, but Chinese companies have moat: training datasets and some non-opensource tech, and also salaried talents, which one would need serious investment for if decide to bootstrap competitive frontier model today.


I’d argue there’s a _bit_ of secret sauce here, but the question is if there’s enough to justify valuations of the prop-AI firms, and that seems unlikely.

> but nothing about the industry's finances add up right now.

The acquisitions do. Remember Groq?


That may not be a good example because everyone is saying Groq isn't worth $20B.

They were valued at $6.9B just three months before Nvidia bought them for $20B, triple the valuation. That figure seems to have been pulled out of thin air.

Speaking generally: It makes sense for a acquisition price to be at a premium to valuation, between the dynamics where you have to convince leadership its better to be bought than to keep growing, and the expected risk posed by them as competition.

Most M&As arent done by value investors.


Maybe it was worth the other $13.1B to make sure their competitors couldn't get them?

Looks like Confer is hosting its own inference: https://confer.to/blog/2026/01/private-inference/

> LLMs are fundamentally stateless—input in, output out—which makes them ideal for this environment. For Confer, we run inference inside a confidential VM. Your prompts are encrypted from your device directly into the TEE using Noise Pipes, processed there, and responses are encrypted back. The host never sees plaintext.

I don’t know what model they’re using, but it looks like everything should be staying on their servers, not going back to, eg, OpenAI or Anthropic.


That is a highly misleading statement: the GPU runs with real weights and real unencrypted user plaintext, since it has to multiply matrices of plain text, which is passed on to the supposedly "secure VM" (protected by Intel/Nvidia promises) and encrypted there. In no way is it e2e, unless you count the GPU as the "end".

It is true that nVidia GPU-CC TEE is not secure against decapsulation attacks, but there is a lot of effort to minimize the attack surface. This recent paper gives a pretty good overview of the security architecture: https://arxiv.org/pdf/2507.02770

So what you are saying is that all the TEE and remote attestation and everything might work for CPU based workflows but they just don't work with GPU effectively being unencrpyted and anyone can read it from there?

Edit: https://news.ycombinator.com/item?id=46600839 this comment says that the gpu have such capabilities as well, So I am interested what you were mentioning in the first place?


> Looks like Confer is hosting its own inference

Even so, you're still exposing your data to Confer, and so you have to trust them that they'll behave as you want. That's a security problem that Confer doesn't help with.

I'm not saying Confer isn't useful, though. e2ee is very useful. But it isn't enough to make me feel comfortable.


> you're still exposing your data to Confer

They use a https://en.wikipedia.org/wiki/Trusted_execution_environment and iiuc claim that your client can confirm (attest) that the code they run doesn't leak your data, see https://confer.to/blog/2026/01/private-inference/

So you should be able to run https://github.com/conferlabs/confer-image yourself and get a hash of that and then confer.to will send you that same hash, but now it's been signed by Intel I guess? to tell you that yes not only did confer.to send you that hash, but that hash is indeed a hash of what's running inside the Trusted Execution Environment.

I feel like this needs diagrams.


As I read it, the attestation is simply that the server is running a particular kernel and application in the Secure Enclave using the hardware’s certification. That does not attest that there is no sidechannel. If exfiltration from the TEE is achieved, the attestation will not change.

To put it another way, I am quite sure that a sufficiently skilled (or privileged: how do you know the manufacturer is not keeping copies of these hardware keys?) team could sit down with one of these enclave modules and figure out how to get the memory image (or whatever) out without altering the attested signature.


That's the main selling point of TEE though, isn't it? That what your hypothetical team could do, can't be done?

I don’t believe for a minute that it can’t be done even with physical access. Perhaps it’s more difficult.

> I feel like this needs diagrams.

And there's the problem.

All of that stuff is well and good, but it seems like I have to have a fair degree of knowledge and technical skill, not to mention time and effort, to confirm that everything is as they're representing. And it's time and effort I'd have to expend on an ongoing basis.

That's not an expectation I could realistically meet, so in practice, I still have to just trust them.


In most of modern life we trust to experts to some degree. I couldn't off the top of my head explain DH key exchange, I don't know if I'll ever understand elliptic curves, but I see that most of the cryptographic community understands them as good methods for many problems and if lots of experts who otherwise argue about anything will agree "what yes, of course DH is good for key exchange but that's beside the point and djb is still wrong about florbnitz keys" then it's likely DH is indeed good for key exchange.

If everyone had to understand every detail to trust in tech we would not have nuclear plants or coast around on huge flammable piles of charged lithium


In theory, you trust the "crowd" (rather than the hosting entity) because if they don't do what they said, the "crowd" should make a noise about it and you would know.

That’s true, but it’s still a distinct threat model from “we use the API of a company run by one of the least trustworthy humans on the planet.” We can talk through side channel attacks and whatnot, but we’re discussing issues with Confer’s implementation, not trusting a different third party.

We'll add that link to the toptext as well. Thanks!

(It got submitted a few times but did not get any comments - might as well consolidate these threads)


I’m not disagreeing with this necessarily, but I do think a lot of people underestimate the costs of actually doing on-prem to a professional standard. You’ll almost certainly have to hire a dedicated team to manage your hardware, and you’re off in the woods as far as most of the rest of the world’s operating stack - an awful lot assumes you’re on EKS with infinite S3 and ECR available. It’s doable, but it’s not drag & drop - the cloud providers are expensive, but they are providing a lot.

That audiosynth link is a fine thread for anyone lamenting the loss of civility on the modern internet.

What's interesting about this post is that it would be completely unsurprising to most people in most parts of the world, as well as most people here 100 years ago. The discomfort so many people in this thread feel with receiving kindness and the devaluing of the gift of companionship, gratitude, and a good story is a testament to how far off human norms our society is these days.

I struggle with this a lot myself - I've had to overcome deep feelings of inadequacy and insecurity because I never valued what I offered people just by being present and because I always felt like I had to "square the books" - that every interaction or relationship was an exchange that needed to square its ledgers. The best thing for my mental health has been to become comfortable just being with people and accepting their continued presence and return to my company as something that doesn't require an exchange - that human companionship and kindness are things we enjoy as people, not services to be itemized and accounted for.

One interesting note on this is that it's not just an "old world" phenomenon - if you look at many old farmhouses across the midwest, they've got a separate covered section that's accessible from outside the house - the assumption was that travelers passing by could and would spend the night there, and often the owners would have supplies of oatcakes or other durable foodstuffs for travelers, because everyone who traveled needed somewhere to stay or something to eat at some point, and reciprocity could be assumed when the homeowner traveled themselves in the future.


Is that really true? Would a black man be able to receive this type of kindness 100 years ago?

So, 1926? My feeling is: probably, from some white people. And, these areas had non-white people. The GP said Midwest, so we can think of the region roughly from Indiana to Nebraska, and North to the border. While there were "sundown towns" in those areas, where racists banded together to keep black people out of the town after dark, there were also people outside those towns with diverse beliefs and attitudes towards black people. In that region, the harshest laws regarding race were probably in Missouri. In any event, a black man was probably at greater risk than a white man of suffering an attack by a stranger, but I would expect there also would be people of all races willing to extend hospitality.

Outside of America, sure.

The first time I walked past a homeless person on a smart phone it took a minute to process - phones are effectively free at this point.

(The first time I walked past a homeless person using a VR headset, on the other hand, was a fucking trip.)


That sounds like a Silicon Valley bit.

That show didn’t hit Black Mirror levels of existentially uncomfortable, but man, I recognized too many of those scenes.

I picked up a binocular microscope a bit back and it’s one of my favorite nerd purchases - it’s 20x, which is enough to be interesting, but not enough to require a lot of sample prep, and the binocular setup gives you depth perception, so it feels like you’re “there” with whatever you’re looking at. I’ve mucked around with microscopes in the past, but the binocular is genuinely just fun.

I used to have a Chase card, but decided I didn’t like dealing with them, so I closed the account.

And then they bought my bank, and I had a Chase account again. I gave them a chance and they were still awful, so I switched banks.

A couple years after that, they bought my bank, and I had a Chase account again. This isn’t a duplicate paragraph, it happened again. I gave them a chance again, and they were still awful, so I switched banks again (to a credit union, which, fingers crossed).

Now I guess I get to do business with Chase again, which is neat. I’m happy to be part of an economy where I can vote with my dollar like this.


Far be it from me to get in the way of someone protesting megabank centralization, but...

I have to imagine that this bank relationship will be different from those previous acquisitions? I never interacted with Goldman Sachs for the duration I've had my Apple Card—the relationship is entirely with Apple and their iOS app. I don't imagine that to be much different when Chase is the issuer.


I have a couple store brand cards backed by Chase (like the Amazon one) and they are basically exactly like any other first party Chase card. They show up on the Chase dashboard. You call Chase customer service for issues. Payments happen through Chase. The card benefits are all provided by Chase. The only difference is that there's Amazon printed in front and the points the card earns aren't regular Chase points. Of course their relationship with Apple could be different, but I doubt it'll be anything like the Goldman one. Remember that Goldman didn't have a consumer business at all before they got into the partnership.

"Of course their relationship with Apple could be different, but I doubt it'll be anything like the Goldman one" I can nearly guarantee this is completely wrong. It will be just like Goldman. That's literally the point of getting the card, is do everything through Apple and Goldman is just the backbone and support (through Apple's UI). Apple's contract with Goldman was until 2030, they wouldn't let Goldman out of it unless they were able to find a new partner with same or better experience and terms. They could have choose Synchrony, etc if Chase didn't agree. And clearly Chase was willing to concede to Apple since they don't even offer high yield savings, but agreed to for Apple since thats part of their existing offerings. Right now Apple decides the Promos for Apple Card, and at most maybe Chase will have a little more involvement in those but probably not. So in realty the only real world change is probably better customer support than Sachs, and Chase may not offer higher risks customers credit or as much like Goldman did and finally they may not offer the very low APR's that Apple card is known for, for it's best customers.

AFAIK in the current setup, only the UI is Apple's; any time you message customer service or open a dispute, that is handled directly by Goldman employees, and decisions to raise/lower your credit limit are also made by Goldman. All of those parts will be handled by Chase now.

Have to disagree there. While the iOS wallet apple is all apple and thats what you will use for checking your statements and making payments which is great. But anytime I have to actually interact with for support that is 100% with Goldman. When you do you support through the iMessage chat feature it's all Goldman on the other end, not Apple at all. And I've had to deal with them far too many times, and each time was absolutely terrible. Because of Goldman but also because of the platform Apple gives them for support, since you send a message have to wait for a text response. So feels like a chat but unlike a chat responses can take a while more like email, but then if they respond and you dont respond very fast then it closes it out and your response triggers a how can we help and you start all over, which is unlike email. So it's the worst of Chat and the worst of Email. Hopefully that is revamped with the new relationship. And if you have to call you get Sachs not Apple. Chase isn't great but expect much better than Sachs.

My problem with Chase and PNC is that their "fraud detection" seems to be random die rolls or monkeys throwing darts at my picture on a wall. I love the 5 minutes of anxiety after each and every purchase where I wonder if it will just randomly fail.

We’ll find out, but it still sort of sticks in the craw to be stuck doing business with them again.

What made you not like dealing with them?

Maybe I’m a passive user, but my credit cards are autopay set it and forget it. As someone with no opinion on cc companies, curious what gave you such a bad impression.


I’ve had the good fortune of working with actually good banks with actually good customer service, and Chase always sticks out for how indifferent they are. If you don’t need anything beyond the app, sure, they’re probably fine - although I seem to remember them shoving ads in my transaction feed - but I’ve never had luck getting ahold of a human with any variety of autonomy or agency when I needed one.

they have very strict AML controls as well. For example if you regularly withdraw a few thousand in cash to pay contractors for a house project they may close your account.

From what I've heard, they have an obnoxious fraud workflow.

It is possible to get locked out of your card abroad and not have a way to unlock it.


I’ve disputed fraud a couple of times on my Chase cards. It was.. fine? Uneventful and simple.

Their problem is with false positives they find, not true positives you find. My application for a credit card was somehow flagged as fraudulent. Chase repeatedly asked for additional forms of ID, then told me the scans I sent were illegible. (The scans were fine; I think they just needed an excuse.) I went to a branch with the physical documents, and they said they couldn't look at them. The branch put me in an office and called the same telephone support, with the same result. I eventually gave up.

I guess I'm lucky they rejected me before any money changed hands. I've heard horror stories from people with significant assets at their bank, locked out until an actual lawsuit (the letter from a lawyer didn't work) finally got their attention. I think it's like Google support, usually fine but catastrophic when it's not.


> The branch put me in an office and called the same telephone support, with the same result.

As far as I can tell, going to a branch of a big bank to address a problem nowadays is similar to going to a cellphone store for tech support. All they can really do is call the same hotline or fill out the same webform you’d have access to at home.


Anecdotal but just came back from 2 weeks abroad and didn’t have to take any action to continue using my chase card. Also I believe the conversion rate was better than the local.

Did you have first republic? RIP. I miss that bank so much.

Yup, that was one of them. Chase botched transferring the bill pay over after spending 3 months saying they’d transfer bill pay and then told me over the phone they hadn’t said that while I was looking at the current page on their website that said they’d transfer bill pay.

First Republic was great - they were the reason I didn’t go to a credit union sooner. Schwab also has amazing customer service, if you’re still searching for a new home and aren’t willing to make the leap to a CU.


>Now I guess I get to do business with Chase again, which is neat. I’m happy to be part of an economy where I can vote with my dollar like this.

You were unusually unlucky. The US has a very decentralized banking system, with thousands of institutions. The Big Four (JPM, BAC, C, WFC) have under 50% of total deposits; the comparable figure for Canada's Big Five is ~85%,


Creeping fees if accidentally pass a withdrawal limit, not to mention abysmal interest rates for parking your money there.

Just want to point out that people have said it’s been very very hard to get Goldman Sachs to dispute a transaction, so I’m hoping that Chase will offer better service than Goldman as Goldman’s actual service is supposedly not very good.

What is awful though? Like what was wrong. I have one of their Amazon cards and their app is decent and their card has all the standard functionality.

Chase spelled my name incorrectly on my account. It took several phone calls over 2-3 months to get it fixed. Humorously awful, I guess.

Same and Same. Closed account because of bad experiences. Later they bought my bank. Raised prices 10x+ and lowered services. Got out ASAP.

Switch to Android and you should be safe.

Sounds like a recursive function

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: