Someone tried this, I saw it one of the Reddit AI subs. They were training a local model on whatever they could find that was written before $cutoffDate.
I think this is a meta-allusion to the theory that human consciousness developed recently, i.e. that people who lived before [written] language did not have language because they actually did not think. It's a potentially useful thought experiment, because we've all grown up not only knowing highly performant languages, but also knowing how to read / write.
However, primitive languages were... primitive. Where they primitive because people didn't know / understand the nuances their languages lacked? Or, were those things that simply didn't get communicated (effectively)?
Of course, spoken language predates writings which is part of the point. We know an individual can have a "conscious" conception of an idea if they communicate it, but that consciousness was limited to the individual. Once we have written language, we can perceive a level of communal consciousness of certain ideas. You could say that the community itself had a level of shared-consciousness.
With GPTs regurgitating digestible writings, we've come full circle in terms of proving consciousness, and some are wondering... "Gee, this communicated the idea expertly, with nuance and clarity.... but is the machine actually conscious? Does it think undependably of the world, or is it merely a kaledascopic reflection of its inputs? Is consciousness real, or an illusion of complexity?"
I’m not sure why it’s so mind-boggling that people in the year 1225 (Thomas Aquinas) or 1756 (Mozart) were just as creative and intelligent as they themselves are, as modern people. They simply had different opportunities then comparable to now. And what some of them did with those opportunities are beyond anything a “modern” person can imagine doing in those same circumstances. _A lot_ of free time over winter in the 1200s for certain people. Not nearly as many distractions either.
Saying early humans weren’t conscious because they lacked complex language is like saying they couldn’t see blue because they didn’t have a word for it.
Well, Oscar Wilde argues in “The Decay of Lying” that there were no stars before an artist could describe them and draw people’s attention to the night sky.
The basic assumption he attacks is that “there is a world we discover” vs “there is a world we create”.
It is hard paradigm shift, but there is certainly reality in “shared picture of the world” and convincing people of a new point of view has real implications in how the world appears in our minds for us and what we consider “reality”
It should be almost obligatory to always state which definition of consciousness one is talking about whenever they talk about consiousness, because I for example don't see what language has to do with our ability to experience qualia for example.
Is it self awarness? There are animals that can recognize themselves in mirror, I don't think all of them have a form of proto-language.
Web search often tanks the quality of MY output these days too. Context clogging seems a reasonable description of what I experience when I try to use the normal web.
THIS. I do my best work after a long vigorous walk and contemplation, while listening to Bach sipping espresso. (Not exaggerating much.) If I go on HN or slack or ClickUp or work email, context is slammed and I cannot do /clear so fast. Even looking up something quick on the web or an LLM causes a dirtying.
I feel the same. LLMs using web search ironically seem to have less thoughtful output. Part of the reason for using LLMs is to explore somewhat novel ideas. I think with web search it aligns too strongly to the results rather than the overall request making it a slow search-engine.
That makes sense. They're doing their interpretation on the fly for one thing. For another just because they now have data that is 10 months more recent than their cutoff they don't have any of the intervening information. That's gotta make it tough.
Web search is super important for frameworks that are not (sufficiently?) in the training data. o3 often pulls info from Swift forums to find and fix obscure Swift concurrency issues for me.
In my experience none of the frontier models I tried (o3, Opus 4, Gemini 2.5 Pro) was able to solve Swift concurrency issues, with or without web search. At least not sufficiently for Swift 6 language mode. They don’t seem to have a mental model of the whole concept and how things (actors, isolation, Tasks) need to play together.
I haven't tried ChatGPT web search, but my experience with Claude web search is very good. It's actually what sold me and made me start using LLMs as part of my day to day. The citations they leave (I assume ChatGPT does the same) are killer for making sure I'm not being BSd on certain points.
It depends on the question. I was having a casual chat with my dad and we wondered how Apple's revenue was split amongst products, and it was just to chat about so I didn't check.
On the other hand, I got an overview of Postgres RLS and I checked the majority of those citations since those answers were going to be critical.
That’s interesting. I use the API and there are zero citations with Claude, charGPT and Gemini. Only Kagi assistant gives me some, which is why I prefer it when researching facts.
What software to you use? The native Claude app? What subscription do you have?
Completely opposite experience here (with Claude). Most of my googling is now done through Claude- it can find and digest a d compile information much quicker and better than I'd do myself. Without web search you're basically asking an LLM to pull facts out of its ass- good luck with trusting the results.
It still is, not all queries trigger web search, and it takes more tokens and time to do research. ChatGPT will confidently give me outdated information, and unless I know it’s wrong and ask it to research, it wouldn’t know it is wrong. Having a more recent knowledge base can be very useful (for example, knowing who the president is without looking it up, making references to newer node versions instead of old ones)
The problem, perhaps illusory that it's easy to fix, is that the model will choose solutions that are a year old, e.g. thinking database/logger versions from December '24 are new and usable in a greenfield project despite newer quarterly LTS releases superseding them. I try to avoid humanizing these models, but could it be that in training/posttraining one could make it so the timestamp is fed in via the system prompt and actually respected? I've begged models to choose "new" dependencies after $DATE but they all still snap back to 2024
The biggest issue I can think of is code recommendations with out of date versions of packages. Maybe the quality of code has deteriorated in the past year and scraping github is not as useful to them anymore?
Knowledge cutoff isn’t a big deal for current events. Anything truly recent will have to be fed into the context anyway.
Where it does matter is for code generation. It’s error-prone and inefficient to try teaching a model how to use a new framework version via context alone, especially if the model was trained on an older API surface.
Still relevant, as it means that a coding agent is more likely to get things right without searching. That saves time, money, and improves accuracy of results.
It absolutely is, for example, even in coding where new design patterns or language features aren't easy to leverage.
Web search enables targeted info to be "updated" at query time. But it doesn't get used for every query and you're practically limited in how much you can query.
Isn’t this an issue with eg Cloudflare removing a portion of the web? I’m all for it from the perspective of people not having their content repackaged by an LLM, but it means that web search can’t check all sources.
Right now nothing affects the underlying model weights. They are computed once during pretraining at enormous expense, adjusted incrementally during training, and then left untouched until the next frontier model is built.
Being able to adjust the weights will be the next big leap IMO, maybe the last one. It won't happen in real time but periodically, during intervals which I imagine we'll refer to as "sleep." At that point the model will do everything we do, at least potentially.
I had 2.5 Flash refuse to summarise a URL that had today's date encoded in it because "That web page is from the future so may not exist yet or may be missing" or something like that. Amusing.
2.5 Pro went ahead and summarized it (but completely ignored a # reference so summarised the wrong section of a multi-topic page, but that's a different problem.)
funny result of this is that GPT5 doesn't understand the modern meaning of Vibe Coding (maximising llm code generation), it thinks it "a state where coding feels effortless, playful, and visually satisfying" and offers more content around adjusting IDE settings, and templating.
maybe OpenAI have a terribly inefficient data ingestion pipeline? (wild guess) basically taking in new data is tedious so they do that infrequently and keep using old data for training.
Compare that to
Gemini 2.5 Pro knowledge cutoff: Jan 2025 (3 months before release)
Claude Opus 4.1: knowledge cutoff: Mar 2025 (4 months before release)
https://platform.openai.com/docs/models/compare
https://deepmind.google/models/gemini/pro/
https://docs.anthropic.com/en/docs/about-claude/models/overv...