More

th0ma5 · 2025-07-07T22:41:37 1751928097

I think you to as he is human, but I respect your desire to question it!

th0ma5 · 2025-07-05T17:00:51 1751734851

th0ma5 · 2025-07-03T19:43:07 1751571787

People should be demanding consistency and traceability from the model vendors checked by some tool perhaps like this. This may tell you when the vendor changed something but there is otherwise no recourse?

padolsey · 2025-07-04T03:51:29 1751601089

Agreed! FWIW I am attempting to create an open-source wiki/watchdog eval platform -- weval.org -- , so we can all keep an eye on LLMs, their biases, and their general competencies without relyong in the AI providers marking their own homework. I really believe this needs to exist to express our needs and hold model creators to account. Especially as model drift and manipulation becomes a risk.

th0ma5 · 2025-07-02T07:52:25 1751442745

Maybe you should try something other than demos? Have you tried creating a reliable system?

th0ma5 · 2025-06-25T22:56:12 1750892172

I have a lot of respect for organizations that get a lot done with Microsoft technologies. I think your perspective could be thought of as the benefits of vertical integration and vendor lock in. These do help people get things done!

In the academic and open source world those things are fought against because you don't want to be at the mercy of the software developer in the context of certain rights.

I think for every negative you mention on either side a positive could be found on either side. And like many things on the net, you're not wrong but not necessarily talking about the same kinds of things.

My remaining complaints about Microsoft are the inflexibility of their solutions that command abstractions that just don't work for many organizations, and the general viral nature of software sales in general of which they are one of many with similar issues, however Oracle is the worst of course.

jiggawatts · 2025-06-25T23:20:24 1750893624

Perfectly valid points. I've worked in academia, and their insistence on non-Microsoft technologies was helpful in certain fields where openness and long-term reproducibility is critical.

The downside is that this produces a microcosm of obscure technologies that can have... strange effects on industry. Some FAANG-like companies have a habit of hiring only recent graduates, so their entire staff is convinced that what they saw at their University is how everybody else does things.

It leads to Silicon Valley clique that has a fantastically distorted perspective of the rest of the world.

Some comments I've seen here on HN are downright hilarious to anyone from the "rest of the world", such as:

"Does anyone still use Windows Server!?" -- yes, at least 60% of all deployed servers world wide, and over 80% in many industries.

"Supports all popular directory servers such as OpenLDAP, ApacheDS, Kopano, ..." -- hello!? Active Directory! Have you heard of it!? It's something like 95% of all deployed LDAP deployments no matter how you count it! The other 5% is Oracle Directory and/or Novell eDirectory and then all of the rest put together is a rounding error.

ArcHound · 2025-06-26T11:12:09 1750936329

I agree with this, I see the AD as critical. Do you please have a source for these numbers? Would love to include it in the article.

th0ma5 · 2025-06-25T21:32:25 1750887145

I thought this was a very good read about the many of the issues that are faced without having any ground truth to reason against. It is interesting how many different ways people have developed to work around missing information, and the marginal improvements it makes in some benchmarks.

th0ma5 · 2025-06-20T17:36:23 1750440983

This also assumes that non human written code will be of any use to humans and no one has shown that to be possible, it is all humans patching it up so far.

th0ma5 · 2025-06-17T17:27:49 1750181269

This has been the problem with higher level natural language programming for years. I really wonder what people are doing if they don't see this core issue that precludes their use.

bluefirebrand · 2025-06-17T18:03:44 1750183424

It makes me wonder if some people writing code just cannot think in terms of code?

I imagine it is very slow if you always have to think in a human language and then translate each step into programming language

When people describe being in flow state, I think what is happening is they are more or less thinking directly in the programming language they are writing. No translation step, just writing code

LLM workflows completely remove the ability to achieve that imo

DevDesmond · 2025-06-17T20:00:03 1750190403

CSS or Tailwind has always been a tough one for me. I have banks of flashcards to help me remember stuff, (align-items, justify-content, grid-template-columns, etc.). Even with all that effort and many projects of practice, though, I've never had things click.

LLM assisted programming, however? – instant flow state. Instead of thinking in code I can think in product, and I can go straight from a pencil sketch to describing it as a set of constraints, and then say, "make sure it's ARIA compliant and responsive", and 95% of the work is done.

I feel similarly about configuration heavy files like Nginx or something. I really don't care to spend my time reading documentation, I'd rather copy paste the entire docs into the context window and then describe what I want in English.

Also good for SQL. And library code for a one off tool or API. And Bash scripting.

bluefirebrand · 2025-06-17T22:16:55 1750198615

> Instead of thinking in code I can think in product

I think you are talking about something very different than I am when you say flow state

th0ma5 · 2025-06-13T16:59:09 1749833949

A lot of people try to hedge this kind of sober insight along with their personal economic goals to say all manner of unfalsifiable statements of adequate application in some context, but it is refreshing to try to deal with the issues separately and I think a lot of people miss the insufficiency compared to traditional methods in all cases that I've heard of so far.

cyanydeez · 2025-06-13T20:21:06 1749846066

Ai slop

th0ma5 · 2025-06-11T23:46:28 1749685588

That you don't know when it will make a mistake and that it is getting harder to find them are not exactly encouraging signs to me.

tptacek · 2025-06-12T00:04:53 1749686693

Do you mean something by "getting harder to find them" that is different from "they are making fewer dumb errors"?

sweetjuly · 2025-06-12T03:51:22 1749700282

There are definitely dumb errors that are hard for human reviewers to find because nobody expects them.

One concrete example is confusing value and pointer types in C. I've seen people try to cast a `uuid` variable into a `char` buffer to, for example, memset it, by doing `(const char *)&uuid)`. It turned out, however, that `uuid` was not a value type but rather a pointer, and so this ended up just blasting the stack because instead of taking the address of the uuid storage, it's taking the address of the pointer to the storage. If you're hundreds of lines deep and are looking for more complex functional issues, it's very easy to overlook.