More

never_inline · 2025-07-07T08:17:55 1751876275

Irony is that jane street hires from prestigious Indian schools too, for pretty obscene salaries. These salaries get hyped and celebrated over newspapers.

never_inline · 2025-07-07T08:17:42 1751876262

Irony is jane street hires from prestigious Indian schools too, for pretty obscene salaries. These salaries get hyped and celebrated over newspapers.

never_inline · 2025-07-06T09:33:16 1751794396

What's exactly looksmaxing and why is the above comment a suspect? It seemed normal to me.

never_inline · 2025-07-05T18:46:33 1751741193

How many microorganisms and pests have you deprived of livelihood? Why stop at animals?

never_inline · 2025-07-04T15:01:13 1751641273

Can someone elucidate how using a full blown browser is improvisation over using say markitdown / pandoc / whatever? Given that most useful coding docs sites are static (made with sphinx or mkdocs or whatever)

never_inline · 2025-07-03T17:11:19 1751562679

The problem I see with MCP is very simple. It's using JSON as the format and that's nowhere as expressive as a programming language.

Consider a python function signature

list_containers(show_stopped: bool = False, name_pattern: Optional[str] = None, sort: Literal["size", "name", "started_at"] = "name"). It doesn't even need docs

Now convert this to JSON schema which is 4x larger input already.

And when generating output, the LLM will generate almost 2x more tokens too, because JSON. Easier to get confused.

And consider that the flow of calling python functions and using their output to call other tools etc... is seen 1000x more times in their fine tuning data, whereas JSON tool calling flows are rare and practically only exist in instruction tuning phase. Then I am sure instruction tuning also contains even more complex code examples where model has to execute complex logic.

Then theres the whole issue of composition. To my knowledge there's no way LLM can do this in one response.

    vehicle = call_func_1()
    if vehicle.type == "car":
      details = lookup_car(vehicle.reg_no)
    else if vehicle.type == "motorcycle":
      details = lookup_motorcycle(vehicle.reg_ni)

How is JSON tool calling going to solve this?

8note · 2025-07-03T17:20:56 1751563256

the reason to use the llm is that you dont know ahead of time that the vehicle type is only a car or motorcycle, and the llm will also figure out a way to detail bycicles and boats and airplanes, and to consider both left and right shoes separately.

the llm cant just be given this function because its specialized to just the two options.

you could have it do a feedback loop of rewriting the python script after running it, but whats the savings at tha point? youre wasting tokens talking about cars in python when you already know is a ski, and the llm could ask directly for the ski details without writing a script to do it in between

chrisweekly · 2025-07-03T17:18:06 1751563086

Great point.

But "the" problem with MCP? IMVHO (Very humble, non-expert) the half-baked or missing security aspects are more fundamental. I'd love to hear updates about that from ppl who know what they're talking about.

never_inline · 2025-07-03T16:59:23 1751561963

Wasn't there a tool calling benchmark by docker guys which concluded qwen models are nearly as good as GPT? What is your experience about it?

Personally I am convinced JSON is a bad format for LLMs and code orchestration in python-ish DSL is the future. But local models are pretty bad at code gen too.

never_inline · 2025-07-01T05:44:46 1751348686

People are used to the `click` way, where you can define args as function parameters. It's little more verbose but it helps click is a very established library which also provides many other things needed by CLI tools.

There's also `typer` from the creator of `fastapi` which relies on type annotations. I have not had the opportunity to use it.

never_inline · 2025-07-01T05:40:24 1751348424

At this point you're just flexing that you have 96GiB machine. (Average developer machines are more like 16GiB)

But that's not the point. If every dependency follows same philosophy, costs (compiler time, binary size, dependency supply chain) will add up very quickly.

Not to mention, in big organizations, you have to track each 3rd party and transitive dependency you add to the codebase (for very good reasons).

kstrauser · 2025-07-01T05:49:50 1751348990

I can write and have written hand-tuned assembly when every byte is sacred. That’s valuable in the right context. But that’s not the common case. In most situations, I’d rather spend those resources on code ergonomics, a flexible and heavily documented command line, and a widely used standard that other devs know how to use and contribute to.

And by proportion, that library would add an extra .7 bytes to a Commodore 64 program. I would have cheerfully “wasted” that much space for something 100th as nice as Clap.

I’ve worked in big organizations and been the one responsible for tracking dependencies, their licenses, and their vulnerable versions. No one does that by hand after a certain size. Snyk is as happy to track 1000 dependencies as 10.

eeZah7Ux · 2025-07-01T06:11:42 1751350302

> No one does that by hand after a certain size

This is not true

saghm · 2025-07-01T06:49:33 1751352573

96? It sounds more like 64 to me, which is probably above average but not exactly crazy. I've had 64 GB in my personal desktop for years, and most laptops I've used in the past 5 years or so for work have had 32 GB. If it takes up 1/4700 of memory, I don't think it changes things much. Plus, argument parsing tends to be done right at the beginning of the program and completely unused again by the time anything else happens, so even if the parsing itself is inefficient, it seems like maybe the least worrisome place I could imagine to optimize for developer efficiency over performance.

never_inline · 2025-06-30T16:48:57 1751302137

Doesnt work recursively?

maleldil · 2025-06-30T23:54:50 1751327690

At that point, you probably want proper validation with something like Pydantic, where you can use MyType.model_validate(dict).