That's reasonable in many cases, but I've had situations like this for senior UI and frontend positions, and they: don't ask UI or frontend questions. And ask their pet low level questions. Some even snort that it's softball to ask UI questions or "they use whatever". It's like, yeah no wonder your UI is shit and now you are hiring to clean it up.
We need a plugin to automatically detect AI posts as I'm basically skipping reading or clicking most links now due to a lot of it being generated word soup.
> Environmental regulations are a win. Unfortunately there is a large segment of the population that doesn't believe something ...
You aren't wrong, but let's be honest that a lot of that is manufacturing just moved to China and moved the pollution. Specific to lead in gas, yes it's great we no longer do this.
Manufacturing output hit an all time high in the US in 2024.
There's less manufacturing jobs and it's less of the total economy as other sectors grew but it would presumably need to be genuinely cleaner in order to offset that growth if industrial pollution just remained flat.
The switch from coal to gas would be a major cleanup for any process that uses electricity, for example.
I'll notice this with TV documentaries and segments on news channels quite frequently as well. I have the "GeoGuessr gene" as well as being decently well travelled so I spot this stuff all the time. One particular pet peeve of mine is movies or shows mean to be shot in medieval Europe but the "forest" they use is actually a tree plantation of North American native trees such as Sitka Spruce.
Sorry if this is an easy-answerable question - but by open we can download this and use totally offline if now or in the future if we have hardware capable? Seems like a great thing to archive if the world falls apart (said half-jokingly)
Sure. Someone on /r/LocalLLaMA was seeing 12.5 tokens/s on dual Strix Halo 128GB machines (run you $6-8K total?) with 1.8bits per parameter. It performs far below the unquantized model, so it would not be my personal pick for a one-local-LLM-forever, but it is compelling because it has image and video understanding. You lose those features if you choose, say, gpt-oss-120B.
Also, that's with no context, so it would be slower as it filled (I don't think K2.5 uses the Kimi-Linear KDA attention mechanism, so it's sub-quadratic but not their lowest).
Yes but the hardware to run it decently gonna cost you north of $100k, so hopefully you and your bunkermates allocated the right amount to this instead of guns or ammo.
Is the software/drivers for networking LLMs on Strix Halo there yet? I was under the impression a few weeks ago that it's veeeery early stages and terribly slow.
llama.cpp with rpc-server doesn't require a lot of bandwidth during inference. There is a loss of performance.
For example using two Strix Halo you can get 17 or so tokens/s with MiniMax M2.1 Q6. That's a 229B parameter model with a 10b active set (7.5GB at Q6). The theoretical maximum speed with 256GB/s of memory bandwidth would be 34 tokens/s.
You are right, but I think at the moment, a lot of people are confusing "software engineering" with "set up my react boilerplate with tailwind and unit tests", and AI just is way better for that sort of rote thing.
I've never felt comfortable with the devs who just want some Jira ticket with exactly what to do. That's basically what AI/LLMs can do pretty well.
reply