Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Gpt-4 runs on 8 x 220B params[1] and gpt is about 220B params(?). Local LLMs can be good for some tasks, but they are much slower and less capable than the size of model and hardware that openai brings to their apis. Even running a 7B model on the CPU in ggml is much slower than the gpt-3-turbo api, in my experience with a 12th gen i7 intel laptop.

[1] GPT4 is 8 x 220B params = 1.7T params: https://news.ycombinator.com/item?id=36413296



It's been well documented by now that the number of parameters does not necessarily translate to a better model. My guess is that OpenAI has learned a thing or two from the endless papers published daily that your "instance" of the model is not what it seems. They likely have a workflow that picks the best model suitable for your prompt. Some people may get a 13B permutation because it is "good enough" to produce a common answer to a common prompt. Why waste precious compute resources on a prompt that is common? Would it not be feasible to collect the data of the top worldwide prompts and produce a small model that can answer those? Why would OpenAI spend precious compute time on the typical user's "write a short story of...".

I would guesstimate that the great majority of prompts are trash. People playing with a toy and amusing themselves. The platform sends those to the trash models.

For the other tiny percentage that produces a prompt the size of a paragraph, using the techniques published by OpenAI themselves, they likely get the higher tier models. This is also why I believe many are recently complaining about the quality of the outputs. When your chat history is filled with "have waifu pretend to be my girlfriend" then whatever memory the model is maintaining will be poisoned by the quality of your past prompts.

Garbage in, garbage out. I am certain that the #1 priority for OpenAI/Microsoft is lowering the cost of each prompt while satisfying the majority.

The majority is not in HN.


> It's been well documented by now that the number of parameters does not necessarily translate to a better model.

That's certainly true, but it's hard to deny the quality of gpt 4. If the issue is the training data, let's just use their training data, it's not like they had to close up shop because of using restricted data.

I think the issue is more on the financial side, it must have been extremely expensive to train gpt 4. Open source models don't have that kind of money right now.

I'll finance open source models once they are actually good, or show realistic promises of reaching that level of quality on consumer hardware. Until then, open source will open source.

I've never bought any kind of subscription or paid api costs to openai, but if gpt 4 finally reached the point where I feel like it's a lot better than just good enough, I'll happily pay for it (while still being on the lookout for open source models that fit my hardware).


Picking the best model based on the prompt seems to be the best way to simplify the task they are doing.


It does seem like a good approach, though that seems to imply that they understand the context of the prompt being entered. Has anyone tackled this context sensitive model routing? It seems like a good approach, but likely not straightforward.


https://mpost.io/phi-1-a-compact-language-model-outpaces-gpt...

a 1billion parameter model beats 175billion parameter GPT3.5

OpenAI wants us all to drink the kool-aid.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: