Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm confused why there is an 7b and an 8b version: https://ollama.com/library/deepseek-r1/tags



These are distillation fine-tunes of two different models:

- Qwen2.5 7B - Llama3.1 8B

Though the sizes are similar, they will probably have different strengths and weaknesses based on their lineage.


thanks.

I'm running the qwen distillation right now and it's amazing.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: