Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thank you!

Do you know if these actually preserve the structure of Gemma 3n that make these models more memory efficient on consumer devices? I feel like the modified inference architecture described in the article is what makes this possible, but it probably needs additional software support.

But given that they were uploaded a day ago (together with the blog post), maybe these are actually the real deal? In that case, I wish Google could just link to these instead of to https://huggingface.co/mlx-community/gemma-3n-E4B-it-bf16.

Edit: Ah, these are just non-MLX models. I might give them a try, but not what I was looking for. Still, thank you!






That's a great question that is beyond my technical competency in this area, unfortunately. I fired up LM Studio when I saw this HN post, and saw it updated its MLX runtime [0] for gemma3n support. Then went looking for an MLX version of the model and found that one.

[0]: https://github.com/lmstudio-ai/mlx-engine




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: