The Gemma 3 models are trained with distillation and achieve superior performance to Gemma 2
for both pre-trained and instruction finetuned versions. In particular, our novel post-training recipe
significantly improves the math, chat, instruction-following and multilingual abilities, making Gemma3-
4B-IT competitive with Gemma2-27B-IT and Gemma3-27B-IT comparable to Gemini-1.5-Pro across
benchmarks. We release all our models to the community.
The Gemma 3 models are trained with distillation and achieve superior performance to Gemma 2 for both pre-trained and instruction finetuned versions. In particular, our novel post-training recipe significantly improves the math, chat, instruction-following and multilingual abilities, making Gemma3- 4B-IT competitive with Gemma2-27B-IT and Gemma3-27B-IT comparable to Gemini-1.5-Pro across benchmarks. We release all our models to the community.
Really cool