Wasn't there a tool calling benchmark by docker guys which concluded qwen models are nearly as good as GPT? What is your experience about it?
Personally I am convinced JSON is a bad format for LLMs and code orchestration in python-ish DSL is the future. But local models are pretty bad at code gen too.
Personally I am convinced JSON is a bad format for LLMs and code orchestration in python-ish DSL is the future. But local models are pretty bad at code gen too.