I think they also based their expectation on the release cycles and speeds of update. Anthropic is known for more conservative release cycle and incremental updates. Google on the other hand is accelerated recently. It also seems that other actors are better at benchmark cheating ;)