You're right, using a doubly-recursive algorithm [1] for `fib` is a terribly naive and uncharacteristic way to write it in any language, including Julia. But it's a wonderful proxy for the cost of a function call. It's also quite scientific — there's an absolute truth for the correctness of an implementation. All languages must use a doubly-recursive scheme.
It all depends on what you want to measure. The whole point of the micro-benchmark suite is to test very specific language primitives. I'd argue that the current set of benchmarks are more valuable for that than an "expert" implementation would be — that may end up simply testing the cleverness or resourcefulness of the expert.
The issue of primitive performance seems to be in the background of how algorithms are implemented, which BLAS is running, compilation to server architecture, etc. One might measure the performance of 'very specific language primitives' directly for those language primitives. Stripping out confounding factors feels fundamental.
Of course, such a benchmark it might not have the same marketing hue as claiming that Julia is 553 times faster than Matlab at parsing an integer, for example.
It all depends on what you want to measure. The whole point of the micro-benchmark suite is to test very specific language primitives. I'd argue that the current set of benchmarks are more valuable for that than an "expert" implementation would be — that may end up simply testing the cleverness or resourcefulness of the expert.
1. https://github.com/JuliaLang/julia/blob/64409a0cae8b52d3f795...