In planning future versions of elm-benchmark, I wonder what the most useful information from a benchmarking run is. I suspect that it may be most useful to provide one metric which says “it’s about this fast” and one which says “it will probably be about this fast in the future.”
I’m currently thinking that in future versions, we may change to:
it’s about this fast is the median of runs in the current population. To my mind, this is the best indicator of the current population, since it’s actually present! (See footnote for a caveat, though.)
it’s will be about this fast is a prediction interval Summary: “given the current population, we are 95% confident that a new point would fall between this upper and lower bound.”
I’m interested in two questions:
- Current users of elm-benchmark: would this be intuitive and useful for you? How are you interpreting the data, currently?
- Current users of benchmarking tools in other languages: what stands out to you in those tools as especially helpful? Especially bad?
Footnote: What’s a run?
Runs are technically the mean time for a bunch of function executions. This is because browsers
performance.now used to be 50µs resolution data, but is now much higher because of the response to the Spectre vulnerabilities. MDN has the deets, as always. By taking the mean, we can measure performance below the resolution threshold, no matter where it lies, as long as our total sampling time is significantly above the browers’ resolution.