I told a bunch of people that elm-benchmark would have these two features in version 2, but it was so full that I cut them from the first release. Now they’re in!
There are two nice benefits here:
- You can examine the data that elm-benchmark collected for you. Humans have really good heuristics for things being off visually, and looking at a plot really kicks the funny-detector on.
- We can now reliably measure “heavy” (read: less than 100,000 runs/second) functions. These tend to have big spikes and dips, and chopping off outliers lets us get a better read on what’s actually going on.
As before, you can find out how to use elm-benchmark at http://package.elm-lang.org/packages/BrianHicks/elm-benchmark/latest
Enough talk, now we plot!
This shows a heavy function (calling length on a long list.) We’ve found and eliminated a bunch of outliers, so the goodness of fit has recovered nicely. Before this change it was hovering around 80%.
We take the same approach when there are multiple benchmarks in a comparison. In these cases, lower means faster, but we also assign each series a color. I think these colors are reasonable for colorblind folks, but please open an issue if they’re not and I’ll try something else!
You can also see that in the cases of lighter functions we have fewer outliers. The calculations here do not change nearly as much as with heavy functions, but everything gets a little more reliable.
Scale benchmarks get more colors. Right now we only have a few colors after which the originals will start repeating. If you run into this, you probably are running very large scale benchmarks, which the library is not really designed for. If this is your case, please open an issue.