Elm-benchmark 2.0.0

elm-benchmark 2.0.0 is out! Installation instructions are in the README. This is a recommended upgrade for all users of elm-benchmark; it fixes a number of bugs from the 1.x releases as well as introducing some new goodies.

Much Nicer UI

A picture says a thousand words, so…

new

Simplified API

Before, we had an API inspired by Json.Decode.map2..8. It turns out that this made it really easy to get yourself into bad situations! For example, you could create a comparion of two groups, or of a group and a single benchmark. This didn’t make any sense, and nonsensical comparisons would just dump JSON on the user. This got confusing! So I redid the API to be closer to elm-test:

let
    target =
        Dict.singleton "a" 1
in
Benchmark.benchmark "Dict.get" <|
    \_ -> Dict.get "a" target

Now everything in the given function will be benchmarked. This has a slight overhead, but that’s ok! elm-benchmark has always favored consistency over accuracy, and this smoothes out a lot of the rough edges where you would get inconsistent or confusing results.

Next, compare has changed slightly:

Benchmark.compare "initialize"
    "Hamt"
    (\_ -> Hamt.initialize 100 identity)
    "core"
    (\_ -> Array.initialize 100 identity)

We also have a way to compare a series of benchmarks, scale. Let’s benchmark Array.Hamt.initialize for the first 5 powers of 10:

List.range 0 5
   |> List.map ((^) 10)
   |> List.map (\n -> (toString n, \_ -> Hamt.initialize n identity))
   |> Benchmark.scale "Hamt.initialize"

Be careful with this one, as it can create very heavy benchmarks! We will make sure you get good results, but it can take a while! If you end up using scale in any significant way, please open an issue or let me know some way. I’m not 100% sure this is the best API for this, and I want to improve it based on actual use.

New Sampling Methods

Running a benchmark twice should take twice as long as running it once. This means that we have a dependent variable (runtime) and an explanatory variable (sample size), and we can combine those to make a trend line that’s resilient to outliers and outside interference!

You mostly don’t need to care about this, except that this means we only need to present two numbers: the runs per second, and how well our trend line fits the data. This enables both more consistent and accurate measurements, as well as letting us simplify the UI.

We’re using elm-trend under the covers. Right now it’s using the the quick estimator, but in future versions I want to get us back to using the robust estimator (it’s better against outliers.)

13 Likes

This is good work and I thank you for your effort :yellow_heart: :yellow_heart: :yellow_heart: :yellow_heart: :yellow_heart:

2 Likes