Hi. This question maybe one about JavaScript, but I think it can help Elm compiler’s improvement.
I want to know why fibTopLevel performs better than fibLetIn on Google Chrome and Firefox (probably on other browsers too).
fibLetIn n =
let
helper m acc acc2 =
case m of
0 ->
acc
1 ->
acc2
_ ->
helper (m - 1) acc2 (acc + acc2)
in
helper n 0 1
fibHelper n acc acc2 =
case n of
0 ->
acc
1 ->
acc2
_ ->
fibHelper (n - 1) acc2 (acc + acc2)
fibTopLevel n =
fibHelper n 0 1
The only diffrence between them is whether the helper function is bound in top level or let … in statement.
Also generated JavaScript codes are almost the same.
var $author$project$Main$fibLetIn = function (n) {
var helper = F3(
function (m, acc, acc2) {
helper:
while (true) {
switch (m) {
case 0:
return acc;
case 1:
return acc2;
default:
var $temp$m = m - 1,
$temp$acc = acc2,
$temp$acc2 = acc + acc2;
m = $temp$m;
acc = $temp$acc;
acc2 = $temp$acc2;
continue helper;
}
}
});
return A3(helper, n, 0, 1);
};
var $author$project$Main$fibHelper = F3(
function (n, acc, acc2) {
fibHelper:
while (true) {
switch (n) {
case 0:
return acc;
case 1:
return acc2;
default:
var $temp$n = n - 1,
$temp$acc = acc2,
$temp$acc2 = acc + acc2;
n = $temp$n;
acc = $temp$acc;
acc2 = $temp$acc2;
continue fibHelper;
}
}
});
var $author$project$Main$fibTopLevel = function (n) {
return A3($author$project$Main$fibHelper, n, 0, 1);
};
The generated code is extremely similar. Maybe the function in function definition or the size of the outer function are tripping the JITs. Very hard to debug!
You may be measuring the performance of the garbage collector here - The amount of work the fib function does is incredibly tiny - it manages to calculate the 100th fibonacci number nearly 6 million times per second! On a 3GHz machine, that would mean it takes around 500 CPU cycles per call, which means it takes approx. 5 cycles per loop body. Considering most instructions actually take more than 1 cycle to complete, this should already be as fast as you can get in a “naive” way (Elm is not exactly known for being close to metal, there is not much we can do).
I made a similar comparison using Benchmark.scale here, but for the sake of not running into Infinity, I chose the harmonic instead of the fibonacci series (it works with your original functions as well, I just wanted to avoid floating-point weirdness):
It takes a while until the runs/second behave like you would expect - runtime does not seem to increase 10x if there are less than 1000 iterations happening
Starting at around the same point, the runs/second for both versions are roughly the same
So when doing less than 1000 runs, we do not seem to benchmark our function much at all - It’s pretty much just the overhead of calling things in Javascript. The one difference being that the fibLetIn version also allocates a new function object (“closure”) on the heap on every call, which seems to take the majority of the time if n is small.
Internally, elm-explorations/benchmark basically does the same thing @jreusch did, but it’s hard to “tune” that scaling factor at the library level because you have three basic kinds of benchmarks:
math stuff (fib, harmonic)
data structures (think tuning Array.get or Dict.get for performance.)
people’s app functions (update or similar)
elm-explorations/benchmark is currently tuned to give good results for data structures since that’s the primary optimization opportunity core team members see in Elm at the moment (cf. Robin’s work to optimize Array for 0.19.0.)
That means when you say…
So when doing less than 1000 runs, we do not seem to benchmark our function much at all - It’s pretty much just the overhead of calling things in Javascript.
… the benchmark harness should be taking care of that for you. In this case, it looks like we start to get on a better track for a linear fit around 128 iterations of harmonic—that is, doubling the work done doubles the runtime.
I think there may be an opportunity to find a better algorithm for sample size. If you want to explore this yourself, Benchmark.LowLevel exposes the functions you’d need to create a new runner.
I haven’t done this myself, though, since it’s a lot of work to make sure the math checks out. A bad benchmark can be really misleading. I have also put my work on elm-explorations/benchmark on hold because the time spent in DOM stuff far outweighs algorithmic complexity stuff for most apps.
oh, also: when you see less than 99% goodness of fit and the benchmark completes very quickly it’s a good indication that tuning the benchmark may give more accurate results.