Re: Performance Optimization - Running in optimized mode

A few days ago @jxxcarlson wrote this thread Performance Optimization on small changes in how equality gets written in the source code and how it had a huge impact on performance for a function that was called many many times.

The example given was the following:

Before

   ensureNonBreakingSpace : Char -> Char
   ensureNonBreakingSpace char =
       if char == ' ' then
           nonBreakingSpace

       else
           char

After

    ensureNonBreakingSpace : Char -> Char
    ensureNonBreakingSpace char =
        case char of
            ' ' ->
                nonBreakingSpace
    
            _ ->
                char

The explanation was that the comparison done in the first example called _Utils_eqHelp, which is a very slow function compared to a simple === comparison in JavaScript.

I have looked at the compiled output, and I am very confused about the result and explanations, because from what I can see, this only applies when Elm is compiled without --optimize.

Here are the JS output for both code versions, with and without optimizations enabled.

// Before version, uncompiled
var $author$project$Review$Rule$ensureNonBreakingSpace = function (_char) {
	return _Utils_eq(
		_char,
		_Utils_chr(' ')) ? $author$project$Review$Rule$nonBreakingSpace : _char;
};

// Before version, compiled
var $author$project$Review$Rule$ensureNonBreakingSpace = function (_char) {
	return (_char === ' ') ? $author$project$Review$Rule$nonBreakingSpace : _char;
};

// After version, uncompiled
var $author$project$Review$Rule$ensureNonBreakingSpace = function (_char) {
	if (' ' === _char.valueOf()) {
		return $author$project$Review$Rule$nonBreakingSpace;
	} else {
		return _char;
	}
};

// After version, uncompiled
var $author$project$Review$Rule$ensureNonBreakingSpace = function (_char) {
	if (' ' === _char) {
		return $author$project$Review$Rule$nonBreakingSpace;
	} else {
		return _char;
	}
};

Without --optimize, there is a big difference in the output, and I believe there is a huge difference in performance. But with --optimize, both versions are almost identical and neither uses _Utils_eq.

I know someone that tried using this type of comparison on a parser and they saw a 16x performance boost. But they were benchmarking without --optimize. This sparked discussions on how we could make tooling to improve the performance, using code transformation on the Elm code or JS code. But once when running with --optimize, there was no difference. I am now wondering whether the comments on that thread and the potential related benchmarks were based on the non-optimized output.

I would really like to understand whether writing code like in the “after” example would improve performance on the code I write. As an author of a tool that I’d like to make as fast as possible, this is valuable information. I am guessing that there are some situations where changing how the code is written in Elm makes a difference in the output, but I don’t think this is one of them.

I would love to have misunderstood something, and for there to be easy performance wins, so please let me know if I did.

1 Like

That’s what happens when running it myself I guess. In general, if you want to draw any conclusions about performance, you must use --optimize, it really changes performance. And especially for Char.

The general principle still holds: Type information is not available during code generation, which means that unless it’s clear from the AST that === is safe (by either argument being a literal), elm will emit the custom eq instead of ===.

Without --optimize, a character literal is really a wrapper around a String. conceptually

type Char = Char String

-- api ensures that it's always (sort of, unicode you know) 
-- a one-character string

Thus neither argument would be a literal. But with --optimize this kind of one-element wrapper type is removed (in general, but it’s hardcoded for Char), and then during code gen we actually do have a literal. The specialization based on on normal and --optimize happens here on the JS side, and here in the compiler, generation of == happens here.

So the slowdown disappears when using --optimize. But in this case the problem was so bad that it influenced development, so the fix still has value here I think. Also kind of shows how easy it is to forget to turn --optimize on, especially during development. E.g. elm reactor and ellie don’t support it.

That makes more sense to me. Thank you for the explanation and the links, those were very interesting :slight_smile:
I figured the type information was not available as several people said, but I didn’t understand how it still managed to improve the output in compiled mode.

I have tried a version where the compared value is a char, and which

ensureNonBreakingSpace : Char -> Char
ensureNonBreakingSpace char =
    if char == someChar then
        nonBreakingSpace

    else
        char


someChar : Char
someChar =
    ' '

and the resulting JS for the function does use _Utils_eq

var $author$project$Review$Rule$ensureNonBreakingSpace = function (_char) {
	return _Utils_eq(_char, $author$project$Review$Rule$someChar) ? $author$project$Review$Rule$nonBreakingSpace : _char;
};

So, to resume my understanding and learnings:

  1. The compiler will optimize equality when you compare with literal values (like with pattern matching), but not when you compare it to variables or expression, because at that point, it doesn’t have the values’ type anymore. So there are cases where this optimization can still make sense, just not this one.

  2. You should run the benchmarks in --optimize mode if you try to make performance improvements. Try to run with --optimize if you notice any lags during development, maybe they will be much faster. Maybe our tooling could offer the possibility to run with --optimize to make that easier.

I’ll add an example of how fast the optimized output is: In elm-review, we take the user’s configuration in Elm, compile it, then load and run the resulting application (which, to be very simple, returns a bunch of information about the project’s source code). Even though the optimized compilation takes longer, the overall result is that when compiling with --optimize, the whole process is much faster than without --optimize: something like 1.5s vs 3s for small projects, with a gap increasing with the size of the analyzed project.

1 Like

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.