List.map optimization options

A few hours ago I looked at the implementation of List.map, which looks like this:

map : (a -> b) -> List a -> List b
map f xs =
  foldr (\x acc -> cons (f x) acc) [] xs

however, for me it seems that foldr is not tail-call optimizable. So I tried another version:

map f =
    List.foldl (\x acc -> (::) (f x) acc) []
        >> List.reverse

The additional List.reverse does not seem to be a problem on Firefox as the following animation shows:

ezgif.com-gif-maker(1)

On Chromium this is slightly different and the benefits of the foldl version seem to start for lists larger than 500. Below that the original List.map seems to perfom better, but less than 5%. However on Firefox the results seem to be constantly better on average 25%.

Step 2

Only for curiosity, I added a local List.map1 function that does the same as List.map but it uses a Kernel function … The patches look like this:

-- ~/.elm/0.19.1/packages/elm/core/1.0.5/src/List.elm

...

map1 : (a -> result) -> List a -> List result
map1 =
  Elm.Kernel.List.map1

...
// ~/elm/0.19.1/packages/elm/core/1.0.5/src/Elm/Kernel/List.js

...

var _List_map1 = F2(function(f, xs)
{
	for (var arr = []; xs.b ; xs = xs.b) // WHILE_CONSES
	{
		arr.push(f(xs.a));
	}
	return _List_fromArray(arr);
});

...

And boom, on Chromium I got a performance boost up to 250%, seems to be constant, tested for list lengths from 5 to 5000 … On Firefox, the results seem to be similar, however it is dropping from 250% to 60% on lists with a 5000 elements…

Conclusion

I have no idea which optimizations might be performed by the compiler with longer pipes of List.maps, etc. Although I had tested it with pipes also and it seems to be promising. But maybe this is an optimization that can be added to the core libs, not only into List …

Appendix

module Main exposing (..)

import Benchmark exposing (..)
import Benchmark.Runner exposing (BenchmarkProgram, program)

main : BenchmarkProgram
main =
    program suite

map f =
    List.foldl (\x acc -> (::) (f x) acc) []
        >> List.reverse

suite : Benchmark
suite =
    let
        sampleList =
            List.repeat 1000 0 |> List.indexedMap (\i _ -> i)
    in
    Benchmark.compare "Mapping"
        "List.map"
        (\_ -> List.map ((*) 2) sampleList )
        "custom map"
        (\_ -> map ((*) 2) sampleList )
6 Likes

screenshot_10

1 Like

screenshot_1000

1 Like

nice work! :sunny:

I’m using elm-optimize-level-2 for applying this sort of optimizations - I don’t know if you’re aware of the project, but it’s in the same space as the exploration you’re doing, and your work could probably find a home there!

Thanks for mentioning this project, I was playing around with it on an earlier stage, some time ago … This is a good idea, I will have a deeper look and create an issue, maybe they are already using this kind of optimization …

1 Like

Cool! Yeah, an alternative List.map implementation was brought up a little while ago which might be interesting to compare: A faster `List.map` for Elm

That’s currently in elm-optimize-level-2, though I still need to do a release for it.

2 Likes

I have seen, that elm-optimize-level-2 already does this kind of optimization as native F2 function … haven’t been aware of … I will use it definitely in the future …

I played around a bit with different kinds of function definitions, and it turns out (I guess its quite obvious) that the simple pattern :

fn : List a -> List b -> List b
fn input output =
  case input of
    [] -> List.reverse output

    x :: xs ->
      fn xs (doSomething x :: output)

generates the most performant code … I tried this out with different List-functions and for different sizes. You can try this out here, checkout the project GitHub - andre-dietrich/elm-blist and with cd benchmark and elm reactor you can try it out by yourself.

However, I tried to reimplement map, filter, indexedMap, partition, and unzip and I got performance boosts up to 300% and more (at least on Chromium). Firefox and Web as an Safari-alternative performed also better … Well in this case 100% means that it is two times faster compared to the original implementation, so 300% will be 4 times faster… I hope I am not mixing something up … However I did not check the Memory-consumption so far…

If I am not making a horrible mistake somewhere … It would be great (at least from the point of green-IT) to update the elm-core functions accordingly …

filter

1 Like

indexedMap

1 Like

map

1 Like

partition

1 Like

unzip

1 Like

Nice! You might try it out against the implementation in elm-optimize-level-2, which is here:

The elm-opt-2 version is able to get away with not having to do the List.reverse at the end. (though of course we’re in JS land)

2 Likes

I created two composed tests one for the original core functions and one for the elm optimized output

  1. Main
  2. Main

The source-code is here: elm-blist/gh-pages at master · andre-dietrich/elm-blist · GitHub

It seems that elm-opt-2 is performing better (the core function in this case are the optimized functions):

Note: I used elm-optimize-level-2 to generate the code, but after looking on the generated source-code, I could not find the new List-core functions, so I added them manually.