Faster `Dict.intersect`

Hey Elm community, I did a fast!
How fast? Much fast! Very wow!

I wrote a faster implementation for Dict.intersect (currently living at elm-generic-dict/Intersect.elm at c4ed642d0bdfd82e35a6b74092763a741ee111b5 · miniBill/elm-generic-dict · GitHub).

With small dictionaries of approximately the same size this code is either the same speed or ~10% faster than the one in elm/core.

With larger dictionaries, especially if the right one is much smaller, it can be up to 40x faster (not 40%, 40x). If the right dictionary is empty, my code runs in constant time, whereas elm/core takes time proportional to the size of the left dictionary (left and right refer to the first and second argument: Dict.intersect left right).

What’s the catch? The catch is that the code in elm/core is one line and obviously correct, my approach is approximately 100 lines and would require some more comments to convince the reader about correctness.

Does it make sense to try and get it merged in elm/core? Maybe! It is a tradeoff, more code for (MUCH) better speed. In that repository I also have other tradeoffs (like a 16 line version which is faster than core but slower than this and it’s probably more palatable for elm/core).

Could it be even faster? Maybe! One approach would be to try and special-case empty leaves in the case statements (this would need benchmarking), the other would be to use kernel code for mutating the accumulator and final build process (the code only prepends to a list, then consumes it in order).

I’m considering publishing it as part of a fast-dictionaries package, but I’d like to gather community feedback before going ahead. I think it should be possible to also make diff and union faster, and I’m definitely eyeing keeping track of the size to make it O(1).

Thanks to the many people who already gave feedback and helped me make this FAST, in no particular order @robin.heggelund @jfmengels @avh4 @Janiczek and everyone else I’ve forgotten about.

EDIT: if forks of Elm want to experiment with the code go right ahead! The code is MIT licensed but I can consider relicensing if needed (the intersect code is wholly mine).

19 Likes

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.