Yesterday, I found myself helping to explain folds on the Slack. Someone was trying to understand the difference between foldr
and foldl
, again. This is not unexpected: Those new to functional programming need to learn how to stop thinking in imperative loops and start using higher level functions like folds.
As always, I want to illustrate how to think about the folds like so (this is math notation, not code):
foldr⊕,z ([a, b, c, d]) = (a ⊕ (b ⊕ (c ⊕ (d ⊕ z))))
foldl⊕,z ([a, b, c, d]) = ((((z ⊕ a) ⊕ b) ⊕ c) ⊕ d)
These definitions make it clear how one groups to the right, the other to the left. You can see that for many operations (like append, +, *) with the appropriate z value, these two produce the same result.¹
It’s also pretty easy for people to grasp why, when folding over a list, walking the list from the head (as system must), foldl is efficient, becoming a simple loop, and foldr is recursive, to the depth of the list.²
This is not how it is in Elm. In Elm we have:
foldr ⊕ z [a, b, c, d] = (a ⊕ (b ⊕ (c ⊕ (d ⊕ z))))
foldl ⊕ z [a, b, c, d] = (d ⊕ (c ⊕ (b ⊕ (a ⊕ z))))
Which I argue is a little harder to see what’s going on, breaks the foldr / foldl equivalence, and requires a few more steps to see why one is a plain loop and not the other.
I presume the choice was made to avoid the other difficulty in learning and using folds: The order of the arguments to the combining function.
Based on the mathematical definitions, Haskell’s functions have these type signatures:
foldr :: (a -> b -> b) -> b -> [a] -> b
foldl :: (b -> a -> b) -> b -> [a] -> b
Even I, after over a decade of Haskell coding, still have to stop and think for a second when coding if I need to flip my combining function’s args or not.³
You might think that Elm’s definitions would eliminate that mental bump because there’s only one signature to remember:
foldr : (a -> b -> b) -> b -> List a -> b
foldl : (a -> b -> b) -> b -> List a -> b
But I find it doesn’t: I still have to stop and think for a second about which way things are going, is that the direction I want, can/should I use the other fold, and is my combining function defined with the arguments in the needed order. I bet you do, too.
Which leads us to consider: Which form of foldl
is better?
Many useful combining functions match a -> b -> b
:
Set.insert : a -> Set a -> Set a
List.(::) : a -> List a -> List a
And common things one writes are often in this form:
\item model -> { model | filter = Dict.insert item True filter }
\(k, v) -> Dict.update k (Just << Maybe.Extra.unwrap [v] ((::) v))
These are reduction operations. In general one usually doesn’t care what order the list is traversed, and it is conventional for the reduction operations to have the form (a -> b -> b)
. Hence, foldl
and foldr
taking the same combining function is useful.
On the other hand, there are common combining operations that generally have the form a -> a -> a
⁴
Dict.union : Dict k v -> Dict k v -> Dict k v
(++) : a -> a -> a
When using these, there is an important order to the first two arguments. And so, in Elm, choosing foldr
vs. foldl
not only chooses the order or traversal, but also the order of arguments to the combining function.
concat listOfLists = foldr (++) [] listOfLists
oddConcat listOfLists = foldl (++) [] listOfLists
> concat[ [1, 2, 3], [4], [], [5, 6] ]
[1,2,3,4,5,6]
> oddConcat [ [1, 2, 3], [4], [], [5, 6] ]
[5,6,4,1,2,3]
For many (most?) uses, the ordering that foldr
uses is natural (it is left-biased). However, it is often both conceptually, and efficiently, desirable to use foldl
- It makes sense to “proceed down the list” merging the elements. Hence, foldl
’s swapped combining function arguments makes sense.
Given both are useful, just pick one, right? Perhaps. In Haskell, it is common to use the function flip
, and it is part of the standard library, always available unqualified. It isn’t uncommon to see things like:
foldl' (flip Set.insert) Set.empty flagList
Since it is very easy to use a combining function of either form, with either fold, Haskell chooses the order it has, and it works just fine for all uses.
But Elm doesn’t have flip
, and it seems small utility higher order functions are eschewed, so if foldl
were defined as in Haskell, you’d have to write:
List.foldl (\s f -> Set.insert f s) Set.empty flagList
Reading that use of Set.insert
requires a higher mental burden than the modest flip
. You have to stop for a moment when you read that lambda. Which, I’m guessing, is what Elm’s definition of foldl
is trying to avoid.
So, both?
Which leads me to this proposal. In the next major upheaval of the core libraries, define:
foldr : (a -> b -> b) -> b -> List a -> b
foldl : (b -> a -> b) -> b -> List a -> b
reduce : (a -> b -> b) -> b -> List a -> b
Where reduce
is a left fold, but with the combining function in the order shown.
I know that we like having a minimal core, and that two ways to do something is not “the Elm way”… but, hear me out:
Now the library addresses two situations that programmer’s encounter:
- Reduction operations - for these
reduce
is the natural choice both because of the common combining function definitions, and from efficiency point of view. - Merging operations - for which there are two natural choices,
foldr
andfoldl
, that match the associative meaning of these operations, and work well for merging functions.
If nothing else, even if this proposal goes nowhere, I hope you find this exploration of the order of arguments to folds enlightening.
— Mark
¹ Technically, for monoids, which are by definition associative and have a zero element that is commutative, foldr and foldl always produce the same result. I hesitate to say “monoid”, as it seems to worry people I’m getting too Haskelly… but “monoid” is what the concept is, and programmers of all types use monoids day in and day out without seeing the common pattern because we don’t name it. So - let’s not be scared of odd words: “monoid” is what it is!
² The Elm implementation of foldr
will, after a depth of about 4k, decide to instead reverse the list, and then use foldl
for the rest. It also has unrolled the recursion 4× so that the maximum depth it 1000. Of course, these details don’t belong in a discussion of folds for those new to functional programming.
³ The mnemonic I use, and teach, is “In foldr the z slides in from the right. With foldl it slides in from the left.”
⁴ Operations that are part of a monoid. But I’m trying hard not to say that word in the main part of the post!