Elm Compiler in Elm - Update

I finally managed to (hopefully) solve the stack overflow problems when running my port of the Elm compiler on other browsers than Safari.

With the new version I can compile the big module of the elm-default-tailwind-modules package in my Chrome browser, and I can compile the compiler with itself, too :rofl: There still might be problems with other big projects, though, so please tell me if you happen to encounter a stack overflow or other errors.

If you’re interested in the details of the changes I applied, feel free to ask.

26 Likes

In the Elm Slack I was asked what I had changed. I’m answering here because the posts stay longer.

Here is a short list:

  • In the original compiler, types like Parser, Decoder, Result, Tracker are implemented using a technique which is called (I think) “Church Encoding”. I wanted to add something like a stack-safe loop function to some of these types, but didn’t know how to do this in the Church Encoding, so I changed the defiinition of these types. For Decoder and Tracker this was all I had to do to solve the stack overflow problems there.

  • For Parser, Result and my IO monad I added the said “loop” function and used them where needed. I also used them to add some optimized fold and traversal functions to Result and IO.

  • In some places, the Elm compiler builds a graph (of nodes and edges) and then uses a library to dermine the so-called “strongly connected components” of such a graph. I had ported the code from the Haskell library to Elm, but as it turned out, that hasn’t been stack-safe. I had to re-implement the graph algorithm from scratch.

  • Finally, there were some functions where I had to manually try to make them tail-recursive, sometimes using a so-called continuation passing style: the functions constrainDecls and solve from the type checker, and the function checkDecls from the nitpicker. I was also lucky that Jeroen Engels showed me in his pull request how to make my implementation of the MVector type tail recursive. Thank you again @jfmengels.

This is just a very high-level description of my changes. If you want to know more, I’ll try to find some time to describe some points in more detail.

I was also asked whether these changes had any impact on performance, but after some very rudimentary testing I can’t see any substantial difference.

14 Likes

Thank you for making this even better and finding ways to not exceed stack limits on Firefox and Chromium.

Do you maybe have a basic introduction to this maybe-named “church encoding” technique?
Is it similar to defunctionalization, where instead of stacking function executions one returns records that are then handled in later runs, like e.g. in elm/parser loop?

Hey Marc, I am currently handicapped because of a cold, but I’ll try to answer as soon as possible…

Sorry Marc, I need more time to recover. Just posting this to keep the thread open…

Hi Marc, now I’m back again :slightly_smiling_face:

Instead of giving you just a few links I decided to explain this with a brief, little example. I hope that I’m not writing too much again, but here’s the first part:

Part 1: Either as an Algebraic Data Type

Let’s define the Haskell type Either as an Algebraic Data Type, in Elm called “Custom Type”:

module Either exposing (Either(..))

type Either a b
    = Left a
    | Right b

This type looks pretty much like Elm’s Result type, but in Result x a it is clear that x is an error type and a is the result type. In Either a b the types a and b are equivalent, one isn’t “better” than the other.

Remark: In Haskell, Either is often used like Elm’s Result type with Left as the error case and Right as the result case, but in the following example we treat both as equal.


To create values of type Either we use the constructor functions Left and Right:

import Either exposing (Either(..))

eitherList : List (Either Int Char)
eitherList =
    [ Left 1
    , Right 'a'
    , Right 'z'
    , Left 9
    , Right 'p'
    ]

To process values of type Either we use case expressions:

eitherToString : Either Int Char -> String
eitherToString anEither =
    case anEither of
        Left i ->
            "Int: " ++ String.fromInt i

        Right c ->
            "Char: " ++ String.fromChar c

Finally, we can put all this togther in a little program:

import Html exposing (Html)

main : Html msg
main =
    Html.text (String.join ", " (List.map eitherToString eitherList))

and if we run this we get

Int: 1, Char: a, Char: z, Int: 9, Char: p

Here’s an Ellie with the code: (https://ellie-app.com/t5jVrfnPnMfa1)

The second part will follow soon…

Edit: changed “Abstract” to “Algebraic Data Type”.

Part 2: Either as an Opaque Type

Now suppose we want to make the type Either in module Either opaque.

If we do so, we can’t use the constructors Left and Right in other modules anymore, and we also can’t use them to pattern match in case expressions. This means we need some helper functions.


For creating values of type Either we can define and expose the following functions:

left : a -> Either a b
left =
    Left

right : b -> Either a b
right =
    Right

Now the code of our previous example looks like this:

import Either exposing (Either, left, right)

eitherList : List (Either Int Char)
eitherList =
    [ left 1
    , right 'a'
    , right 'z'
    , left 9
    , right 'p'
    ]

Easy.


But what about processing values of type Either in case expressions like our previous example?

eitherToString : Either Int Char -> String
eitherToString anEither =
    case anEither of
        Left i ->
            "Int: " ++ String.fromInt i

        Right c ->
            "Char: " ++ String.fromChar c

Such a case expression takes an Either value and turns it into a new result type. If I name the result type z, we need a function with a signature like:

someFun : Either a b -> z

But that’s not enough. The caller of this function also has to tell us how to turn an a value into a z value in the Left case, and how to turn a b into a z in the Right case. So we need to pass two more functions to someFun:

someFun : (a -> z) -> (b -> z) -> Either a b -> z

In Haskell, this function is called “either”. Here’s the implementation in Elm:

either : (a -> z) -> (b -> z) -> Either a b -> z
either leftFun rightFun anEither =
    case anEither of
        Left a ->
            leftFun a

        Right b ->
            rightFun b

Using this function, we can rewrite our eitherToString function like this:

eitherToString : Either Int Char -> String
eitherToString anEither =
    either
        (\i -> "Int: " ++ String.fromInt i)
        (\c -> "Char: " ++ String.fromChar c)
        anEither

The rest of the code doesn’t have to change. Here’s an Ellie with the new code: (https://ellie-app.com/t5kDfyrsxRPa1)

The next part will follow soon…

Part3: Either in Church (?) Encoding

Ok, now that the type Either is opaque, we can change its implementation.


Notice: I always thought that the new implementation I’ll show you in this part was called “Church encoding”, but I’m not so sure anymore. I learned the name a while back when I experimented with so-called “Free Monads”. You can see a Haskell implementation of Free Monads as an Algebraic Data Type in this module: Control.Monad.Free. Back then I noticed there’s another module called Control.Monad.Free.Church which implements Free Monads as a function in the so-called “Church encoding”. This is where I got the name from.

But as I said, I think that I’ve been using the wrong term. When Marc asked his question above, I looked at the Wikipedia article for “Church encoding”. At the bottom there’s a link to the so called “Scott encoding”, and I now think that this is a better name.

So thank you Marc for helping me recognize this mistake.


The new implementation is related to the either function from the last part:

either : (a -> z) -> (b -> z) -> Either a b -> z

This could be read as follows: if we have an Either a b value, we can turn two functions (a -> z) and (b -> z) into a z value.

The so-called “Scott encoding” (!) follows exactly this interpretation. It defines the type Either as:

type Either a b
    = Either ((a -> z) -> (b -> z) -> z)

Unfortunately, this isn’t valid Elm code.

In Haskell we could define an Either type like this:

newtype Either a b =
    Either (forall z. (a -> z) -> (b -> z) -> z)

The forall keyword allows us to hide the type parameter z, but in Elm we have to make it explicit, so we have to change the type Either to:

type Either z a b
    = Either ((a -> z) -> (b -> z) -> z)

With this new definition, the helper functions can be implemented as follows:

left : a -> Either z a b
left a =
    Either (\leftFun _ -> leftFun a)

right : b -> Either z a b
right b =
    Either (\_ rightFun -> rightFun b)

either : (a -> z) -> (b -> z) -> Either z a b -> z
either leftFun rightFun (Either eitherFun) =
    eitherFun leftFun rightFun

As a client of module Either, we have to add the new type parameter z to the Either type:

eitherList : List (Either z Int Char)
eitherList =
    [ left 1
    , right 'a'
    , right 'z'
    , left 9
    , right 'p'
    ]

Normally we can leave the type parameter z unspecified as in the code above.

Only when we call the either function, the type parameter must match the result type:

either : (a -> z) -> (b -> z) -> Either z a b -> z

All four z types must be the same.

This turns the eitherToString function from the last part into:

eitherToString : Either String Int Char -> String
eitherToString anEither =
    either
        (\i -> "Int: " ++ String.fromInt i)
        (\c -> "Char: " ++ String.fromChar c)
        anEither

Notice that the type parameter z is String in this case, because we turn the Either value into a String.


The rest of the code doesn’t have to change. Here’s an Ellie with the new code: (https://ellie-app.com/t5KPkjJG9Lxa1)

Stay tuned for more…

2 Likes