About the Ergonomics of Applicative JSON Decoding

I have a feeling that many Elm projects pull in NoRedInk/elm-json-decode-pipeline as a dependency. Even the docs for elm/json point to it. And for a good reason: It gives an intuitive API for decoding large records:

import Json.Decode as Decode exposing (Decoder)
import Json.Decode.Pipeline exposing (required, requiredAt)

type alias User =
    { email : String
    , name : String
    }

userDecoder : Decoder User
userDecoder =
    Decode.succeed User
        |> required "email" Decode.string
        |> requiredAt [ "profile", "name" ] Decode.string

But is this really a good idea? I have two problems with this approach:

  1. This adds an additional library dependency for something quite trivial.
  2. It introduces new terminology that is different from elm/json.
elm/json NoRedInk/elm-json-decode-pipeline
field required
at requiredAt
succeed hardcoded
field + maybe + Decode.map (Maybe.withDefault x) optional

When I first learned Elm, the functions in elm/json quickly made sense to me and I got a feeling for how they could be composed, but with NoRedInk/elm-json-decode-pipeline it always felt like magic and it wasn’t easy for me to understand how it really works. Now that I learned much more about functional programming and Haskell, it’s easy to see that a decode pipeline ist just some very specific form of an applictive functor composition and you can do it easily by yourself:

userDecoder : Decoder User
userDecoder =
    Decode.succeed User
        |> Decode.map2 (|>) (Decode.field "email" Decode.string)
        |> Decode.map2 (|>) (Decode.at [ "profile", "name" ] Decode.string)

This doesn’t look very pretty yet, but it looks better if we give a proper name to Decode.map2 (|>):

andMap : Decoder a -> Decoder (a -> b) -> Decoder b
andMap =
    Decode.map2 (|>)

userDecoder : Decoder User
userDecoder =
    Decode.succeed User
        |> andMap (Decode.field "email" Decode.string)
        |> andMap (Decode.at [ "profile", "name" ] Decode.string)

And andMap is already implemented in elm-community/json-extra!

I like the final version much better than the decode pipeline. It only needs elm/json and a one-line helper function. The terminology is consistent with “non-pipeline” style decoders and it’s easy to type and read. Why don’t we recommend this as the way to go? And if we were convinced that this is a good idea, andMap should be added to elm/json. Then we would no longer need any additional libraries for basic JSON decoding :tada:

17 Likes

I like the applicative style too. Like you mentioned,

The terminology is consistent with “non-pipeline” style decoders.

this part is highly valuable to me.

Adding to that,

andMap : Decoder a -> Decoder (a -> b) -> Decoder b

is, when it is used with |>, interpreted to:

|> andMap
--> Decoder (a -> b) -> Decoder a -> Decoder b

This is exactly what we see in elm/parser too, as |=

(|=) : Parser (a -> b) -> Parser a -> Parser b

Personally it really helped me to comprehend “applicative workflow” in functional programming.

If we are accustomed to this style, the whole idea is nicely portable between elm/json, elm/parser, other libraries of “parser”-nature (including my ymtszw/elm-xml-decode) and even beyond Elm!

3 Likes

I mean, if we had that operator in elm/json that would be cool, too (it’s <*> in Haskell).

userDecoder : Decoder User
userDecoder =
    Decode.succeed User
        |= Decode.field "email" Decode.string
        |= Decode.at [ "profile", "name" ] Decode.string
5 Likes

Maybe it could be a good idea to also allow custom packages to define their own version of |= so it can be used in things like yaml decoders.

It should only be possible to define (|=) with a type like this, where Type can be anything:

Type (a -> b) -> Type a -> Type b
1 Like

Another library that might benefit from this “de-sugaring” is elm-graphql.

I’m a die-hard fan of the continuation style of JSON decoding (despite it’s wide row output in elm-format), so I would love to see less of a community reliance on elm-json-decode-pipeline. Yes it’s magical, but too magical.

I agree this terminology difference situation is undesirable

excellent idea. hope this will happen :bowing_man:

3 Likes

I think andMap should definitively be included in elm/json. It is very basic and very needed. Most of the times that I used 3rd party lib for json decoding was when my records was getting bigger, only.

1 Like

Love your explanation, but still pipeline is much more readable. When I knew little Elm, I could immediately use it. It reads great. Your version reads quite techie.

1 Like

I like using the continuation style (monadic style, like >>= in Haskell) since it does not depend on the order which fields are defined in the type alias, (User in this example).
Similarly to andMap, I usually create a helper function, do (to resemble do-notation but you can ofc use another name).

do : Decoder a -> (a -> Decoder b) -> Decode b
do a b =
    Decode.andThen b a

userDecoder : Decoder User
userDecoder =
    do (Decode.field "email" Decode.string) <| \email ->
    do (Decode.at [ "profile", "name" ] Decode.string) <| \name ->
    Decode.succeed
        { email = email
        , name = name
        }

The biggest problem I have with this code is that elm-format is going to mess it up and indent every line, there is an open issue about it but for now I just don’t run elm-format on those files.

4 Likes

I agree with this - taking custom operators out of Elm has helped to keep code readable. Its a minor nuisance to write |> andMap instead of |=, but I still prefer the more explicit form. I would not even mind if the special symbols were dropped from elm-parser.

The OP is a neat observation though - I do use elm-json-decode-pipeline. I’m not worried about the dependency, but I do like the consistency of this idea and it makes it more obvious to me what is going on.

4 Likes

Dropping the extra operators from elm-parser would probably be valuable politically as well…

Personally, I would like to keep the |= operator, because it is part of one of the fundamental type classes used in functional programming. I agree that too many operators make code difficult to read (and Haskell can be great example of that ^^) but the semantics of |= are clearly defined by applicative functors. I would even say that the operator can be quite intuitive once you get the hang of it: It’s like <|, but both arguments (and the return value) are wrapped in a functor. It is a useful operation in a lot of different scenarios because it is a very general operation.

1 Like

I think one reason why NoRedInk/elm-json-decode-pipeline feels like magic is the fact that the application operators seem go in the wrong direction when you first look at it. When you create a user without decoders, the application operator points to the left:

User
    <| email
    <| name

But when you build a user within a decoder, it points to the right:

Decode.succeed User
    |> required "email" Decode.string
    |> requiredAt [ "profile", "name" ] Decode.string

Explain that to a newcomer :grin:

But if you use the |= operator, you can just say “it’s like the <| operator, but the arguments are wrapped in a decoder”:

Decode.succeed User
    |= Decode.field "email" Decode.string
    |= Decode.at [ "profile", "name" ] Decode.string
1 Like

Why don’t we recommend this as the way to go?

type alias User =
  { name : String
  , address : String
  }

If for some reason you decide to switch the order that the fields are defined in the record, you will add a bug that is very silent and very difficult to catch.
Given the guarantees that Elm strives to make, this bug is doubly dangerous.

It happened to us at least once in production, which is why we generally avoid json-decode-pipeline.

Any solution that the community recommends as THE way to do decoding should IMHO address this.

1 Like

This is both true for decode-pipeline AND applicative style in this thread. They are both dependent on record constructor functions that are sensitive to field definition order.

One solution to that problem is what @albertdahlin proposed; “continuation” style (analogous to Haskell’s “do”-style)

The style binds resultant values to explicitly-named variables and smoothly leads to literal record construction syntax. Less sensitivity to field order.

This I also love, but due to unfriendliness to current elm-format we cannot employ the style simply.

It is also possible to use explicitly-named variables without continuation style:

succeed (\name address -> { name = name, address = address })
    |> andMap (field "name" string)
    |> andMap (field "address" string)

though it does not well-associate variables to field decoders since they are vertically apart. If their types are the same, error-by-shuffling still occurs.

My take is, “continuation” style is the most robust in the long term, though it needs good support from elm-format. Applicative style and decode-pipelines are, in their semantics, not so different and a matter of preference I would say. I do prefer |> andMap style since it requires less knowledge to additional APIs. Re-introduction of |= (for specific type of functions) would be cool indeed, but I doubt it will happen.

3 Likes

There’s some resistance to permitting an —exclude option in elm-format proper, as seen in this Github issue (which suggests a command-line workaround).

Perhaps IDEs could offer more flexibility than a global “format on save” option.

@albertdahlin is my boss so yeah, i know what he’s talking about. :slight_smile:
I agree with what you wrote, I use that style in my personal project and yes, the biggest drawback is elm-format.

However, the point I wanted to make was different: Elm tries to solve problems “in batches”, so that the whole picture can be addressed instead than just a specific issue.

A solution that leaves a major issue open, like the one discussed in this thread, is unlikely to be embraced by core.

Beyond that, the main issues with encoders and decoders is that you have to write them at all, and it’s entirely possible that Evan is waiting for a solution to that before officially supporting any change.

Having out-of-the-box official support for applicative style would be nice indeed

1 Like

The challenge of matching which field corresponds to which record field is a common one. I have also seen production bugs with mismatched values as the root cause. I find continuation style really intriguing, it seems like it solves that problem pretty effectively.

Challenges with continuation style in elm-graphql

One note about elm-graphql, as far as I can tell it’s not possible to do continuation style chains because SelectionSets represent both the request to be made, and the underlying decoder. That means I can’t define SelectionSet.andThen, because that would mean the query I send depends on the response, so it’s a chicken and egg problem (I can’t make the request unless I have the response).

Intellij Inlay Hints

I’ve been working on some intellij-elm contributions, and one cool possibility is using the Inlay Hint functionality of the Intellij SDK. We could use this to make it easier to tell if there’s a mismatch in an applicative pipeline.

I’m currently working on some basic inlays for intellij-elm. Here’s an example of what they look like for basic arguments:

image

If you’re interested, you can see the work in progress and my notes on it:

Using Intellij Inlay Hints for Applicative Pipelines

I think that this could really help with the problem of keeping track of which field corresponds to which part of an applicative pipeline. Here’s an example of what that might look like (imagine the {--} comments looking like the inlay hints from the above screenshot).

type alias User =
    { email : String
    , name : String
    }

userDecoder : Decoder User
userDecoder =
    Decode.succeed User
        |> {- email: -} andMap (Decode.field "email" Decode.string)
        |> {- name: -} andMap (Decode.at [ "profile", "name" ] Decode.string)

The same hints could be used for Decode.map3 (or mapN), and it could be generalized to work with elm-graphql, elm/parser, and other applicative pipelines.

It seems like a promising idea to me, but I’d love to hear what people think of the idea.

5 Likes

i use just one function from decode extra: andMap. and I use it all the time. device pipeline’s genius was to build on this, but did the rest the basics library is enough, and easier for other elm users to read