Generic decoding of json to an elm record

I think I finally learnt the intricacies of Json decoding by spending a lot of time (4 days) decoding a complex Json into my elegant elm data model. The json structure that is being sent from the server is very dynamic, unoptimised and not done well. Though I had the pleasure of learning and getting my head around it, I still think, the json decoding excercise is very verbose and difficult to wrap your head around. I understand the reasoning behind this design, which is tied to the language design, but I think this could be a major stumbling block for many beginners who want to adopt elm at work (especially the ones coming from javascript or java).

When I was searching around for some solutions, I did hit this one - How to write a generic JSON decoder?

However, this did not serve my purpose and I ended up writing the regular Json decoder. The set of decoders I wrote for the json is alone 300 lines of code! I think improving this situation will go a long way in Elm’s adoption.

Is it possible to write a generic decoder that decodes any json into a record of records | list of records recursively automatically inferring primitive json types into Elm primitive types. So, some json like this

            "posts": [
                   { "id" : 1,
                    "likes" : 10
                     "content" : "this is post no 1",
                     "author": {
                         "id": "jmex",
                         "name": "James Max"
                     },
                     "comments" : [
                         { "id" : 1,
                            "comment": "comment 1",
                            "user" : "achrist"
                         }
                     ]
                   }
             ]
       }

This should get decoded into an elm record like this

decoded = {
    data = {
        posts = [
          { id = 1
          , likes = 10
          , content = "this is post no 1"
          , author = 
                { id = "jmex"
                , name = "James Max"
                }
          , comments = 
            [ { id = 1
              , comment = "comment 1"
              , user = "achrist"
              }
            ]
          }
        ]
    }
}

If I can get this type of generic Elm record, then I can run it through my function to transform it into the data model I have defined with much lesser code. Any ideas from elm experts here?

I’m curious how do you do that with much lesser code?

Short (Unuseful) answer

You cannot do this with records. It is no longer possible to dynamically update record types. It simply isn’t allowed by the language. See https://elm-lang.org/blog/compilers-as-assistants for why this was removed in Elm 0.16.

Even if it was still possible, I doubt you’d want to deal with the absolutely painful experience of working with those types. It would get pretty hairy to do what you’re talking about to the point that your 300 lines of code sounds tame.

Long answer

As I mentioned, you can’t use records. Instead, use Dict. Additionally, you’d have to represent json values correctly which is annoying, but doable.

I’ve included a full example below.

But my question is this: are you sure you want to go down this road? Knowing and writing your schema upfront can be really helpful. It’s why protobuf, capnproto, flatbuffers, and even graphql are successful. Sometimes schemas are complex which means you’ll need complex code.

Anyway, here’s an example that should work. Just know that if you go this route, you’ll end up writing a lot more Maybe.withDefault, Maybe.map to handle the unknowns.

import Dict
import Json.Decode as Decode


{-| A json object is really just a map/dict from a string to other json values.
-}
type alias JsonObject =
    Dict.Dict String JsonValue


{-| A json array is a list of Json values
-}
type alias JsonArray =
    List JsonValue


{-| This is the meat of the representation.
-}
type JsonValue
    = JBool Bool
    | JInt Int
    | JFloat Float
    | JString String
    | JArray JsonArray
    | JObject JsonObject



-- Decoders


jsonObject : Decode.Decoder JsonObject
jsonObject =
    Decode.dict jsonValue


jsonArray : Decode.Decoder JsonArray
jsonArray =
    Decode.list jsonValue


{-| Decodes a json value.

The order matters here. Ints are not actually a thing in javascript, so Int 
must come before float.

EDIT: I originally had mentioned bool don't exist either, but I was wrong about
this.

-}
jsonValue : Decode.Decoder JsonValue
jsonValue =
    Decode.oneOf
        [ Decode.bool |> Decode.map JBool
        , Decode.int |> Decode.map JInt
        , Decode.float |> Decode.map JFloat
        , Decode.string |> Decode.map JString
        , -- The next two must be lazy since they are recursive.
          Decode.lazy (\_ -> jsonArray) |> Decode.map JArray
        , Decode.lazy (\_ -> jsonObject) |> Decode.map JObject
        ]

2 Likes

This is not a direct answer to the question, but when I need code to decode arbitrary structures I find this useful:
https://noredink.github.io/json-to-elm/

I could break thing down into pieces to customize processing by level and it was quite nice. You don’t have to write the boilerplate, but you also don’t go into the hole of dealing with a genericized solution which you then also end up maintaining. YMMV of course

1 Like

Tried to capture some thoughts on valid use cases for generic json decoders here:

https://discourse.elm-lang.org/t/how-to-write-a-generic-json-decoder/3179/14

This doesn’t seem like one of those cases.

I had tried this before posting this and it did not work for me, as you rightly pointed out that the code for transforming the json into elm data model was way more complex than using json decoders since I had to do a lot of branching/caseof/Maybe because of the very generic types. So, I reverted to the regular json decoder.

I understand the rationale behind removing dynamic update of records. I was just wondering, if this could be a core language feature like JSON.parse and JSON.stringify (in javascript), implemented as one of the core kernel modules, so that could produce a record as stated above, while we developers dont have that ability to create a dynamic record on the fly like that.

When you get a lot of experience writing json decoders, maybe this is not an issue especially for simple, straightforward json structures. However, when you are a beginner and new to functional programming, understanding how decoders work, is really difficult. IMO, when you have an elm record like what I stated, its much easier to think about and write a function to transform it into your data model.

I can just use List.map*, Maybe.map*, Maybe.withDefault, and just return straight elm types from my transformation functions instead of using Decode.map*, Decode.andThen branching and returning decoders that are chained

Thanks, I looked at json-to-elm. However, my requirements on decoding were way more complex than this. Stuff like, having to go through an array of objects inside an object and extracting few fields from it to add to the overall parent object’s elm type, determining the max of the array based on a predicate and sorting the resulting elm list based on that, etc. So, I had to create a bunch of custom decoders and massage the data with Json.Decode.andThen

It sounds like what you want is not a generic decoder, but an implicit decoder. Elm already has this actually, but only for flags and ports.

ie. you can do :

type alias Flags = {foo: String, bar: Int}

init : Flags -> (Model, Msg)
init flags =
  ({foo = flags.foo}, Cmd.none)

One issue with the current implementation is that it throws a runtime exception if types don’t match, rather than returning a Result, which might be okay for JS values that you pass in yourself, but is not so good for web APIs. That could be fixed though.

I kinda do think that this could make things easier for many, but I’m pretty sure it’s not going to happen, as Evan has already stated that he doesn’t want JSON to be a core part of the language and have special compiler support.

Exactly, what I want is an implicit decoder supported by the core language. I read Evan’s post and I think it makes sense, especially given that this will give rise to runtime exceptions in elm, given that json spitted out by a web API can change dynamically.

What I’m trying to say is that completely dynamic records is just not mathematically possible, implicit or otherwise. You’d have to be able to define a function like so:

parseJson: String -> {a | ???}

At compile time, you’d have to be able to fill in ???. Unfortunately, the type of ??? is “literally anything”. You’d end up with a dynamic language like python, ruby, or javascript.

That’s the main reason some people like those languages more than statically typed languages like Elm. As much as it sucks to say, this is the cost of using Elm. Just decide whether the benefit you’re getting from defining the schema upfront is worth it when choosing Elm. Personally, I think the benefit WAY outweighs the cost.

Yes. I completely understand now that this is impossible. I am already bought into Elm, with the guarantees it provides to you with its strict static types and the compiler makes refactoring a breeze. It is very good for iterative development and you get a very high confidence with refactors

1 Like

I came to Elm from a Ruby background and like many others was confused by JSON decoders. Now that I understand them, I wish I could use something like JSON decoders in my Ruby code, particularly when dealing with data that can be in multiple shapes :sweat_smile:

3 Likes

Not necessarily. There could be a serializable super type and the compiler could throw errors if it is not clear from the code what type would the function return. This errors would be very similar to the errors thrown if you attempt to ignore the Flags type.

This kind of a facility would take care of a large set of use-cases. The current decoders could still be used for endpoints that return data that changes shape or for some special cases. They would be the fallback mechanism.

Anyway, this has been discussed before and it will not be implemented in any foreseeable future for very good reasons.

I have already filed couple of bugs on the REST API maintainer, who is throwing such inefficient json at me. Coding in elm, is actually forcing me to think in terms of good data model for the response json

1 Like

For sure this one of the harder things to wrap your head around at first when using Elm.

But then you realize–it’s not a bug, it’s a feature!

Also, I use Atom and Elmjutsu

which automagically builds JSON encoder and decodes from type definitions (which you can then tweak with andThen, etc. to your heart’s content.)

1 Like

How do you auto generate Encoder + Decoder with elmjutsu? I’m looking at the docs, but I don’t see it:

If you look at the last gif in the section on “Special completions” it seems to demonstrate it, I think.
I cannot link to the section but the image: https://i.github-camo.com/9ec11fdf14da312fa7fec8d90b5405851ec101b3/68747470733a2f2f6769746875622e636f6d2f68616c6f68616c6f7370656369616c2f61746f6d2d656c6d6a757473752f626c6f622f6d61737465722f696d616765732f636f6e7374727563742d66726f6d2d747970652d616e6e6f746174696f6e2d332e6769663f7261773d74727565