Some data formats are extensible, in addition to a fixed set of fields they may also allow additional, not known in advance fields to be included in the JSON.
The case I am looking at just now is the Open API spec, which allows extensions in the form of fields starting with x-. So if you had some record type definition, and for code generation purposes you could specify what type something is to be mapped to in the target language with an extra field like this: x-elm-type: String. There can be any number of these additional x- fields for any purpose as a general mechanism for making the spec extensible - they can also have values that are JSON objects or arrays.
So my question is, how do you write a Json decoder that can handle any valid Json, decoding it into a Dict?
My first attempt was to use the Json.Decode.dict function, and try and decode things as Strings. The example given in its docs is:
With decoders, I find that it’s impossible to give good advice without knowing both the format coming in and what you want to do with the result in Elm. Ultimately a decoder is about mapping a concept in JSON to a concept in your Elm program’s data structures, and it depends on both of those two ends equally.
For instance, if all you’re going to do with these extended fields is remember them and reencode them as JSON later, you can use Json.Decode.value to get the fields as unparsed JSON blobs. You can do that with
If you want to use the values in Elm, though, this doesn’t help very much. You could try to decode them into some completely generic JSON elm data structure, but that’s unlikely to be of much help (I wrote an article about that a while back).
So I guess my question is: Suppose you could do anything you wanted with your decoder. What would its type be and how would you use it in Elm?
I want to decode the { name : String, age : Int } part which will always be the same and known in advance. I also want to decode the x-superpower part, which is optional and can be any number of additional x- fields of any type.
Those x- fields are perhaps not so useful, since they do not map to static types in my Elm program, it is true. In the first instance all I may end up doing with them is displaying them in some UI - so for that reason at least I would like to capture them, what their names are and what their values are.
In that case I think I’d try using the decodeDict decoder I wrote above and filtering the result so that the only keys are the x-* fields. Out of the box Elm doesn’t have good support for formatting JSON values as strings, but it looks like you can use https://package.elm-lang.org/packages/ThinkAlexandria/elm-pretty-print-json/latest/ for that.
Use a generic Dict decoder that lifts out the x- fields, and also a specific Decoder for the fixed format fields that are expected and ignores any x- fields. I can then combine those together to get what I am after using Decode.map2 - gives me a direction to get started with anyway.
This would work too if all the other fields have a predictable prefix. The solution I proposed has the advantage that it would capture all fields other that a known set (name, age).
I was intrigued to take a look at the pretty printer, as it also must do generic JSON decoding. It bypasses having a generic model to describe the JSON, and goes straight to building the pretty printable document. However, the structure is similar to what I came up with:
Interestingly, that article was a result of me doing this same exercise, and writing just about the same code as you did, while trying to figure out if I could make a better decoder API. Doing that work and then trying to use the resulting Json values really drove home for me the idea that a decoder isn’t simply bringing JSON into Elm, but bringing JSON into your Elm program. A converter of JSON into a generic Json data structure is pretty simple to write, but doesn’t buy you very much – you still need to turn Json into data structures your program understands, and that transformation is basically just as difficult as the original problem.
That’s why I now try to steer people away from generic decoding as a first step, and instead steer them towards deciding what they’d want the native Elm representation of the JSON to be if they could choose anything, and then writing a decoder directly from JSON to that.
There are a couple of situations where a generic decoder can be useful:
If you wanted to validate an arbitrary json against an arbitrary json-schema - which of course has its own fixed schema, that you would decode and write a program around.
If you wanted to write a UI to help a user understand and work with arbitrary JSON. Say to visualise it or search it.
Programs that take JSON and try to infer its schema, or automatically map it to a data model. For example, Main