How to JSON.decode a recursive data type?


#1

I have a recursive data type that I’d like to encode to, and decode from, JSON.

Here’s a simplified version of the data type:

module RecursiveJson exposing (..)

import Json.Decode as Decode exposing (Decoder)
import Json.Encode as Encode


type Tree
    = Branches (List Tree)
    | Leaf Int

Branches [Leaf 1, Branches [Leaf 2, Leaf 3, Branches [Leaf 4]]] should encode as the following, modulo whitespace:

{
  "children": [
    {"value": 1},
    {"value": 2},
    {"value": 3},
    {"children": [{"value": 4}]}
  ]
}

The choice of this encoding, where the choice of Branches or Leaf is encoded by the presence or absence of an object property ("children"), is motivated by concern over the encoding size.

The application uses this encoding to cache values in localStorage, in order to avoid a series of slow API calls to the Dropbox API the next time the page is visited, and I’m running up against the localStorage browser limit, for Dropbox accounts that have tens of thousands of files, and therefore of nodes in the tree.

The program repeats the API calls when the data doesn’t fit in the cache, but I’d like to maximize the chance that it’s able to use the cached data, by making this data relatively small. (In fact, the actual code uses "v" and "c" instead of "value" and "children"; this makes a difference. I’ve considered running the stringified JSON through gzip, which would let me move back to the mnemonic property names, but don’t I currently want to introduce that complexity.)

Here’s the encoder and decoder:

encode : Tree -> Encode.Value
encode tree =
    case tree of
        Branches children ->
            Encode.object
                [ ( "children", Encode.list <| List.map encode children )
                ]

        Leaf value ->
            Encode.object
                [ ( "value", Encode.int value ) ]


decode : Decoder Tree
decode =
    Decode.field "children" (Decode.maybe (Decode.list decode))
        |> Decode.andThen
            (\children ->
                case children of
                    Just c ->
                        Decode.succeed <| Branches c

                    Nothing ->
                        Decode.field "value" Decode.int
                            |> Decode.andThen (Decode.succeed << Leaf)
            )

Compiling this with elm 0.18.0 produces the error message:

-- BAD RECURSION -------------- /Users/osteele/code/banyan/src/RecursiveJson.elm

`decode` is defined directly in terms of itself, causing an infinite loop.

25| decode : Decoder Tree
26|>decode =

[snipThis text will be blurred

Maybe you DO want a recursive value? To define `decode` we need to know what
`decode` is, so let’s expand it. Wait, but now we need to know what `decode` is,
so let’s expand it... This will keep going infinitely!

To really learn what is going on and how to fix it, check out:
<https://github.com/elm-lang/elm-compiler/blob/0.18.0/hints/bad-recursion.md>

(Factoring decodeChildren out from decode, to turn the direct recursion into indirect recursion, creates maybe the best compiler error message I’ve ever seen, but still an error message.)

I believe the compiler is mistaken here. decode could recurse infinitely, but it’s guarded by Decode.maybe, so it will only do this if the data recurses as well — which is possible for hand-crafted JavaScript objects, but not for those produced by encode.

Whether or not the compiler is mistaken — how can I solve either (1) the immediate problem of writing a decoder for this encoding, or (2) the higher-level problem of implementing a simple storage-efficient caching mechanism for this data type? I considered writing a custom string generator and parser, but I’d prefer to use JavaScript’s JSON for this since it already handles hierarchical data.


#2

You need to use Json.Decode.lazy for recursive data structures (essentially just wrap recursive decoder calls in Decode.lazy). That will only expand the decoder as far as is necessary.


#3

Thank you for your gentle help with my reading skills.


#4

Gonna bookmark this one because I know I’ll need this at some point.


#5

Here’s a version without Elm.Decode.lazy, generated with this:

decodeTree =
   Dec.field "Constructor" Dec.string |> andThen decodeTreeHelp

decodeTreeHelp constructor =
   case constructor of
      "Branches" ->
         decode
            Branches
               |> required "A1" (Dec.list decodeTree)
      "Leaf" ->
         decode
            Leaf
               |> required "A1" Dec.int
      other->
         Dec.fail <| "Unknown constructor for type Tree: " ++ other

encodeTree a =
   case a of
      Branches a1->
         object
            [ ("Constructor", Enc.string "Branches")
            , ("A1", (Enc.list << (List.map encodeTree)) a1)
            ]
      Leaf a1->
         object
            [ ("Constructor", Enc.string "Leaf")
            , ("A1", Enc.int a1)
            ]

#6

I had a similar problem working with unknown JSON structures (json schema parser).
You can see real world example of it here


#7

There’s a library for working with JSON in Elm :slight_smile:
http://package.elm-lang.org/packages/1602/json-value/latest


#8

Yes, thanks! I wrote mine a year before this lib was created. :slight_smile: but it was closed source.
Now I will probably refactor my code to use this library and contribute missing part.