I have a recursive data type that I’d like to encode to, and decode from, JSON.
Here’s a simplified version of the data type:
module RecursiveJson exposing (..)
import Json.Decode as Decode exposing (Decoder)
import Json.Encode as Encode
type Tree
= Branches (List Tree)
| Leaf Int
Branches [Leaf 1, Branches [Leaf 2, Leaf 3, Branches [Leaf 4]]]
should encode as the following, modulo whitespace:
{
"children": [
{"value": 1},
{"value": 2},
{"value": 3},
{"children": [{"value": 4}]}
]
}
The choice of this encoding, where the choice of Branches
or Leaf
is encoded by the presence or absence of an object property ("children"
), is motivated by concern over the encoding size.
The application uses this encoding to cache values in localStorage
, in order to avoid a series of slow API calls to the Dropbox API the next time the page is visited, and I’m running up against the localStorage
browser limit, for Dropbox accounts that have tens of thousands of files, and therefore of nodes in the tree.
The program repeats the API calls when the data doesn’t fit in the cache, but I’d like to maximize the chance that it’s able to use the cached data, by making this data relatively small. (In fact, the actual code uses "v"
and "c"
instead of "value"
and "children"
; this makes a difference. I’ve considered running the stringified JSON through gzip, which would let me move back to the mnemonic property names, but don’t I currently want to introduce that complexity.)
Here’s the encoder and decoder:
encode : Tree -> Encode.Value
encode tree =
case tree of
Branches children ->
Encode.object
[ ( "children", Encode.list <| List.map encode children )
]
Leaf value ->
Encode.object
[ ( "value", Encode.int value ) ]
decode : Decoder Tree
decode =
Decode.field "children" (Decode.maybe (Decode.list decode))
|> Decode.andThen
(\children ->
case children of
Just c ->
Decode.succeed <| Branches c
Nothing ->
Decode.field "value" Decode.int
|> Decode.andThen (Decode.succeed << Leaf)
)
Compiling this with elm 0.18.0 produces the error message:
-- BAD RECURSION -------------- /Users/osteele/code/banyan/src/RecursiveJson.elm
`decode` is defined directly in terms of itself, causing an infinite loop.
25| decode : Decoder Tree
26|>decode =
[snipThis text will be blurred
Maybe you DO want a recursive value? To define `decode` we need to know what
`decode` is, so let’s expand it. Wait, but now we need to know what `decode` is,
so let’s expand it... This will keep going infinitely!
To really learn what is going on and how to fix it, check out:
<https://github.com/elm-lang/elm-compiler/blob/0.18.0/hints/bad-recursion.md>
(Factoring decodeChildren
out from decode
, to turn the direct recursion into indirect recursion, creates maybe the best compiler error message I’ve ever seen, but still an error message.)
I believe the compiler is mistaken here. decode
could recurse infinitely, but it’s guarded by Decode.maybe
, so it will only do this if the data recurses as well — which is possible for hand-crafted JavaScript objects, but not for those produced by encode
.
Whether or not the compiler is mistaken — how can I solve either (1) the immediate problem of writing a decoder for this encoding, or (2) the higher-level problem of implementing a simple storage-efficient caching mechanism for this data type? I considered writing a custom string generator and parser, but I’d prefer to use JavaScript’s JSON for this since it already handles hierarchical data.