Json decoder runtime error

I’m making a json decoder which decodes structures such as:

["foo", {"type":"p", "content": ["bar", "baz"]}]

This is my original version:

That compiles just fine but throws a Javascript error(!):

TypeError: decoder is undefined[Learn More]

The next commit in the gist fixes the error:

But, AFAICS, that should be equivalent code. What’s going on here?

Thanks!

(discourse is not making things easier by expanding the gists…)

In 0.18; getting recursive values to play nice takes some work. I’ve written a bit about what is going on in this particular case in this blogpost.

I hope this will help figure out why this doesn’t work.

Since this is a fairly simple structure, I believe the following will work:

import Json.Decode as Decode exposing (Decoder)

type Elem
    = Para (List Elem)
    | Chunk String

elemDecoder : Decoder Elem
elemDecoder =
    Decode.oneOf
        [ Decode.map Chunk Decode.string
        , Decode.map Para (Decode.field "content" <| Decode.list (Decode.lazy <| \_ -> elemDecoder))
        ]

The addition of lazy makes it so that the reference back to itself is wrapped in a function, so there is always at least one function call involved in getting back around to itself. This prevents the “cycle” error you may have encountered initially.

Thanks for the help Ilias. The git has two commits: the first broken, and the 2nd fixes it. However, the only thing which was needed was inlineing some functions. Lazy wasn’t needed.

This is the fixing diff: https://gist.github.com/alicebob/84083c48310cdb984fbe493d50aac905/revisions#diff-f1ab166876eeb993a8f1f1d6a7a1c37d
it basically goes from oneOf [func1, func2] to oneOf [code from func1, code from func2]. That this solves the problem does surprise me.

The definitions of elemDecoder, elemDecoderType, and elemDecoderTypeHelp form a cycle, and the compiler has emitted the first two value definitions in an unlucky order (see the JS output below; the definition of elemDecoder references elemDecoderType before it’s defined). This is a known issue.

So when you inline the definition of elemDecoderType, you avoid the unlucky ordering. And in this case using lazy isn’t the only solution because the cycle includes a function, elemDecoderTypeHelp (which can be safely referenced before its definition). If the cycle were formed of only values, then you’d want to use lazy as @ilias described.

var _user$project$Main$elemDecoder = _elm_lang$core$Json_Decode$oneOf(
    {
        ctor: '::',
        _0: _user$project$Main$elemDecoderChunk,
        _1: {
            ctor: '::',
            _0: _user$project$Main$elemDecoderType,
            _1: {ctor: '[]'}
        }
    });
var _user$project$Main$elemDecoderType = A2(
    _elm_lang$core$Json_Decode$andThen,
    _user$project$Main$elemDecoderTypeHelp,
    A2(_elm_lang$core$Json_Decode$field, 'type', _elm_lang$core$Json_Decode$string));

I would still recommend wrapping the call from elemDecoderType back to elemDecoder in lazy. The fact the above works is basically “luck” - I’ve seen this work in one place and break in tests, so for now it’s just safer to err on the side of, well, safety.

This will be fixed in 0.19, though, and shouldn’t require lazy for the case you’ve described. So, coolbeans!

Got it, thanks for the explanation. Nice to know why something which looks like shuffling bits around fixes the problem.

I spent a bunch of time looking at the theoretical issues around cycles after getting burned by a function indirectly referencing a list which contained the function.

The only safe way for the compiler to fix the case in this thread is either to build in some knowledge about decoders so it can special case them or for it to disallow such cycles entirely. Lazy works because it basically relies on the fact that the function passed to lazy which references the decoder won’t get used until the decoder gets used by which time all of the construction will be finished. But if the compiler doesn’t know that about lazy, it would have to assume that the function could be called immediately when being passed and if the function references something that hasn’t yet been fully constructed, “boom”. The only safe cycles (and cyclic clusters) without such special knowledge consist only of functions since the language definition makes clear that constructing functions does not invoke functions. Various special cases — e.g., decoders and constructors (thereby enabling lists of functions within a cycle) — can be handled with some care but it’s tricky if one really wants to detect the errors at compile time while still being generous about what can compile.

Mark

1 Like