Elm-edn, de- and encoding Extensible Data Notation

I’ve put a module up (robx/elm-edn, github) that deals with EDN.

What’s EDN, and why?

EDN is a way to encode data, think of it as an alternative to JSON. It’s closely related to Clojure, and it’s the standard data format there. There’s a slightly too enthusiastic article arguing for EDN over JSON here, it gives a decent overview of the differences though. Here are two example messages:

#siren/status             ; a tag, applied to the following map
    { :state "play"       ; (keyword, string) map entry
    , :songid "505"
    , :elapsed 11.342     ; a floating point number
    , :duration 198.896
    , :volume 30          ; an integer
    }

; compact message, two-element array
#siren/playlist[{:pos 0 :track"01.mp3"}{:pos 1 :track"02.mp3"}]

I made this for a project that needs to communicate with a websocket connection to a go backend. (Not published yet, it’s a small card game.) I didn’t like the usual way of encoding sum types in JSON, with oneOf [ tryThis, or, tryThat ], and/or looking at an in-data “type” field or so. Then tried Protobuf which ended up too noisy in both client and server code. With EDN, you can encode this information somewhat canonically in the tags (#siren/status, #siren/playlist above).

The package

The package is live and tested and seems to work fine both for that game and alicebob/siren, an mpd web frontend. (That’s what the example messages above are derived from.) Here’s how it’s used there: Mpd.elm.

The API is modeled closely on Json.Decode/Json.Encode. There’s probably some room for making more use of EDNs features to write a more “give me what’s there” instead of “this is what I expect” kind of API, but that’s for maybe the future.

Maybe the most interesting parts of the API are related to tags:

type ID
    = UserID Int
    | Email String

decodeString
    (list <| tagged
        [ ( "my/uid",   map UserID int   )
        , ( "my/email", map Email string )
        ])
    """(#my/uid 1, #my/email "alice@example.com", #my/uid 334)"""
--> Ok <|
-->     [ UserID 1
-->     , Email "alice@example.com"
-->     , UserID 334
-->     ]

encode <| list <|
    List.map (\id -> case id of
        UserID i -> mustTagged "my/uid" (int i)
        Email e  -> mustTagged "my/email" (string e)
    ) [ UserID 5, Email "alice@example.com" ]
--> """(#my/uid 5 #my/email "alice@example.com")"""

Random notes

There are definitely some rough edges still:

  • parse failure error messages could be improved
  • Encode is quite minimal-effort
  • I’m not that happy with mustTagged and friends: The issue is that you want to use symbols/keywords/tags, which are basically strings but not all strings, which are typically constants so you want to decide once and for all that they’re fine. I didn’t find a better solution than the must* functions which assert that a string is of the appropriate format and call Debug.crash otherwise.

Next it was fun to build a complete parser with elm-tools/parser. For the moment I’m sticking with a hacked version of it that supports look-ahead, because that seems to permit more elegant parsers. But see the very helpful discussion here: Parsing puzzle, avoid look-ahead?

elm-verify-examples is pretty awesome!

And I’m not quite sure where I’m at with EDN itself. It has some really nice aspects, but also some warts:

  • The spec is not well maintained; it seems that Clojure’s reader is the actual reference. My first version was faithful to the spec but failed to parse quite a few details of go-edn’s output.
  • Clojure leaks into it a bit, e.g. I find the 16-bit Unicode escapes weirdly limited. No emoji character constants?

Anyway, maybe someone finds it interesting and/or useful?

5 Likes

Recently I worked in a project that employed a triplestore (eav model) to store data about entities. I did not know EDN at the time, but it looks like much more appropriate to transact data in this structure. I hope to be able to give it a try in next projects!