Where to put encoders, decoders and generators?

ianmackenzie · February 19, 2018, 4:18pm

Which of the following two designs do you prefer, and why?

Option 1: With the data type

module Point3d

-- origin, coordinates, translateBy...

encode : Point3d -> Value

decoder : Decoder Point3d

generator : Generator Point3d

module Triangle3d

-- vertices, area, rotateAround...

encode : Triangle3d -> Value

decoder : Decoder Triangle3d

generator : Generator Triangle3d

This approach is nice and simple and easy to understand, but introduces a fair bit of coupling. Each module is now tied to a particular data format (JSON), and would depend on third-party packages if this pattern is extended to things like fuzzers.

Option 2: In their own modules

module Geometry.Encode

point3d : Point3d -> Value

triangle3d : Triangle3d -> Value

module Geometry.Decode

point3d : Decoder Point3d

triangle3d : Decoder Triangle3d

module Geometry.Random

point3d : Generator Point3d

triangle3d : Generator Triangle3d

This approach is a bit more complex and the functionality is arguably a bit buried, but it decouples things nicely. These modules could be in their own separate packages with their own dependencies if necessary, which would make it easy to add things like a Geometry.Fuzz module later.

Thoughts?

justin · February 19, 2018, 6:10pm

In most cases I go with Option 1 and place them in the same module as the type. The only other thing I’ve done is create an Encodermodule that has every encoder in it, etc etc.

Chadtech · February 19, 2018, 7:14pm

Definitely option 1. Also I include the type thats getting encoded / decoded in the module as well.

Its impossible to encode a type, without also having access to that type, so in practice you will always have the encoder and the type itself in the same space. That kind of sets the foundation for type centric modules (type X, the X decoder, the X encoder, and the X helpers, all together in one module) which leads to encoders and decoders in the same space.

I havent done option 2, so I guess I cant speak from experience. But I would think stuff should be together if it gets used together. That way, for any particular part of your code, you can see it and also see its related parts hanging around nearby. That doesnt seem to be what option 2 encourages, since for any given X and Y, I couldnt assume that the X encoder and the Y encoder get used together.

ianmackenzie · February 19, 2018, 8:46pm

For context, option 2 is how I currently have things set up in opensolid/geometry, and I think those encoders/decoders are likely to be used together. But as part of my current transition of opensolid/geometry to ianmackenzie/elm-geometry, I wondered if that was actually the best approach or if another way would be better.

True, but that doesn’t actually mean the encoders have to be in the same module as the type (unless encoding relies on private implementation details). In general it should be possible to implement encoders in a separate module in the same package, or even in a separate package that depends on the package defining the type.

Sebastian · February 19, 2018, 10:16pm

I prefer #1. I often find that I need to write several helper functions for a particular type e.g. encode, decode, a fn to update an attr, fn to return an attr as human label, etc. Having all these in a module for a particular type makes them easier to find and groups them nicely.

We started with option 2 in our app. The code base felt way less organised that way. e.g. we didn’t know where to put an associated function for a type as it didn’t fit in the Encoders or Decoders modules we had.

opvasger · February 19, 2018, 10:58pm

Evan has encouraged building modules around data-structures, which clearly favors option 1. I think that’s the way to go, as there is multiple good reasons to follow this practice, as explained in this talk; specifically around 5:21

mattpiz · February 20, 2018, 12:40am

In case you go for (1) you could probably put your fuzzers inside a dedicated module in tests. By adding ../src in tests description you still have access even to private modules. The issue then is that you can’t access them in tests of other packages built on top of it.

It would be nice to have a way to declare modules that are exposed as tests dependencies.

Yet we have Json.Encode.list instead of List.encode so it can probably make sense some times to pack together your encoders/decoders.

luke · February 20, 2018, 3:41am

The encoders and decoders for builtin types are in the json modules so that there will be less upgrade friction if the json modules are moved into a separate package at some point. I think this particular sort of API planning strategy factors in less when writing modules for an application.

mattpiz · February 20, 2018, 4:05am

@luke That makes sense. It could also be interpreted as building modules around protocols. And I guess (let me know Ian if I’m wrong) Ian is probably thinking of adding multiple protocols at some point, like protobuf since 3D data tends to grow fast and JSON might not be appropriate for some related projects.

That’s a very interesting question, thanks Ian for asking!

luke · February 20, 2018, 4:09am

This is a good explanation. Look at me, with my single protocol bias . Thanks for the thoughtful response!

gampleman · February 20, 2018, 9:44am

I think option 2 makes more sense here. In my mind opensolid/geometry is a package about geometry - I expect the types and functions to be about doing geometrical operations. Encoding/decoding/generating/fuzzing/etc are separate concerns that will often depended on elsewhere.

wondible · February 21, 2018, 12:52pm

I’ve been doing option 2, so I can import Json.Encode or Decode at module scope as appropriate.

ianmackenzie · February 21, 2018, 6:29pm

Thanks all for the thoughtful responses! I should have mentioned up front that I was personally interested in library design - I think that for an app, putting encoders/decoders/generators with the type probably is the clear choice. But for packages where you have to worry a bit more about how many dependencies you pull in, I don’t think it’s quite as obvious.

I think more generally, JSON may not always be the default serialization format so it feels wrong to ‘bless’ it by having Point3d.encode etc.

All that said, after some discussions on Slack (thanks @ilias!) it looks like the answer for now might be “just leave JSON support out and see if it’s actually needed”. (If anyone is currently relying on the encoding/decoding functionality built into opensolid/geometry right now and is keen to see that continue being supported in the upcoming ianmackenzie/elm-geometry, let me know!)

mattpiz · February 21, 2018, 8:17pm

In the last step of a previous user study (shortened demo with 6 steps here: OTIS - User Study), I’m converting all annotations to json and sending them to my server. For simplicity, I was using the encoders already baked in opensolid/geometry. Building my own encoders is quite easy though so I could have gone this way and I probably should have, encoding integers instead of floats, etc.

So I’m good with removal of JSON support from elm-geometry but it can save some time to have some ready-to-use JSON interop elsewhere like in an elm-geometry-interop or similar package.

Topic		Replies	Views
Create test input for a Bytes decoder/encoder? Learn	3	671	November 20, 2019
How to write fuzzers for encoding/decoding (potentially recursively nested) datatypes? Learn	4	1041	August 6, 2018
Decoder generator and imports cleanup tool Show and Tell	21	2656	July 19, 2018
A tool for autogenerating code and en/de-coders of enum-like types Show and Tell	1	663	January 18, 2019
Elm Code Generation in Protoc-Gen-Elm Request Feedback	2	538	April 11, 2023

Where to put encoders, decoders and generators?

Option 1: With the data type

Option 2: In their own modules

Related topics