Moving from "Similar" to "Same"


#1

I think it’s a pretty common piece of advice (let me know if I’m wrong!) in the Elm community to not confuse “similar” for “the same”, to not try to abstract/generalize/optimize too early, or to think too far into the future, even if it leads to a lot of similar/superficially identical code (writing extra simple/“boilerplate” code is cheap, and well worth the awesomeness of code readability, type safety, and, uh, “refactorability”). I think this is great advice, not just for Elm, and I try my best to stick to these principles.

With that in mind, though, there may come a time when you start to see the similarities between your types and the functions your perform with them really add up, and you start to wonder whether you could refactor your code to reflect that, and make your codebase easier to understand and maintain.

My question is twofold:

  1. What are your heuristics/rules of thumb for deciding that it might be time to consider some kind of abstraction (or other refactoring) over those types/functions/modules?
  2. If you decide it is time, what are some approaches you might try? I’m sure they vary wildly by the exact situation, but are there certain patterns you’ve come across?

#2

Maybe the answer to question 1 is just “Do you have a good idea for what to do”?


#3

The situation I was in that made me think of this question:

I have an app with a couple of types, Thing1 and Thing2. They are both currently opaque custom types of the form type Something = Something { ... } (is there a term for that?). The records for each type define a lot of the same fields and a lot of similar fields (e.g. { ... name: String, items : List Item1, ...} vs { ... name : String, items : List Item2, ...}, but also a few different fields.

They are also conceptually related, their modules share a lot of almost-identical functions, and the Pages, view functions and Msgs they touch share plenty of similarities.


#4

Take a look at how Article is implemented in the elm-spa-example.

If you can extract the common fields and do something meaningful with them maybe you can have some kind of an abstraction for what’s common.


#5

Can you be more specific about Thing1 and Thing2? It’s cool to paste the whole definitions.

I can think of one really nice path and one path that looks neat but isn’t worth it in the long run, but I’d need to see more of the types you have in mind to see how the paths actually relate to your situation.


#6

Thanks for your help! I wasn’t sure whether to post that comment about Thing1 and Thing2, because I’ve kind of restructured it since so it’s not as relevant, but I thought some context might be helpful.

I wanted to hear from people their approaches for dealing with situations like this, because it’s something I’ve come across a few times, but I’ve never been able to put my finger on what triggers the feeling, or how I came to a particular solution. Sorry if the question was too vague, or if I posted it in the wrong topic.

Do you think you’d be able to elaborate on the two options at all, though? It sounds really intriguing! One thing I’d considered was exposing the record type, and using extensible records to define functions that work on both types. That didn’t feel like the “right” solution, though.


#7

I’ll outline three paths in order of worst to best (in my opinion!). I feel pretty strongly that these are good recommendations, but it is based on my personal experience with each one and I am sure other people think different things or have different scenarios than me.

Extensible records :warning:

I wasn’t even thinking of this. I personally find it leads to a mess in the long run. Making sure different types stay in sync structurally is not really very easy in this world because the dependency goes like this:

Thing1   Thing2
     \   /
   sharedFunc

So you change Thing1, update sharedFunc, and then realize that you cannot change Thing2 in the same way. I think this is not so fun to see in practice.

The literature on extensible records had big dreams for them being useful for this sort of thing. They had not been implemented in OCaml or SML or Haskell, so I was excited to see how they would work in Elm. I just have not seen them pan out in practice though. You have to think real hard and things still feel misaligned and messy in the end. So overall, I think extensible records are useful for making type inference on .x expressions easy, but a red herring for pretty much everything else.

Types with Holes :warning:

At some point I got into the idea of just adding more type variables to my types in the compiler. So instead of type AST = ... I had type AST loc var tipe doc = ..., allowing me to fill in different parts differently depending on context.

Thing1 == GenericThing a b c d e == Thing2

This is the path that I think looks neat, but I have found it is not worth it in the end. In my case, each phase of the AST was similar, and at the time, the Haskell community online was really excited about generic traversals of generic data structures. The trouble is that all of my traversals were specific. They relied on certain details interacting with other details. So in the end, I had done a bunch of work to have less code, but I ended up with code that was more complex and more frail to changes. Touching code about parsing meant messing up totally unrelated traversals two phases later. I eventually just made four separate AST types and the code got simpler and faster.

So I think this pattern is attractive in that it seems to promise less code, but I found that it led to code that was much harder to understand and modify. Instead of being messy like the extensible records approach, I found the result here was complex.

Nesting Types :white_check_mark:

I try to find a subset of information that makes sense as a type of its own. So let’s say Thing1 and Thing2 had shared fields about their location. I would see if a Location type made sense. Are there helper functions specifically relevant to it? Is it exactly the same in both cases? Maybe it corresponds to some idea that is true about your overall system? If it seems like a solid concept, I would consider making a module around it.

Now you have a structure like this:

    Location
      /   \
Thing1   Thing2

So everyone gets the benefit of the Location type. They can share decoders and helper functions. If you change something about Thing1 it does not ruin any code that works on Thing2. They are just separate.

The risk here is that a Location that looks the same in both cases today will only be similar in the future. If that happens, many people just start making Location more complex rather than pushing it back into Thing1 and Thing2. So the risk is that if you draw these lines wrong, you end up with a bunch of optional fields that are actually not optional. They are contextual! People think “oh, I’ll just make a little edit. This type surely exists for a good reason.” And when folks realize that they have all these optional fields that are actually contextual, they may try to get to Location a b c or { a | location : Location } to “fix” things. Now someone spent a lot of time getting that to work, but no one else on the team can understand or modify it easily anymore and you are gonna have all the problems discussed in the sections above.

So I give this a :white_check_mark: because it seems to work out the most often of any technique I know of. It has risks when it comes to misidentifying the same and similar where you just have to know how to back out of the situation without leaving messy debris. This is why I try to emphasize the same/similar distinction a lot!


In the end, these are just my opinions based on the kinds of code I have written. I prioritize “easy to understand and modify” in my code, and I have found that sometimes that means having code that is similar :man_shrugging:


#8

Thank you for a considered and insightful reply! I have definitely gone down the Types with (too many) Holes path before, and regretted it :slight_smile:.


#9

I just realized that the only way to be sure you’re not confusing the same with similar, is to wait until something needs to change. Then you should know, whether the change affects only one instance (similar) or both have to change (the same).

So even if it might not be desirable to actually wait until a change, this thought experiment might be useful. What sorts of things do I expect to change? Will those changes affect all instances or just one?


#10

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.