Change records to be extensible by default in Elm

In Elm one can define record types in two different ways:

-- strict
type alias User = { name = String, age = Int }

-- extensible
type alias User a = { a | name = String, age = Int }

Now if I write a function that takes a User using the regular (strict) record type definition, the compiler would only accept values that don’t contain properties other than name and age.
In the second version(extensible) it would accept any records that contains those properties along with zero or more other properties.

My Proposal: Change Elm in a future version so that record types are always extensible and can contain other properties.

Here’s why:

  • the second version is more flexible but introduces a type variable, is more verbose and harder to grasp for beginners
  • the constraint of the second version (requiring at least the properties that were defined) is enough to do anything you want to do if that type is passed to a function, and the function does not need to know or care about other properties on the record. They are hidden away and the function cannot access them(like the current behavior with extensible records). Like when do you ever say: “I want my function to not accept records that contain more properties that I am asking for, even though I can’t access them anyway”?
  • compatibilty: This is a big one. Say I am publishing a library which exports a function that requires the User record with name and age. Say, in the next version of the lib the function does not need the age anymore and only takes the name. This should be a compatible change because the constraint on the input data is loosened. Yet, currently the compiler will complain and the library author is forced to release a new major version even thought the constraint requires less than before. Of course the library author could’ve used an extensible record to begin with, but those things are hard to anticipate up front and why not make that the default behavior?

What does everyone think about that? Can anyone think of a reason or situation where my proposal is not desirable? Looking forward to discussion.

4 Likes

This might seem sensible at first, but what about functions that return a record? Your idea would mean that a function

createUser : Arguments -> { name : String, age : Int }

would be now be interpreted as

createUser : Arguments -> { a | name : String, age : Int }

which in the current system means that createUser guarantees that a can be any record type the caller of the functions wants. For example, I could call this function when I need { name : String, age : Int, address : String }, but of course createUser can’t do that because it cannot magically create an address. In fact, functions with extensible records only in the return type are impossible to write in Elm (except with Debug.todo).
(I use the current syntax because it allows to make distinctions your new syntax cannot express.)

You might propose that the meaning of {a | name : String, age : Int } in a return type position should be changed, but this produces (at least one) new problem:

In a function like increaseAge : { a | age : Int } -> { a | age : Int } we want to express that a does not change and that the caller can choose a; i.e. this function can increase the age field of any record we give to it and does not change fields other than age and name. Because a is implicit in your syntax, we cannot express that a remains the same anymore. So either a is implicitly constant whenever the same record is used in multiple positions (which would be strange because in increaseAge the caller can choose the value of a for the return type whereas in createUser above the caller cannot make that choice) or in your syntax the type of increaseAge would have to be { a | age : Int } -> { b | age : Int } which means that the type of increaseAge does not guarantee anymore that it does not change the other record fields.

7 Likes

Here’s a use case where I think extensible by default records is undesirable.

Suppose I have two vector math modules. One for 2d vectors and one for 3d vectors. Each defines an add function.

module Vector2 exposing (..)

type alias Vector2 = { x : Float, y : Float }

add: Vector2 -> Vector2 -> Vector2
module Vector3 exposing (..)

type alias Vector3 = { x : Float, y : Float, z : Float }

add: Vector3 -> Vector3 -> Vector3

If I then try adding two 3d vectors together, there is a risk of using the wrong add function and ending up with a Vector3 that didn’t get the previous z values added together.

v1 = { x = 4, y = 3, z = 2 }
v2 = { x = 1, y = 5, z = 6 }

-- Oops, I used Vector2 instead of Vector3 and the compiler won't catch this mistake
a = Vector2.add v1 v2
3 Likes

I have never used extensible records and I don’t need all records to be implicitly extensible. Also I don’t think it’s very common for a function in a package to be updated to no longer require certain record fields.

Maybe I’m not understanding @ChristophP’s desired use case, but you can already define functions that take any record with certain fields, without explicitly defining an extensible record, like so:

doSomethingWithXY : { a | x : Float, y : Float } -> Float

Functions with a type signature like that will take any record with the defined fields, even if they have extra fields. Here’s an example on Ellie: https://ellie-app.com/9xP6WYjjwZSa1

My point is that, if extensible records are the default, it can lead to bugs that the compiler would have otherwise caught, such as with Vector2.add being used with two Vector3’s resulting in the z coordinates not being added.

1 Like

Sorry, I was replying to the OP, my post wasn’t specifically referring to what you said, I must have clicked the wrong “reply”. I agree with you that you lose some type safety, it’s a trade off.

You didn’t reply to my message but I misunderstood and thought you had since you also used x and y fields in your example. My mistake!

1 Like

:laughing: They were the first field names that popped into my head, I’ve clarified my original post!

I like your example. I bet you work with vectors quite a bit in the cool games you have made.

I see there’s a possibility to use the function that does not add the z component. There’s still a possibility to turn the vector type into an opaque type which I think would be a good alternative, to prevent that sort of thing.

I think having records require only minimum structure would be practical, but it’s a good example that mentioned.

1 Like

True I know that it’s entirely possible. But my suggestion was to reduce the syntactical noise and type variable, to make records a bit more flexible and keep using the convenience of type aliases.

I see return types is a bit different, thanks for bringing that up. I wouldn’t really see it as “any record the caller function wants” but more like “a record that contains (at least) the following properties”. The “at least” part is kind of redundant since you as a caller can of course only access properties that you know about. So my proposal doesn’t really help with return types but the way I see it, it doesn’t hurt them either.

You might not want to interpret a return of { a | name : String, age : Int } as “any record the caller of the function wants” but that is precisely what this return type currently means (well, the record the caller wants needs to have the name and age fields). Look in this Ellie, for example: https://ellie-app.com/9xSPvpGrz6ja1. You will notice that this compiles, even though we call createUser to recreate our model of type Model and Model has an extra field that the return type of createUser does not specify. This shows that the caller can choose what a stands for (here: the record type { extraField : String }) (Of course, it is impossible to actually write such a function createUser honestly, hence I had to use Debug.todo in the implementation.)

The Ellie also contains the second function increaseAge which shows why we want it to work like this. We don’t want increaseAge to simply drop the additional fields, so its return type a has to be the same as the argument type a and the argument type is something the caller chooses. So it is a good thing that the function type can specify that a does not change and if we want the types to work consistently, we also have to be able to choose a if the record type occurs only as a return type.