Language Idea: Limit number of paramters in custom types to three

robin.heggelund · September 23, 2018, 7:41pm

In Elm 0.19 you can no longer create tuples with more than three elements. The reasoning behind this is good. Tuples does little to tell you what each parameter means, and so one should prefer records to tuples if you have a bunch of paramters.

However, this restriction does not apply to custom types. This seems odd, as I’d think the same arguments apply.

I’ve written a blog post about this: https://dev.to/skinney/language-idea-limit-custom-types-to-three-arguments-27p1

What do people think?

rgrempel · September 23, 2018, 9:19pm

One thing to consider is that a recalcitrant user would be able to work around the restriction you describe. Consider a custom type that looks like this:

type Color
    = RGBA Int Int Int Float
    | HSLA Float Float Float Float

You’re probably thinking that this should be turned into something like:

type Color
    = RGBA { red : Int, green : Int, blue : Int, alpha : Float }
    | HSLA { hue : Float, saturation : Float, lightness : Float, alpha : Float }

Or, perhaps this:

type Color
    = RGBA RGBAValue
    | HSLA HSLAValue    

type alias RGBAValue =
    { hue : Float
    , saturation : Float
    , lightness : Float
    , alpha : Float
    }

type alias HSLAValue =
    { red : Int
    , green : Int
    , blue : Int
    , alpha : Float
    }

However, a truly recalcitrant user could turn it into this instead:

type Color
    = RGBA ( Int, ( Int, ( Int, Float ) ) )
    | HSLA ( Float, ( Float, ( Float, Float ) ) )

In other words, as long as you allow a plain-old-tuple, a determined user can nest as many as they like. However, perhaps most users would consider the extra commas and brackets to be sufficiently ugly that they would move in your preferred direction instead.

robin.heggelund · September 23, 2018, 10:27pm

zinggi57 on Slack also mentioned the lack of pattern matching on records as a concern, which is something I’d overlooked.

christian · September 24, 2018, 12:06am

I absolutely do not support this at all. The loss of greater than 3-way tuples was a substantial detriment to pattern matching groupings of values (especially if you want to make an ad-hoc grouping of several values that are in no other way related, in the same way you define a lambda function when you do not need a named one).

Removing the ability to have more than 3 parameters in a value constructor of a custom type seems completely arbitrary. Who is to say that 4 is too many? Why not also prohibit function names that are less than 5 characters long? Surely a 4 character function name does little to tell you what the function does. Or 4-level nested records? Surely you should use a flat data model. Is it really the job of the compiler to try to eliminate what you might consider “code smells”?

I cannot speak for anyone other than myself and perhaps my team members at work, but I use Elm because I see the value in things like a static type system, pure functions, pattern matching, and controlled side effects. These features are conducive to writing reusable, easily-refactorable, and generally error-free code. I do not use it because I need to be policed about having too many parameters in my types (or functions, or tuples, or records…).

Moreover, as mentioned, your suggestion to use records instead would be tenable if records actually supported pattern matching in things like case statements. Unfortunately they do not, so this is really a non-starter. Pattern matching is one of Elm’s best features, which I raved about in our production use writeup last year. But using it very often necessitates either tuples or union/custom types.

robin.heggelund · September 24, 2018, 3:42am

in the same way you define a lambda function when you do not need a named one

The difference being that you cannot inspect a lambda. You never pull data out of a lambda, only run it. Since tuples are something you have to read back out again, being nudged to name those things makes sense.

Removing the ability to have more than 3 parameters in a value constructor of a custom type seems completely arbitrary.

It isn’t though. It’s a common theme in books and best practice tips to get the number of function arguments down to three, preferably one or two, because after that it becomes very difficult to remember what order the argument goes in, especially if they’re all the same type. Both Uncle Bob’s Clean Code and Effective C# comes to mind, though I’m sure there are others. The same argument applies to tuples and custom types.

Why not also prohibit function names that are less than 5 characters long?

Functions are, amongst other things, often used as getters and setters. x is a perfectly valid name in a record to represent the x coordinate, so x would also be a perfectly fine function name for a getter function.

Or 4-level nested records? Surely you should use a flat data model.

This would actually limit what you could do with the language, my proposal (with the exception of pattern matching in records, which I did say I forgot to account for) would only make it harder to let code grow without thought.

While a flat data model does make the code easier in general, it can be easier to reason about a nested structure.

There are also performance reasons to nest instead of keeping things flat.

Is it really the job of the compiler to try to eliminate what you might consider “code smells”?

On this point I’m sure we are both thinking yes. Elm is staticly typed and enforces pure functions because the compiler also serves as a strict linter.

Or would you say that when all functions in your program can accept any type isn’t a code smell? Or that it isn’t a code smell that every function in your program can perform side effects?

I’m sure that we both are using Elm because it eliminates a bunch of code smells enabled in other languages. What we’re discussing now is if a custom type which contains 4 or more parameters should be named or not.

would be tenable if records actually supported pattern matching in things like case statements. Unfortunately they do not, so this is really a non-starter.

What if records had better pattern matching support?

Finally I would like to add that there are languages where custom types supports at most one parameter, and if you wanted to store more in it you would have to use tuples or records. It could very well be that three is too big a number and that it should be two or even just one.

I do agree that pattern matching on records would have to be better for my suggestion to be feasible, which is something I’ve admitted to overlook.

Warry · September 24, 2018, 7:50am

I made a similar proposal a few months ago (A type proposal) where I suggested to have at most 1 parameter in union types, BUT still expanding constructor function for records and tuples, like :

type MyType a
  = Const
  | Value a
  | Record { foo : String }
  | Record2 { foo : String, bar : Int }
  | Tuple (a, a)

-- Type Constructors:
Const   : MyType a
Value   : a      -> MyType a
Record  : String -> MyType a
Record2 : String -> Int -> MyType a
Tuple   : a -> a -> MyType a

rupert · September 24, 2018, 9:50am

This change would likely cause me no issues, as I already tend to switch to records at a low threshold. For example, the restrictions on tuples have not required me to change any of my code in the move to 0.19.

That said, there is something about this heavy handed approach I feel is off target - were overly large tuples really a big problem before 0.19? I think I could have found tens of other issues that I would have prioritised higher.

My personal preference would be to make these things warnings instead of compile errors. Or make them warnings in one compiler release first, before making them errors in a subsequent one - dependant on community feedback on how well the rule has been received.

madasebrof · September 24, 2018, 12:44pm

Hmm.

This is interesting, as it gets to a) what is so great about Elm, and b) what drives some folks crazy about Elm:

The idea that someone else can “tell them what is good for them”.

I had no idea that 0.19 restricted tuples to 3. I think there is maybe one function where we return a 4 tuple, which we could obviously change to return a record. (Ironically, I remember the last time I looked at that function I spent about 5 minutes wrapping my head around all of the return values!)

I don’t think we ever use a case statement with more than 3 in a tuple to pattern match on.

Just looking through our codebase, we do indeed have multiple times where we are passing a custom type variant that has more than 3 parameters.

Clearly, it wouldn’t be rocket science to change a custom type variant from | CustomTypeA String Int Int TypeAliasB to | CustomTypeA TypeAliasC, where:

type alias TypeAliasC =
   { name: String
   , count : Int
   , id : Int
   , typeAliasB: TypeAliasB
  }

So, in short, I could go either way. If it was part of the language, I’d work around it, and would probably force us to write code that is a bit easier to follow. At the moment, I can’t think of a showstopper reason not to do it.

But on this note, I will say that I think limiting the use of type variables in function signatures may be a good thing! I think one could possibly make the argument that, outside of the core library, you don’t actually need to use type variables. (I haven’t fully thought this through, and am not sure there is a real mechanism to do this, but maybe just a best-practices in the Elm guide.)

christian · September 24, 2018, 1:30pm

There are numerous examples in @rtfeldman’s SPA Example that use type variables in function signatures. Type variables are necessary to make certain pieces of code reusable.

I have to agree with @rupert here. Suggestions like this seem like an unnecessarily heavy-handed approach to language design. If you make a claim that the ability to do X is a problem, then you should make a strong case for it by including both examples of the problem repeatedly occurring and also thoroughly researching how X is currently used. Perhaps there’s a valid use case you are overlooking? What I see here instead is claims that X is a problem and suggestions to jump straight to evisceration immediately afterwards.

rupert · September 24, 2018, 3:35pm

I think perhaps you are joking or trolling us… ?

madasebrof · September 24, 2018, 4:21pm

Who is “us”? I am one of us.

Have you ever read any of my other posts???

Trolling??? For real?

A) as I said, I hadn’t really thought it through fully.

B) I actually think you could make a fairly strong argument for not needing type variables outside of certain core functions.

I’ve been coding daily in Elm for the last 18 months, and have written tens of thousands of lines of code. I’ve never used a type variable, and was more just postulating that much of what one uses type variable for one could use a custom type or Union types.

Also, having seen much code that uses type variables, it generally makes the code much more confusing than it would be if someone used customs types. Also, I just happen to notice than in the newest version of Elm-visualization, they we looking to remove type variable from a bunch of function signatures.

Again, for certain libraries it may just not be possible. Just trying to have a discussion.

No need to go there, please.

joelq · September 24, 2018, 6:00pm

@madasebrof I’m curious to hear more about your thoughts on type variables. I didn’t want to derail this conversation on limiting custom type parameters so I created a separate thread for discussing type variables: https://discourse.elm-lang.org/t/the-use-and-over-use-of-type-variables/

madasebrof · September 24, 2018, 6:21pm

Just looking through the code for @rtfeldman 's SPA Example.

It would be fairly easy to refactor using only custom types instead of type variables. Basically, you’d just have to move the global Model and global Msg to a top-level module, then import that where you need it.

That way, you can use Msg instead of msg. If you needed to refer to a local Msg (there are 7 unique version of the Msg types in the SPA example), you’d just have to refer to them specifically as needed, e.g. Article.Feed.Msg, Page.Article.Editor.Msg, etc.

That’s how I would design it in the first place. Again–not throwing stones, just to explain why what I said made sense to me!

If I get time, I’ll rebuild the Elm SPA example just to show what I’m talking about.

Also, not advocating for this to be a thing. I was just throwing it out there, as I think the more explicit your code is, the easier it is for someone else to understand what you are doing!

and

itsgreggreg · September 24, 2018, 6:30pm

In an application that I am working on we have a pagination “component” that is used in many lists. It takes in a (Int -> msg) (note the lowercase msg) as a message constructor from the enclosing module to produce when either a number or arrow button is pressed. Without type variables, you’d have to intercept this specific message from the pagination update but forward all of the rest of its messages on to it.

Because this msg must come from outside, it must be specified. And because it must be specified, it cannot be forgotten in your enclosing update function. If you take these two parts away, it becomes less clear how to use the component. Your application will compile just fine and you will get no feedback as to why your button clicks do nothing.

rupert · September 24, 2018, 7:14pm

The trouble with mailing lists is that the tone can easily be read the wrong way. It was meant to be taken as a light-hearted remark, so please don’t take any offence.

madasebrof · September 24, 2018, 7:16pm

No worries!!!

bChiquet · September 26, 2018, 10:52pm

I would like to agree with this, and share a story about what I have now come to think about as elmsplaining.

When 0.19 was announced, I made a library which was initially a replacement for some of elm-monocle use cases, but has other interesting properties. Two use cases for this lib are accessing nested records easily, which may not be a great thing, but also embedding full components (such as datepickers) from libraries in your code in what I feel was a better way.

As 0.19 came out, I had a week of vacations, and wanted to dedicate some of this time to publish a nice datepicker to showcase that work. However, this lib was never published, because back then, the date library was gone, and rather than happily shipping my code, I spent that time reading that what I needed was actually not necessary, some more time debating about why my need was actually legitimate, and eventually I ended up with nothing. Eventually, a third-party date lib took momentum, but I was too annoyed with the time I had spent justifying the existence of my needs to people I don’t know to keep trucking. I eventually significantly scaled down my involvement in this community besides my work.

My point here is that, while a broken language may be detrimental to its users, a language and a community built around overzealous scrutiny of its users, and decisions built on partial knowledge of what users do is also detrimental.

I also don’t want people cargo culting on Evan’s decisions and tell me that if I’m running into trouble for those, then, certainly, I must be wrong and should be glad to get educated. This is an egregious appeal to authority, which also works as an inhibitor to useful feedbacks on language changes.

Please, stop telling people how they should work. If anything, provide your own work as an example instead.

nmsmith · September 27, 2018, 4:55am

I’m of the opinion we should stick to the topic of this thread, and try to determine in what ways limiting tuples and custom types (specifically multi-parameter data constructors) would be beneficial or detrimental. Debating whether language developers should have the authority to force you to write code in a particular way is a wholly separate issue (and tends to lead towards personal attacks).

I don’t personally understand why “unnamed fields can harm readability” (the issue implicitly at hand) is considered by some to be a contentious proposition. It’s generally accepted that program comprehension is a serious challenge in software engineering, and we also know that people are human and tend to do what “works” in the short term to get a job done, and we don’t always go back and neaten things up afterwards. If people reach for large tuples or data constructors by default (gradually contributing to future comprehension issues), then it seems worthwhile to close off this path and instead provide another equally powerful alternative that is more comprehensible. I’d encourage us all to see if we can find such an alternative.

Records are one alternative that has been proposed. It appears the biggest problem with the thread’s proposition to replace multi-parameter data constructors with records is their support for pattern matching: you can’t use literals in a record pattern. This seems like something that could be implemented in a future compiler release if we truly need it.

Speaking more generally about the idea of deprecating both (large) tuples and multi-parameter data constructors: if pattern matching support is the only real issue that needs to be tackled, then we’ve reduced our original problem down to “how can we improve pattern matching on values which aren’t grouped in tuples or data constructors?”. Here’s another issue from the Slack thread which reveals a requirement:

At work, we use multi-way tuples in our case statements, for evaluating complex conditions against values that are not grouped in any other way.

That sounds like a good use-case for large tuples to me, and I can see why they might be useful here. However if we dig down a bit it seems like the real problem statement is “I need to pattern match on multiple unrelated values”. Maybe we can deliver this without tuples per se. A suggestion on the Slack thread is to support comma-separated values in case expressions. I guess this concept is similar to that of a tuple, but it is only part of the syntax of a case expression; it’s not a first class value. Comma-separated values seem worthy of discussion.

These are the challenges and potential solutions that I think are important to discuss if we can first agree that “unnamed fields can harm readability” is a reasonably true proposition that is worthwhile acting on.

robin.heggelund · September 27, 2018, 5:04am

Couldn’t have said it better myself, @nmsmith . I’m planning to write a blog post on this when I have the time, hopefully this sunday.

nmsmith · September 29, 2018, 2:27pm

I’ve done a bit of background reading and it turns out that PureScript allows comma-separated sequences of values in case expressions. This seems like a useful alternative to big tuples and big data constructors if you need to match on multiple disparate values. It alleviates the concern of having to use a verbose record-based alternative. PureScript doesn’t actually have built-in tuples.

I’m just documenting this as something that might be implemented alongside any restrictions to tuples or data constructors.

Topic		Replies	Views
Purpose of 3-tuples Learn	6	1075	May 22, 2020
The use and over-use of type variables Request Feedback	7	2630	October 5, 2018
Types and Sets - Cardinality of Custom type Learn	5	1107	November 18, 2018
Proposal: Defining custom numeric types Request Feedback	15	1743	April 23, 2020
Renaming "union type" to "custom type" Show and Tell	12	2651	September 8, 2018

Language Idea: Limit number of paramters in custom types to three

Related Topics