Earlier I suggested this is not exactly like extensible records for two reasons:
- That the type of value
@X "hi"
would be the open type[ a | @X String ]
- There needs to be a special rule for
case
expressions.
Both of these differences appear in your write up, which makes sense. The fact that the general mechanism is the same does not account for practical differences like this when it comes to implementation and error messages. That’s the point I was trying to make above.
Example
The fact that @X "hi"
has an open type has particularly big consequences with error messages. Take this function:
toOutput input =
case input of
@A a -> @Quick a
@B b -> @Brown b
@C c -> @Fox c
@D d -> @Jump d
@E e -> @Lazy e
@F f -> @Dog f
@G g -> @How g
@H h -> @Much h
@I i -> @Wood i
@J j -> @Could j
@K k -> @Wood k
@L l -> @Chuck l
@M m -> @Chuck m
@N n -> @If n
@O o -> @Wood o
@P p -> @Coud p
@Q q -> @Chuck q
@R r -> @Wood r
@S s -> @Could s
@T t -> @Chuck t
@U u -> @Wood u
@V v -> @Could v
@W w -> @Check w
@X x -> @Wood x
@Y y -> @Brown y
@Z z -> @Fox z
Where is the typo? Is there one typo? Zero? Multiple?
And are there any type errors here? How would you even begin to figure that out?
With a closed ADT, the compiler can underline the exact constructor that has a typo or type error, with or without a type annotation.
It is easy to say “this is no problem if people just add the type annotation” but this is not fully true:
-
You can only give an error for the whole
case
branch, not for the specific constructor. So with a branch withlet x = ... in x
, instead of underliningCoud
directly, it is going to say "there is something wrong with the type of thislet
or thislet
body. If the branch is 10 lines or 20 lines, this is significantly less specific. So even in the best case, the error message is much less specific. -
Say you want intend for
@Whatever x
to have type[ a | @Whatever Int ]
but in a largecase
branch someone ends up using the constructor with aString
value. With the current design, you get the error message directly underWhatever x
but with the proposed design you can only say “this branch does not match the type annotation” again getting 10 or 20 line chunks on largecase
branches. -
My experience very strongly suggests that if a program can be written, it will be written. People will be looking at 400 and 600 line
case
expressions hunting for a typo or type error with associated data. At that point, it is fair to say “the error messages are not very good” and there is no real way to get the quality back besides not using the feature.
In the past, we took out the ability to change the type of record fields in the record update syntax specifically because of problems (1) and (2) where you couldn’t get good specificity, particularly with unannotated cases which are not uncommon in practice. Even after restricting the design of records, it’s still hard to underline the specific field name that has a typo.
I hope this establishes the error message quality issues clearly.
Tradeoffs
Say we have this BEFORE
and AFTER
code, where we are getting the best case error messages for both open and closed union types:
-- BEFORE
type Output
= Quick String
| Brown String
| Fox String
| Jump String
| Lazy String
| Dog String
| How String
| Much String
| Wood String
| Could String
| Chuck String
| If String
toOutput : Input -> Output
toOutput input =
...
-- AFTER
type alias Output
= @Quick String
or @Brown String
or @Fox String
or @Jump String
or @Lazy String
or @Dog String
or @How String
or @Much String
or @Wood String
or @Could String
or @Chuck String
or @If String
toOutput : Input -> Output
toOutput input =
...
To my eye, the AFTER
looks harder to understand and comes with a bunch of downsides that will come up in practice a lot. Conservatively lets say 50% of users don’t think about all the ways the error messages are impacted by the BEFORE
and AFTER
and many are not writing type annotations, especially beginners. The result is that error messages are worse in practice, and there is nothing that really can be done about it.
The best path to deliver them good quality is to strongly recommend against using this feature at all. Furthermore, beginners would be seeing person A recommending this feature highly and person B recommending against it strongly. What should they do? Should they try both? Is this question important to making the website or game they set out to make? Do teams need to argue about it in their style guide on features to use or not? (Any team I’ve been on that uses C++ or Haskell has had a style guide banning specific features specifically because there are so many trade-offs with extensions, macros, features, etc. The reason these style guides are so common is that there is a real tension between “make working code that others can read and modify easily” and having lots of ways to express the same thing.)
So based on my understanding of the design, it seems like some aspects of error message quality can be addressed (e.g. maybe with pattern part of case
branches) but that there are still significant tradeoffs in error message quality in other areas.
Thoughts
I could be wrong about things here, but it feels like this kind of feature is a bit risky for Elm. I try to prioritize ease-of-learning and error message quality very highly. So while I am open to the idea that someone could figure out how to make open union types strong on those points, I would not be comfortable running that experiment in Elm with the information I have at this point.
It seems like a lot of cool things could be done if the core design of the language was “always use open union types” and there just wasn’t closed unions except when you say [ X Int | Y String ]
. That would also mean that @Foo
could be written as Foo
without clashing with another language feature. It’d look a lot cleaner, and it may end up leading towards a different style of typed functional programming that many people could be into. (Lots of people value flexibility and having many ways to do things! E.g. people who prefer Ruby over Python! So even if the error messages are never as good, many people have priorities where that is a worthwhile tradeoff.)
So it feels to me like something worth exploring independently to get a feeling for the full implications in a setting where the culture, best practices, libraries, etc. can all evolve in a coherent way, with flexibility prioritized a bit higher than other things.