Best practice for guarded types?

I’m not actually sure what the best terminology is here, a guarded type, a restricted type? The idea is to wrap a basic type in an opaque custom type, and expose a lower case constructor for it, such that only instances meeting certain guard conditions can ever be created. An example:

module Unitary exposing (Unitary, unitary, value) -- Unitary is opaque.

{-| Numbers between zero and one. -}
type Unitary
    = Unitary Float

unitary : Float -> Maybe Unitary
unitary val = 
    if val < 0 then
        Nothing
    else if val > 1 then
        Nothing
    else
        Unitary val |> Just

value : Unitary -> Float
value (Unitary val) = 
    val

So this can be quite annoying to use because the constructor returns a Maybe. An alternative might be:

unitary : Float -> Unitary
unitary val =
   clamp 0.0 1.0 val |> Unitary

half = unitary 0.5

When I define the value half, I know it will work, but if the answer was a Maybe I would have to include some failure processing code to deal with Nothing, whilst knowing that will never happen, and this way I don’t have to - Also, in 0.19 Debug.crash is not an option for the never followed failure branch.

On the other hand, some bugs may be silently masked by the default value.

This technique is only viable in situations where a reasonable default can be assumed. What about if the guard type was over strings, and the strings must have a minimum length? The caller passes a string in of length 9, but this particular constructor needs 10 chars minimum. Padding the string out with whitespace would seem wrong - but not impossible. What if it were a regular expression match? Mapping the input to the nearest regex match would definitely seem wrong and not easy to do either.

I also note that in the case where constructors return a Maybe, some common values can be provided by the module implementing the guard type, since they do have access to the upper case constructor:

zero = Unitary 0.0
half = Unitary 0.5
one = Unitary 1.0

This can work well if the guard type is an enum - all the enum values can be listed out.

type Fruit
    = Fruit String

apple = Fruit "Apple"
orange = Fruit "Orange"
banana = Fruit "Banana"

What if I want to help the caller understand why something could not be created? Instead of Maybe perhaps I should use Result? Maybe is for things that are optional, Result is for things that can result in errors. If the caller passes 2.0, a good response might be Err "The input must be between 0.0 and 1.0, but you gave the value 2.0.".

Would love to hear your views on this, or alternative ideas.

Always return Maybe, or allow defaults if the case is right for it?
Maybe vs Result?

2 Likes

Opaque type! They’re great for the reasons you describe.

This rule seems fine for new code! It can get a little bit nicer by defining the error as a custom type instead of a String:

type Problem
    = BelowRange
    | AboveRange

Then you don’t need to worry about translating the string. Buuuut if “out of range” is the only way this can fail, it’s maybe fine to return a Maybe. For example, check out String.toInt and String.toFloat. Both return Maybe a and it works just fine.

6 Likes

Dict.Dict is an opaque type, but its not just a single clause wrapper around a basic type, which is why I think maybe there is a terminology specific to this kind of thing?

This is a really great question. I had to make this decision in at least two of my projects recently.

The first was in my calculator app. I have a Rational type and you can’t create one with a zero denominator. In that case I returned a Maybe Rational. See here. I also provide two alternative ways to make rationals that never fail, see the zero and fromInt functions.

The second was in my rater app. I have a Rating type and there I used a default if the arguments led to an invalid rating. See here.

I think there are a variety of APIs that can work and all the suggestions so far seem reasonable. The choice you make depends on the trade-offs you’re willing to make I guess.

Good point. It also defers the decision of how you’d want to write the error message. You’d write it differently if it’s for logging purposes as opposed to user facing.

The compiler source code calls it a “box” (VarBox)

This does not have a special name as far as I know.

(Compilers have things like boxed and unboxed integers, but I would not use those terms outside of a compiler.)

The more general idea is using module boundaries to enforce invariants. Sometimes you cannot achieve what you want with just a custom type (e.g. decimals between zero and one) but you can create a module such that any Fraction behaves as expected. As long as the module is implemented correctly, all the uses outside the module will work as expected.

I’m not sure if there’s a short name for “using module boundaries to enforce invariants” besides things like opaque types and exposed values (i.e. the tools you use to enforce invariants)

Choosing the wrong name can sometimes make things more confusing than having no name at all, and my instinct is that the important thing here is more about module boundaries.

5 Likes

Depending on the specifics of the use case, I would say some or all of the techniques you’ve identified!

For the example of unitary, I think clamp is a great option since it’s likely to be useful and also self-explanatory. I would also include a function that returns a Maybe or Result so that clients can optionally check their input. zero and one also sound like good options for this module.

For enum-like modules, it may make sense for factory functions to return a Maybe or to return some default value. It all depends on the exact structure being modeled!

I guess you could also call it an “abstract data type” or a “class of objects whose logical behavior is defined by a set of values and a set of operations”. The way to define an abstract data type in Elm is by using modules. You define an opaque type, a set of operations for that opaque type and you expose the type and the operations to the users of the module.

In my view, the best practice is entirely dependent on the use-case. The reason we have both Maybe and Result is because in some case you care about the type of failure and in some cases you don’t.

What is the right tool, depends on what you want to use if for.

1 Like

Seems like in Haskell they are called singletons - a type with only one constructor.

Hopefully helpful, possibly irrelevant academic comments about names.

Singletons in Haskell are something different: they’re types with only one inhabitant. They’re used to reflect values to/from the type level (e.g. lifting a program-level number to be a type-level number).

The constructors-that-also-do-extra-work you’re using are typically called “smart constructors”, in my experience. Smart constructors are maybe most commonly used for hash-consing.

These ‘guarded types’ are also sometimes called ‘refined types’. A common format (which seems to line up with the intended uses here) is the ‘subset refinement type’, x:T where e, i.e., those x of type T such that e evaluates to true. (The nomenclature is confusing.)

Liquid Haskell does subset refinement type checking statically, and in principle such a tool could be built for Elm, too. The kinds of dynamic checking you’re thinking about are common in languages where runtime failures are allowed (e.g., Racket). In Elm, you could write a smart constructor that used Debug features to allow those runtime failures, at least during development.

4 Likes

If you are providing a default, maybe make it explicit such as withDefault.

1 Like

This Stack Exchange comment gave me a critical insight that had been missing since I started learning to program.

Using Domain Driven Design the author creates smart constructors to help build algebraic data type abstractions. Effectively modelling domain entities throughout his data.

This is what I want to do in Elm and would really like to understand how to create smart constructors. Can we do this as elegantly as in Haskell?

1 Like

Interesting, as this article was posted recently to the Slack chat, perhaps you saw it too?

https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/

2 Likes

@rupert

An excellent article. Thanks :pray:

Is it possible to implement similar smart constructors in Elm? To parse, not validate…?

Yeah, going with the example in the article, here is the NonEmpty list implemented in Elm https://github.com/mgold/elm-nonempty-list/blob/4.0.2/src/List/Nonempty.elm#L93

Thanks. I’m somewhat surprised that smart constructors aren’t really addressed in the Elm community.

It’s also very surprising that the idea of ‘Parse, don’t validate’ hasn’t found its way into discussion about Elm best practices.

Especially if Elm supports the capability to create smart constructors as naturally as in Haskell.

I think they are widely used, especially in libraries, but they don’t always use the name smart constructors. Some other terms that may be useful in searching for the pattern are Opaque Types, and Phantom Types, they are both great for getting the most out of the type system and preventing errors through it.

As @antew says, this idea is widely used in Elm and we generally call it opaque types. So the List -> NonEmpty example translates accross quite easily.

I never heard it expressed as ‘Parse, don’t validate’ before, and that seems a good rule of thumb made easy to remember. The idea seems to be, don’t validate to Bool, because the underlying data will still need to be error checked every time it is used after the validation. Instead parse to something that does not allow the illegal states, and then the code that follows will be able to remain a pure function.

I was curious to try this out on the kind of refined types we are talking about on this thread. A slightly contrived example, suppose I wanted a function to compute to-the-power-of but in ints:

powerOf : Int -> Int -> Maybe Int

The result has to be a Maybe because the power could be negative. 2 ^ -1 = 0.5, and 0.5 is not an Int. So lets create a refined type to only allow positive integers and see how it works out:

type Positive
    = Positive Int

positive : Int -> Maybe Positive
positive val = 
    if val >= 0 then
        Positive val |> Just
    else
        Nothing

powerOf : Int -> Positive -> Int
powerOf val (Positive pow) =
    val ^ pow

So seems like it will work out nicely. I was wondering if I would end up having to re-check that pow is really positive, or deal with the result of (^) being a Maybe or Result - code branches that I know would never be followed but the compiler does not. If there were such code branches they would need to return some default wrong answer, or could be made to crash with Debug.crash or some other hack if running with --optimized.

I got lucky in this case, because the (^) function has a bug:

pow : Int -> Int -> Int
pow x y =
  x ^ y

pow 3 -1  
0.3333333333333333 : Int -- Uh oh!
1 Like