Phantom types in practice

Hi! :wave:I wrote a post on a real-world application of phantom types:


In my last post, I described how integrating with a filtering API for querying a database gave me an opportunity to apply mutually recursive types. In this post, I’ll explain another unusual technique it led me to use— phantom types —and how to apply it.

The filtering API I mentioned represents its criteria using lists of conditions that satisfy the filter. The specification describes:

  • The possible conditions available
  • The data types the filter supports
  • Which conditions apply to which types

Here’s a chart based on a simplified version of the specification:

Is Less than Greater than One of Contains
String âś“ âś“ âś“ âś“ âś“
Int âś“ âś“ âś“ âś“
Time âś“ âś“ âś“
Bool âś“

The possible conditions could be modeled as a type in Elm this way:

type Condition value
    = Is value
    | IsLessThan value
    | IsGreaterThan value
    | Contains value
    | IsOneOf (List value)

Having defined our Condition type, we might choose to only expose type aliases for the data types supported by the filter, so that these are the only options available to client code:

type alias StringCondition =
    Condition String
   
type alias IntCondition =
    Condition Int
    
type alias TimeCondition =
    Condition Time.Posix
    
type alias BoolCondition =
    Condition Bool

And in order to allow us to change the implementation of our Condition type without changing the interface to it, we might expose functions to construct Condition s rather than exposing the Condition type’s variants themselves:

is : value -> Condition value
is theValue =
    Is theValue
    
isLessThan : value -> Condition value
isLessThan theValue =
    IsLessThan theValue
    
isGreaterThan : value -> Condition value
isGreaterThan theValue =
    IsGreaterThan theValue

contains : value -> Condition value
contains theValue =
    Contains theValue
    
isOneOf : List value -> Condition value
isOneOf theValues =
    IsOneOf theValues

This is a good start to an API for allowing us to construct Condition s. But there’s another constraint we haven’t accounted for: not all Condition s should be combined with all the supported data types:

aWeirdCondition : IntCondition
aWeirdCondition =
    -- How can an Integer "contain" another Integer?
    contains 148

anotherWeirdCondition : BoolCondition
anotherWeirdCondition =
    -- Isn't this crazy though?
    isOneOf [True, False]

These Condition s aren’t just illogical; they’re also invalid. The chart I showed earlier describes only certain subsets of the possible variants that are compatible with each type. The remaining ones couldn’t be included in requests to the filter API.

Our problem, then, is that it’s possible to instantiate invalid Condition s using our exposed constructor functions. How might we make these invalid states impossible to represent at the type level?

One possible solution: type-specific constructor functions

One obvious solution is to simply write constructor functions for Condition0 s, TimeCondition s, BoolCondition s, and IntCondition s, for only the variants they’re compatible with. This would look something like this:

stringIs : String -> StringCondition
stringIs theValue =
    Is theValue
    
intIs : Int -> StringCondition
intIs theValue =
    Is theValue
    
timeIs : Time.Posix -> TimeCondition
timeIs theValue =
    Is theValue
    
boolIs : Bool -> BoolCondition
boolIs theValue =
    Is theValue
    
stringIsLessThan : String -> StringCondition
stringIsLessThan theValue =
    IsLessThan theValue
    
intIsLessThan : Int -> IntCondition
intIsLessThan theValue =
    IsLessThan theValue
    
-- etc...

This does enforce the constraint on the allowed combinations of conditions and data types. But the downside is that there are now 13 functions to write instead of 5, functions that will need to be remembered and made sense of in client code. And that’s just in our simplified example; in the actual specification for the filter I was working with, there would be 44.

Introducing phantom types

Another solution to this problem is to use so-called “phantom” types. A phantom type gets its name from its declaration of a type parameter that, counterintuitively, is not used inside its definition:

type Phantom notUsed
   = Phantom

Usually, type parameters are used to allow a type to be instantiated in terms of another type. List 's definition as List a allows us to use the List API to work with List s of any type of data.

The type parameters in a phantom type serve a different purpose: they allow us to write functions that will only accept that type with certain parameters. Here’s an example use case that can be improved by this technique:

type Door
    = Door LockState
    
type LockState
    = Locked
    | Unlocked
    
type Room
    = Room

This model of a Door can have LockState of Locked or Unlocked . We’d like to write a function openDoor that maintains the constraint that only an Unlocked door will return the Room it leads to.

With the current model of Door , this is the best we can do:

openDoor : Door -> Maybe Room
openDoor door =
    case door of
        Door Locked ->
            Nothing
        Door Unlocked ->
            Just Room

We’ve modeled the possible states of a door correctly, but since Door Locked and Door Unlocked are both valid instances of Door , we cannot guarantee a function openDoor can return a Room instead of a Maybe Room .

Let’s change the LockState in the definition of Door into a type parameter, to make Door a phantom type:

type Door lockState
    = Door

Let’s also make Locked and Unlocked into separate types with one variant each:

type Locked
    = Locked
    
type Unlocked
    = Unlocked

Now we can write a function openDoor that will only accept a Door Unlocked as an argument, and will return a Room :

openDoor : Door Unlocked -> Room
openDoor door =
    Room

If we instantiated a Door Locked , and attempted to pass it to openDoor , the compiler would produce this error:

lockedDoor : Door Locked
lockedDoor =
    Door
    
thisWontWork : Room
thisWontWork =
    openDoor lockedDoor
    
----

This `lockedDoor` value is a:

    Door Locked

But `openDoor` needs the 1st argument to be:

    Door Unlocked

Before, Door Locked was a possible value of the Door type. After our changes, Door Locked is now a type itself, that can be used in a function signature.


Let’s apply this technique to our Condition problem. My solution was to categorize the variants into groups based on their compatibility with different types:

Identity Comparison Substring Multiple-value
Is Is less than, is greater than Contains Is one of

The compatibility of these groups with a particular type of Condition can be represented by type parameters comparison , substring , and multiple . Adding these parameters to the declaration makes Condition a phantom type:

type Condition value comparison substring multiple
    = Is value
    | IsLessThan value
    | IsGreaterThan value
    | Contains value
    | IsOneOf (List value)

Now, we can define single-value types to use in the definition of Condition type aliases and constructor functions:

type SubStringAllowed
    = SubStringAllowed

type SubStringNotAllowed
    = SubStringNotAllowed

type ComparisonAllowed
    = ComparisonAllowed

type ComparisonNotAllowed
    = ComparisonNotAllowed

type MultipleValueAllowed
    = MultipleValueAllowed

type MultipleValueNotAllowed
    = MultipleValueNotAllowed

Now, let’s define our type aliases in terms of which groups of variants its data type is compatible with. These read more or less like a plain English description of which variants are allowed:

type alias StringCondition =
    Condition String ComparisonAllowed SubStringAllowed MultipleValueAllowed
   
type alias IntCondition =
    Condition Int ComparisonAllowed SubStringNotAllowed MultipleValueAllowed
    
type alias TimeCondition =
    Condition Time.Posix ComparisonAllowed SubStringNotAllowed MultipleValueNotAllowed
    
type alias BoolCondition =
    Condition Bool ComparisonNotAllowed SubStringNotAllowed MultipleValueNotAllowed

And now let’s fix the type signatures of our constructor functions, using concrete types for the values of the type parameters we want to restrict:

is : value -> Condition value comparison substring multiple
is theValue =
    Is theValue
    
isLessThan : value -> Condition value ComparisonAllowed substring multiple
isLessThan theValue =
    IsLessThan theValue
    
isGreaterThan : value -> Condition value ComparisonAllowed substring multiple
isGreaterThan theValue =
    IsGreaterThan theValue

contains : value -> Condition value comparison SubStringAllowed multiple
contains theValue =
    Contains theValue
    
isOneOf : List value -> Condition value comparison substring MultipleValueAllowed
isOneOf theValues =
    IsOneOf theValues

Now these invalid Condition s will produce type errors:

aWeirdCondition : IntCondition
aWeirdCondition =
    contains 148

anotherWeirdCondition : BoolCondition
anotherWeirdCondition =
    isOneOf [True, False]
    
----

This `contains` call produces:

    Condition Int ComparisonAllowed SubStringAllowed MultipleValueAllowed

But the type annotation on `aWeirdCondition` says it should be:

    IntCondition
    
...

This `isOneOf` call produces:

    Condition Bool ComparisonNotAllowed SubStringNotAllowed MultipleValueAllowed

But the type annotation on `anotherWeirdCondition` says it should be:

    BoolCondition

Things to consider

Keep an eye on your module interface

The guarantees that phantom types offer are not secure unless you pay attention to which parts of your modules you expose.

In this example, we exposed:

  • The type aliases StringCondition , IntCondition , etc.
  • The constructor functions like is , isGreaterThan , etc.

And we did not expose:

  • The Condition type
  • The single-value types like SubStringAllowed
  • The variants of the Condition type like Is and IsGreaterThan , etc.

This prevented client code from instantiating invalid Condition s directly.

Consider developer experience

Although their power to restrict instances of a type can be helpful, phantom types can potentially produce compiler errors that are hard to understand. This is a risk, so consider whether a developer will be able to make sense of them without an understanding of the internals of your module or reading notes in your documentation.

In my case, the constructor functions I exposed included documentation comments describing the constraints and the purpose of the type parameters that a developer would see in error messages.


Phantom types can be helpful when certain instances of a type need to be restricted from use with certain functions, but duplicating the type and its helper functions would be impractical. They’re not a tool to reach for regularly, but they can be valuable to keep in mind when other options for maintaining constraints fall short.


Originally posted at https://dmalashock.com/elm-phantom-types/

10 Likes

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.