How to represent many subsets of an enum?

Something I am working on at the minute, and I would appreciate some thoughts on the best way to model this.

I generating stubs for AWS services. Each service has a set of errors it can return. For example:

ServiceException
ResourceNotFoundException
ResourceConflictException
TooManyRequestsException
InvalidParameterValueException
PolicyLengthExceededException
PreconditionFailedException

Now each endpoint within the service might only be able to return some of these errors. It would be more accurate if each endpoints type only allows for the errors it can return, and not the full set.

Anyway, here are the options I have thought of so far:

  1. Have each endpoints return type allow the full set:
type ServiceError
    = ServiceException
    | ResourceNotFoundException
    | ResourceConflictException
    | TooManyRequestsException
    | InvalidParameterValueException
    | PolicyLengthExceededException
    | PreconditionFailedException

enpoint1 : Input1 -> Request (Result ServiceError Response1)
endpoint2 : Input2 -> Request (Result ServiceError Response2)

Problem with this is, suppose endpoint1 can only produce ResourceNotFoundException and not the other errors, its return type is overly wide.

  1. Have each endpoints return type allow just the errors it can return. This seems to need an error type per endpoint:
type Endpoint1Error
    = Endpoint1ErrorResourceNotFoundException

service1 : Input1 -> Request (Result Endpoint1Error Response1)

type Endpoint2Error
    = Endpoint2ErrorServiceException
    | Endpoint2ErrorResourceNotFoundException
    | Endpoint2ErrorResourceConflictException
    | Endpoint2ErrorTooManyRequestsException

service2 : Input2 -> Request (Result Endpoint2Error Response2)

Problem with this approach is the need to have an error type per endpoint. This removes the ablity to andThen a series of endpoint calls together without mapping the errors onto some other error type that includes all the possiblities (ServiceError and a load of error mapping functions). I also had to prefix all the constructors with EndpointX to make their names unique.

  1. Do something with phantom types. Seems overly complicated for this.

  2. Maybe just return the error name as a String. Its simple and the full set of strings could be supplied as constants to reduce coding errors when writing an error matching case statement:

case error of 
    AWS.Lambda.policyLengthExceeedException -> ...

    _ -> ...

Any better ideas on how to have lots of types, each of which represents a subset of a set of enum constants?

3 Likes

I believe this is one of common shortcomings when working with custom types (sum types) in languages which support them. Essentially we want to “narrow down” from wider variants to some specifics, but there are no obvious way to do this.

Usually we end up prefixing them by modules (i.e. errors per endpoint in @rupert 's case) and marshalling between them. In languages where common behaviors can be abstracted away (i.e. typeclasses and whatnots) it settles down to acceptable repetitions but certainly in Elm more boilerplate code will come in our way.

I am also interested in good solutions to this.

On #4, I don’t believe you can match against constants as you’re suggesting, you can only match against literals AFAIK. I would love to be corrected on that though.

I don’t have any good suggestions for how to do it in Elm unfortunately. In Scala, each of the options in a tagged union is also a class in itself, so you can actually include the same option in multiple tagged unions, ala:

sealed trait ErrorEnumOne
sealed trait ErrorEnumTwo

case class CommonErrorCaseOne(...) extends ErrorEnumOne with ErrorEnumTwo
case class ExclusiveErrorCaseTwo(...) extends ErrorEnumTwo

def dealWithErrorOne(error: CommonErrorCaseOne) ...
def dealWithErrorTwo(error: ExclusiveErrorCaseTwo) ...

match ErrorEnumOne {
    case error: CommonErrorCaseOne => dealWithErrorOne(error)
}

match ErrorEnumTwo {
    case error: CommonErrorCaseOne => dealWithErrorOne(error)
    case error: ExclusiveErrorCaseTwo => dealWithErrorTwo(error)
}

It’s a capability that comes in handy precisely for defining errors :slight_smile:

For Scala 3, they’re also adding the ability for you to do anonymous inline unions, ala:

def doOperation(...): Success | ErrorOne | ErrorTwo = { ... }

Which you match on as any other union.

One way I could do it, and not a nice way, is to define an Either type (which is really the same as Result).

type Either a b
    = A a
    | B b

type ServiceException = ServiceException
type ResourceNotFoundException = ResourceNotFoundException
type TooManyRequestsException = TooManyRequestsException


fun : Result (Either ServiceException (Either ResourceNotFoundException TooManyRequestsException))  Whatever

By stacking the Eithers up into a tree, I can use them to list all the possiblities. Its really not a practical solution.

Scala being a JVM language can you not just use the normal Java exceptions mechanism?

In Java it would be:

Response1 endpoint1(Args args) : throws TooManyRequestsException, InvalidParameterValueException, PolicyLengthExceededException

(Do I need that :? Too long since I wrote Java…)

You can use exceptions, though they’re not checked as they are in Java, so you can’t tell which exceptions a function might throw by looking at its types. Although that’s also true in Java, if you count all the runtime exceptions that also aren’t checked.

So if you want strict error checking, Scala pushes for sum types

I thought of a way of doing it, but it seems a little contrived and I’m not sure it really adds much that is useful.

type ErrorCode
    = ServiceException
    | ResourceNotFoundException
    | ResourceConflictException
    | TooManyRequestsException
    | InvalidParameterValueException
    | PolicyLengthExceededException
    | ErrBadCode


type EndpointError
    = ByCode ErrorCode


endpointError : ErrorCode -> EndpointError


endpoint : EndpointInput -> Request (Result EndpointError EndpointResponse)


processError : EndpointError -> a -> Result EndpointError a -> Result EndpointError a

The idea is that each endpoint has its own error type EndpointError here. It also will have a function to build that type, endpointError and the constructor will not be exposed - an opaque type. To handle errors, you have to use this constructor to build an instance of the error you want to handle, and a default value a to use when that error is matched. Its a bit like Result.withDefault, but it wraps the result with an Ok when the error is matched, so that you can chain these together to handle all the error cases of interest. Since the constructor can fail by being given a value from the enum that is not valid, another error had to be inserted to allow this - ErrBadCode. An ErrBadCode is to be interpreted as a runtime error pointing to a bug in your code.

It produces a meaningful error if you try to handle an error result that the API spec says should not be allowed to happen. It also makes it easy to map the error onto a common type, ErrorCode, simply by lifting the value beneath the ByCode constructor. Makes it easy to chain multiple API calls together and coalesce errors into a common type, when you don’t want to handle them, just pass them up the call stack.

Thing is, I bet AWS services can return errors that are not described in their API specs - given my experience of the fidelity of the specs so far. So for a first pass, I think I will just use a String representation and side-step the issue. Give it some consideration on a future release if something more typed would actually be better.

Shame about not being able to pattern match against constants. I can see why though - they are not distinguishable from pattern match variables:

val = "someVal"

fun x = 
    case x of
        val -> ... -- Is val the constant or a variable???

Compiler gives this error:

The name `val` is first defined here:

154| val =
     ^^^^
But then it is defined AGAIN over here:

160|         val ->
1 Like

I don’t think there’s a good way to solve this with where Elm stands currently, but I do think that it’s a good fit for extensible unions/polymorphic variants. Essentially those are to sum types what records and extensible record types are to product types (imagine how limiting Elm would be without records and only with standard product types i.e. type ProductType = ProductType Int String Int, that’s basically the current situation with sum/union types).

I think, unlike the cornucopia of other possible type extensions to Elm’s type system, the addition of polymorphic variants is something that both fills in a theoretic hole in Elm’s type system, and provides a lot of help for real world Elm code.

In particular, apart from solving this representation issue, polymorphic variants offer a solution for (but not limited to):

  • Compiler support to remind you to update decoders with new variants (you add a new variant to a custom type MyCustomType and then forget that you have a decoder of String -> Maybe MyCustomType that you now need to update as well, which Elm’s current compiler cannot help with because MyCustomType appears as an output rather than an input)
  • Solving the annoying issue of unifying error types when you have two functions f : A -> Result Error0 Output and g : A -> Result Error1 Output that you want to use together
  • Unifying the NoMap, OutMsg, and Transformer patterns for modularization of Elm code under a single framework (which looks like NoMap but without its downsides)

And I think that the same limitations that Elm puts on extensible records (namely no adding or deletion of keys) can also be used with polymorphic variants to limit the complexity cost they might bring to the language (as opposed to e.g. how they’re used in OCaml where they can quickly balloon into monstrosities).

Just a seed to plant in the collective minds of the Elm community.

If folks are interested I can write up a more detailed brain dump of what polymorphic variants are and why I think polymorphic variants are a uniquely good fit for Elm.

I would be interested in more information. My initial impression is that I don’t understand how the proposal helps with the first bullet point, and the 2nd and 3rd seem straightforward, even if they need a little boilerplate.

Go on then, interested to see what this looks like as imaginary Elm code. Also, would it work with type inference?

1 Like

Briefly, yes! You can maintain global inference and fast compile speeds.

Cool looks like there’s interest for a longer explanation. I’ll probably post it as a separate thread and tag the two of you (@dta and @rupert ) because it’ll be quite long.

I’m kinda curious as to what the phantom type solution would look like. I think I may want/need a similar solution to you with a side project I’m working on as having separate errors will probably make sense.


type ServiceError a
    = ServiceException
    | ResourceNotFoundException
    | ResourceConflictException
    | TooManyRequestsException
    | InvalidParameterValueException
    | PolicyLengthExceededException
    | PreconditionFailedException

type Endpoint1 
    = Endpoint1

endpoint1 : Input1 -> Request (Result (ServiceError Endpoint1) Response1)
endpoint1 input = 
    ...

The consumers of the endpoint1 function will thus know that the result has certain guarantees, in this case, that only a specific error set will be expected.

Unfortunately phantom types give you no lift here.

The problem is if you have a second endpoint endpoint2 that can only error out with some subset of ServiceError a. Phantom types don’t let you change the number of variants that ServiceError has, which is what @rupert ultimately needs AFAICT. In fact, in this case phantom types end up being the same thing as namespacing each error with the endpoint name (i.e. the equivalent of Endpoint1Error).

@wolfadex The problem here is actually very similar to the problem that faces modularizing of large Elm codebases, which runs into the problem of how to address the fact that different modules have different subsets of messages that are being returned. Your choices here (in Elm as it stands without polymorphic variants) are likewise very similar, and can be thought of as being completely analogous to NoMap, OutMsg, and Translator.

NoMap: Basically don’t try to have different endpoint return different error types and just unify everything under a single error type and accept that the type is “too wide.” (Basically both choices 1 and 4 among @rupert’s choices, just a matter of how “wide” we want to go)

OutMsg: Split up your error type into a type containing your “core” errors that are common to all your endpoints and your endpoint-specific error types and have each endpoint now return Result (Either CoreError EndpointSpecificError) SuccessType. This only really works if such a “core” exists for all endpoints. (Not currently listed)

Translator: Construct a single overarching error type covering all error cases and write out error types for every endpoint, then create functions embedding each endpoint-specific error type into your overarching error type (what @rupert calls choice 2). This also ends up being equivalent to what @rupert is calling his stacking solution (because you need to have functions that both inject a type into and convert between various permutations of Eithers).

1 Like

The only thing that comes to my mind that would allow expressing the kind of thing @rupert seams to want is dependent types.

It would require support for something like:

type alias Endpoint1Error = 
    e : ServiceError | e == ResourceNotFoundException

This would allow the same thing as the phantom types above but in a way where you can have a subset of the tags (values) of ServiceError.

Thanks for the explanation - where do these terms come from? Scala? Haskell? ReasonML? The term OutMsg is the only one I have heard of in the Elm realm previously. In Elm we sometimes talk about the out message pattern, where you have an update function that may return an additional value:

update : Msg -> Model -> (Model, Cmd Msg, OutMsg)

-- Or
update : Msg -> Model -> (Model, Cmd Msg, Maybe OutMsg)

Where the out message tells the caller about some event of interest, such as a state change in this module instance. For example, I use it in my auth module to inform the caller when the auth state changes from LoggedOut to LoggedIn and so on.

Interestingly for me, the Translator pattern is what I used in my code generator. Each code generator module has its own error type. I also defined a common error type, which consists of a String and an error code. The errors in this case are very similar to compiler errors - they are aimed at being descriptive and helpful to the user to solve a problem with their code, and there can be more than one of them. The error code is unique to each type of error, and is also used as a key into an error catalogue. The error catalogue is a site I built with elm-pages, it has a section for each kind of error that gives more detailed background on the error, examples of the error, issues around it, how to fix it, that sort of thing. The idea is that in the UI I can link off to the error catalogue for more detailed context sensitive help.

I touched on some of these ideas here:

And got a package out of it for results with multiple errors:
https://package.elm-lang.org/packages/the-sett/elm-error-handling/latest/

For that project, having one error type per module worked out pretty well. Also I always handle all errors in the same way, and each code gen module provides the same function to convert into the common format, so its pretty easy to hook everything together. For these AWS API stubs, having one error type for each endpoint would feel like too much, given that there are tens to a hundred endpoints in each module. That said, it is generated code so creating lots of boilerplate isn’t an issue. I’m more thinking it will be a pain for the user of the API to wade through such a bloated interface.

I quite like the Translator pattern though - either you deal with an error right away, or else you translate it into a less specific form and pass it up. Often a String that you can log is good enough. The errors that you don’t deal with can usually be either thought of as runtime errors that signify a bug in your code, or unrecoverable system errors like a 500 response from some service - you just want to do your best to log the string for the attention of the technical team.

In case you were interested in full on phantom types, you can do it like this (not sure it’s worth it, but it typechecks):

module AWS.Phantom exposing (Thrown, NotThrown)
{- Not exposed -}

type Thrown
    = Thrown

type NotThrown
    = NotThrown
module AWS exposing (AwsError, PossibleErrors, foo, bar)

import Dict exposing (Dict)
import AWS.Phantom exposing (..)


type AwsError rec
    = AwsError String

type alias PossibleErrors =
    { foo : NotThrown, bar : NotThrown, baz : NotThrown }

{-| These are the actual services. The type indicates which errors can be thrown -}
bar : Int -> Result (AwsError { a | baz : Thrown, bar : Thrown }) Int
bar n =
    Debug.todo "not implemented"


foo : Int -> Result (AwsError { a | foo : Thrown, bar : Thrown }) Int
foo n =
    Debug.todo "not implemented"
module AWS.ErrorHandler exposing (Handler, new, FooError, handleFoo, handleBar, handleBaz, handle)

import Dict exposing (Dict)
import AWS.Phantom exposing (..)
import AWS exposing (PossibleErrors)

{-| These make it nicer to read -}
type alias Handled =
    Thrown

type alias NotHandled
    = NotThrown


{-| This type encapsulates error handling logic -}
type Handler resultType errors
    = Handler (Dict String (String -> resultType))

{-| The api here follows a builder pattern -}
newHandler : Handler b PossibleErrors
newHandler =
    Handler Dict.empty

type FooError
    = FooError


{-| Each possible error gets a function. These can each pass custom metadata to user code -}
handleFoo : (FooError -> b) -> Handler b { a | foo : NotHandled } -> Handler b { a | foo : Handled }
handleFoo fn (Handler handler) =
    Handler (Dict.insert "FooError" (parseFooErrorStr >> fn) handler)

{-| They each track which errors are handled in the handler -}
handleBar : (FooError -> b) -> Handler b { a | bar : NotHandled } -> Handler b { a | bar : Handled }
handleBar fn (Handler handler) =
    Debug.todo "not implemented"


handleBaz : (FooError -> b) -> Handler b { a | baz : NotHandled } -> Handler b { a | baz : Handled }
handleBaz fn (Handler handler) =
    Debug.todo "not implemented"

{-| Finally we convert the type into a function that handles an error. At this point we assert that the two records need to match - we have provided a handler for each error. -}
handle : AwsError a ->Handler b a ->  b
handle  (AwsError error) (Handler handler) =
    case Dict.get (parseErrorType error) handler of
        Just fn ->
            fn error

        Nothing ->
            Debug.todo "an excersize for the reader"


parseErrorType : String -> String
parseErrorType e =
    Debug.todo "not implemented"


parseFooErrorStr : String -> FooError
parseFooErrorStr e =
    Debug.todo "not implemented"

And then finally, an example of usage:

myTask =
    case AWS.bar 2 |> Result.andThen (\n -> AWS.foo n) of
        Ok n ->
            n

        Err err ->
           ErrorHandler.newHandler
              |> ErrorHandler.handleFoo (always 3) 
              |> ErrorHandler.handleBar (always 3) 
              |> ErrorHandler.handleBaz (always 3)
              |> ErrorHandler.handle err

Ah sorry, I was just referring to the oft-linked post https://medium.com/@rchaves/child-parent-communication-in-elm-outmsg-vs-translator-vs-nomap-patterns-f51b2a25ecb1 and drawing analogies between each of the three approaches there and some same approaches for error handling.

Ah that’s really cool! It looks like you’re essentially hacking in some rudimentary polymorphic variants/extensible unions with some of the same underlying ideas as what’s going on here: https://github.com/natefaubion/purescript-variant by emulating algebraic datatypes with higher order functions.

I stand corrected. Phantom types give you more than I thought (although things get painful if you try to mix and match this and I can’t think of a reasonable implementation of "an exercise for the reader" other than Debug.crash because if expose the Maybe from the get you get an API that’s equivalent to just having exposed the dictionary directly at which point all the type stuff doesn’t really matter)!