Can an Elm program ever be untyped?

Elm has polymorphism, but we usually cannot create polymorphic values. An example:

foldr : (a -> b -> b) -> b -> List a -> b

To call this, I need a list of something:

foldr (\w acc -> w ++ " " ++ acc) "" ["words", "to", "join", "with", "spaces"]

Its polymorphic, but I needed to give it some values to do anything useful, and that means that the program ended up fully typed at runtime. Ok, I don’t have to give it concrete values immediately, the polymorphism could continue up the call stack as many times as we like, but ultimately somewhere has to provide real values.

Also worth noting that the empty list can be polymorphic (List a); in some sense all empty lists are equivalent despite their possible differences in type.


There is a way in Elm that I can create a non-empty value that has a variable type:

num : number
num = 1

Elm does not require that Float has a decimal point in it, so when I write an integer it can be Int or Float until the type checker constrains it one way or the other.

But will that always happen ?


Elm Programs can have arbitrary type variables. You might know this from having done some fancy thing with Programs in the past. It is possible for unconstrained type variables to bubble all the way up to the Program level, and they can then escape the Program entirely there:

main : Program () (Model number) Msg

So now I can have:

type alias Model number =
    { count : number }

And I can really instantiate that with a variably typed value, see num above.

But can I ever instantiate it with a value that is not actually an Int?

untypedModel : Bool -> Model number
untypedModel isInt =
    if isInt then
        { count = 1 }
    else 
        { count = 1.0 }

But that won’t pass typechecking, since both branches must have the same type, they unify to Model Float and that is not the same as Model number:

Something is off with the 2nd branch of this `if` expression:

23|         { count = 1.0 }
            ^^^^^^^^^^^^^^^
The 2nd branch is a record of type:

    { count : Float }

But the type annotation on `untypedModel` says it should be:

    Model number

I also tried passing in a number as a flag:

main : Program number (Model number) Msg

And the type checker also rejected that:

Your `main` program wants an unspecified type from JavaScript.

55| main =
    ^^^^
But type variables like `number` cannot be given as flags. I need to know
exactly what type of data I am getting, so I can guarantee that unexpected data
cannot sneak in and crash the Elm program.

If we took an Elm program and type checked it, then replaced all occurences of values with number type that remain with Int, I think the result would be an Elm program that still type checked and had the exact same runtime behaviour as the original.

I don’t believe there are any other ways in Elm that we can create instances of values that are untyped.

Therefore, even though Elm programs can be polymorphic, they can never be untyped.

I just wanted to check that my reasoning here is sound, and that there is nothing I have missed ? no sneaky workarounds ? No devious type shenanigans that would falsify that assertion ?

Ellie in case you wanted to fiddle around with it: https://ellie-app.com/xVfKVgBKs3Xa1

Does this count since we never define msg?

main : Html msg
main =
  Html.text "Hello"

No, since msg here is effectively empty - I count it in the same category as List a, polymorphic but you never actually create an untyped value of anything.

Doing something like:

let
   l : number
   l = [ 1, 2, 3 ]
in
List.any (\x -> x > 2) l

Is not what you’re looking for right? In this case l really is of type List number and we consume it without ever resolving its type. I’ve even done a bit more with l than was strictly needed, I could have done something like List.length l, but that would feel like I was deliberately ‘throwing away the values’. In the example above I don’t throw away the values. Though, the example above is a bit contrived in that I could have written l : Int to no ill effect.

Good example, thanks. The result of this is Bool. So what is interesting about this example is that it uses number in a computation, but the result type is Bool. So the use of values that are not strictly typed is only temporary, and does not escape this context into the wider program.

But as you say, change number to Int and it works just the same. I think changing values from number to Int should always be safe, after type checking. After type checking so that number is not unified to Float.

To be more precise about what an ‘untyped program’ would be: One that can run with values that can have more than one type, but are somehow coerced into being untyped (represented as a type variable or their types made to dissapear somehow), yet the program still runs. So this program is almost that, but in reality the only type it can actually run over is Int.

So the use of values that are not strictly typed is only temporary, and does not escape this context into the wider program.

Although I could have done List.maximum l, which would have resulted in a Maybe number.

Ultimately you’re going to do something with the number, since we don’t have a generic String.fromX or even String.fromNumber.

So this program is almost that, but in reality the only type it can actually run over is Int.

Except that l : List Float would also work.

But I get that this example is not quite hitting the potential problem you are thinking of.

So this program is almost that, but in reality the only type it can actually run over is Int or Float, but only one of those in any given version of the program!

I think it is an imaginary problem, and Elms type system is completely sound.

1 Like

I think the way to summarise this is: If a value is of type number then it really is safe to use it as either an Int or a Float. Similarly “empty” values, such as List a really are safe to use as though they are any kind of list (because they don’t have any a values in them).

There is one way we can sort of hide types in Elm, and that is “Jeremys Interface” (oh no!, not again…). The single function version of which is:

create : (opsTyp -> typ) -> ((rep -> () -> typ) -> opsRep -> opsTyp) -> (rep -> opsRep) -> (rep -> typ)
create constructor mapOps impl rep =
    let
        raise : rep -> () -> typ
        raise rep_ () =
            constructor (mapOps raise (impl rep_))
    in
    raise rep ()

This lets us return a value that contains rep but without exposing rep in the return type.

I believe the correct term for create is a hylomorphism which is an unfold and fold fused together or generate and interpret in one go.

type Counter
    = Counter CounterIF

type alias CounterIF =
    { value : Int
    , inc : () -> Counter
    }

type Rep
    = Rep Int

new : Int -> Counter
new n =
    let
        make : Rep -> Counter
        make rep =
            Counter
                { value =
                    case rep of
                        Rep x ->
                            x
                , inc =
                    \() ->
                        case rep of
                            Rep x ->
                                make (Rep (x + 1))
                }
    in
    make (Rep n)

Rep never appears in the resulting API. Only Counter does. So the outside world can’t name Rep, construct it, or inspect it. From the outside we know it exists but that is all.

This is existential typing:

∃ t . Interface(t)

There exists some type t, but you are not allowed to know what it is.
You only get operations defined by the interface.

And not ad-hoc typing:

∀ t . If t has property P, then operations exist

Behaviour depends on which type t actually is.

Which is overloading / typeclass / polymorphic dispatch.

I was argued against so vehemently on Slack that Jeremys Interface is ad-hoc typing and not existential, that I gave in to believing that to be true. If it were true, then we really could create new typeclasses in Elm and get polymorphic dispatch. Turns out it isn’t true and I was right all along to identify it as existential typing. No loop holes here!

I think it’s worth starting with reminder that elm is in theory based on a type system design that is mathematically proven to be sound.

There are 3 theoretical properties we can clasify typesestems with:

  • Soundness - is everything that is invalid proven to be invalid
  • Completeness - can everything that is valid be proven to be valid
  • Decidability - is type checker total (guaranteed to fail or succeed in finite time)

You can think about these as 3 distinct platonic ideals of a type checking process.

To rephrase the question with this in mind if I understand it correctly is to say "Is Elm’s type system sound?".

And the answer is that in theory it should be 100% sound. But in practice there could be bugs in implementation which make it unsound. Like this one which is a real bug present in current version of elm

> (floor 2) ^ -1
0.5 : Int

With soundness out of the way, the next question should be “What does untyped mean?”.
The way I understand type theory as it applies from philosophy (Russell) beyond to the practical applications in logic and computer science algorithms is that it’s not really an additive.

The type theory doesn’t say that there shall be types. It rather says that there are types and the two options you have is to talk about them or not.

So more practically that implies that there is no language without types. Nothing can be untyped. To ignore types means to consireder just a single type. So if we say by “untyped” we mean something like dynamic language lacking a type checking step we don’t talk about programing without types but programing with single type only.

Under this interpretation you can very easily program without types in Elm or any other language. For instance you can write your whole program around type like this:

type TheOnly
    = Int Int
    | String String
    | Record (List String TheOnly)
    | Pair TheOnly TheOnly

if you write the whole program with this you’re essentially programming without static type checking. Now your whole program is using just a single type and all errors that would be compile times errors if you would use type system that language gives you are now runtime concern.

1 Like

Nice and comprehensive presentation! I like the observation that all types could be shifted to the run-time even in a statically typed program. The sillier example often given is that you can write a dynamic language interpreter in the static language. Your example is more to the point.

I just want to layer on: Type variables are, themselves, types. All types must unify. So even when there are type variables the program is still “typed” in the practical sense that the program will exhibit the behavior you expect (unlike TypeScript :nauseated_face:).

HOWEVER, continuing in the spirit of articulating “infirmities” or ways one might undermine the type system.

Lack of equatable

There is a small lie in the core library in that (==) and (/=) are typed as a -> a -> Bool. That type signature promises “give me any two Elm values and I can tell you whether they are equal”, but you cannot compare all functions i.e. always 1 == always 2 throws a run-time error.

That could be solved by introducing a new constrained type variable like equatable. See PureScript’s Eq typeclass definition for a fun overview of things that would be members. I suspect adding a check for equatable would slow down compilation / feedback loops which can hurt quality more than it helps.

Total / Partial

Elm doesn’t check functions for totality so we can write recursive functions that blow the stack and can use that as a loophole to write nonsense. :smiley:

fromResult : Result e a -> a
fromResult result = 
  case result of
    Ok a -> a
    _ -> fromResult result

There’s an Unwrap library that does this and I have seen this technique used internally in some other libraries.

If I remember (its been like 8 years) Idris distinguishes between total and partial functions and only lets total functions call other total functions, and checks for totality by verifying that recursive calls always work on a “smaller” version of the input. Dhall - See the Safety First Section does something similar I think.

For anyone not familiar, this is not actually a design problem or weakness with Elm. If you want a Turing Complete language you have to have some holes (or at least, I think that is true - I’m not a computer scientist so listen to me at your own peril).

All non-functions can be encoded to String/JSON and Parsed

  • Given
    • we can serialize things without functions in them to JSON, and parse them back out with a potential failure i.e. Result.
    • and we can use Unwrap.result that produces a stack overflow if there is an Err and otherwise produces the relevant value
  • Then you could actually represent heterogeneous data in a Dict String Json and then decode the results without leaking result to the consumer

You could probably use this trick to model a composable form system where each field or sub-form could serialize its state to a JSON and deserialize it back out. This could yield an Api that reduces boiler plate in declaring forms but gives you the :wrapped_gift: present run-time :collision: explosions.

3 Likes

@john_s you’re absolutely correct.

  1. Type variables come from Hindley–Milner you can specifically check algorithm w which has proovable properties we’re looking for here.
  2. The way elm works around ad-hoc polymorphism (operator overloading) with things like number or comparable is already a bit weird. It’s almost like type classes but not quite. But there is tension between simplicity and generality.
  3. Halting problem is undecidable. It can be formulated as answer to a question “Is it possible to implement a program that could in finite time analyze any other Turing complete program and determine if it will ever finish” as “no”. The program doing the analysis is what we would call type checker.
    • Idris works around this partially by having analysis for subsets of the problem (it can prove this for some programs, not all)
    • Dhall chooses different approach - it’s not turing complete. The language itselfs makes programs where this proof is not possible illegal. But it’s not general purpose language so that’s fine tradeoff.
    • Even type systems can be turing complete and thus undecidable. For instance typescript’s typesystem is undecidable

As a bonus there is a historical note. There Actually was a problem in elm back in a Signal days. This might be closest thing I know about to what I feel @rupert might have been interested in. The problem was that parametric polymorphism doesn’t play well with side-effects and must be restricted in certain cases to avoid problems. This is documented in Polymorphic references - Counterexamples in Type Systems and this old compiler issue

1 Like

Interesting, but no, not what I was originally interested in. What got me thinking was the observation that an Elm program itself can have free type variables:

main : Program () (Model a) Msg
main = Platform.worker ...

a is an unbound type variable, so the type of the program overall is not completely grounded in concrete type instances. So I wondered, can that lead to a program that is in some sense untyped to be created and run? I think it can but it does not in any way compromise your 3 properties of soundness, completeness and decidability.

Here is such a program:

module Main exposing (main)

import Platform


{-| A Church numeral is polymorphic in its “carrier” type `a`. -}
type alias Church a =
    (a -> a) -> a -> a


type alias Model a =
    { count : Church a }


type Msg
    = Inc


main : Program () (Model a) Msg
main =
    Platform.worker
        { init = init
        , update = update
        , subscriptions = subscriptions
        }


init : () -> ( Model a, Cmd Msg )
init () =
    ( { count = zero }
    , Cmd.none
    )


zero : Church a
zero f x =
    x


inc : Church a -> Church a
inc n f x =
    f (n f x)


update : Msg -> Model a -> ( Model a, Cmd Msg )
update msg model =
    case msg of
        Inc ->
            ( { model | count = inc model.count }
            , Cmd.none
            )


subscriptions : Model a -> Sub Msg
subscriptions _ =
    Sub.none

I wanted to also write a view function for this, that applies the count function to (+) 0 to derive the actual count as an Int and then String.fromInt. But that would mean that the view function needs a Model Int, spoiling my attempt at a polymorphic program.

I think there might actually be a way to write the view function using Jeremys Interface to hide the Int implementation as an existential. Possibly. No there isn’t.

Anyway, its all still within the framework of sound, complete and decidable.

Here is program which is doesn’t have msg specialized which runs:

main : Html msg
main = Html.text "hello"

And here is a program which has unbound type in main by design (using phantom type variable). Not only that, update function says it’s changes this a to b when ever update is called.

module Main exposing (main)

import Browser
import Html exposing (Html)
import Html.Attributes
import Html.Events
import Html.Lazy


type Model a
    = Model
        { text : String
        , lazyText : String
        }


initialModel : Model a
initialModel =
    Model
        { text = ""
        , lazyText = ""
        }


type Msg
    = GotText String
    | GotLazyText String


update : Msg -> Model a -> Model b
update msg (Model model) =
    Model <|
        case msg of
            GotText text ->
                { model | text = String.filter Char.isDigit text }

            GotLazyText text ->
                { model | lazyText = String.filter Char.isDigit text }


view : Model a -> Html Msg
view (Model model) =
    Html.div []
        [ Html.p [] [ Html.text "These inputs only allow you to type digits. However, it doesn’t really work for the lazy one." ]
        , Html.label []
            [ Html.text "Regular: "
            , viewText model.text |> Html.map GotText
            ]
        , Html.hr [] []
        , Html.label []
            [ Html.text "Lazy: "
            , Html.Lazy.lazy viewText model.lazyText |> Html.map GotLazyText
            ]
        ]


viewText : String -> Html String
viewText text =
    Html.input
        [ Html.Events.onInput identity
        , Html.Attributes.value text
        ]
        []


main : Program () (Model a) Msg
main =
    Browser.sandbox
        { init = initialModel
        , view = view
        , update = update
        }

But could this be a problem in any way?


If you look at some idealized model of what elm is as a language than what you’re looking for couldn’t exist. It gets a bit messier when we look at practical implementation which confronts us with some practical engineering decisions which are sort of imposed onto a language. Like that elm is targeting JavaScript which is itself dynamic language with its own specification which for declares that all numbers are IEEE 754 floating points. But it might be always worth to consider “what the thing could be” as well as “what it really is” because that clarifies what is inherent property of some design and what is behavior or constraint imposed by the platform.

say you have this program:

nothing : Maybe a
nothing = Nothing

What is the type of Nothing? Well it’s Maybe a. So it’s not specialized to any concrete type yet. So by design the language needs to handle values which don’t have concrete type yet like maybe.

type Maybe a = Just a | Nothing

What will this be in computer memory? Well there are at least 2 reasonable options that will have different properties. First it could be something like this C type

typedef enum {
    Just_Tag,
    Nothing_Tag,
} Maybe_Tag;

typedef struct Maybe {
    Maybe_Tag tag;
    void *data; // <--- maybe holds a pointer/reference to the data
} Maybe;

If the container has a pointer to the data then every maybe has the same size in the memory. Maybe Int as well as Maybe HugeRecordWithTonOfData both will have just a tag and a pointer. to what ever the inner type is.

But maybe you’re implementing low level language and wrapping everything in pointers like this is not acceptable. When you say Maybe Int or Maybe User you don’t want to go through this pointer indirection. But then you have a problem - what is the size of the nothing we defined above? you don’t know.

What I mean to show there is that how types map onto what real values in real memory are and depends on things other than types. With most designs having type with free parameter doesn’t mean the value in real memory is any less real than value whose type is specialized. However there are language designs where this matters.

In Elm most of these decisions are made for you by the fact that you’re targeting JS. If you compile to JS objects there will be JS references living in heap memory maintained by garbage collector of the JS VM, that’s simply given. Elm doesn’t need to do the work to make sure every value is specialized because VM doesn’t require it.

If you look at say Haskell there the story is different. Haskell has some rules. For instance if you say

main = putStrLn 42

it’s not clear what 42 literal is. Is it Int, Integer, Float…? Compiler has to make some decision. In this case it desides that it needs to generate some code for something this literal could be and because it you didn’t need to tell it what primitive type it must be it should not in theory matter. And it generates code for 32 bit integer.

Elm doesn’t have to make that decisions. The difference between Int and Float in elm exists primary for correctness purposes. But under the hood for performance reasons these are just javascript numbers. and that’s where the bug with n ^ -1 for n being Int happens.

The reason why I’m not addressing your question directly is because I’m trying to adjust the question to something that is more answerable. And it seems to me that the most important step is to make clear distinction between what type checker is doing and what its purpose is on one hand and what code generation is doing on the other. Type checker validates if program makes sense, code generator just spits out the implementation. And the goal is to have correct program generated. One code is generated types no longer matter.


btw: Elm type system is not really complete in the strictest sense. Like demonstrated with infinite recursion or equality check on functions.

I agree that compiling to Javascript means that the Int/Float distinction goes away. Even if the compiler gets it wrong, as per the bug:

> (floor 2) ^ -1
0.5 : Int

This actually still works just fine, since its all just javascript number under the hood. I could add 1 : Int to the above and get 1.5 : Int as the answer.

I would not include details of the javascript implementation when reasoning about Elms type system though. For example, we can imagine an Elm that compiles to some other target where Int and Float are distinct, say 64-bit signed ints and 64-bit CPU float (or IEEE). For that reason, I am thinking about the semantics of Elm the language as distinct from Elm the implementation as a compiler + runtime.

The (^) operator which is (^) : number -> number -> number in Elm would need to be implemented differently on this imagined runtime with distinct Int and Float. The best interpretation given Elms goal of zero runtime errors would seem to be:

> (floor 2) ^ -1
0 : Int -- Wrong answer, like we do with divide by zero.

(Observation: This can be implemented with “branchless” CPU logic. Extend the sign bit to make a word that is all 1s or 0s, calculate the pow function, AND with sign extended word to mask wrong answers to 0)

The (^) operator which is (^) : number -> number -> number in Elm would need to be implemented differently on this imagined runtime with distinct Int and Float. The best interpretation given Elms goal of zero runtime errors would seem to be:

absolutely. Actually almost all operators would need to be figured out. My favorite one is ++ for appendable type. This one also performs some runtime reflections.

So yeah if we stay purely in the domain of the design of elm type system - Program with unbound type parameters doesn’t mean the code can’t be generated. All of these a and b which leak all the way to the main are for all practical purposes Never.

One more related OT comes to mind. There is this thing called “polymorphic recursion”. In elm such type could look like:

type Nested a = Thing a (Nested (List a)) | Epsilon

For languages which perform monomorphization it’s impossible to generate code for types like this.

If a here was replaced by Never, you would not be able to apply the church function. The church functions just apply some some arbitrary function N times, for church of N.

type alias Church a =
    (a -> a) -> a -> a

So the a must really be an unbound type var in the program I gave above - it is a truly polymorphic program.

I belive you’re mistaken. Try this.

module Main exposing (main)

import Platform


{-| A Church numeral is polymorphic in its “carrier” type `a`. -}
type alias Church a =
    (a -> a) -> a -> a


type alias Model a =
    { count : Church a }


type Msg
    = Inc


main : Program () (Model Never) Msg
main =
    Platform.worker
        { init = init
        , update = update
        , subscriptions = subscriptions
        }


init : () -> ( Model a, Cmd Msg )
init () =
    ( { count = zero }
    , Cmd.none
    )


zero : Church a
zero f x =
    x


inc : Church a -> Church a
inc n f x =
    f (n f x)


update : Msg -> Model a -> ( Model a, Cmd Msg )
update msg model =
    case msg of
        Inc ->
            ( { model | count = inc model.count }
            , Cmd.none
            )


subscriptions : Model a -> Sub Msg
subscriptions _ =
    Sub.none

it’s the program you supplied above. all I did is change main annotation to use Never instead of a. It should compile fine.

This is how substitutions work in HM. it sees some constraints like a <> Never and it performs unification where a is substituted by Never which type checks. So the type of the program is Program () (Model Never) Msg.

When ever there is any free type variable in the type annotation of main, it’s indeed effectively a Never. Or in actuality it’s free type variable waiting to by substituted by Never type.