If Elm had threads..?

There is this intriguing snippet in the docs for elm/core Process:

Right now, this library is pretty sparse. For example, there is no public API for processes to communicate with each other. This is a really important ability, but it is also something that is extraordinarily easy to get wrong!

I think the trend will be towards an Erlang style of concurrency, where every process has an “event queue” that anyone can send messages to. I currently think the API will be extended to be more like this:

type Id exit msg
spawn : Task exit a -> Task x (Id exit Never)
kill : Id exit msg -> Task x ()
send : Id exit msg -> msg -> Task x ()

One thing missing, how does a spawned task receive messages ? Well it could do it by either polling or blocking:

poll : Task x (Maybe msg) -- Nothing if none available
recieve : Task x msg -- Blocks till there is one

And that completes the API into a minimal, non-TEA like, concurrency model that could be backed by a real threaded Elm runtime. Not in the browser obviously, since Javascript there is single threaded, I am interested in exploring this topic hypothetically, as a thought experiment.

What is the right concurrency model for multi-threaded Elm ?

We could have mutable Dicts, like Haskells MVar, software transactional memory (STM), something else ?

Actor model does seem like the most natural fit to Elm, as TEA itself is very actor like.

I have written before that Actor model is bad for Elm, but to be clear, that is when talking about purely single threaded Elm. My view is that it makes single threaded programs harder than is strictly necessary to understand and modify, and that a better structure is achieved through functional abstraction and shared flat models sliced up using extensible records, as opposed to defunctionalising, out messages, and actor model.

But that is not to say that I believe Actor model is a bad concurrency model, and when real parallelism is available, it is arguably the safest. There is no shared mutable state, eliminating an entire class of race conditions, and multi-threaded programming can be very hard, so we appreciate concepts like this that can make it simpler and safer to do.

A downside of Actor model is that it is coarse grained parallelism. You don’t have things like parallel for loops, or the flexibility of coroutines to turn sequential looking code into parallel without necessarily reorganizing it around an Actor model.

Actor model is well suited to more static parallel pipelines. In my view that is actually a good thing - in a really high performance system you want to think of CPU cores themselves as Actors, and deliberately create a system that maps onto the hardware, as opposed to creating thousands of threads, on per job, and losing the benefits of going parallel to task switching overheads.

I have create a more TEA-like Actor model API, and the built a simulation of it running in Elm, to get a feel for what working with it would be like.

I will follow up with that next, but now my wife wants to watch our series on TV, so later… but if anyone has some ideas on how best to introduce multi-threading into Elm or even concurrent distributed Elm oper many machines, I would be interested to hear them.

Web workers can be multi-threaded, so this discussion is potentially applicable to browsers

That is true. But web workers also come with some restrictions that make them a little awkward.

send : Id exit msg → msg → Task x ()

With this API we could send any msg. It could be a function, or a partially applied function. Passing a closure is not something that is natively supported by web workers, at the Javascript level I mean. We could have a restricted API for web workers:

send : Id exit msg → Value → Task x ()

But also, one way of representing a closure is to have an object with a pointer to the function to apply, followed by the suspended arguments to call it with:

(+) 1 is:

{ fp : function_pointer → (+)
, args : [ 1 ]
}

If we have the same Elm program inside the web worker, or at least some shared subset of a program between web workers and main JS process, and instead of function pointers, we had some other label that uniquely identifies them, such as fully qualified name:

“some/package.Module.fun” – Do we need the version too ?

Then that is good enough as a function pointer. We could fully represent a closure and pass around functions, with the right compiler and runtime support. :grimacing:

But yeah, its a legitimate part of this exploration. :+1:

So here is my proposed API for creating TEA-like Actors as processes:

module Actor.Core exposing (..)

type Process msg = Process

type alias Actor flags model msg =
    { init : flags -> ( model, Cmd msg )
    , update : msg -> model -> ( model, Cmd msg )
    , subscriptions : model -> Sub msg
    }

spawn : Actor flags model msg -> flags -> Task x (Process msg)
self : Task x (Process msg)
kill : Process msg -> Task x ()

spawn creates effectively a Platform.worker and give you a process handle for it. self exists just for the purpose of being able to get the process handle of the main program thread.

No event messaging model yet, this is just the core actor lifecycle stuff.

And here is a simple point-to-point messaging API for sending message to Actors. @pd-andy pointed me at the Gleam API for this, and this is pretty much that API 1:1 translated to Elm.

The trick here is that whilst actors have their own internal msg type, the mail boxes (Subject) are themselves typed. This removes the type tangle that you would get into if you directly message actors - we message typed mailboxes instead.

A Selector is both a filter and a type mapper:

module Actor.P2P exposing (..)

type Subject a = Subject

type Selector msg a = Selector

subject : Task x (Subject a)
send : Subject a -> a -> Task x ()
all : Subject a -> Selector a a
selector : Selector msg msg
selectMap : Subject a -> (a -> msg) -> Selector msg b -> Selector msg b
map : (a -> b) -> Selector msg a -> Selector msg b
filter : (a -> Bool) -> Selector msg a -> Selector msg a
orElse : Selector msg a -> Selector msg a -> Selector msg a
withTimeout : Time -> a -> Selector msg a -> Selector msg a
select : Selector msg a -> Task x a
subscribe : Selector msg a -> (a -> parentMsg) -> Sub parentMsg

Example: Suppose you have two subjects — one carrying Int payloads, one carrying String. You want a single selector that listens to both subjects and produces Combined values:

type AppMsg
      = NumberMsg Int
      | TextMsg String
      | Combined String  -- what the selector ultimately produces

intSubject : Subject AppMsg Int
-- codec: (NumberMsg, \m -> case m of NumberMsg n -> Just n; _ -> Nothing)

strSubject : Subject AppMsg String
-- codec: (TextMsg, \m -> case m of TextMsg s -> Just s; _ -> Nothing)

mySelector : Selector AppMsg Combined
mySelector =
    selector                                           -- Selector AppMsg AppMsg (accepts any msg)
        |> selectMap intSubject (\n -> Combined (String.fromInt n))  -- intercept ints
        |> selectMap strSubject (\s -> Combined s)                   -- intercept strings
        |> map identity                                -- (or further transforms)

Another example, parent creates a subject, spawns a child, and listens for replies:

init =                                                                                                                                                             
      P2P.subject                                                                                                                                                    
          |> Task.andThen                                                                                                                                            
              (\replySubject ->                                                                                                                                      
                  Core.spawn childActor { replyTo = replySubject }
                      |> Task.map (\_ -> replySubject)
              )

subscriptions replySubject =
      P2P.subscribe (P2P.all replySubject) HandleReply


-- Child receives the subject as a flag and sends messages back

childActor : Actor { replyTo : Subject String } model msg
childActor =
      { init =
          \flags ->
              P2P.send flags.replyTo "hello from child"
      , ...
      }

Could you get most of the power of this API with something much more simple like a new function called Task.async : (() -> Task x a) -> Task x a that runs the provided function on a different thread?

That would work. The task provided would be handed off to another thread as a one-off job.

But if we also had these to receive message as Tasks:

poll : Task x (Maybe msg) -- Nothing if none available
recieve : Task x msg -- Blocks till there is one

And a little loop helper for creating Tasks that loop:

type Step state result                                                                                                                        
      = Loop state
      | Done result                                                                                                                             
                                                                  
                                                                                                                                                
loop : state -> (state -> Task x (Step state result)) -> Task x result
loop state callback =                                                                                                                         
      callback state                                              
          |> Task.andThen
              (\step ->
                  case step of
                      Loop newState ->
                          loop newState callback

                      Done result ->
                          Task.succeed result
              )                         

Then this handed off Task could be a long running process in a loop, and we pass messages to it, actor model style.

I think at a practical level some type tangling issues would arise from:

send : Id exit msg -> msg -> Task x ()

The msg type of the process and the actual messages themselves have the same type. That is where the typed mailbox idea is beneficial, as an intermediary between Actors with different msg types, that do not want to know about each others internal msg types, just the surfaced types of the mailboxes.

The main difference with the TEA-like Actor model API is not one of what is possible to do with that API compared with a Task based one, just that it has the familiar TEA-like shape to it. I am thinking it is for long running processes and where a bit more structure is beneficial to the design.

Room for both - quick async hand off of Tasks would be useful in many situations.

If I were using Actors in Elm, it’d probably be to benefit from the distributed aspect of Actors over local concurrency. I mean, when do you really lack performances in Elm (especially in the browser) ? It’s usually too high level.

Something in the spirit of Cloudflare’s Actors with Durable Objects. It offers great primitives for storage and scheduling and strong garanties for concurrency GitHub - cloudflare/actors: An easier way to build with Cloudflare Durable Objects · GitHub It’s very much alpha, but I’ve played with it a bit and it’s amazing for anything realtime, and everything multi-tenancy.

I suppose it depends what you are comparing it to. Browser no, but if running under NodeJS say, I might like to do some crazy thing like writing a high performance threaded server process, and I think that would not perform super well compared with alternatives. So I am imagining what a hypothetical alternative might look like, where scaling up with threads is possible.

Cloudflare’s Actors - I read the README but it did not really address the fundamential question of what is this and what does it do ? This is some way of writing Actors that you deploy into the cloud to give you a scalable distributed system, is that it ?

Another aspect of Actors is what the BEAM provides - reliability. Fault tolerance and supervision trees that restart failed processes and so on. Actor model has strong isolation so is key to making this possible. I don’t know the APIs to make use of it though, what would that look like in Elm ?

But yes, lets talk about distributed concurrency too. I’ve never been a BEAM developer but I think that also offers distributed actor model too, or am I mistaken about that ?

I think its also worth mentioning that the gap between local CPU and network distributed concurrency is not as wide as it used to be.

At one time CPUs were single core, and we built distributed systems to go parallel over a network, ethernet maybe.

Now CPUs can have > 128 cores on them. The cores are connected together by a network, just not an ethernet one. The difference is that there is shared memory, but that is connected to the network too (kinda, not the same network and via the cache hierarchy) - so not entirely unlike random access distributed storage.

You may have heard the term “data centre on a chip” in conjunction with modern CPU architectures. Here is an interesting talk about it: Uncomplex: Modern Hardware for Better Software - InfoQ

tldr; You can build distributed system accross an ethernet network that perform poorly compared to more tightly integrated systems accross the CPU network, and you can eliminate a lot of unnecessary complexity by doing so.

The project I linked is in a research phase, but the foundations are Durable Objects: What are Durable Objects? · Cloudflare Durable Objects docs It basically gives you a SQLite DB for each uniquely created Object that is identifiable globally. They had globally distributed compute (workers) and strongly consistent storage (R2/KV), so Durable Objects is mostly context and scheduling over their existing platform. Most of my Actor knowledge comes from Akka (Scala). That’s obviously a billion time more customizable, but I’m very impressed with the final API Cloudflare was able to offer. It’s a brutally simple way to get into distributed Actors with virtually no knowledge, configuration or even installation needed.

BTW, A drawback of this approach is that a Task can only run Tasks, it cannot run a Cmd.

Unless we had something like this:

module Taskify exposing(..)

cmd : Cmd msg -> Task x msg
sub : Sub msg -> Task x msg -- Blocks till sub fires
poll : Sub msg -> Task x (Maybe msg) -- Polls

But that won’t work as a Cmd can be fire and forget and never return anything, unless the Task there is implemented with a timeout.

This is not an accident! Elm v0.15 through v0.17 were explicitly inspired by Erlang! Evan used to talk about it a fair amount, and it’s touched on in the release posts.

OTP’s gen_server has a callback for initialising the state from an argument passed in at boot, and a callback for handling a message which takes a state and returns a new state and optionally a side effect. It’s easy to see how this evolved into TEA.

This is not at all how one would use any actor system I’m familiar with. Hardware is not used as the base for the design and actors/coroutines are certainly not used to model data-flow pipelines. Instead the workload is split the other direction, with an actor/coroutine being used for each item of work. This results in a great number of them being run. The original designers of Erlang have often said that an interesting program will be running at least hundreds of thousands of actors/coroutines concurrently.

Using actors the way you’ve described results in poor performance due to heavy copying of memory. Perhaps more importantly, it sacrifices the main advantage of programming in languages like Gleam, Erlang, Elixir, Go, etc, where one can write their program entirely in the single-threaded style and then effortlessly make it concurrent by running multiple copies of it.

As Joe Armstrong said: It’s hard to write a server that handles a thousand concurrent requests. It’s easy to write a web server that handles a single request and running a thousand instances of it.

Could you expand on what you mean by this? Thanks!

The Gleam API that @rupert shared is very high level and considered to be part of the application framework, which is likely why is seems rather complex. It does a great deal more things than the low level API you’ve shared there.

This high level API is built on top of a set of low level primitives, the equivalent to that Task.async would be the spawn : (() → a) → CoroutineId function.

Interesting. I have never been a BEAM programmer so much of this is unknown to me.

But actor model and coroutines are not the same thing - actor model is an architecture based around isolated entities passing messages to each other, and coroutines is a programming model that allows functions to be suspended and resumed.

I think I am right in claiming that actors on the BEAM are implemented as coroutines ? They can be stopped and restarted, and have light weight executions contexts, making swapping which actor is running quite fast. So indeed, an interesting program could have thousands of them, and they should and do scale well.

I think it is coroutines rather than actors that let you take a single threaded program, and essentially parallelise it without changing its structure too much ? Actors are an architecture, so you have to follow the architecture to use them, which could significantly alter how an existing program works.


The highest performance actor model systems I have worked with do align with CPU hardware, and these are custom systems for high frequency trading. That is the kind of hardware aware design that I know about, and its what John Ohara is talking about in the talk I posted a link too.

Heavy copying of memory is avoided by passing pointers to events, not the events themselves. Obviously that is unsafe, but also quite normal in C++ or whatever lower level language these kinds of systems are usually built in - care must be taken to treat them like they are immutable.

I am fairly confident the same trick could be done in a hypothetical Elm runtime with parallel actors, with ref counts on messages, and messages that are immutable by Elm semantics, and therefore do not need to be copied to be safely shared accross threads.

The real issue is that the actor model forces you to structure parallelism around message passing boundaries, which changes how you express problems.

For example in Kotlin you could write:

val results = (1..N).map { i ->
    async { doWork(i) }
}.awaitAll()

In the Actor model it would be more like:

Coordinator Actor
→ sends N messages
Workers
→ process
→ send results back
Coordinator
→ merges results

So the tendency is more towards coarse grained chunking of parallel activities into actors, and less towards using more localized and composable parallel structures in code.

That said - no reason at all why Array.map in Elm could not be compiled down into something that runs in parallel like the Kotlin example above. Not sure I want to think about what the compiler heuristics for deciding when to automatically parallelize look like though!

Any thoughts on what an API for this would look like in Elm ?

I did not state it explicitly, but my intention in the P2P API above, is that each message is delivered to only one actor - its P2P, not PubSub.

So if we can make N actors, but all feed off 1 typed mailbox, those N can process the events simultaneously - like the web server you describe.

Perhaps we just need:

spawnN : Int -> Actor flags model msg -> flags -> Task x (List (Process msg))

Thankfully you’re mistaken here, it’s exactly the same with the actor model. Here’s what that would look like in idiomatic Elixir, for example.

results = 1..n
  |> Enum.map(&Task.async(fn -> do_work(&1) end)
  |> Task.await_many()

Both the future based API and the actor based concurrency systems are capable of expressing APIs such as these. The main difference is that with futures/tasks the final return type is Future (List t), but with actors it’s just regular List t. There’s no infectious future type! This means that all code composes and there’s no split between sync code and async code.

Typically one would create actors as needed.

handleMessage state msg =
  Process.spawn \_ -> doSomeWork state msg

Spawning some fixed number of actors typically is done when you want to reduce the amount of concurrency, preventing it from going high dynamically.

Where is the actor model here though ? This looks like coroutines, but I do not see an actor and message passing ? But perhaps that is implicitly there, I just do not know how to read it.

Looking in Elixir docs I see this example, which looks more actor like to me, receiving a message and processing it.

defmodule Example do
  def explode, do: exit(:kaboom)

  def run do
    Process.flag(:trap_exit, true)
    spawn_link(Example, :explode, [])

    receive do
      {:EXIT, _from_pid, reason} -> IO.puts("Exit reason: #{reason}")
    end
  end
end

Here is a revised messaging API. This one combines P2P and a separate PubSub that I had, into a single API for Channels.

Subject got renamed to Channel and a phantom type param added for OneToOne versus Broadcast semantics. This is a common split in many messaging APIs.

I like how send and publish are distinguished on the sending side by phantom types, but on the receving side it does not matter, Selector construction is the same for both kinds of messages.

Better ?

module Actor.Channel exposing (..)

type Channel c a = Channel
type OneToOne = OneToOne
type Broadcast = Broadcast
type Selector msg a = Selector

oneToOne : Task x (Channel OneToOne a)
broadcast : Task x (Channel Broadcast a)
send : Channel OneToOne a -> a -> Task x () -- or Cmd msg
publish : Channel Broadcast a -> a -> Task x () -- or Cmd msg
all : Channel c a -> Selector a a
selector : Selector msg msg
selectMap : Channel c a -> (a -> msg) -> Selector msg b -> Selector msg b
map : (a -> b) -> Selector msg a -> Selector msg b
filter : (a -> Bool) -> Selector msg a -> Selector msg a
orElse : Selector msg a -> Selector msg a -> Selector msg a
withTimeout : Time -> a -> Selector msg a -> Selector msg a
select : Selector msg a -> Task x a
subscribe : Selector msg a -> (a -> parentMsg) -> Sub parentMsg

I think you’re right that the actors here are a bit hidden. I looked up the docs for Task and found three paragraphs that explain it well, I think:

Tasks spawned with async can be awaited on by their caller process (and only their caller) as shown in the example above. They are implemented by spawning a process that sends a message to the caller once the given computation is performed.

One of the common uses of tasks is to convert sequential code into concurrent code with Task.async/1 while keeping its semantics. When invoked, a new process will be created, linked and monitored by the caller. Once the task action finishes, a message will be sent to the caller with the result.

Task.await/2 is used to read the message sent by the task.

So to summarise @lpil’s example:

  • the lambda is running in a new spawned process / actor
  • there are no messages from “main” → “spawned”. The necessary data is simply copied over via scope. (But you also could await more messages in the spawned task, if you wanted)
  • once the function ran to completion, it’s result value is wrapped in a message and sent back “spawned” → “main”
  • “main” then has a receive to wait for the message within the Task.await
  • so there are actors and there is message passing - it’s just under the hood of a nice API to hide reference tracking mostly I think (EDIT: and also a kind of “error handling”. Missed that earlier, whoops…)