I would like to better understand how pure functional programming languages handle side effects in general. I understand that in Elm, side effects are delegated to the runtime environment in the form of commands. The runtime takes care of the “dirty jobs” (such as generating random numbers, etc.) and can, in turn, call the update function with the results. The update function itself remains purely functional. Great example of ‘functional core, imperative shell’. So far, so good.
Since programs without any side effects are rarely useful, all pure functional languages must have a way to execute side effects nonetheless. Unfortunately, my brain refuses to properly understand monads in Haskell I know monads are the way Haskell handles effects, but regardless of not really understanding monads, monads in Haskell are still purely functional (?) and Haskell must also have some mechanism to calculate a random number while still remaining purely functional.
Can someone help explain to my (imperative and object-oriented trained) brain how side effects are achieved in pure functional languages other than Elm?
Monads in Elm:
Elm doesn’t actually require you to use Monads in the same way as Haskell or other functional programming languages. In functional programming, Monads are often used to “chain” operations together, similar to how chaining works in object-oriented JavaScript with promises. For example, in JavaScript, you can chain .then(onSuccess).catch(onError) on a promise. In object-oriented programming, methods return objects (often this), which can be used to call additional methods (OOP chaining).
In Elm, this “chaining” concept is similar but not exactly the same. Elm uses Data Types and functions like andThen to chain operations, which is conceptually similar to Monads. For example, you might see something like:
While andThen is technically the Monad function in Elm, the broader concept of chaining operations is often associated with the |> operator. In Haskell, a similar concept is represented by the $ operator, which is specifically a Monad operator.
Managed Effects in Elm:
Effects in Elm are managed by the runtime, which means that creating a Task (like an async operation) doesn’t actually do anything until it’s handled by the Elm event loop. This can be a bit tricky to understand at first because you don’t see effects happening immediately when you create a Task—they only run when the event loop processes them.
The key idea behind managed effects in Elm is that they force you to explicitly handle all possible outcomes (success or failure) of a computation, so your program can never reach an inconsistent state. In Elm, if a function can fail, you must account for that failure explicitly, and the compiler will make sure you handle all possible cases. This approach minimizes errors and ensures reliability, and it’s a lot more powerfull than try...catch basically by considering error as data instead of just exits.
Effect in TypeScript:
If you’re familiar with TypeScript, you might want to check out the effect-ts library. It’s a nice way to work with effects in a functional programming style, but without using too much jargon, making it a good transition for those coming from JavaScript/TypeScript.
in a nutshell:
chaining in Js: obj.fn1().fn2() with a DOT operator .
chaining in Elm: data |> fn1 |> fn2 with a PIPE operator |>
chaining in Haskell: data $ fn1 $ fn2 with a DOLLAR operator $ (amongst many others operators dediacated to chaining in Haskell). Monads in Haskell also opens the world of do notations, to define values with the left arrow <- operator (which is just syntactic sugar) very much like await in Js. Haskell developpers want to offer this api as much as Js devs want to offer Promises, for the exact same reason: syntactic sugar to reason about async. This doesn’t have a correspondance in elm.
monad in Js: the Promise API
monad in Elm: functions like andThen: (a -> X b) -> X a -> X b
monad in Haskell: everything associated with the $ operator
effects in Js: try { } catch() {}
effects in Elm : Task Error result (which is a Monad)
The clarifications on monads and chaining above are great, but I don’t think they answer tomka’s original question, which I would like to know as well: In Elm side effects are achieved via the runtime interacting with the pure update function. How are side effects achieved in other pure functional languages like Haskell?
Every time I search for the answer I mostly get more information about how side effects are represented, when I’m looking for how they are executed. Is it just that all effects in the program are ultimately represented in the single IO monad returned from main?
Thank you for your very detailed and helpful explanations.
As @blaix pointed out, it’s still unclear to me how effects are handled in purely functional languages other than Elm. I understand, that in Elm the runtime environment, that is generated by the compiler and included in the JS output, does the job and calls the pure update function in the Elm program, that is also included in the JS file. Although they are both in the JS file, they are logically separated. The task with the side effect is therefore executed by the runtime environment, which is logically not part of the pure Elm-program.
Who actually performs the task in other purely functional languages, such as e.g. Haskell? Even if it’s a monad in Haskell, it’s still a task that involves a side effect and therefore can not be executed by the Haskell program itself?
For one, Elm is quite strict about separating “Elm” code and the rest of the system. The boundary for these is a bit more porous in Haskell, since you have a FFI that can directly call C functions.
Conceptually though, you can think of the whole Haskell program as if residing inside a massive Task IOError () (in Elm parlance). So instead of having an update function receiving messages, a Haskell program will run by executing this task. That task can be composed of other subtasks via an andThen like mechanism (albeit Haskell has some neat syntax sugar for this:
in Elm you would write:
myFun fileName =
File.readFile fileName
|> Task.andThen (\contents ->
File.writeFile (fileName ++ ".output") (countLineLength contents)
)
|> Task.andThen (\res ->
case res of
Ok _ -> Console.put "Success"
Err _ -> Console.put "Error"
)
in (pseudo) Haskell you could write
myFun fileName = do
contents <- File.readFile fileName
res <- File.writeFile (fileName ++ ".output") (countLineLength contents)
case res of
Ok _ -> Console.put "Success"
Err _ -> Console.put "Error"
I’m not an expert but maybe I could add a few interesting bits to this discussion.
Here’s what I currently understand:
IO is a constraint, it’s a “contract” to which you should abide because it allows differentiating pure functions from impure functions (as a reminder, a pure function will always give you back the same result for a given input, an impure function may not). As far as I understand, IO represents the possibility of any possible computation so if a function returns an IO of something, “anything can happen” basically.
So the deal is that you are allowed to run functions requiring an IO context if your are already in an IO “context”. It’s a little bit similar (but much more constraining) to the mechanism of a JavaScript promise: you can only call an async function if you are already in an async “context”.
But we can still break this “contract” for debugging purposes because it’s a convention, like we do by logging inside pure functions in Elm with Debug.log.
So, we could in theory print to the console (the simplest example of a side effect) in Haskell without this requirement like we do in Python, but we’d loose all the typical garanties you’d expect from your FP program so we don’t do that (and thus we conform to this designed constraint).
It’s a bit difficult to demonstrate what’s going on in Haskell when printing because the IO setup is quite involved, but we could look at its cousin PureScript, another purely functional language.
Here, to log something to the console, it is required by the function signature that we are in an Effect context (it’s similar to IO in Haskell)
With regards to Elm, the TEA pattern is a stronger form of constraint, whereas IO is still a constraint but much “looser”.
I can recommend the book Get Programming with Haskell if you want to learn more. I found it approachable and very much enjoyed reading it (doing the exercises is essential)
‘do notation’ is fancy syntax sugar for repeatedly applying the ‘bind’ (andThen) function in a way that it looks like it’s a pure function returning a value for whatever monad context it’s in. You can see this in the structural difference between the two examples you quoted - the elm version has to repeatedly capture output in a lambda, while haskell can assign it using the back arrow “<-”.
So the interesting part is what ‘bind’ actually does under the hood - and this is defined differently for each monad (similar to how we have different Maybe.andThen vs. Task.andThen).
In the IO case, I believe Haskell’s lazyness means that the operation is stored in a “thunk” within the context (as though it’s a partial function) and evaluation is started whenever it decides the value will be needed.
So depending on the operation, this might be right away, or carry it around with other thunks, waiting until the last minute to start them.
Contrast this with Elm where we know the task will start at the next TEA loop, or procedural languages where promises will be queued for next available opportunity.
Yes, the book seems very good to me. I read a little bit in the section on IO, and combined with the discussion that happened so far, it clarified quite a few things. Thanks for the recommendation.
I’ll try to summarize my thoughts:
Every purely functional language must provide some way to allow effects, as the language would otherwise not be practically usable (without IO or state). Purely functional code must be separated from impure code to ensure the language deserves the label “pure.” However, there must be a way to handle impure code.
In Elm, effects are implemented using Commands, which are executed by the runtime. The result is returned to the pure update function through messages. The purely functional code resides within the Elm program, while the impure code is in the runtime environment.
In Haskell, this separation is achieved through pure functions and, for example, IO Actions like main, which are not pure. Impure code has a special data type that “marks” it.
So you never know when your IO is going to run? Could it even be in a different order to the one in which you andThened your IO operations? I suppose later tasks will likely consume the outputs of prior ones in the order you chain them, so should execute as expected since their outputs will be needed to evaluate the next one.
I often get confused in javascript about the order of flow control when promises get involved. Before I first learned about Elm I tried making a UI with Angular, and found that I quickly got to the point where I had no idea of the sequence in which things would run - AND it mattered. Seemed to matter to much of how Angular worked and it was just… confusing. Honestly, asynchronous code is actually easier to write in a more bare metal multi-threaded language like Java even.
Chaining Tasks in Elm is nice. Currently doing some backend code in Elm and chaining tasks as Procedures with brian-watkins/elm-procedure is just as nice.
This becomes a really interesting question for tasks that are IO ()- that is, all side effect and no return value - things like writing to the console and making database changes. If they are only run ‘when needed’, when is that?
Theoretically you do know when it’s going to run - when main() is run, and yes, they could run in unexpected order even if not waiting on return values.
Haskell has various escape hatches (e.g. through annotations, the sequence command, and the unsafe ‘runIO’ command) to have some control of when things happen and in what order. Things like database libraries often include some kind of ‘run’ function that use these so that you can chain up operations and then guarantee when they are started and in what order.
p.s. I’m not a Haskell expert by any means, this is just based on the intuition I built up trying to use it on the backend. If anyone else wants to pipe in with corrections please let me know!
Just like Elm having a “runtime”, Haskell’s compiler (GHC) compiles your program with what’s called a “non-trivial runtime system” (RTS) which takes care of doing a lot of the things that runtimes do (like scheduling, memory allocations, storage etc.). A very hand-wavy approximation is to say this runtime system takes care of actually “running” your Haskell program – by which, I mean that it “runs” your main function which has to be an IO action (not too dissimilar from Task). And in running your main function, any other function that it encounters in the path will also be run, often in non-deterministic ways but following a set of rules for evaluation and how you describe the sequence of IO functions inside a given function/context.
There have been a lot of replies putting parts of the puzzle together, but perhaps things haven’t been completely tied up for you…
As to alternate “pure” functional languages other than Elm, there aren’t very many, with the common ones being Haskell and PureScript, both quite a bit more complex than Elm, but both handling side effects in generally the same way. Rather than trying to teach you these languages, I’m going to express the whole way in which they work in terms of Elm code and see it that does it for you…
First, these languages have something like Elm’s Task type that wraps actions to be performed that would be side effects if they were performed in the language itself so they are “performed” by the runtime instead; the difference between what Haskell and PureScript do with them is that, unlike Elm’s Task wrapper that can return an indication of failure, Haskell and PureScript assume that failures will be handled by an entirely different mechanism - exceptions. Since that is entirely different, I am going to suppose that they do it in the Elm way and that there is a IO e a Type in these languages rather than just a IO a Type…
So, these IO/Task types represent an action or through the andThen (bind in Haskell/PureScript) mechanism can represent a chain of actions performed in a pre-determined sequence. The functional magic is how this is done, as follows:
There is a flow of a kind of state from action to action from the beginning to end of the chain of actions given to “main”.
In Haskell, this chain of actions passes the kind of state called realWorld and has its own unique type RealWorld that could be defined in Elm as type RealWorld = RealWorld and with the only instance of this type defined as realWorld = RealWorld. This Type only has a single Constructor with no payload so doesn’t need to have a memory representation (it doesn’t in Haskell, the code generation in PureScript turns it into just a void call, but if created in Elm would have a JavaScript object representation, but it doesn’t really matter) and is actually a phantom Type meaning it is never actually used.
Now, if we were defining this IO e a Type in Elm as an action passing this phantom state, we might define it as follows:
type IO e a = IO (s -> (s, IOResult e a) )
type IOResult e a = Good a | Bad e
where s represents this phantom state parameter.
In order to be something like a monad, we then need the ability to wrap something in the IO e a Type, either to indicate success or failure, so we might define the following:
return : a -> IO e a -- this might better be called "wrap" or "success" rather than "return"
return a = IO (\ s -> (s, Good a))
failure : e -> IO e a
failure e = IO (\ s -> (s, Bad e))
Other than the ability to wrap actions as per the above, monadic actions require the ability to chain/andThen which could be defined as follows:
andThen : (a -> IO e b) -> IO e a -> IO e b
andThen iof (IO sf) = IO <| \ s0 ->
let (s1, ior) = sf s0
in
case ior of
Bad e -> (s1, Bad e)
Good a ->
let (IO iorsltf) = iof a
in iorsltf s1
This will run actions as long as we keep andThen’ing more of them, but when it encounters the first error, it will abort the rest of the chain and return the error Type e for that first error.
Now all we need is the ability to run an action (which may - and very likely - consist of a chain of actions), which Haskell and PureScript would call perform and which we also have in Elm for actions that can never error, but since we are including the possibility of errors in our action chains, we’ll call this version attempt, which for a language like Haskell which only is supposed to have one of these ever used by the main chain, returns the a type if successful and handles the error Type by special processing would be defined as follows:
attempt : IO e a -> (e -> a) -> a
attempt (IO iof) onError =
let (_, ior) = iof realWorld
in
case ior of
Bad e -> onError e -- turns the error into a proper output, for instance
Good a -> a
The error handling could be whatever the “pure” language allows, correcting the error as shown or calling some sort of a panic/error function with a message, etc.
Now, Haskell (and likely PureScript) allow breaking the purity of the language in certain ways such as allowing one to call unsafePerformIO in more places than just in the main chain but it is unsafe because it breaks the Type purity in certain ways. Elm wouldn’t allow that, but instead of returning the final a result directly (usually as a unit/() in Haskell/PureScript), as is done above, it wraps the whole chained action as a command/Cmd msg with a Message to be passed back on completion to be performed by the runtime containing the non-error result of the action and the runtime passing the Message back. In this way, Elm allows many calls of the perform/attempt functions with any returned type, yet maintains its purity.
Many Elm programmers don’t use andThen chains as much as they could and just have a huge chain of Message’s to be sorted out by the update function, with any action that needs to be followed by another action calling yet another perform/attempt to trigger yet another Message and so on until the chain of actions is completed by issuing a Cmd.none command.
There are a few features missing from this implementation such as the ability to map from one error Type to another and map different result types, etc., but it should be sufficient to understand how side effect actions are build up and composed…
So there are a few things we can see from this exercise: first, even in Haskell with lazy evaluation, actions always have the order that they were programmed to perform because they are part of a chain of actions with a “something” passed from action to action; second, that actions passing that “realWorld” something are functional combinators that pass a something state from action to action via functions so fit completely into the pure functional programming world similar to Alonzo Church’s definition of Church Numerals consisting of combinator compositions of chains of functions.
I hope that helps your understanding of this; I didn’t understand all of this myself until recently even though I have been about an intermediate level Haskell programmer for years, but my present understanding came about as I have been digging into the implementation of the Elm compiler written in Haskell…