Understanding side effects in Elm vs other languages

tomka · November 23, 2024, 2:37pm

Hi,

I would like to better understand how pure functional programming languages handle side effects in general. I understand that in Elm, side effects are delegated to the runtime environment in the form of commands. The runtime takes care of the “dirty jobs” (such as generating random numbers, etc.) and can, in turn, call the update function with the results. The update function itself remains purely functional. Great example of ‘functional core, imperative shell’. So far, so good.

Since programs without any side effects are rarely useful, all pure functional languages must have a way to execute side effects nonetheless. Unfortunately, my brain refuses to properly understand monads in Haskell I know monads are the way Haskell handles effects, but regardless of not really understanding monads, monads in Haskell are still purely functional (?) and Haskell must also have some mechanism to calculate a random number while still remaining purely functional.

Can someone help explain to my (imperative and object-oriented trained) brain how side effects are achieved in pure functional languages other than Elm?

Warry · November 24, 2024, 3:30pm

Let me clarify a few things for you:

Monads in Elm:
Elm doesn’t actually require you to use Monads in the same way as Haskell or other functional programming languages. In functional programming, Monads are often used to “chain” operations together, similar to how chaining works in object-oriented JavaScript with promises. For example, in JavaScript, you can chain .then(onSuccess).catch(onError) on a promise. In object-oriented programming, methods return objects (often this), which can be used to call additional methods (OOP chaining).

In Elm, this “chaining” concept is similar but not exactly the same. Elm uses Data Types and functions like andThen to chain operations, which is conceptually similar to Monads. For example, you might see something like:

task |> Task.andThen onSuccess |> Task.onError onError

While andThen is technically the Monad function in Elm, the broader concept of chaining operations is often associated with the |> operator. In Haskell, a similar concept is represented by the $ operator, which is specifically a Monad operator.

Managed Effects in Elm:
Effects in Elm are managed by the runtime, which means that creating a Task (like an async operation) doesn’t actually do anything until it’s handled by the Elm event loop. This can be a bit tricky to understand at first because you don’t see effects happening immediately when you create a Task—they only run when the event loop processes them.

The key idea behind managed effects in Elm is that they force you to explicitly handle all possible outcomes (success or failure) of a computation, so your program can never reach an inconsistent state. In Elm, if a function can fail, you must account for that failure explicitly, and the compiler will make sure you handle all possible cases. This approach minimizes errors and ensures reliability, and it’s a lot more powerfull than try...catch basically by considering error as data instead of just exits.

Effect in TypeScript:
If you’re familiar with TypeScript, you might want to check out the effect-ts library. It’s a nice way to work with effects in a functional programming style, but without using too much jargon, making it a good transition for those coming from JavaScript/TypeScript.

in a nutshell:

chaining in Js: obj.fn1().fn2() with a DOT operator .
chaining in Elm: data |> fn1 |> fn2 with a PIPE operator |>
chaining in Haskell: data $ fn1 $ fn2 with a DOLLAR operator $ (amongst many others operators dediacated to chaining in Haskell). Monads in Haskell also opens the world of do notations, to define values with the left arrow <- operator (which is just syntactic sugar) very much like await in Js. Haskell developpers want to offer this api as much as Js devs want to offer Promises, for the exact same reason: syntactic sugar to reason about async. This doesn’t have a correspondance in elm.
monad in Js: the Promise API
monad in Elm: functions like andThen: (a -> X b) -> X a -> X b
monad in Haskell: everything associated with the $ operator
effects in Js: try { } catch() {}
effects in Elm : Task Error result (which is a Monad)
effects in Haskell : IO (a Monad)

pete-murphy · November 25, 2024, 4:30pm

Small syntax correction $ in Haskell has the same signature as <| in Elm (not |>), so this would be

fn2 $ fn1 $ data

which is the same as

fn2 (fn1 data)

There is a & operator that matches |> in Elm, but in my experience this is used much less frequently than $.

blaix · November 26, 2024, 9:20pm

The clarifications on monads and chaining above are great, but I don’t think they answer tomka’s original question, which I would like to know as well: In Elm side effects are achieved via the runtime interacting with the pure update function. How are side effects achieved in other pure functional languages like Haskell?

Every time I search for the answer I mostly get more information about how side effects are represented, when I’m looking for how they are executed. Is it just that all effects in the program are ultimately represented in the single IO monad returned from main?

tomka · November 27, 2024, 9:06am

Thank you for your very detailed and helpful explanations.

As @blaix pointed out, it’s still unclear to me how effects are handled in purely functional languages other than Elm. I understand, that in Elm the runtime environment, that is generated by the compiler and included in the JS output, does the job and calls the pure update function in the Elm program, that is also included in the JS file. Although they are both in the JS file, they are logically separated. The task with the side effect is therefore executed by the runtime environment, which is logically not part of the pure Elm-program.

Who actually performs the task in other purely functional languages, such as e.g. Haskell? Even if it’s a monad in Haskell, it’s still a task that involves a side effect and therefore can not be executed by the Haskell program itself?

gampleman · November 27, 2024, 12:19pm

For one, Elm is quite strict about separating “Elm” code and the rest of the system. The boundary for these is a bit more porous in Haskell, since you have a FFI that can directly call C functions.

Conceptually though, you can think of the whole Haskell program as if residing inside a massive Task IOError () (in Elm parlance). So instead of having an update function receiving messages, a Haskell program will run by executing this task. That task can be composed of other subtasks via an andThen like mechanism (albeit Haskell has some neat syntax sugar for this:

in Elm you would write:


myFun fileName =
    File.readFile fileName 
         |> Task.andThen (\contents -> 
             File.writeFile (fileName ++ ".output") (countLineLength contents)
         )
         |> Task.andThen (\res -> 
             case res of
                 Ok _ -> Console.put "Success"
                 Err _ -> Console.put "Error"
         )

in (pseudo) Haskell you could write


myFun fileName = do
   contents <- File.readFile fileName
   res <- File.writeFile (fileName ++ ".output") (countLineLength contents)
   case res of 
         Ok _ -> Console.put "Success"
         Err _ -> Console.put "Error"

)

benjamin-thomas · November 27, 2024, 10:27pm

I’m not an expert but maybe I could add a few interesting bits to this discussion.

Here’s what I currently understand:

IO is a constraint, it’s a “contract” to which you should abide because it allows differentiating pure functions from impure functions (as a reminder, a pure function will always give you back the same result for a given input, an impure function may not). As far as I understand, IO represents the possibility of any possible computation so if a function returns an IO of something, “anything can happen” basically.

So the deal is that you are allowed to run functions requiring an IO context if your are already in an IO “context”. It’s a little bit similar (but much more constraining) to the mechanism of a JavaScript promise: you can only call an async function if you are already in an async “context”.

But we can still break this “contract” for debugging purposes because it’s a convention, like we do by logging inside pure functions in Elm with Debug.log.

So, we could in theory print to the console (the simplest example of a side effect) in Haskell without this requirement like we do in Python, but we’d loose all the typical garanties you’d expect from your FP program so we don’t do that (and thus we conform to this designed constraint).

It’s a bit difficult to demonstrate what’s going on in Haskell when printing because the IO setup is quite involved, but we could look at its cousin PureScript, another purely functional language.

Here, to log something to the console, it is required by the function signature that we are in an Effect context (it’s similar to IO in Haskell)

github.com

purescript/purescript-console/blob/v6.1.0/src/Effect/Console.purs#L10-L12


      
          foreign import log
            :: String
            -> Effect Unit

And the “dirty work” is delegated to JavaScript but we can easily imagine another runtime doing the same thing in the case of Haskell: purescript-console/src/Effect/Console.js at v6.1.0 · purescript/purescript-console · GitHub

With regards to Elm, the TEA pattern is a stronger form of constraint, whereas IO is still a constraint but much “looser”.

I can recommend the book Get Programming with Haskell if you want to learn more. I found it approachable and very much enjoyed reading it (doing the exercises is essential)

rupert · November 28, 2024, 9:31am

gampleman:

in Elm you would write:

myFun fileName =
    File.readFile fileName 
         |> Task.andThen (\contents -> 
             File.writeFile (fileName ++ ".output") (countLineLength contents)
         )
         |> Task.andThen (\res -> 
             case res of
                 Ok _ -> Console.put "Success"
                 Err _ -> Console.put "Error"
         )

in (pseudo) Haskell you could write

myFun fileName = do
   contents <- File.readFile fileName
   res <- File.writeFile (fileName ++ ".output") (countLineLength contents)
   case res of 
         Ok _ -> Console.put "Success"
         Err _ -> Console.put "Error"

)

Thanks for that. Best explanation of it that I have seen, and I never actually understood it until now. Its just a syntax sugar for tasks!

Would I be right in saying the do is approximately equivalent to Task.attempt?

Brendan · November 28, 2024, 4:23pm

Yes, in the IO context.

‘do notation’ is fancy syntax sugar for repeatedly applying the ‘bind’ (andThen) function in a way that it looks like it’s a pure function returning a value for whatever monad context it’s in. You can see this in the structural difference between the two examples you quoted - the elm version has to repeatedly capture output in a lambda, while haskell can assign it using the back arrow “<-”.

So the interesting part is what ‘bind’ actually does under the hood - and this is defined differently for each monad (similar to how we have different Maybe.andThen vs. Task.andThen).

In the IO case, I believe Haskell’s lazyness means that the operation is stored in a “thunk” within the context (as though it’s a partial function) and evaluation is started whenever it decides the value will be needed.

So depending on the operation, this might be right away, or carry it around with other thunks, waiting until the last minute to start them.

Contrast this with Elm where we know the task will start at the next TEA loop, or procedural languages where promises will be queued for next available opportunity.

tomka · November 28, 2024, 4:48pm

Yes, the book seems very good to me. I read a little bit in the section on IO, and combined with the discussion that happened so far, it clarified quite a few things. Thanks for the recommendation.

I’ll try to summarize my thoughts:

Every purely functional language must provide some way to allow effects, as the language would otherwise not be practically usable (without IO or state). Purely functional code must be separated from impure code to ensure the language deserves the label “pure.” However, there must be a way to handle impure code.

In Elm, effects are implemented using Commands, which are executed by the runtime. The result is returned to the pure update function through messages. The purely functional code resides within the Elm program, while the impure code is in the runtime environment.

In Haskell, this separation is achieved through pure functions and, for example, IO Actions like main, which are not pure. Impure code has a special data type that “marks” it.

I think I could live with that.

rupert · November 28, 2024, 5:09pm

So you never know when your IO is going to run? Could it even be in a different order to the one in which you andThened your IO operations? I suppose later tasks will likely consume the outputs of prior ones in the order you chain them, so should execute as expected since their outputs will be needed to evaluate the next one.

I often get confused in javascript about the order of flow control when promises get involved. Before I first learned about Elm I tried making a UI with Angular, and found that I quickly got to the point where I had no idea of the sequence in which things would run - AND it mattered. Seemed to matter to much of how Angular worked and it was just… confusing. Honestly, asynchronous code is actually easier to write in a more bare metal multi-threaded language like Java even.

Chaining Tasks in Elm is nice. Currently doing some backend code in Elm and chaining tasks as Procedures with brian-watkins/elm-procedure is just as nice.

Brendan · November 28, 2024, 5:51pm

This becomes a really interesting question for tasks that are IO ()- that is, all side effect and no return value - things like writing to the console and making database changes. If they are only run ‘when needed’, when is that?

Theoretically you do know when it’s going to run - when main() is run, and yes, they could run in unexpected order even if not waiting on return values.

Haskell has various escape hatches (e.g. through annotations, the sequence command, and the unsafe ‘runIO’ command) to have some control of when things happen and in what order. Things like database libraries often include some kind of ‘run’ function that use these so that you can chain up operations and then guarantee when they are started and in what order.

p.s. I’m not a Haskell expert by any means, this is just based on the intuition I built up trying to use it on the backend. If anyone else wants to pipe in with corrections please let me know!

druchan · December 1, 2024, 4:34pm

Haskell novice here.

Just like Elm having a “runtime”, Haskell’s compiler (GHC) compiles your program with what’s called a “non-trivial runtime system” (RTS) which takes care of doing a lot of the things that runtimes do (like scheduling, memory allocations, storage etc.). A very hand-wavy approximation is to say this runtime system takes care of actually “running” your Haskell program – by which, I mean that it “runs” your main function which has to be an IO action (not too dissimilar from Task). And in running your main function, any other function that it encounters in the path will also be run, often in non-deterministic ways but following a set of rules for evaluation and how you describe the sequence of IO functions inside a given function/context.

GordonBGood · December 1, 2024, 9:37pm

Hi Tom (@tomka),

There have been a lot of replies putting parts of the puzzle together, but perhaps things haven’t been completely tied up for you…

As to alternate “pure” functional languages other than Elm, there aren’t very many, with the common ones being Haskell and PureScript, both quite a bit more complex than Elm, but both handling side effects in generally the same way. Rather than trying to teach you these languages, I’m going to express the whole way in which they work in terms of Elm code and see it that does it for you…

First, these languages have something like Elm’s Task type that wraps actions to be performed that would be side effects if they were performed in the language itself so they are “performed” by the runtime instead; the difference between what Haskell and PureScript do with them is that, unlike Elm’s Task wrapper that can return an indication of failure, Haskell and PureScript assume that failures will be handled by an entirely different mechanism - exceptions. Since that is entirely different, I am going to suppose that they do it in the Elm way and that there is a IO e a Type in these languages rather than just a IO a Type…

So, these IO/Task types represent an action or through the andThen (bind in Haskell/PureScript) mechanism can represent a chain of actions performed in a pre-determined sequence. The functional magic is how this is done, as follows:

There is a flow of a kind of state from action to action from the beginning to end of the chain of actions given to “main”.
In Haskell, this chain of actions passes the kind of state called realWorld and has its own unique type RealWorld that could be defined in Elm as type RealWorld = RealWorld and with the only instance of this type defined as realWorld = RealWorld. This Type only has a single Constructor with no payload so doesn’t need to have a memory representation (it doesn’t in Haskell, the code generation in PureScript turns it into just a void call, but if created in Elm would have a JavaScript object representation, but it doesn’t really matter) and is actually a phantom Type meaning it is never actually used.
Now, if we were defining this IO e a Type in Elm as an action passing this phantom state, we might define it as follows:

type IO e a = IO (s -> (s, IOResult e a) )
type IOResult e a = Good a | Bad e

where s represents this phantom state parameter.

In order to be something like a monad, we then need the ability to wrap something in the IO e a Type, either to indicate success or failure, so we might define the following:

return : a -> IO e a -- this might better be called "wrap" or "success" rather than "return"
return a = IO (\ s -> (s, Good a))
failure : e -> IO e a
failure e = IO (\ s -> (s, Bad e))

Other than the ability to wrap actions as per the above, monadic actions require the ability to chain/andThen which could be defined as follows:

andThen : (a -> IO e b) -> IO e a -> IO e b
andThen iof (IO sf) = IO <| \ s0 ->
  let (s1, ior) = sf s0
  in
  case ior of
    Bad e -> (s1, Bad e)
    Good a ->
      let (IO iorsltf) = iof a
      in iorsltf s1

This will run actions as long as we keep andThen’ing more of them, but when it encounters the first error, it will abort the rest of the chain and return the error Type e for that first error.

Now all we need is the ability to run an action (which may - and very likely - consist of a chain of actions), which Haskell and PureScript would call perform and which we also have in Elm for actions that can never error, but since we are including the possibility of errors in our action chains, we’ll call this version attempt, which for a language like Haskell which only is supposed to have one of these ever used by the main chain, returns the a type if successful and handles the error Type by special processing would be defined as follows:

attempt : IO e a -> (e -> a) -> a
attempt (IO iof) onError =
  let (_, ior) = iof realWorld
  in
  case ior of
    Bad e -> onError e -- turns the error into a proper output, for instance
    Good a -> a

The error handling could be whatever the “pure” language allows, correcting the error as shown or calling some sort of a panic/error function with a message, etc.

Now, Haskell (and likely PureScript) allow breaking the purity of the language in certain ways such as allowing one to call unsafePerformIO in more places than just in the main chain but it is unsafe because it breaks the Type purity in certain ways. Elm wouldn’t allow that, but instead of returning the final a result directly (usually as a unit/() in Haskell/PureScript), as is done above, it wraps the whole chained action as a command/Cmd msg with a Message to be passed back on completion to be performed by the runtime containing the non-error result of the action and the runtime passing the Message back. In this way, Elm allows many calls of the perform/attempt functions with any returned type, yet maintains its purity.

Many Elm programmers don’t use andThen chains as much as they could and just have a huge chain of Message’s to be sorted out by the update function, with any action that needs to be followed by another action calling yet another perform/attempt to trigger yet another Message and so on until the chain of actions is completed by issuing a Cmd.none command.

There are a few features missing from this implementation such as the ability to map from one error Type to another and map different result types, etc., but it should be sufficient to understand how side effect actions are build up and composed…

So there are a few things we can see from this exercise: first, even in Haskell with lazy evaluation, actions always have the order that they were programmed to perform because they are part of a chain of actions with a “something” passed from action to action; second, that actions passing that “realWorld” something are functional combinators that pass a something state from action to action via functions so fit completely into the pure functional programming world similar to Alonzo Church’s definition of Church Numerals consisting of combinator compositions of chains of functions.

I hope that helps your understanding of this; I didn’t understand all of this myself until recently even though I have been about an intermediate level Haskell programmer for years, but my present understanding came about as I have been digging into the implementation of the Elm compiler written in Haskell…

tomka · December 8, 2024, 8:38pm

Thanks to you (and the others) for the explanations. I have to admit that I still don’t understand everything. But I now have a more concrete idea of how side effects are handled in purely functional languages. On the other hand, it also makes it clear to me why there aren’t that many purely functional languages. Simple things can become quite complex rather quickly. It also highlights for me once again that Elm is a great language for learning to think and program functionally.

GordonBGood · December 8, 2024, 10:52pm

Hi Tom (@tomka),

Yes, there are basically two things that are quite difficult to learn moving from imperative and Object Oriented programming languages to purely functional languages and the most difficult is the handling of side effects, with the second how to use recursion to implement loops without mutation, with this last leading one into “thinking functionally” to use functions as the solutions to problems rather than the problems that get in the way of more imperative paradigms.

Yes again, Elm is the simplest pure functional language of any out there, yet powerful enough to be able to express just about everything such as the equivalent to the IO side effect action wrapping monad of Haskell as I explained. One can even write compilers in it, although it is a bit limited as to performance due to only having persistent data types without a linearly indexed array type (even the Array Type from that module isn’t one of these but is still based on a tree of nodes, just that each node can have up to 32 linear entries so its like a 32 way tree rather than a binary tree), and neither does Elm have a way to express mutation side effects (as there are Haskell libraries that use either the IO - or the similar ST monad, with this second allowing for a safe way to have more than one perform/runST per program).

What don’t you understand. If you have specific questions, I’ll try to answer them…

Perhaps an example of how a Haskellish/PureScriptish language would use the IO e a for which I wrote the equivalent implementation in Elm above might help, as follows:

In these languages, instead of an Elm main value (NOT A FUNCTION) that is an Html Msg, a Svg Msg, or a Program Flags Model Msg, the main value (AGAIN NOT A FUNCTION) in those languages expresses a possibly chained action that eventually yields the “nothing” value ().
So suppose that we want to take a start time using the now action, an IO action that can never fail, perform some complex function (doesn’t need to be an side effect action), then use now to take end time, and use another IO action to print the resulting when it has been converted to a string.
The type signatures for all of these in terms of IO above would be as follows, with posixToMillis as defined in Elm’s Time module:

now : IO Never Posix -- a side effect time action
posixToMillis : Posix -> Int
longRunningComputationFunction : () -> Int
putStrLn : String -> IO Never () -- side effect action printing String to console

Now, it is much nicer to write this with a chain function that is exactly the andThen as defined for Elm in my previous post but with the arguments flipped; Haskell/PureScript have both versions of these (also symbolic operators for them) and refers to them a left and right “bind”:

chain : IO e a -> (a -> IO e b) -> IO e b

Using this, one can write the whole main chained action as follows:

main : IO Never ()
  chain now <| \ posixStart ->
  let result = longRunningComputationFunction() in
  chain now <| \ posixStop ->
  let elpsd = posixToMillis posixStop - posixToMillis posixStart
      rsltStr = "Result:  " ++ String.fromInt result ++ " in " ++
                       String.fromInt elpsd ++ " milliseconds."
  in
  chain (putStrLn rsltStr) <| \ _ ->
  putStrLn "That's all folks..."

Put this together with my descriptions of how to implement IO above with this even simpler because none of the actions can ever fail so we are using IO e a as IO Never a and never have to handle errors. If you put this together thinking of how the s/realWorld value is flowing through the entire chain of actions, you will understand how side effects are usually handled in functional languages.

Haskell and PureScript have “do notation” syntax sugar that can give the last bit of code the appearance of being somewhat imperative even though it isn’t that someone else described in other posts, most accurately by @Brendan, but this has nothing to do with how handling side effects in those language work, which is the subject of this thread…

Elm doesn’t handle them exactly this way (with a chain of functions passing a realWorld value through) as it takes the description of a chain of tasks wrapped in a Cmd Msg and its runtime has a scheduler that separates each Task and Task.andThen into a queue of pending tasks that get acted upon by the runtime until the end of the chain, at which time the final result is wrapped by an associated Msg to be passed back, along with the current state of the Model Type, to the update function. This is actually a bit more complex than as Haskell and PureScript do, but it means that Elm safely supports multiple perform/attempt chains with pure Type Safety. We can actually emulate what Haskell and PureScript do in Elm code using the headless Process.worker (no view function); however, one can’t directly input and output to the console (although one can use ports to do this).

If you work through this example, perhaps it will help your understanding of those other languages, and through that may help understanding how Elm handles side effects…

GordonBGood · December 10, 2024, 4:37pm

@Warry, while it doesn’t exactly have to do with the OP, there is an error in the above statement that might be perpetuated by someone reading your answer.

While it is true that the ($) operator is equivalent to the (<|) operator in Elm (not the (|>) operator as you state here, as someone else already noted), these operators in any of the mentioned languages have nothing to do with Monad’s as you have stated. The Type Signature for ($) in Haskell is as follows:

($) :: (a -> b) -> a -> b |infixr 0|

and there is no mention of a Monad Type Class anywhere, so neither the a nor b Type variables must be Monad’s nor are they contained in Monad’s. What is important is that this operator has the very lowest of precedence (zero) meaning any other operator will be evaluated first if there aren’t guiding round parenthesis, and it is right associative meaning that the expression to the right (the second argument) will be evaluated first and then applied to the function on the left (the first argument). This is exactly like the Type Signature for the (<|) operator in Elm (from the source code for its definition) as follows:

infix right 0 (<|) = apL
apL : (a -> b) -> a -> b

and both the Elm and Haskell versions can be used to avoid having to include round parenthesis around the right hand side (due to the precedence and associativity). However, because Elm is less sophisticated about Tail Call Optimization, the use of these “piping” operators in Elm can lead to stack overflow problems due to the use of this “function” enclosing the application of the function argument meaning the function argument is no longer in Tail Call Position as the last function application in the enclosing function expression.

You are correct that Monad’s are associated with the ability to chain/compose actions, but that is what Haskells (>>=)/(=<<) left/right “bind” operators do, specifically requiring Monad containers as Constraints, as well as Elm’s andThen (left “bind” only) which doesn’t have Type Classes and Constraints but which is only defined for Type’s that have “monadic” behaviour, such as Maybe, Result, Task, or any Type we care to define that has this “monadic” behaviour.

Haskell’s Type Signatures for left and right “bind”:

infixr 1 =<<
(=<<) :: Monad m => (a -> m b) -> m a -> m b
infixl 1 >>=
(>>=) :: m a -> (a -> m b) -> m b

as compared to the Elm Definition for the Maybe.andThen function:

andThen : (a -> Maybe b) -> Maybe a -> Maybe b

You can see that this is identical to the top version in Haskell other than for not having general “Type Classes” for Monad’s and needing to specifically use the Type the function is being defined for…

GordonBGood · December 10, 2024, 10:44pm

Excuse me for being pedantic, but it seems to me that there are some fragments of comments that never get properly resolved in this thread. Here, I try to identify them and answer them to the best of my (current) ability and understanding:

Yes, this is mostly correct for Haskell, since an IO monad can represent a chain of IO monad’s bound together with applications of the bind functions; However, there are a couple of of extensions to this that allow there to be other monadic chains that can be used to control side effects, with the first the ST monad that is defined to express a chain of actions that include mutations while still being Type Safe, and the second the unsafe use of the unsafePerformIO function and its variations that allow running more than one chain of IO actions although this is unsafe because it breaks Type Safety in certain ways. This last is used only when one must in order to obtain a result that has been “unwrapped” from the IO monad when the action cannot be (at least conveniently) wrapped in the main action.

This is the ultimate question regarding other functional languages such as Haskell and PureScript!

So the thing that is actually used to run user side code in these languages is the runtime system and it is passed the main value. The runtime then applies the main value to the performIO function that runs the entire chain of actions inside that IO value to the last one that returns the unit value of () and exits normally other than if there have been exceptions/errors/panics along the way in which case the runtime does whatever has been programmed for these cases.

In Elm, things are a little more complex in that Elm allows for the case where there are no actions at all and only a static web page is described with a defined Html Never or Svg Never value, in which case it just evaluates that likely nested Html Never (or Svg Never) Custom Type and converts it into an HTML web component through the “VirtualDOM” module - no Task’s are involved and therefore no Cmd msg is needed but there is not even the need for a “Model” or an update function taking “messages” as there is no way of handling events by Elm code in this simple type of program and there is no view function but only what the view function would have produced initially as the given Html Never. For the more common cases where Elm code can handle events through the update function being passed Msg’s and optionally can use Task’s and Cmd Msg’s to send commands to the runtime to be executed as possible side effects or receive subscription Sub Msg’s from the outside through the runtime, the main value is defined as a Program Flags Model Msg which defines the Type’s for all of “Flags”, “Model”, and “Msg” and also the functions to be called to “update” (decode the messages to produce a new model and optionally new Cmd Msgs), “view” (convert the current “Model” to the Html Msg Custom Type representation), subscriptions (enable - or not - external inputs of “Msg”'s from the outside which enabling may depend on a value in the “Model”), etc. There is also an init condition which includes the initial “Model” and optionally the means to pass an initialization Cmd Msg to the runtime, which the Elm runtime then proceeds to perform/attempt until it gets to Cmd.none for no more commands, at which point it passes the “Model” at that stage to the defined view function and ultimately passes the resulting Html msg Custom Type to the “VirtualDOM” to generate the current state of the viewable web page UI. Between events once the previous event commands have been processed, it just sits idle until an another event “Msg” is triggered, whether via a web page event such as keyboard, mouse, or touch, or via a subscription “Msg”, and so on…

While you are correct that the action to be performed is stored as a function that might be closure function (capturing external inputs from its enclosing scope), it is not defined as a “thunk” (a function with the “unit” value () as its only parameter), although later optimizations might turn it into a “thunk”, the input value includes a representation of current “real world” state, which in the Haskell implementation is a phantom value without memory representation that is passed from chained function to chained function through the entire IO action chain.

Again, while you are correct that in general, Haskell is a language which evaluates lazily by default, when evaluating a chain of monadic actions such as this, there is a specific order in which these actions are executed because of the passing of the previous state to the next for each step in the chain, so we know that these actions will be “executed” in the order we specified. Further, because these actions are executed by the runtime, normally only when the program is execution by the single invocation of the performIO function by the runtime., lazy evaluation is no longer in the equation as the runtime needs to evaluate the result of each action in order to pass it to the next in the “bind” chain and needs to complete the last action in order to complete the entire chain of actions.

Haskell’s main value IS NOT A FUNCTION as neither is Elm’s; Haskell’s main value defines the overall chain of actions to be executed when the program is run; Elm’s main value defines one of an Html msg or Svg msg structure in the simple, not reactive (not even sandbox reactions) case, or a Program flags model msg structure that provides the runtime with Type’s, values, and/or functions such as update, view, subscriptions, etc. it needs to handle the interactions in the more complex case.

Again, the order of execution of actions is as they are defined due to them being expressed as applications of a chain of functions…

Almost correct other than main is not a function and thus an IO action is not really a function in that it doesn’t have parameters that can be applied, but rather is a container for a chain of functions that can be called one after the other to complete the action; however, being functions, there must be at least one parameter that needs to be applied to start the whole chain. In Haskell’s case, this one parameter is the phantom “realWorld” value that is applied by the performIO function executed once by the runtime to run the program, but could just as easily be just a chain of “thunks” run one after the other as is how PureScript actually implements this.

Hi Tom, I’ve re-read your original question, and have another approach that might help your understanding in addition to what I have already posted, which covers how the IO monad is coded, then how the IO monad is used. I didn’t really understand monads completely when I started with function programming either, but will give you what helped me the most…

First, although there are a couple of zillion explanations of monad’s on the Internet, I found the whole concept of the holy trinity of Functor’s, Applicative’s, and Monad’s very complex (especially Applicatives) until I stumbled across this article (monad’s in pictures) and it all just clicked; maybe it will do the same for you, especially with the other posts about how they are implemented and used.

Next, yes, monad’s including the IO monad are most definitely purely functional as you should understand from the implementation details in the above “pictures” article and my implementations of the “side effect enclosing” IO monad in my previous posts. The reason the Haskell/PureScript IO/Effect monads are still purely functional is that they wrap a function that is an action that performs a side effect but the only way to actually get the results of this is to pass that chain of actions to a performIO function that actually runs the chain of side effects in the runtime and returns the final result. Remember that the definition of a pure function is one that when passed the same argument, always returns the same return value, and that makes performIO still pure because when passed a chain of IO actions that end in IO () as is the requirement for the main value, the return value is always (), but even if it returned different values depending on whether there were errors or not, all of that is outside the scope of the actual Haskell program and implementation details of the runtime, so doesn’t affect the purity of the Haskell code.

Finally, your opening question is how Haskell can calculate a random number and still stay pure given that the random number may give different results for each time we request one. We have all the tools at hand to understand this, including an implementation of a IO e a “monad” so given that we have a “Random” library package with the following two action and function, we can proceed:

-- initialize using system entropy, fallback to using time input
initStdGen : IO StdGen -- StdGen is a standard pseudo random generator
--given a StdGen produce a tuple of an Int and a new value of StdGen
-- this is pure as it produces the same result given the same StdGen input
genInt : StdGen -> (Int, StdGen)

So now we can print a non-deterministic random integer to the console using things we have defined here or in my previous posts as follows:

main : IO Never ()
main =
  chain initStdGen <| \ stdgen ->
  let (value, _) = genInt stdgen in
  putStrLn (String.fromInt value)

Note that initStdGen is an impure action describing a side effect because it doesn’t return the same value each time it is used, genInt is in fact pure because it does return a given Int for a given StdGen along with a new given StdGen which is also always the same for the given input, and the putStrLn side effect prints a String to the console; the net chained result of this is then, of course, impure because some links of the chain are impure. However, performIO main is pure because it always returns the unit value of ().

In the same way, because Haskell has the ST monad that supports monadic mutation, the following getResult function is still pure even though mutation of a STRef is going on, in the following code (written in Elm syntax although Elm doesn’t support this):

-- ST is a monad with `andThen`, `chain`, and `wrap` functions
-- only `chain` is used here...
type STRef a = STRef a
newSTRef : a -> ST (STRef a)
modifySTRef : (a -> a) -> STRef a -> ST ()
readSTRef : STRef a -> ST a
runST : ST a -> a
getResult : Int -> Int -> (Int -> Int -> Int) -> Int
getResult x y f = runST <|
  ST.chain (newSTRef y) <| \ sta ->
  ST.chain (modifySTRef (f x) sta) <| \ _ ->
  readSTRef sta

so getResult 2 3 (+) is always five, getResult 2 3 (-) is always negative one, etc. and the mutation that is going on internal to the ST monad is invisible outside the runST function. This can also be used to implement mutable arrays where the array content mutation is only internal to the runST chain of actions.

EDIT_ADD:

Actually, the above definitions aren’t quite complete in that in Haskell and PureScript, the ST monad exposes the “range” type meant to pass the “RealWorld” phantom argument along between the binding functions, and then have a feature that doesn’t require that the Type of that phantom variable be declared when applied with the runST function as there is a forall s. feature that says that it can be any Type but must be the same “any” Type thoughout the whole function chain (an Existential Type’s extension). This then ensures that the programmer can’t intermix binding chains from separate invocations of different runST chains as each of these chains is thus declared to have a unique Type for s (even though in practice they all have the “RealWorld” Type applied to them). However, this detail isn’t really important in understanding that side effects including mutation can be handled safely with this mechanism, although without this detail, use of the ST monad as above would be just as Type Unsafe as using unsafePerformIO is in Haskell and PureScript.

Now do you “get it”?

tomka · December 18, 2024, 9:20am

Thanks again for the very detailed response. An important aspect is that main is not a function but an IO value.

When Christmas is over, I will try to understand the other parts even better. So far, I have only programmed fairly small applications in Elm and haven’t had a need for andThen. I understand what it does, but I still need to use it myself to get a real feel for it.

pdamoc · December 21, 2024, 8:42am

Effects are about time. If you understand that what you are looking at in the code is time, you will get a better grasp on what is happening. Managed effects are about making the time flow explicit since denotational functional languages like Elm or Haskell have all declarations be simultaneous.

If you set yourself a challenge to do something that is time bound like get some data from the net andThen get some more data from the net based on the first data, you will see the value of andThen. Another small task is to time a request. This would require you to getTime andThen make the request andThen getTime again to see the difference.

Topic		Replies	Views
How do you guys deal with monads? Request Feedback	23	1436	February 6, 2023
Use cases for effect manager Request Feedback	14	2332	March 15, 2020
Producing side effects directly from view functions? Learn	18	4101	May 17, 2021
Why couldnt Elm (or an Elm like language) work on the back end Learn	34	10780	December 29, 2018
Is it fair to say abstraction and expressiveness are Elm's weak points? Learn	26	4655	June 8, 2018

Understanding side effects in Elm vs other languages

Related topics