I get really frustrated when asking any community “How would you do X?” and get back responses of the form “Don’t do X.”
So let me first say I personally like
- Your accumulating record approach. I do not love the Tuple approach. While the tuple approach is clever and succinct, I really prefer records to tuples on the basis that names are good and nesting tuples are noisy and hard for me (personally) to read.
- Named functions which can be defined in a let binding if you want to preserve locality and avoid polluting the module top-level.
Next, I beg your
forgiveness as I give an answer of the form “You probably don’t need monadic binds as much as you think so this may not be that big of a problem.”.
Before working professionally in a language with do notation, I was one of the people that silently (or maybe not silently
) wished Elm offered it. I have come to deeply dislike do notation after spending
- 6 months writing Scala 30% of the time
- 18 months writing Haskell 20% of the time (Elm the other 80%)
- 24 months writing PureScript 95% of the time and Haskell 5%.
do was abused more often than not at the three companies where I worked. Other engineers with far more professional Haskell / PureScript experience likely have a more mature perspective, but I think that my grievances speak directly to your question.
andThen or the monadic bind, in my opinion, should only be used to express that one computation
depends
on the results of a prior computation. If there is NO dependency then, while do can be used for convenience, it is not truthful / declarative.
In surveying over 300K lines of PureScript / Haskell code in 4 large applications I believe the most genuinely needed monadic binds I ever saw in any properly sized function was 4 and was 1 on average!!!. In other cases,
- No Monad was even necessary because it collapsed to identity!!!
- Functor was the real operation.
- Applicative was the real operation.
- The engineer did not understand the State monad or the State monad transformer.
- The function was too large.
I have some PureScript/Haskell pseudocode below. I intentionally use parenthesis rather than $ (which is like <|) and # (which is like |>) and use map rather than <$> and <#> to try to make it a little more readable for Elm developers. Apologies for the mishmash of code style (for anyone looking to potentially hire me in the future as a Haskell/PureScript engineer: this is not how I write Haskell/PureScript.
).
1. No Monadic Bind Needed
noWorkDone = do
x <- someResult
pure x
-- the above is basically like the following in Elm
noWorkDone =
someResult |> Result.andThen Ok
-- so... um... this doesn't do anything! It unwraps the result and then re-wraps it
-- without doing any work
noWorkDone = someResult
2. Functor Masquerading as Monad
actuallyMap = do
x <- someResult
pure $ f x
-- in Elm this mistake would be written
actuallyMap =
someResult |> Result.andThen (\r -> Ok (f r))
-- which can be simplified to just map
actuallyMap =
Result.map f someResult
3. Applicative Masquerading as Monad
actuallyApplicative = do
a <- aResult
b <- bResult
c <- cResult
pure { a, b, c }
-- the above can be simplified in PureScript
actuallyApplicative =
lift3 (\a b c -> { a, b, c }) aResult bResult cResult
-- or in Elm
actuallyApplicative =
Result.map3 (\a b c -> { a = a, b = b, c =c }) aResult bResult cResult
3.a Combinations of Functor and Applicative
The function below looks like it is using 3 monadic binds. But it is really only using one monadic bind.
mostlyApplicative = do
a <- aResult
b <- bResult
c <- f a b
pure { a, b, c }
-- re-written in PureScript
mostlyApplicative = do
Tuple a b <- lift2 Tuple aResult bResult
map (\c -> { a, b, c }) (f a b)
-- in Elm... this is messy and should be re-framed but I am just trying to make
-- a point of showing only one real `andThen` needed
mostlyApplicative =
Result.map2 pair aResult bResult
|> Result.andThen (\(a, b) -> Result.map (\c -> { a = a, b = b, c = c }) (f a b))
4. Misunderstanding State
mostlyState = do
textFromUser <- readLine
currentState <- State.get
let newText = currentState.accumulatingText <> textFromUser
State.put (currentState { accumulatingText = newText })
-- the code above performs a get only to then call put which makes the update appear
-- effectful/monadic when the state update is really pretty pure. There is only one
-- monadic bind.
mostlyState = do
textFromUser <- readLine
State.modify (\state -> state { accumulatingText = state.accumulatingtext <> textFromUser })
Now to Your Sample
I cannot be certain from your code example the exact semantics and intention of the code. I also cannot tell which Parser library you are using. So I cannot make a recommendation for what you should do. However, making some serious assumptions I might try to “stay inside of the Parser monad” (rather than breaking out of the monad by running) and then I might write this with a single Parser.andThen because the way I have written this there is only one data dependency from the closeTagParser, which needs to know about the open tag name to validate an appropriate close.
import Parser exposing ((|.), (|=), Parser)
import Parser.Extras as Extras
type alias Tag =
{ openTag : OpenTag, contents : List Content }
type Content
= TaggedContent Tag | TextContent String
type alias OpenTag =
{ name : String, attributes : List ( String, String ) }
openTagParser : Parser OpenTag
openTagParser = Debug.todo "Not implemented"
-- Parse a close tag, failing if the tag's name does not match the open tag
closeTagParser : OpenTag -> Parser ()
closeTagParser { name } =
Parser.symbol "</" |. Parser.token name |. Parser.symbol ">"
contentParser : Parser Content
contentParser = Debug.todo "Not implement"
tagParser : Parser Tag
tagParser =
openTagParser
|> Parser.andThen
(\openTag ->
Parser.succeed (\contents -> { openTag = openTag, contents = contents })
|= Extras.many contentParser
|. closeTagParser openTag
)
So I think that if you only use andThen when there is a genuine data dependency and if you keep your functions “right-sized” then you will avoid most problematic nesting. That said
- right-sized is an obnoxious value judgment.
- what constitutes problematic nesting is a matter of taste. I actually didn’t think you original example looked bad so…
- there are obviously going to be cases where this is not true and where there are genuinely five or more monadic binds in a function. I suspect that you will be happy with your suggested approach of
- Decomposing with named functions (and defining them in the
letof a function to prevent module top-level scope pollution). - Using your record approach to accumulate interim results.
- Decomposing with named functions (and defining them in the