Quite some time ago, back in March 2022, there was this discussion on the Elm slack:
@SiriusStarr
am I missing something or is there nolookAheadforParser
@megapctr
what would it do?
@SiriusStarr
IfpinlookAhead psucceeds (either consuming input or not) the whole parser behaves likepsucceeded without consuming anything (parser state is not updated as well). Ifpfails,lookAheadhas no effect, i.e. it will fail consuming input ifpfails consuming input.
@SiriusStarr
Checks if a parser succeeds without actually consuming the input
While the discussion continued, no concrete solution emerged.
The problem interested me and I had the time to explore it. After many trials and errors I eventually came up with the following code (published in this Ellie):
import Parser as P
lookAhead : P.Parser a -> P.Parser ()
lookAhead parser =
P.oneOf
[ P.oneOf
[ parser
|> P.backtrackable
|> P.andThen (\_ -> P.commit ())
|> P.andThen (\_ -> P.problem "")
, P.succeed
(parser
|> P.backtrackable
|> P.map (\_ -> ())
)
]
|> P.backtrackable
, P.succeed
(P.succeed ())
]
|> P.andThen identity
As you can easily see, this function solves the given task.
![]()
No, kidding aside. The code works, but it has a couple of flaws, not least the poor readability. In the meantime I found a much better way to implement a lookAhead parser. I wrote about it in this Discourse post: How to build interesting parsers.
However, apparently my old code is still being used, and from time to time, someone asks how it works.
So, if you just want to use a lookAhead parser, then I suggest to use the one I talked about in the linked Discourse post. I published it in my pithub/elm-parser-extra package.
But if you want to understand the code shown above, keep on reading.
Introduction
In the following text, I assume a certain basic knowledge of Elm parsers. You should understand the parser combinators andThen and oneOf and know what it means for a parser to be backtrackable or not. If you donβt understand what Iβm writing here, feel free to ask in the comments.
Iβll explain the code line by line, roughly in the order in which Elm would execute it: from top to bottom and from inside to outside. In each section, I first show the Elm code with line numbers and then the results of running the code, line by line:
Code:
1 {some parser}
2 {more}
Results:
# β Code β β
ββββΌβββββββββββββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ€
1 β {some parser} β {some parser results} β
2 β {more} β {more results} β
There are two use cases for the lookAhead parser: if the given parser succeeds, the lookAhead parser should succeed, too, but without consuming any input, and if the given parser fails, the lookAhead parser should fail, too. Iβll name these two use cases βSuccessβ and βFailureβ and show the results of each parser for those two use cases:
Code:
1 {some parser}
2 {more}
Results:
# β Code β Success β Failure β
ββββΌβββββββββββββββββββββββββββββββΌβββββββββββββββββββββΌβββββββββββββββββββββ€
1 β {some parser} β {success result 1} β {failure result 1} β
2 β {more} β {success result 2} β {failure result 2} β
Running a parser can either succeed or fail. Internally, those two states are represented in the elm/parser package by this type:
type PStep context problem value
= Good Bool value (State context)
| Bad Bool (Bag context problem)
Iβll show the results of running the parsers in a similar, but slightly shorter way:
- if the parser succeeds:
Good Bool value - if the parser fails:
Bad Bool problem
For the Bools Iβll use True if the corresponding parser is backtrackable and False if it isnβt.
Side note:
This is the opposite of whatelm/parseruses internally, but I find it easier to understand it this way.
I hope the notation becomes clear when you see it in actionβ¦
Letβs go
We start with the only one thing we have in the beginning: the parser given to the lookAhead function:
Code:
1 parser
Results:
# β Code β Success β Failure β
ββββΌβββββββββββββββββββββββββββββββββββββΌβββββββββββββββββΌβββββββββββββββββββ€
1 β parser β Good ? a β Bad ? e β
This means that running the code in line 1 (in this case the parser given as an argument to the lookAhead function) yields the following results:
-
In the βSuccessβ use case:
Good: in the βSuccessβ use case, the parser in line 1 succeeds (this is exactly how the βSuccessβ use case is defined)?: we donβt know whether the given parser is backtrackable or nota: the parser is of typeP.Parser a, so I use βaβ to represent the parsed value
-
In the βFailureβ use case:
Bad: in the βFailureβ use case, the parser in line 1 fails (this is exactly how the βFailureβ use case is defined)?: we donβt know whether the given parser is backtrackable or note: I use βeβ to represent the error that the given parser produces in the βFailureβ use case
Two Kinds of Bad
In the βSuccessβ use case, the parser might have consumed part of the input. The only way to undo that is to turn success into failure. We can do this with P.andThen and P.problem:
Code:
1 parser
2 |> P.andThen (\_ -> P.problem "")
Results:
# β Code β Success β Failure β
ββββΌβββββββββββββββββββββββββββββββββββββΌβββββββββββββββββΌβββββββββββββββββββ€
1 β parser β Good ? a β Bad ? e β
2 β |> P.andThen (\_ -> P.problem "") β Bad ? "" β Bad ? e β
OK, now we have a parser which doesnβt consume anything, but we get a Bad result for both use cases, and we need to be able to distinguish between them. How can we do that?
![]()
Idea 1:
Could we just use a special error text?
No. The behavior of a parser doesnβt depend on the error value.
Idea 2:
Could we make one backtrackable and the other not?
Yes, this could work, so letβs try itβ¦
First, we need a defined state for the backtrackable flag. We donβt know whether the given parser is backtrackable or not, so letβs make it definitely backtrackable first:
Code:
1 parser
2 |> P.backtrackable
Results:
# β Code β Success β Failure β
ββββΌβββββββββββββββββββββββββββββββββββββΌβββββββββββββββββΌβββββββββββββββββββ€
1 β parser β Good ? a β Bad ? e β
2 β |> P.backtrackable β Good True a β Bad True e β
Now we can modify the βSuccessβ use case by making it (and only it) not backtrackable with P.andThen and P.commit, and then turn success into failure as we did above:
Code:
1 parser
2 |> P.backtrackable
3 |> P.andThen (\_ -> P.commit ())
4 |> P.andThen (\_ -> P.problem "")
Results:
# β Code β Success β Failure β
ββββΌβββββββββββββββββββββββββββββββββββββΌβββββββββββββββββΌβββββββββββββββββββ€
1 β parser β Good ? a β Bad ? e β
2 β |> P.backtrackable β Good True a β Bad True e β
3 β |> P.andThen (\_ -> P.commit ()) β Good False () β Bad True e β
4 β |> P.andThen (\_ -> P.problem "") β Bad False "" β Bad True e β
Fine. We accomplished the first goal: to undo the input consumption in both use cases. Plus, we are still able to distinguish between them.
![]()
But is that really true?
Tell Bad From Bad
Letβs see how can we turn the two kinds of Bad into different results. For now, it doesnβt matter what these results are exactly, so Iβll leave the details open and just use
{result S}for the result in the βSuccessβ use case{parser S}for the parser used to generate theresult Svalue{result F}for the result in the βFailureβ use case{parser F}for the parser used to generate theresult Fvalue
We can change any backtrackable Bad using P.oneOf. The backtrackable Bad is the Bad of the βFailureβ use case, so we use {parser F} in P.oneOf:
Code:
1 P.oneOf
2 [ parser
3 |> P.backtrackable
4 |> P.andThen (\_ -> P.commit ())
5 |> P.andThen (\_ -> P.problem "")
6 , {parser F}
7 ]
Results:
# β Code β Success β Failure β
ββββΌβββββββββββββββββββββββββββββββββββββΌβββββββββββββββββΌβββββββββββββββββββ€
2 β parser β Good ? a β Bad ? e β
3 β |> P.backtrackable β Good True a β Bad True e β
4 β |> P.andThen (\_ -> P.commit ()) β Good False () β Bad True e β
5 β |> P.andThen (\_ -> P.problem "") β Bad False "" β Bad True e β
β β β β
6 β {parser F} β β {result F} β
β β β β
1 β P.oneOf [ {2-5}, {6} ] β Bad False "" β {result F} β
I hope the notation is clear:
-
line 6 is only relevant in the βFailureβ use case, so I left the βSuccessβ column empty
-
P.oneOfin line 1 combines the parsers of lines 2-5 and of line 6. I wrote this in the βcodeβ column as β{2-5}β and β{6}β respectively.
Now to the βSuccessβ use case. Weβll use another P.oneOf, but before we can do this we have to make the Bad of the βSuccessβ use case backtrackable again:
Code:
1 P.oneOf
2 [ P.oneOf
3 [ parser
4 |> P.backtrackable
5 |> P.andThen (\_ -> P.commit ())
6 |> P.andThen (\_ -> P.problem "")
7 , {parser F}
8 ]
9 |> P.backtrackable
10 , {parser S}
11 ]
Results:
# β Code β Success β Failure β
ββββΌβββββββββββββββββββββββββββββββββββββΌβββββββββββββββββΌβββββββββββββββββββ€
3 β parser β Good ? a β Bad ? e β
4 β |> P.backtrackable β Good True a β Bad True e β
5 β |> P.andThen (\_ -> P.commit ()) β Good False () β Bad True e β
6 β |> P.andThen (\_ -> P.problem "") β Bad False "" β Bad True e β
β β β β
7 β {parser F} β β {result F} β
β β β β
2 β P.oneOf [ {3-6}, {7} ] β Bad False "" β {result F} β
9 β |> P.backtrackable β Bad True "" β {result F} β
β β β β
10 β {parser S} β {result S} β β
β β β β
1 β P.oneOf [ {2-9}, {10} ] β {result S} β {result F} β
Great. Using two nested P.oneOf, we manage to undo the input consumption in both use cases and still can return different results for them.
What should these results be?
Final Results
Letβs start with the βSuccessβ use case. If the given parser succeeds, the lookAhead parser should succeed, too, so we need a Good result.
I wanted the result to be backtrackable (Good True), because this seems to be appropriate for a parser that is conducting a test (lookAhead) rather than actually parsing something.
For the value of the result, we canβt use the value of the given parser, because we loose it when we turn success into failure with P.problem. Therefore we simply use the βunitβ value ().
So the desired result is Good True (), which can be easily achieved with P.succeed:
Code:
1 P.succeed ()
Results:
# β Code β Success β Failure β
ββββΌβββββββββββββββββββββββββββββββββββββΌβββββββββββββββββΌβββββββββββββββββββ€
1 β P.succeed () β Good True () β β
And now the βFailureβ use case. If the given parser fails, the lookAhead parser should fail, too, so we need a Bad result.
This is simple: we can just use the given parser. In the βFailureβ use case, it returns the result Bad ? e. (As a bonus for reusing the given parser, the lookAhead parser returns the same error βeβ in the βFailureβ use case.)
But we have to make two modifications: first, we want the result to be backtrackable as in the βSuccessβ use case, giving us Bad True e.
Second, we have to match the type of the βSuccessβ use case parser: P.Parser (). The given parser is of type P.Parser a, so we have to P.map it to the desired type:
Code:
1 parser
2 |> P.backtrackable
3 |> P.map (\_ -> ())
Results:
# β Code β Success β Failure β
ββββΌβββββββββββββββββββββββββββββββββββββΌβββββββββββββββββΌβββββββββββββββββββ€
1 β parser β β Bad ? e β
2 β |> P.backtrackable β β Bad True e β
3 β |> P.map (\_ -> ()) β β Bad True e β
(Line 3 changes the type of the parser.)
Good. Now that we know what weβd like to return, letβs put this into the placeholders of the previous code!
All Together Now
Hereβs the code from above with the placeholders replaced by the parsers we just found:
Code:
1 P.oneOf
2 [ P.oneOf
3 [ parser
4 |> P.backtrackable
5 |> P.andThen (\_ -> P.commit ())
6 |> P.andThen (\_ -> P.problem "")
7 , parser
8 |> P.backtrackable
9 |> P.map (\_ -> ())
10 ]
11 |> P.backtrackable
12 , P.succeed ()
13 ]
Lines 7-9 contain the parser for the βFailureβ use case, line 12 the parser for the βSuccessβ use case.
Results:
# β Code β Success β Failure β
ββββΌβββββββββββββββββββββββββββββββββββββΌβββββββββββββββββΌβββββββββββββββββββ€
3 β parser β Good ? a β Bad ? e β
4 β |> P.backtrackable β Good True a β Bad True e β
5 β |> P.andThen (\_ -> P.commit ()) β Good False () β Bad True e β
6 β |> P.andThen (\_ -> P.problem "") β Bad False "" β Bad True e β
β β β β
7 β parser β β Bad ? e β
8 β |> P.backtrackable β β Bad True e β
9 β |> P.map (\_ -> ()) β β Bad True e β
β β β β
2 β P.oneOf [ {3-6}, {7-10} ] β Bad False "" β Bad True e β
11 β |> P.backtrackable β Bad True "" β Bad True e β
β β β β
12 β P.succeed () β Good True () β β
β β β β
1 β P.oneOf [ {2-11}, {12} ] β Good True () β Good True () β
Hmm. Do you see the problem? The βSuccessβ use case works, but the βFailureβ use case doesnβt. It doesnβt yield the desired result.
Why is this happening?
The problem is that the P.oneOf in line 1 acts on every backtrackable Bad result. If you look at the results of line 11, you can see that we have backtrackable Bad results in both use cases, and therefore we get the result of line 12 in both use cases
It wouldnβt help if we chose a not backtrackable Bad as the result of line 9, because line 11 would make it backtrackable again.
![]()
And Then Bool Comes to the Rescue
The only thing we can do is to use a Good result in the βFailureβ use case, which wouldnβt be changed by the P.oneOf in line 1. We then have to turn it into the desired Bad result in a second step.
Changing a Good result into something else can be done with P.andThen. So the idea is to add P.andThen at the end of the code and - depending on the value of the input parameter - return the desired result.
What should we pass to P.andThen? Why not simply pass βTrueβ in the βSuccessβ use case and βFalseβ in the βFailureβ use case? Then we can use an if expression to return the desired result.
Letβs see how this looks:
Code:
1 P.oneOf
2 [ P.oneOf
3 [ parser
4 |> P.backtrackable
5 |> P.andThen (\_ -> P.commit ())
6 |> P.andThen (\_ -> P.problem "")
7 , P.succeed False
8 ]
9 |> P.backtrackable
10 , P.succeed True
11 ]
12 |> P.andThen
13 (\success ->
14 if success then
15 P.succeed ()
16 else
17 parser
18 |> P.backtrackable
19 |> P.map (\_ -> ())
20 )
In line 7, we simply return False. The parser for the final result in the βFailureβ use case is in lines 17-19. Analogously, for the βSuccessβ use case, we return True in line 10, and put the parser for the final result in line 15.
Results:
# β Code β Success β Failure β
ββββΌβββββββββββββββββββββββββββββββββββββΌβββββββββββββββββΌβββββββββββββββββββ€
3 β parser β Good ? a β Bad ? e β
4 β |> P.backtrackable β Good True a β Bad True e β
5 β |> P.andThen (\_ -> P.commit ()) β Good False () β Bad True e β
6 β |> P.andThen (\_ -> P.problem "") β Bad False "" β Bad True e β
β β β β
7 β P.succeed False β β Good True False β
β β β β
2 β P.oneOf [ {3-6}, {7} ] β Bad False "" β Good True False β
9 β |> P.backtrackable β Bad True "" β Good True False β
β β β β
10 β P.succeed True β Good True True β β
β β β β
1 β P.oneOf [ {2-9}, {10} ] β Good True True β Good True False β
β β β β
15 β P.succeed () β Good True () β β
β β β β
17 β parser β β Bad ? e β
18 β |> P.backtrackable β β Bad True e β
19 β |> P.map (\_ -> ()) β β Bad True e β
β β β β
12 β {1-11} |> P.andThen {13-19} β Good True () β Bad True e β
OK, this looks good: the result of the P.oneOf in lines 1-11 (the first step) is Good True True in the βSuccessβ use case and Good True False in the βFailureβ use case. We change this in the P.andThen in lines 12-20 (the second step) into the desired results.
And indeed, the code works! It passes all tests!
Butβ¦ weβre not finished, yet.
Code Golfing
The code works, but it isnβt exactly the code I came up with. If I remember correctly, I never had a version that looked like the current code. So letβs see how we can transform the current code into the final version.
The current code roughly has the following structure:
β Step 1 β Combinator β Step 2 β
ββββββββββββββββββββββββββββββββββΌβββββββββββββββΌβββββββββββββββββββββββββ€
β Parser a -> Parser Bool β >> P.andThen β Bool -> Parser () β
We can interpret the first step as a function from Parser a to Parser Bool. The second step is a function from Bool to Parser (). Both functions are combined with >> P.andThen to finally yield a function from Parser a to Parser ().
Letβs first generalize the input of the first step and the output of the second step:
Parser aβinputParser ()βoutput
β Step 1 β Combinator β Step 2 β
ββββββββββββββββββββββββββββββββββΌβββββββββββββββΌβββββββββββββββββββββββββ€
β input -> Parser Bool β >> P.andThen β Bool -> output β
Now the first step is a function from a general input type to Parser Bool, the second step is a function from Bool to output, so the combination of both steps is a function from input to output.
If we wouldnβt return a Parser Bool but only a Bool in the first step, we could use the forward function composition operator >> alone:
Parser BoolβBool>> P.andThenβ>>
β Step 1 β Combinator β Step 2 β
ββββββββββββββββββββββββββββββββββΌβββββββββββββββΌβββββββββββββββββββββββββ€
β input -> Bool β >> β Bool -> output β
Now the first step is a function which processes the input parameter into a True or a False value. The second step takes this Bool value and returns either a value for the True case or another value for the False case.
Normally I wouldnβt split a function like that into two steps:
(\input -> ..... True ..... False .....)
>>
(\flag -> if flag then trueValue else falseValue)
Iβd rather directly use trueValue instead of True and falseValue instead of False:
\input -> ..... trueValue ..... falseValue .....
Then I wouldnβt need a second step at all:
β Step 1 β Combinator β Step 2 β
ββββββββββββββββββββββββββββββββββΌβββββββββββββββΌβββββββββββββββββββββββββ€
β input -> output β β β
Unfortunately, as we have seen above, we canβt use the falseValue (the parser for the βFailureβ use case) directly. Therefore we had to introduce a second step, and thereβs no way around it. But we can still omit the Bools, even with a two-step implementation:
β Step 1 β Combinator β Step 2 β
ββββββββββββββββββββββββββββββββββΌβββββββββββββββΌβββββββββββββββββββββββββ€
β input -> output β >> β output -> output β
In code:
(\input -> ..... trueValue ..... falseValue .....)
>>
identity
By using trueValue and falseValue directly, we can simply use the identity function as the second step.
Can we do this with our Parser code, too?
Yes we can. We just have to reverse the transformations we did aboveβ¦
First, we re-introduce Parsers, both around the return type of the first step and in the combinator:
outputβParser output>>β>> P.andThen
β Step 1 β Combinator β Step 2 β
ββββββββββββββββββββββββββββββββββΌβββββββββββββββΌβββββββββββββββββββββββββ€
β input -> Parser output β >> P.andThen β output -> output β
Second, we re-introduce the real input and output types:
inputβParser aoutputβParser ()
β Step 1 β Combinator β Step 2 β
ββββββββββββββββββββββββββββββββββΌβββββββββββββββΌβββββββββββββββββββββββββ€
β Parser a -> Parser (Parser ()) β >> P.andThen β Parser () -> Parser () β
(Iβll briefly talk about the weird result type of the first step in the closing remarksβ¦)
This gives us the final version:
Code:
1 P.oneOf
2 [ P.oneOf
3 [ parser
4 |> P.backtrackable
5 |> P.andThen (\_ -> P.commit ())
6 |> P.andThen (\_ -> P.problem "")
7 , P.succeed
8 (parser
9 |> P.backtrackable
10 |> P.map (\_ -> ())
11 )
12 ]
13 |> P.backtrackable
14 , P.succeed
15 (P.succeed ())
16 ]
17 |> P.andThen identity
In lines 8-10, I replaced False with falseValue, which is the parser for the βFailureβ use case. In line 15, I replaced True with trueValue, the parser for the βSuccessβ use case. And in line 17, I replaced the second step with the identity function.
Results:
# β Code β Success β Failure β
ββββΌβββββββββββββββββββββββββββββββββββββΌβββββββββββββββββΌβββββββββββββββββββ€
3 β parser β Good ? a β Bad ? e β
4 β |> P.backtrackable β Good True a β Bad True e β
5 β |> P.andThen (\_ -> P.commit ()) β Good False () β Bad True e β
6 β |> P.andThen (\_ -> P.problem "") β Bad False "" β Bad True e β
β β β β
8 β parser β β Bad ? e β
9 β |> P.backtrackable β β Bad True e β
10 β |> P.map (\_ -> ()) β β Bad True e β
β β β β
7 β P.succeed {8-10} β β Good True {8-10} β
β β β β
2 β P.oneOf [ {3-6}, {7-11} ] β Bad False "" β Good True {8-10} β
13 β |> P.backtrackable β Bad True "" β Good True {8-10} β
β β β β
15 β P.succeed () β Good True () β β
β β β β
14 β P.succeed {15} β Good True {15} β β
β β β β
1 β P.oneOf [ {2-13}, {14-15} ] β Good True {15} β Good True {8-10} β
17 β |> P.andThen identity β Good True () β Bad True e β
This still works, and this is the code I came up with back then ![]()
Itβs no wonder no one understood that codeβ¦ ![]()
Closing Remarks
You might be puzzled by the result type of the first step: Parser (Parser ()). A parser that creates another parser ???
In lines 7 and 14, we simply wrap the parsers for the βFailureβ and βSuccessβ use cases with P.succeed. Therefore, the resulting Parser (Parser ()) doesnβt do any parsing at all. Itβs only use is to yield a Good result in both use cases, as described above.
The second step of the final version is
step2 =
P.andThen identity
I didnβt show the type signature of this function on purpose. You can use the Elm REPL or maybe your IDE to show you the type.
You can use this pattern with every type that has an andThen function, for example with Maybe, Result, or Json.Decode.Decoder. Those types are often called βMonadsβ in functional programming, and the pattern shown is called join.
The combinator used to connect the two steps is
combinator f1 f2 =
f1 >> P.andThen f2
Itβs type signature is also interesting, and again, you can create such a function for other types with an andThen function, too. In Haskell, the function is written as >=> and called βLeft-to-right composition of Kleisli arrowsββ¦ ![]()
Thank you for reading this far. If you have any questions or comments, feel free to leave them below. You can also find me as @Pit on the Elm slack.