Parsers with Error Recovery

rupert · September 23, 2020, 12:02pm

Anyway, I think I have now got this working now how I want it to. I have played around with various ideas, and the main insight I have had is that the recovery tactic should not be associated with tokens. The aim is not to error correct individual tokens, but simply to get the parser back to a place where it can continue, whilst taking note of the problem that did occur.

Originally, I had the idea of passing down the error handler on each Parser building block, in a similar way to how inContext works in Parser.Advanced. So I had:

type Parser context problem value
    = Parser
        (RecoveryTactic problem
         ->
            { pa : PA.Parser context problem (Outcome context problem value)
            , onError : RecoveryTactic problem
            }
        )

The problem with this, is that the recovery tactic would often be used in the wrong situation. If parsing a list of integers with a contribution from say:

[1sfd, 2, 3, 4]

That would fail on parsing 1sdf as an Int, but if recovering by skipping ahead to , then re-trying the int parser, that is also going to fail because there is whitespace after the comma, not an int. The recovery tactic needs to be put accross a larger piece of the parser, which will be re-tried in its entirety:

(PR.succeed identity
    |> PR.ignore PR.spaces
    |> PR.keep (PR.int ExpectingInt InvalidNumber)
    |> PR.ignore PR.spaces
    |> PR.ignore (PR.symbol "," ExpectingComma)
)
    |> PR.forwardThenRetry [ ',' ] ExpectingComma Recovered

So I was able to get rid of the complicated context passing error handling mechanism, and just have the Parser type like this:

type alias Parser context problem value =
    PA.Parser context problem (Outcome context problem value)

The recovery tactic described in various papers is to first back up to a known start symbol, then scan ahead to a sentinal symbol, and try to continue after that. This is implemted as:

https://github.com/the-sett/parser-recoverable/blob/master/src/Parser/Recoverable.elm#L608

And can be summarised by this pseudo-code:

forwardThenRetry parser =
    loop, starting with empty warning list []
        oneOf [ try backtrackable parser
                    |> ensure any warnings from previous loop iterations are kept
              , scan ahead for a sentinal token
                    |> if no characters consumed then
                           fail
                        else 
                           try again 
                              |> adding a warning about what was skipped over
                   ]

Some tidying up and documentation and I will put it out as a new package.

Topic		Replies	Views
Error-tolerant Elm parser (for editor tooling) Request Feedback	7	2033	June 1, 2018
Problem writing a parser Learn	4	922	January 29, 2018
Best way to write this in Elm? (Parser vs Regex) Learn	2	1136	January 11, 2018
Parsing Example in Elm Request Feedback	4	1238	August 3, 2018
[elm/parser]: understanding what's going on Learn	3	790	August 23, 2019

Parsers with Error Recovery

Related topics