Parsers with Error Recovery

Anyway, I think I have now got this working now how I want it to. I have played around with various ideas, and the main insight I have had is that the recovery tactic should not be associated with tokens. The aim is not to error correct individual tokens, but simply to get the parser back to a place where it can continue, whilst taking note of the problem that did occur.

Originally, I had the idea of passing down the error handler on each Parser building block, in a similar way to how inContext works in Parser.Advanced. So I had:

type Parser context problem value
    = Parser
        (RecoveryTactic problem
         ->
            { pa : PA.Parser context problem (Outcome context problem value)
            , onError : RecoveryTactic problem
            }
        )

The problem with this, is that the recovery tactic would often be used in the wrong situation. If parsing a list of integers with a contribution from :cat: say:

[1sfd, 2, 3, 4]

That would fail on parsing 1sdf as an Int, but if recovering by skipping ahead to , then re-trying the int parser, that is also going to fail because there is whitespace after the comma, not an int. The recovery tactic needs to be put accross a larger piece of the parser, which will be re-tried in its entirety:

(PR.succeed identity
    |> PR.ignore PR.spaces
    |> PR.keep (PR.int ExpectingInt InvalidNumber)
    |> PR.ignore PR.spaces
    |> PR.ignore (PR.symbol "," ExpectingComma)
)
    |> PR.forwardThenRetry [ ',' ] ExpectingComma Recovered

So I was able to get rid of the complicated context passing error handling mechanism, and just have the Parser type like this:

type alias Parser context problem value =
    PA.Parser context problem (Outcome context problem value)

The recovery tactic described in various papers is to first back up to a known start symbol, then scan ahead to a sentinal symbol, and try to continue after that. This is implemted as:

https://github.com/the-sett/parser-recoverable/blob/master/src/Parser/Recoverable.elm#L608

And can be summarised by this pseudo-code:

forwardThenRetry parser =
    loop, starting with empty warning list []
        oneOf [ try backtrackable parser
                    |> ensure any warnings from previous loop iterations are kept
              , scan ahead for a sentinal token
                    |> if no characters consumed then
                           fail
                        else 
                           try again 
                              |> adding a warning about what was skipped over
                   ]

Some tidying up and documentation and I will put it out as a new package.

2 Likes