Anyway, I think I have now got this working now how I want it to. I have played around with various ideas, and the main insight I have had is that the recovery tactic should not be associated with tokens. The aim is not to error correct individual tokens, but simply to get the parser back to a place where it can continue, whilst taking note of the problem that did occur.
Originally, I had the idea of passing down the error handler on each Parser
building block, in a similar way to how inContext
works in Parser.Advanced
. So I had:
type Parser context problem value
= Parser
(RecoveryTactic problem
->
{ pa : PA.Parser context problem (Outcome context problem value)
, onError : RecoveryTactic problem
}
)
The problem with this, is that the recovery tactic would often be used in the wrong situation. If parsing a list of integers with a contribution from say:
[1sfd, 2, 3, 4]
That would fail on parsing 1sdf
as an Int
, but if recovering by skipping ahead to ,
then re-trying the int parser, that is also going to fail because there is whitespace after the comma, not an int. The recovery tactic needs to be put accross a larger piece of the parser, which will be re-tried in its entirety:
(PR.succeed identity
|> PR.ignore PR.spaces
|> PR.keep (PR.int ExpectingInt InvalidNumber)
|> PR.ignore PR.spaces
|> PR.ignore (PR.symbol "," ExpectingComma)
)
|> PR.forwardThenRetry [ ',' ] ExpectingComma Recovered
So I was able to get rid of the complicated context passing error handling mechanism, and just have the Parser
type like this:
type alias Parser context problem value =
PA.Parser context problem (Outcome context problem value)
The recovery tactic described in various papers is to first back up to a known start symbol, then scan ahead to a sentinal symbol, and try to continue after that. This is implemted as:
https://github.com/the-sett/parser-recoverable/blob/master/src/Parser/Recoverable.elm#L608
And can be summarised by this pseudo-code:
forwardThenRetry parser =
loop, starting with empty warning list []
oneOf [ try backtrackable parser
|> ensure any warnings from previous loop iterations are kept
, scan ahead for a sentinal token
|> if no characters consumed then
fail
else
try again
|> adding a warning about what was skipped over
]
Some tidying up and documentation and I will put it out as a new package.