Parsers with Error Recovery

So I feel like I am now making some good progress with this. I have taken Matt’s TolerantParser as a starting point, the current WIP is here:

https://github.com/the-sett/parser-recoverable

Example

Don’t get too excited, but I created a little example to start trying it out:

cd examples
elm-live src/Main.elm

Parsing Outcomes

I didn’t like that the result of run is going to yield a double wrapped Result, the outer one with DeadEnds but the inner one with just problem.

type alias Parser context problem data =
    Parser.Parser context problem (Result (List problem) data)

outcome : Result (List (DeadEnd c x)) (Result (List x) data)
outcome = Parser.Advaned.run someParser someInput

So I changed the inner error representation to be (DeadEnd c x) too. Only problem with that is that Parser.Advanced has no getContext function, so I could not give a context stack. Errors resulting from inside the extended parser will therefore have not have the full context - perhaps there is a way to do it by carrying around a second copy of the context stack with the parser? I guess I’ll try and fix that when it actually looks like it is needed.

The other thing is that I didn’t think Result was the right output for this parser to give. If the parser does recover succesfully it will still yield some AST, but it should also give errors for the bits it had to gloss over to get it. So the outcome should be more like a tuple.

I decided to use this rather than (List (DeadEnd c x), Maybe data), to avoid the case where there are no errors and no data!

{-| Describes the possible outcomes from running a parser.

    - `Success` means that the parsing completed with no syntax errors at all.
    - `Partial` means that the parsing was able to complete by recovering from
    syntax errors. The syntax errors are listed along with the parsed result.
    - `Failure` means that the parsing could not complete, so there is no parsed
    result, only a list of errors.

-}
type Outcome context problem value
    = Success value
    | Partial (List (DeadEnd context problem)) value
    | Failure (List (DeadEnd context problem))

The run function is then:

run : Parser c x a -> String -> Outcome c x a

Recovery Tactics

Similar to the TolerantParser, I defined a set of actions to guide the parser when it fails.

{-| Describes the possible ways the parser should act when it encounters
something that it cannot parse.

    - `Fail` stop parsing and return a `Failure` outcome.
    - `Warn` ignore the error, but add a problem and use a `Partial` outcome.
    - `Ignore` ignore the error and continue with a `Success` outcome.
    - `ChompForMatch` try chomping to find a matching character. If succesfull
    add a problem but continue with a `Partial` outcome. If this does not work
    then `Fail`.

-}
type RecoveryTactic x
    = Fail
    | Warn x
    | Ignore
    | ChompForMatch (List Char) x

The default behaviour is to Fail.

The current recovery tactic can be attached to a parser with this function:

withRecovery : RecoveryTactic x -> Parser c x a -> Parser c x a

The idea is that this will be passed down to all subsequent parsers (chained with ignore, keep, map, andThen, and so on), and not just for one particular token. So if parsing a List Int, but some of the Ints are expressions, the parser could keep its strategy of chomping to the end of the list or next comma, in the event that it sees a malformed expression. Not totally sure this is the right thing, but it feels right for now.

Feedback to the Editor

This is what I am going to work on next, by evolving the problem type.

When a string gets chomped to recover, I will add the start position of the string, and the string itself to the problem. The idea is that an editor can check if a problem overlaps the current cursor position, and if so, it knows what String to cut out of the source and offer context sensitive suggestions to replace it with.