Parsers with Error Recovery

I’ve been trying to collect my thoughts on this, feel free to chip in if anything chimes for you!

I think this is good advice, as recovery may need to understand the context of an error in order to be able to succesfully recover. If error reporting is already precise and giving good context, the code will likely be in a good place to implement recoveries. As I figure out the recovery strategies, I may need to re-visit context and adjust it better to that purpose.

High Level Approch

Parsing is quite hard, so is doing good errors, so is doing error recovery. It seems prudent to not try and tackle all this complexity in a single pass. Fortunately, the way elm/parser is structured supports this well, as does Matt’s TolerantParser, since Parser is simpler than Parser.Advanced is simpler that TolerantParser. So…

  1. Write a Parser that accepts the input, does not matter that error reporting is not so great. Just get the shape of the parser right.

  2. Switch up to Parser.Advanced using type alias Parser = Parser.Advanced.Parser Never Parser.Problem a to get started. Then start designing the Context paying particular attention to areas where a recovery could be possible.

  3. Look at error recovery.

Chunking, Error positions and Global Context

Since I chunked ahead of using a Parser, rather than trying to parse all in a single pass and recover to the start of chunks, each Parser will not be starting from true line 1. So I at least need to pass the start line around to add to the row.

I could do this at the end, if I am only marking positions in error, by post-processing the list of DeadEnds.** In my case, I want to record source positions all through the AST on succesful parsings too, so that errors in later stages of the compiler pipe-line can also tie back to the source accurately. (I could also post process the AST to add in the start line, but why make an extra pass if its not really needed).

At first I was trying to is Parser.Advanced.inContext to carry this global context around. Then I realised there is no getContext function, so how do I get the info back deeper in the parser? So I now think of the Parser context as a local context, and a seperate global context can just be passed around as function parameters:

type alias TopLevelContext = 
    { startLine : Int
    , ...
    }

someParser : TopLevelContext -> Parser AST
someParser global = 
    ...
    |= someChildParser global

someChildParser : TopLevelContext -> Parser AST
...

** Actually, just realized I have to do it at the end for errors, since there is no mapDeadEnds function that I could use to modify the line numbers on-the-fly. No big deal, at least I can insert the correct line numbers into the succesful AST during parsing.