Error-tolerant Elm parser (for editor tooling)

dpren · April 5, 2018, 12:07am

The problem of coming up with a high-quality, standard way for editor tools to query source code is tricky stuff that takes time to get right. This post is about just a piece of that puzzle.

One main piece of info that IDE tools need is a project-wide map of module info (exports, types, locations, etc.) I’m aware this is something that’s still being worked out.

But an area I don’t think anyone is working on, and that I think the community could greatly benefit from is a smaller, specialized parser for dealing with incomplete/invalid syntax as a user is typing. The idea being that unlike the compiler which bails on errors, you’d try to recover from errors and keep filling out as much of the AST as possible.

It would enable editor features like:

completions, usages, and type info for locally scoped definitions
“as you type” linting that can reason about incomplete code, making errors more relevant and less distracting
allowing elm-format to still work in the presence of errors
type-directed autocomplete

I imagine this tool operating in a fine-grained manner on single declarations as they change in an open file. It could work alongside other tools that produce more coarse structured output.

I’m interested in taking on a project like this. It would require me to learn more advanced parsing techniques, but I’m up for the challenge.

Before I go further I’d like hear what others think of this and how it fits in with the larger picture.

zoul · April 5, 2018, 6:01am

FWIW, here’s a great intro to error-tolerant parsing for reader’s context (by Swift’s Joe Groff).

klazuka · April 15, 2018, 3:03pm

I am the author of an Elm language plugin for IntelliJ, and I recently worked on parse error recovery. You might want to look at that for an example, although I must admit that there was some trial-and-error on my part, and the grammar is not as clean as I would like it to be.

My parser was built using the GrammarKit parser generator. GrammarKit is commonly used for custom language support in the IntelliJ (WebStorm) family of IDEs. One unique thing about IntelliJ is that the editor continually parses the text buffer into an AST. Most of the IDE features operate on this AST, and since the input is being actively edited, parse error recovery has to be very good.

GrammarKit’s equivalent concepts for the synchronization points discussed in Joe Groff’s blog post would be pin and recoverWhile. Message me on Slack if you want to talk about it any further.

luke · April 22, 2018, 11:40pm

I saw this project - tree-sitter - mentioned while reading some progress updates on the atom team’s new editor work

avh4 · May 1, 2018, 8:19pm

elm-format already does this to a small degree, and I expect to add more lenient parsing features to elm-format in the future. elm-format will also soon have an AST output mode, which is meant to be useful to editor plugins and refactoring tools. How can we better align those goals for elm-format with the needs of IDE plugins?

klazuka · May 1, 2018, 9:44pm

@avh4 that’s good to hear.

For IntelliJ plugin authors, using an external process to do the parsing is not really an option. But there were several other editor/IDE plugin authors who were talking about consolidating on a language server to do this sort of work. See #elm-language-server in Slack. You may also want to ask around in #editors-and-ides, which is somewhat active.

dpren · May 2, 2018, 12:06am

@avh4 I’d love to collaborate on this. I’ve been experimenting with megaparsec’s recovery feature using elm-format’s parser.

An AST output mode would be great for non-Haskell tools. Considering Haskell-based tools, I think it would make sense to move the parser into a separate library that both elm-format and others could import and use in-memory.

As @klazuka mentioned, #elm-language-server is a thing. One effort already underway is gyzerok/elm-language-server which I believe is aiming to put elmjutsu’s code on a node server. My goal is to write a language server from scratch in Haskell using this improved parser – my focus being performance, accuracy, and maintainability in the long run.

system · June 1, 2018, 12:06am

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Parsers with Error Recovery Learn	22	4765	October 11, 2020
Elm Town #50 – My favorite thing is when they don't even notice Show and Tell	7	1001	April 30, 2020
Elm-tree-sitter implementation Show and Tell	4	1322	April 20, 2019
Intellij-elm 2.0: Infer the type of Elm expressions directly within the editor Show and Tell	5	2023	November 17, 2018
Type-directed autocompletion with editor integration Request Feedback	3	845	September 5, 2019

Error-tolerant Elm parser (for editor tooling)

Related topics