If anyone is curious to learn a bit of Rust or has advice on using Rust, nom or doing parsing in general I would welcome feedback & contributions. I am quite new to this. It is hugely satisfying to write a parser; it matches very well with a red/green testing workflow. That said, there is a quite a lot to cover so any help would be great
Isn’t this much easier to do if you first build or find a formal grammar for the language?
Probably I don’t have any experience with them. I did google around and didn’t find anything official but there are definitely some grammar definitions out there that people have made for various tools.
I myself wrote an elm parser in rust, using the lalrpop library (my parser). It can parse pretty much any elm code, but doesn’t do much beside that.
When starting my project I considered nom, but at the time, the documentation was not good enough for me, and I found the error messages confusing. I went with Lalrpop because I’m familiar with parser generators, it has neat error messages and has a grammar close to bnf, this was of fantastic help when starting out.
I’ve not much experience with rust outside of that one project, so take what I say with a grain of salt .
First, you should use clippy, it gives tips on how to improve your code, it is especially useful for beginners. I see for example in your code clippy identifies a lot of
if vec.len() == 0 where you could do
if vec.is_empty(). I also see that clippy is a bit iffy with nom macros.
This is an elm forum, so I feel uncomfortable giving more rust tips . Otherwise I would tell you how to put your test code in its own module and how to use
include_str! to test your parser over whole elm files.
I wrote a formal grammar of the elm syntax to help for my own parser, I hope it can be useful to anyone, but again, it is mostly for myself and it might not follow exactly the actual grammar I use. the grammar
A formal grammar is more useful with a parser generator. Plus, the way I handle indentation is questionable at best and a typical workaround due to the architecture of parser generators.
A parser combinator library like nom is much closer to what the original elm compiler uses. And in there, I don’t see any special handling of indentation. The indentation detection code seems to lie with the parsing code with
let and case branch parsing.
A parser is also disappointingly useless. The hardest part (but also the most fulfilling) is in fact using the resulting AST to do something (for example, say code validity checking). I’m myself a little disappointed in what I made. It’s easy to test though, because of the very simple input -> output concept of them.
If you have any question about rust in general, or parsing elm, I am open to them. If you have questions about nom you should ask in their chat channels (they are linked in the nom github page)
Thank you for the reply. I will install clippy and follow its advice. I certainly have a long way to go with my Rust knowledge. There is curiously little that feels like proper Rust int he code base at the moment given how custom the nom macros are.
I don’t have any experience with formal grammars and I don’t have sufficient interest to get deep on this one effort. Perhaps that is a recipe for a poor final product though. I can see that it is definitely relevant! Perhaps if there was an official grammar published for Elm I would have considered it more strongly.
My interest is in writing a linter. I’m not sure how far I will get. I think elm-analyse is an impressive project and hopefully the long term future of linting in Elm but it does not run very quickly so I wanted to explore having something a bit quicker to detect unused variables and the like.
I don’t know whether to continue with my efforts or switch to using yours. Either way, thank you for the advice and for sharing your project.
You do you! My parser is itself an effort to learn Rust. So I do not fly higher than you! If you are writing that parser to learn Rust, you should not take my implementation as a detriment! The Rust community is as nice and friendly as the elm community, so you should seek further information in specialized places (I think of the nom gitter) They know better than me when it comes to the feasibility and difficulty of making a parser with nom.
I’ll add that I had the exact same motivations to start my project! Though I completed writing the parser, I still have miles to go before getting something else than a just cool module dependency graph! And also miles to go before having an usable stable API for third parties to use.