Introducing elm-pratt-parser, a flexible parser for expressions with precedence and associativity rules

Have you struggled with precedence and associativity rules when parsing an expression using elm/parser?

I have!

So the last few months, I have slowly worked on a solution that hopefully would end the pain for most users.

I’m now quite pleased with the API, implementation and documentation, I don’t know what to remove anymore from it, so after a small round of feedback and a few versions iterations, I’m ready to share it:


elm-pratt-parser
Pratt / Top-Down Operator Precedence parsing in Elm
Face complex precedence and associativity rules without fear using elm/parser


This package was designed with the following goals:

  • Seamless integration in the elm/parser ecosystem
  • As user-friendly as possible
  • Minimalist but flexible API that should meet most users needs
  • Let most of the parsing control to users, including error messages
  • Parser.Advanced support
  • Low footprint (the package is currently 135 LOC without comments, whatever that means)
  • Tail-Call Elimination for left-associative operations
  • No backtracking
  • Reasonable documentation and test suite

If you are interested, read the README and play with this Ellie example that allows to dynamically change the precedences of a mathematical expression (use RELOAD to reset):

I intend to maintain this package as long as needed, as I use it myself, so feedback and bugs reports are very welcome, including about the documentation and API.

Thanks:

21 Likes

Looks really great! Nice focused API with great documentation. I like the idea of a ‘References’ section in the README, and kudos for having a thorough ‘Thanks’ section in this post - always great to highlight good work that’s been done by others.

1 Like

Cool!

I remember trying different ways to get this going in a nice way in the compiler. I am going to outline the approach I use now, just in case it is interesting. I have no idea if it would be useful for the particular goals of your project though. This info is just because I think it’s a fun problem!


I ended up parsing into the AST without sorting out the precedence and associativity, so the initial AST has a line like this:

data Expr
  = ...
  | Binops [(Expr, Name)] Expr

So x + y * z would become something like Binop [(_,"+"),(_,"*")] _ with the blanks fillled in better! However many operators you use, it always comes out as a flat list.

Later on, I have code that figures out if the Name of the operator is known. If so, I’ll have associativity and precedence information for it. So that lets me “add parentheses” to the list of expressions and operators. The resulting AST has a line like this:

data Expr
  = ...
  | Binop CanonicalName Expr Expr

I found that this process is nice because:

  1. You can always parse without backtracking. Just put the operators in a list.
  2. You do not need to know the precedence and associativity while parsing. Instead I can lookup + at the same time as List.map and all the other names.

Again, I’m just sharing this because it took me a while to figure out a good approach given the particulars of Elm, and maybe it is interesting information. May not be relevant though!

3 Likes

I believe the technique you describe is the one that you start to introduce in the elm/parser Math.elm example. This is also actually the first one I tried thanks to your indications (then I basically evaluated most of the known algorithms until I came to Pratt’s one).

However if I remember correctly, it became a lot more complicated once I wanted to be able to support expressions with rules like -2^2^3 == -(2^(2^3)) == -256, i.e. having a precedence of the exponentiation operator higher than the negation one (for example like in haskell, python, ruby, wolfram, julia, and more).

This is because unless I missed something, we cannot build a list consisting only of (Expr, OpName) anymore, as we cannot know what is the Expr value of Neg Expr unless we know the precedence and associativity rules.

I wonder by the way if this is not the cause of Elm’s behaviour with such expressions (in Elm, -2^2^3 == (-2)^(2^3) == 256).

As an aside, my parser never uses backtracking either, as it dynamically changes the list of parsers used based on the current expression precedence, but it has indeed to know the rules during parsing to achieve this (it could be possible to change them or add operators dynamically during parsing though, I would just have to allow updating the opaque Config type, but this is not supported now as I have not yet found use cases that couldn’t be solved better differently).

5 Likes

To better visualize precedences, I updated the Ellie example in my first post to allow to change dynamically the precedences of a mathematical expression (elm-ui is fun!).

For example, we can get the Elm behaviour discussed previously regarding negation vs exponentiation by increasing the negation precedence:

unary_minus_vs_exp

Also for those that like algorithms visualizations, I found a quite nice one for Pratt’s algorithm there:

Pratt parser in pictures

Ok, back to work now :sweat_smile:

8 Likes

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.