Here’s a little parsing problem I ran into while writing a parser for EDN (planning to announce that package soon). I used elm-tools/parser, but failed to find a good solution without adding a look-ahead primitive. I’m curious whether there’s a nice way to solve this with
delayedCommit, or in some other way?
Here’s a reduced version: We want to parse a nested datastructure of Lisp-like lists and integers, where the list parentheses are “self-delimited” for lack of a better term:
data Thing = Number Int | Things (List Thing) (1 2 3) == ( 1 2 3 ) --> Things [ Number 1, Number 2, Number 3 ] ((1) 2) == ( (1)2 ) --> Things [ Things [ Number 1 ], Number 2 ] (()1()) == ( () 1 () ) --> Things [ Things , 1, Things  ]
With look-ahead, we can make a parser for numbers that ensures the number is delimited:
import Parser as P exposing ((|.), (|=), Parser) -- run a parser, then rewind input lookAhead : Parser a -> Parser a sep : Parser () sep = P.oneOf [ P.ignore P.oneOrMore (\c -> c == ' ') , lookAhead (P.oneOf [P.symbol "(", P.symbol ")"]) ] number : Parser Thing number = P.succeed Number |= P.int |. sep
and put the whole thing together with a list parser:
thing : Parser Thing thing = P.oneOf [number, things] things : Parser Thing things = P.succeed Things |. P.symbol "(" |= P.repeat P.zeroOrMore thing |. P.symbol ")"
(This doesn’t quite work, since it doesn’t eat all the optional whitespace this way, and lacks some
lazy. Here’s a complete version.)
I think the core issue I’m running into is that I need the closing parenthesis both to terminate the number, and to terminate the list. So trying to do this without look-ahead I found my number parser had to return both the number and the closing token, which made things … messy.
I hope I haven’t broken the problem down too far to illustrate the issue! And am very curious if you have some suggestions for how to tackle this.