Useful Chompers

dwayne · August 16, 2023, 9:23am

Learn about useful chompers that can help you write better parsers.

benjamin-thomas · August 16, 2023, 9:54pm

Interesting post.

I love parser combinators, I’ve learnt a bit about them via OCaml (just dabbling).

One thing I noticed is that it’s pretty much the only time where I’d make sens to me to actually introduce a runtime error, to make the program fail loud and clear (when something goes horribly horribly wrong).

Because it seems to me that some operations can’t be expressed correctly by the type system when parsing.

For instance, in your decimal function, you use Maybe.withDefault to satisfy the type system’s constraints. But that constraint is a little bit artificial, because you do know that only floats will come out of chompDecimal.

decimal : Parser Float
decimal =
    chompDecimal
        |> P.mapChompedString (\s () -> String.toFloat s |> Maybe.withDefault 0)
        |> lexeme

So in this specific case, it feels to me that satisfying the type system actually gives no value (it’s extra work for zero benefit)

This is what I’d prefer to write for clarity. But even then I’m forced to represent that same branch that’ll actually never trigger.

decimal : Parser Float
decimal =
    chompDecimal
        |> P.getChompedString
        |> P.andThen
            (\s ->
                case String.toFloat s of
                    Just n ->
                        P.succeed n

                    Nothing ->
                        P.problem "impossible"
            )

Any thoughts?

dwayne · August 17, 2023, 3:07am

I understand your point and it would be nice if there was a way around it in Elm. But there isn’t so I don’t lose sleep over it.

When I was writing elm-natural I designed the API so that the Maybe.withDefault was hidden in cases where the user knew for certain they had valid input. See for e.g. fromSafeInt and fromSafeString.

So if you were to write a natural parser it could look like:

natural : Parser Natural
natural =
    chompNatural
        |> P.mapChompedString (\s () -> Natural.fromSafeString s)
        |> lexeme

BTW, I like your decimal version better because in the off chance the conversion does fail you get a clear parser error (assuming you improve the problem message).

benjamin-thomas · August 17, 2023, 5:18am

Of course, it’s not a “big” problem I wouldn’t loose sleep over it either

I thought I’d post because maybe someone knows a trick or two I could learn about.

You say you don’t know a way around this in Elm, but presumably there is in another language?

I’m wondering because from the type system’s perspective, all makes sens, but we are left with something not ideal in the end.

fromSafeString looks like a useful convention, I like it.

I thought about it a bit more and I think we can also generalize the problem like this:

{-| `fromMaybe` assumes that applying `f` will never return `Nothing`. 
If it does, the parsing fails.
-}
fromMaybe : (a -> Maybe b) -> a -> Parser b
fromMaybe f =
    f
        >> Maybe.map P.succeed
        >> Maybe.withDefault (P.problem "Bug: f produced Nothing!")


decimal : Parser Float
decimal =
    chompDecimal
        |> P.getChompedString
        |> P.andThen (fromMaybe String.toFloat)

Then I feel things are slightly better that way.

dwayne · August 17, 2023, 2:08pm

Yes, in Haskell you can do it with a partial function. For e.g. if you know for sure you’re dealing with a non-empty list at some point in your application then you can use head without having to handle a Maybe result.

head :: [a] -> a
head (x:_) = x

pit · August 19, 2023, 8:34pm

Thank you for the article, Dwayne.

If you want to improve performance even more, you could use the following approach:

Use Parser.getOffset and Parser.getSource and then String.dropLeft to get the string that hasn’t been parsed yet
Implement the checks of the chompers without using the Parser module, for example via loops with String.uncons
In the case of success, advance the parser with Parser.token, giving it the parsed text as the token

How to handle the failure case depends upon which error you want to generate at which position. One solution would be to advance the parser to the failing position with Parser.token like in the success case, followed by a Parser.chompIf isGood.

If you don’t need the error to be reported at the exact first failing position, I found that you can improve performance even more for chompers with a given lower bound (chompAtLeast, chompBetween, and chompExactly) by using String.all isGood for the part up to the lower bound instead of a loop.

This is just a very brief description. I can send you a sample implementation, if you like.

Thank you again for the article and the code.

dwayne · August 23, 2023, 10:33am

Thanks for finding these interesting ways to improve performance. I’d have to try them out myself at some point but also feel free to share your sample implementation.

pit · August 24, 2023, 5:48pm

Done

system · September 3, 2023, 5:49pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How can I use elm/parser to make a Float from a String? Learn	6	1358	June 17, 2019
Video: Parser combinators under the hood Show and Tell	10	545	April 6, 2025
How to build interesting parsers Show and Tell	3	974	December 9, 2022
Weird thing I found in elm/parser Learn	2	615	August 14, 2020
New blogpost: parser combinators Show and Tell	3	435	April 10, 2024

Useful Chompers

Related topics