Elm Parser performance and help

I have been converting a a time parser from using RegEx to the elm-tools/parser. It is working now but is about 30% slower than the RegEx version.

Should I expect the Parser to be on par with RegEx?

I do suspect the parser I created could be written to be more efficient. Take the below example which handles seconds. It has a smell, but I am not sure how to make it better.

If seconds are not present, then the default is 0. Seconds can appear as “:00”, “:00.00…”, or “:00,00…” (the comma makes it a bit more difficult)

seconds : Parser Float
seconds =
        [ succeed
            (\seconds ms ->
                (toString seconds ++ "." ++ ms)
                    |> String.toFloat
                    |> Result.withDefault 0.0
            |. symbol ":"
            |= twoInts
            |= oneOf
                [ succeed identity
                    |. oneOf [ symbol ".", symbol "," ]
                    |= manyInts
                , succeed "0"
        , succeed 0


Full source

I believe @jxxcarlson has ran into this before.

The root issue is that 0.18 will always box Char values so they can be distinguished from String at runtime. That is needed for toString to work properly, but it makes the parser much slower because it is boxing and unboxing when it looks at each Char.

There is a flag in my development version of Elm that switches to a more efficient runtime representation, so it will generate unboxed Char values. It should improve the performance of parsers quite a lot. So to answer your question:

Yes, but a little bit later.

In the meantime, even if you have 1000 times, I don’t imagine this being observable. Those strings are pretty short, so I suspect it is 30% of something quite small compared to what humans can observe. So I think I think it makes sense to keep going the parser route for now, even if it is still a bit slower in future releases.


This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.