100x faster parsing. Performant parsers for Lists of Strings and Dict

A package to create blazingly fast parsers :fire:

I had a list of strings and needed a parser to succeed with the longest matching occurrence.
A prefix tree is a clever data structure that can help with accomplishing this in as few steps as possible.
The package explains what else I tried and how this approach compares to the alternatives.

image

9 Likes

By coincidence I have spent the last month writing a package for Tries and will be publishing it soon:

Its a more full-featured implementation and has an API that is compatible with Dict over either List comparable or String keys.

Seems a few of us were needing Tries for various projects - text search in my case. Certainly a data structure that finds many uses in various efficient string algorithms.

3 Likes

Nice. At the time I looked into the rluiten/trie package but had to write my own implementation because I needed to use a custom traversal algorithm and that package doesn’t (and it shouldn’t) expose the value constructor of the Trie type.

Yes, I also started out with rluiten/trie buts its API is kind of weird like:

add : ( String, a ) -> String -> Trie a -> Trie a

Rather than the more obvious:

add : String -> a -> Trie a -> Trie a

So it seems to have been built to cater for a specific use case in rluitens other text search packages.

Also, as you say there is no custom traversal. This was a problem for me too as I wanted to implement a case insensitive search, and have more fuzzy searches in mind too.

The elm-scotland/tries package does not currently expose the Trie type. The reason for this is that I may want to change the implementation to a ternary tree with balancing heuristics, paper here:

But it does expose a variation on foldl that allows a traversal of the trie to be carried out with fine-grained control over how the traversal proceeds, as described by the Match type. This includes the ability to Break and terminate a traversal early. The signature for this is a bit wacky:

match :
    (Maybe comparable -> Maybe a -> context -> b -> ( b, context, Match comparable ))
    -> b
    -> context
    -> Trie comparable a
    -> b

There is a fairly extensive comment to explain how this works. If you get a chance would appreciate hearing if you think this provides enough flexibility to cover your use case - I will also take a look at your code myself as I am interested in creating a package that meets real needs.

Great work on the Parsers.

1 Like

Possibly tangential question: what is a trie in relation to a tree?

Probably the quickest way to explain is to refer you to the wikipedia page - it has a picture.

2 Likes

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.