BrianHicks/elm-string-graphemes 1.0.0

Hello all! I’ve just released a new library: BrianHicks/elm-string-graphemes. It does everything String does, except it operates on graphemes instead of bytes or characters. Observe:

import String.Graphemes

String.toList "🦸🏽‍♂️" --> [ '🦸', '🏽', '\u{200D}', '♂', '\u{FE0F}' ]

String.Graphemes.toList "🦸🏽‍♂️" --> [ "🦸🏽‍♂️" ]

Check it out at https://package.elm-lang.org/packages/BrianHicks/elm-string-graphemes/latest/. In particular, I’ve included a primer on why this library is necessary in the README if you haven’t worked a lot with different levels of text (e.g. the emoji above is one grapheme, but four characters and 17 bytes. If that doesn’t make sense yet, go read it!)

If you find any issues with the grapheme segmentation (e.g. where it breaks improperly) please open an issue! I would also love it if we could get the parser to go even faster—I already took it from 0.1% of String.toList performance to 1% to 2%, but can we get higher? Probably!

16 Likes

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.