I wish to build a fuzzy searcher that should allow using accents and other special characters in its search string.
So, for example if I have a string like
"Dès Noël, où un zéphyr haï me vêt de glaçons würmiens, je dîne d’exquis rôtis de bœuf au kir, à l’aÿ d’âge mûr, &cætera"
I’d like it to be converted to
"Des Noel, ou un zephyr hai me vet de glacons wurmiens, je dine d'exquis rotis de boeuf au kir, a l'ay d'age mur, &caetera"
There is a lot of edge case that I can’t think about beforehead (’ → ', ÿ → y, etc) and even some of them need to be translated to many characters (æ → ae)!
So my question is, is there a way to not do this conversion manually in Elm? Is there some standard utility that I missed? Some third-party libraries? Should I use ports in JS (I’d rather not at this point)?
For the record, here’s the output of each of these library given my example string:
TLDR:
deburr and elm-string-normalize have the exact same behavior on this given input
string-extra fails to parse œ and somewhat succeed on æ by giving a a
none of these are able to do ’ → ' but it’s okay I guess
I have no clue about the performance implications of each of these solutions
> import String.Deburr as String -- Fresheyeball/deburr (provides deburr)
> import String.Normalize as String -- kuon/elm-string-normalize (provides removeDiacritics)
> import String.Extra as String -- elm-community/string-extra (provides removeAccents)
> sample = "Dès Noël, où un zéphyr haï me vêt de glaçons würmiens, je dîne d’exquis rôtis de bœuf au kir, à l’aÿ d’âge mûr, &cætera"
"Dès Noël, où un zéphyr haï me vêt de glaçons würmiens, je dîne d’exquis rôtis de bœuf au kir, à l’aÿ d’âge mûr, &cætera"
: String
> String.deburr sample
"Des Noel, ou un zephyr hai me vet de glacons wurmiens, je dine d’exquis rotis de boeuf au kir, a l’ay d’age mur, &caetera"
: String
> String.removeDiacritics sample
"Des Noel, ou un zephyr hai me vet de glacons wurmiens, je dine d’exquis rotis de boeuf au kir, a l’ay d’age mur, &caetera"
: String
> String.removeAccents sample
"Des Noel, ou un zephyr hai me vet de glacons wurmiens, je dine d’exquis rotis de bœuf au kir, a l’ay d’age mur, &catera"
: String