How do you get the nth character of a string

danielFHcode · October 5, 2024, 9:23pm

Looking through the standard library documentation there seems to be no method for getting the nth character of a string (Int -> String -> Maybe Char), and so if you want to get it you have to convert the string into an array (|> String.toList |> List.toArray) which is not optimal as that takes O(n), does anyone know if there is a way to get the nth character of a string in O(1)?

lydell · October 5, 2024, 10:14pm

Hi! You can do myString |> String.dropLeft (n - 1) |> String.uncons |> Maybe.map Tuple.first, but I don’t know how efficient it is.

You could also mention what you are trying to do. Maybe you don’t even need to get a char at a certain index if you change something else.

pit · October 6, 2024, 6:06am

FWIW: I used exactly the same code as Simon (lydell) in my port of the Elm compiler’s parser, and there it was fast enough to even parse >1MB files, so depending on what you are trying to do it might still be sufficient for your use case.

But as Simon said: maybe there’s another way to solve you problem?

danielFHcode · October 11, 2024, 12:40pm

Thank you both very much!
I originally stumbled across this problem while writing a parser in elm, which I was able to solve by using String.slice to get a 1-character long strings, so as you two suggested there was another solution to my problem, but regardless I thought it was odd there was no way to do this.
@lydle, regarding the efficiency of your solution it is certainly good enough for me as it is O(1) just like I wanted, so again thank you!
Finally, I just wanted to add that it is possible to make the method lydell suggested more generalized for getting the nth character of a string (as their method was only to get the last one):

get : Int -> String -> Maybe Char
get i str = String.slice i (i+1) str |> String.uncons |> Maybe.map Tuple.first

perkee · October 11, 2024, 9:04pm

@lydle, regarding the efficiency of your solution it is certainly good enough for me as it is O(1)

Not sure if it’s O(1) or not. It ends up calling String.prototype.slice in JS. It seems that JS slices by the codepoint (of which there can be, in the extreme case, 30 in any extant character and for which there is no theoretical limit) which simplifies things, but even then—a codepoint can stretch across many bytes. So you can’t tell where any codepoint begins without starting at the start of the string and parsing the bytes, right? Only thing that could affect that is if the JS string is implemented as a JS array which is itself implemented as a hashmap and then, yeah, you’d have O(1) lookup.

dta · October 13, 2024, 6:10am

JS strings operate on UTF-16 codepoints, so they are always 2 bytes. It should be constant time to index into a string, but as you point out you may wind up extracting only part of a character if you slice a single code point.

system · October 23, 2024, 6:10am

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unexpected behaviour when slicing unicode strings Learn	4	552	October 9, 2018
Parse fixed length strings with elm/parser Learn	3	853	September 24, 2019
Access to javascript String.normalize() Request Feedback	12	2441	April 20, 2019
How to go around measuring text synchronously? Learn	10	1889	January 17, 2019
The encoding of the elm string type Request Feedback	8	1517	February 26, 2020

How do you get the nth character of a string

Related topics