Scala claims to have “effectively constant” lookup, add, and remove for their immutable
HashSet implementations - http://docs.scala-lang.org/overviews/collections/performance-characteristics.html
Does anyone know how Scala’s implementation managed to get those characteristics? I’m trying to determine if it would be possible to implement a collection like that (assuming the caller specified the hashing function) in Elm.
The docs say it uses a hash trie, which I suspect is more specifically a hash array mapped trie. Clojure also uses a HAMT map and describes performance characteristics as “effectively constant-time”. I’ve seen some good Clojure talks on the topic, although it seems I didn’t save links to any of them.
I thought HAMTs had logarithmic access times…maybe I’m mistaken?
Hash array mapped tries usually have a bunch of “slots” per level. So where a binary tree has branching factor of 2, a HAMT will usually have many more.
Array.Hamt uses 32, for example. So, with a good hashing function, the depth becomes
log32(n), which means that even for an array of 1024 elements, you only need 2 levels in the tree.
In practice, that’s “close enough” to call it constant time.
Yeah, the key word here is “effectively” (constant time). Another thing to mention is that n is never greater than 7 (32^6.4 maxes out an int).
I’m already exploring something like this for Elm, just need to find the time. An implementation for Elm already exists (https://github.com/Skinney/collections-ng) but there are bunch of things that can be improved to increase the performance, like a proper hash function and applying lessons learned from
Array.Hamt. Expect to read more about this soon.