Why must Set members be comparable?

rauxyo · February 23, 2019, 6:54pm

It’s impossible to create a Set of Sets because all Set creation functions require comparable types. Why is this? A Set does not have or need the ability to order its elements; it needs only to be able to check equality. Does this constraint belie an implementation detail such as a binary tree for efficient Set operations?

Here’s some previous discussion: https://github.com/elm/compiler/issues/310

Erkal_Selman · February 23, 2019, 7:48pm

this may not be an answer but there was an episode of elm town where they talk about this: https://elmtown.simplecast.fm/elm-019-improved-collections-robin-heggelund-hansen

ilias · February 23, 2019, 8:07pm

It needs the comparable to operate in O log n. If those runtime characteristics aren’t important to your use-case (small datasets, no tight loops, …), it’s not too hard to make a structure that only checks equality.

The easiest way to do this, I think, would be using pzp1997/assoc-list and having a type alias Set a = AssocList.Dict a ().

rauxyo · February 23, 2019, 8:58pm

Thanks. I came to that conclusion as I was writing my question. So I guess I would probably prefer to just allow Sets to be comparable as well, as was mentioned in the issue I linked and now this collection of issues: https://github.com/elm/compiler/issues/1008.

If a Set has a deterministic ordering of comparable elements, it should make sense for the set itself to be comparable, just like tuples of comparable types are also comparable.

rauxyo · February 23, 2019, 9:01pm

Thanks, @Erkal_Selman. I remember listening to that episode, but I don’t remember them discussing Sets. I’ll have to listen again sometime.

Viir · February 24, 2019, 6:52am

Why? The order for toList can be defined by the order of insertion. Why would elements need to be comparable then?

Erkal_Selman · February 24, 2019, 12:13pm

Here is what I’ve understood from the conversation in elm town:

Although the documentation says that you shouldn’t write code which depends on the order of the elements in a set, people do that.

For example, some coder may write the function

pick : Set a -> Maybe a
pick =
  Set.toList >> List.head

which picks an “arbitrary” element from a set.

This code should not break anything, if there is a new implementation of set.

For this reason they implement sets using an (internal) ordering.
And when they change the implementation of set, they make sure that this original ordering is respected.

The elements of a set must be comparable so that from that ordering the internal ordering (which is used when implementing sets) can be derived uniquely.

So all of it is for not breaking old, badly written code.

Erkal_Selman · February 24, 2019, 12:32pm

Did you see turboMaCk/any-set? It may solve your problem.

norpan · February 24, 2019, 8:22pm

It’s not just that. Let’s say you have two Sets, a and b, and a == b, then toList a == toList b should also hold, since you are allowed by referential transparency to substitute equal values.

Put perhaps it’s worth scrapping equality of sets.

mgold · February 24, 2019, 9:07pm

Yes, that’s exactly it.

Elm dictionaries are implemented as (self-balancing) binary search trees. Sets are actually just dictionaries with values of type (). If you haven’t seen “unit” before, it’s a type with only one value, also called (). Point is, sets don’t care about the value in the underlying dictionary, only that there is one.

rauxyo · February 24, 2019, 9:35pm

Thanks! That’s an interesting implementation.

Coming from a mathematical side, I want sets to be able to contain anything, but I definitely appreciate the logarithmic optimization here.

Hopefully the issues above can be solved.

rauxyo · February 24, 2019, 9:35pm

I hope that doesn’t happen.

robin.heggelund · February 25, 2019, 6:45am

For more information on how Elm’s data structures work, have a look at this video, where I did my best to explain them: https://www.youtube.com/watch?v=mmiNobpx7eI&t=

rauxyo · February 25, 2019, 4:47pm

Thanks, Robin. I hadn’t seen that one. The implementation of arrays was fascinating.

jc00ke · February 28, 2019, 6:37pm

I don’t think this is the case, unless toList imposes an ordering itself. List has a fundamental property that a Set does not share: ordering. Now, an OrderedSet, yes, I’d buy that if a and b are OrderedSets and a == b, then toList a == toList b. But the only way you can get to toList a == toList b is if toList orders the elements of a and b.

dta · February 28, 2019, 7:10pm

In elm, a == b implies f a == f b for any values of a, b, and f. So, toList will always return the elements in the same order, but that order may not be a sorted order.

robin.heggelund · February 28, 2019, 7:44pm

According to the documentation for Set.toList:

    Convert a set into a list, sorted from lowest to highest.

jc00ke · March 1, 2019, 5:30pm

I guess I was conflating math and programming. Both toList and fromList orders, so my unless branch fires

I guess I should have RTFM eh?

> import Set
> [2, 1, 0, -1, 0, 2] |> Set.fromList
Set.fromList [-1,0,1,2]
> [2, 1, 0, -1, 0, 2] |> Set.fromList |> Set.toList
[-1,0,1,2]

system · March 11, 2019, 5:30pm

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
New package to use custom-types as keys in sets/dicts Show and Tell	12	1182	September 19, 2022
Model a Set with at least one element Learn	7	775	December 25, 2019
API suggestion for dict/set in core; custom compare functions Request Feedback	21	2579	March 8, 2018
Introducing assoc-list, a Dict substitute with custom keys Request Feedback	6	2081	January 19, 2019
New Constrained Type Variable for `(==)` Request Feedback	12	1803	July 24, 2019

Why must Set members be comparable?

Related topics