Why must Set members be comparable?


#1

It’s impossible to create a Set of Sets because all Set creation functions require comparable types. Why is this? A Set does not have or need the ability to order its elements; it needs only to be able to check equality. Does this constraint belie an implementation detail such as a binary tree for efficient Set operations?

Here’s some previous discussion: https://github.com/elm/compiler/issues/310


#2

this may not be an answer but there was an episode of elm town where they talk about this: https://elmtown.simplecast.fm/elm-019-improved-collections-robin-heggelund-hansen


#3

It needs the comparable to operate in O log n. If those runtime characteristics aren’t important to your use-case (small datasets, no tight loops, …), it’s not too hard to make a structure that only checks equality.

The easiest way to do this, I think, would be using pzp1997/assoc-list and having a type alias Set a = AssocList.Dict a ().


#4

Thanks. I came to that conclusion as I was writing my question. So I guess I would probably prefer to just allow Sets to be comparable as well, as was mentioned in the issue I linked and now this collection of issues: https://github.com/elm/compiler/issues/1008.

If a Set has a deterministic ordering of comparable elements, it should make sense for the set itself to be comparable, just like tuples of comparable types are also comparable.


#5

Thanks, @Erkal_Selman. I remember listening to that episode, but I don’t remember them discussing Sets. I’ll have to listen again sometime.


#6

Why? The order for toList can be defined by the order of insertion. Why would elements need to be comparable then?


#7

Here is what I’ve understood from the conversation in elm town:

Although the documentation says that you shouldn’t write code which depends on the order of the elements in a set, people do that.

For example, some coder may write the function

pick : Set a -> Maybe a
pick =
  Set.toList >> List.head

which picks an “arbitrary” element from a set.

This code should not break anything, if there is a new implementation of set.

For this reason they implement sets using an (internal) ordering.
And when they change the implementation of set, they make sure that this original ordering is respected.

The elements of a set must be comparable so that from that ordering the internal ordering (which is used when implementing sets) can be derived uniquely.

So all of it is for not breaking old, badly written code.


#8

Did you see turboMaCk/any-set? It may solve your problem.


#9

It’s not just that. Let’s say you have two Sets, a and b, and a == b, then toList a == toList b should also hold, since you are allowed by referential transparency to substitute equal values.

Put perhaps it’s worth scrapping equality of sets.


#10

Yes, that’s exactly it.

Elm dictionaries are implemented as (self-balancing) binary search trees. Sets are actually just dictionaries with values of type (). If you haven’t seen “unit” before, it’s a type with only one value, also called (). Point is, sets don’t care about the value in the underlying dictionary, only that there is one.


#11

Thanks! That’s an interesting implementation.

Coming from a mathematical side, I want sets to be able to contain anything, but I definitely appreciate the logarithmic optimization here.

Hopefully the issues above can be solved.


#12

I hope that doesn’t happen.


#13

For more information on how Elm’s data structures work, have a look at this video, where I did my best to explain them: https://www.youtube.com/watch?v=mmiNobpx7eI&t=


#14

Thanks, Robin. I hadn’t seen that one. The implementation of arrays was fascinating.


#15

I don’t think this is the case, unless toList imposes an ordering itself. List has a fundamental property that a Set does not share: ordering. Now, an OrderedSet, yes, I’d buy that if a and b are OrderedSets and a == b, then toList a == toList b. But the only way you can get to toList a == toList b is if toList orders the elements of a and b.


#16

In elm, a == b implies f a == f b for any values of a, b, and f. So, toList will always return the elements in the same order, but that order may not be a sorted order.


#17

According to the documentation for Set.toList:

    Convert a set into a list, sorted from lowest to highest.

#18

I guess I was conflating math and programming. Both toList and fromList orders, so my unless branch fires :wink:

I guess I should have RTFM eh? :smile:

> import Set
> [2, 1, 0, -1, 0, 2] |> Set.fromList
Set.fromList [-1,0,1,2]
> [2, 1, 0, -1, 0, 2] |> Set.fromList |> Set.toList
[-1,0,1,2]

closed #19

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.