De-throning the List

John_Orford · March 30, 2018, 3:01pm

My idea was have both arrays and lists around at the same time, in an “array , list”.

You access the appropriate structure based on operation : )

Terribly wasteful, perhaps only optimal in niche situations…

endlessoblivion · April 2, 2018, 9:55am

In Java, the order of preference of data types is not really Array > List, it’s more like Stream > Iterable > Collection > List > Array. It’s quite a pain to work with arrays in Java, or to use them in API signatures.

MarkHamburg · April 2, 2018, 2:26pm

Here are a couple other statistics it seems would be useful to this discussion:

What is the statistical distribution of collection (list/array) lengths in the target class of programs?
How often are these collections constructed in a literal?
How often are these collections constructed through transformation of another such collection?
Alternatively, how often are these structures accumulated through custom logic not based on folds, maps, etc.

I didn’t include access patterns in the above list of questions because those seem likely to be influenced by the strengths and weaknesses of the existing use of lists but it would be interesting to find an unbiased way to look at this question as well.

I raised these questions because it dawned on me that benchmarks for 1000 element collections are interesting — speaking as someone who works with even larger collections right now — but could be misleading if program performance is dominated by working with lots of small collections. The performance on 100 ten element collections could be more relevant than the performance on a single thousand element collection. Of course, it’s awkward if the correct answer for the small case and the large case are different because it means that any default becomes problematic at some point.

But keep at it. Pushing hard on this question is likely to be revealing whatever the outcome.

Mark

rupert · April 5, 2018, 2:45pm

I wanted to point out that having a swap-able List/Array(/Other) data structure is already possible in Elm, by using a slightly more OO style of encoding. Compared with the union type approach of @gampleman above, it is also more flexible as arbitrary new implementations can be added after the fact, without changing the type.

People have mentioned that in Java there are various ‘interfaces’ that are implemented by the collections library in java.util. It is also possible to define ‘interfaces’ with separate ‘implementations’ in Elm.

Here is an example. I needed FIFO, LIFO and ordered buffer types for my search algorithms to produce depth first, breadth first and informed search strategies. The Buffer interface can be seen here:

github.com

the-sett/ai-search/blob/master/src/Search.elm#L146


zero, when performing an iterative search. For bounds that do not iteratively
increase, this can be ignored.
-}
type alias Limit state =
Int -> Node state -> Bool




{-| Defines the operations needed on state buffers that hold the pending search
states.
-}
type alias Buffer state buffer =
{ orelse : Node state -> buffer -> buffer
, head : buffer -> Maybe ( Node state, buffer )
, init : List (Node state) -> buffer
}




{-| This utility function is used to convert a comparison over states into a
comparison over search nodes.
-}
nodeCompare : Compare state -> Node state -> Node state -> Order

With the 3 implementations on lines 287 (fifo), 304 (lifo) and 314 (ordered).

It would be possible in a similar way to define interfaces for Iterable and Collection in Elm, with List and Array backed implementations. Iterable and Collection would then be standard interfaces that packages could use in APIs to say what they want without saying how it should be implemented.

robin.heggelund · April 5, 2018, 3:01pm

Please stay on topic. If people want to talk about interfaces, they can do so in another thread.

rupert · April 5, 2018, 3:54pm

Ok, but I don’t think it is off topic. I’m pointing out a different way that List/Array could be combined, which seems very relevant? But as you request, we’ll take any further discussion about interfaces to a new thread.

robin.heggelund · April 5, 2018, 4:48pm

The topic isn’t about combining , but replacing the list entirely.

ram9cc · April 7, 2018, 1:48am

No thank you - there are many data types to handle many different situations - why replace a linked list with an array in a language when it already has both and the history of computing science has provided both of us to them?

I would like to keep having choices.

robin.heggelund · April 11, 2018, 7:35am

List can still be in the std lib, but it shouldn’t be the default.

robin.heggelund · April 22, 2018, 11:54am

New part of the series out now. Look at the top post for link.

eelcoh · April 22, 2018, 4:02pm

Just a silly question: in your examples you use these type annotations:
find : (a -> Bool) -> List a

instead of
find : (a -> Bool) -> List a -> Maybe a

Why is that?

robin.heggelund · April 22, 2018, 6:44pm

Oversight on my part. Have fixed it now. Thanks for pointing it out

robin.heggelund · April 23, 2018, 5:38pm

New part published now, see top post for link.

dta · April 23, 2018, 9:00pm

I’ve been accessing this through the email interface, and so I didn’t realize you had been editing the top post. I was beginning to get the impression that you were making a point by having us click through a linked list of articles each time.

robin.heggelund · April 23, 2018, 9:11pm

I’ll keep that in mind for my next update

rupert · April 24, 2018, 1:19pm

In the latest post it points out that Set in Elm is ordered. Worth noting, in general Set implementations do not need to be ordered - an example being a Set based on hash tables.

If you were to build one of these ‘collections’ with Set semantics, how would you do that? The same collection implementation also supports List semantics, so they must have different constructors right? Would be interesting to see the constructor APIs.

Its an interesting piece of work you have done here, and on the whole I would be in favour of it. It will perform well enough, and is a real simplification.

robin.heggelund · April 24, 2018, 5:11pm

While Sets doesn’t necessarily have to be ordered to be a set, Elm’s Set is. I’m working on a HashSet for Elm right now, so there could be multiple Set implementations in the future.

The point is that Set’s are collections, and so there are multiple functions available in the List and Array API that also makes sense to implement for Set.

danneu · May 3, 2018, 10:53pm

Excellent series. I liked the approach of speaking in broader concepts and then applying them to the current state of Elm.

I think performance concerns (especially wrt defaults) are central to Elm because Elm runs in the web browser and thus slow clients (like low-end mobile devices) cannot be avoided.

And since defaults have such a big impact on what people use, we have situations like people preferring List even though their data access patterns don’t match the List’s niche. It’s easy for us to say “well, I don’t want to have to think about datastructures,” but that comes at the expense of every client having to pay the penalty for us, like reversing a large array of endless-scroll gallery images on every render on the UI thread.

Kotlin’s solution here was to provide built-in functions like listOf(1, 2, 3), arrayOf(1, 2, 3), setOf(1, 2, 3) instead of a collection literal. Though I think there is now a generic [1, 2, 3] literal that you must explicitly annotate so it knows which collection impl to use, else it won’t compile.

Just about every time I implement a custom datastructure in Elm, I end up using an Array for its performance properties. For example, any time I need a grid of game pieces or gallery images, I often need to index into it. When I have a sequence in Elm (and UIs in general), I often am appending to the end but want to iterate in insertion order, which is precisely the List’s worst case.

If Elm were to adopt the Array as the default collection, I wonder how much of its APIs would also change. For example, ctrl-f “List” on http://package.elm-lang.org/packages/elm-lang/html/2.0.0/Html. One downside of using Array currently is that you have to convert it to a List in the view on each render though I never took the time to see what kind of perf impact it can have on a real world app.

I look forward to see what comes of this discussion.

robin.heggelund · May 8, 2018, 10:40pm

The last pieces of the blog series are out now:

Part Boron: https://dev.to/skinney/de-throning-the-list-part-boron-185
Summary: https://dev.to/skinney/de-throning-the-list-summary-3f3c

rupert · May 9, 2018, 9:31am

I am a little confused by this post:

Since in that one you also bring Set into the picture, but in the earlier posts you only talked about List and Array. Are you also going to try and build a data structure which can replace Set too?

That is why, above, I was asking what the constructors would look like. That is, how do you build this thing with Array vs Set semantics?

Dethroned.list [ 1, 2, 2, 3]
Dethroned.set [ 1, 2, 2, 3]

But perhaps this is not your intention and the above post has given me an incorrect understanding.

Topic		Replies	Views
Typed Arrays for Elm Request Feedback	25	3820	March 14, 2018
Current State of General Purpose ElmPlus Request Feedback	24	1473	March 7, 2025
Proposal: Mutable Type Request Feedback	20	2500	June 25, 2020
Communicating in Types Request Feedback	8	418	May 18, 2025
Introducing assoc-list, a Dict substitute with custom keys Request Feedback	6	2096	January 19, 2019

De-throning the List

Related topics