RE: Depreciation of elm-lang/lazy library


#21

If only (LazyList) were exposed by the language, you’d be forced to create singleton lists.

That wouldn’t be very efficient. If there are enough cases for libraries that use Lazy, maybe it can be added back to core, as that would make more sense that a lot of libraries that need to be included in elm-explorations because they must use Native (Kernel) code to implement Lazy within each library. If there are only two or three cases, then perhaps Lazy wouldn’t need to be exposed.


#22

It is lazy (but not memoizing), you can see the continuations for later exploration being created on lines 173 and 178. So it can search an infinite tree. I have tested this because I made sure this code tail re-curses so it can infinite loop all day without running out of stack (at least for depth-first search, obviously breadth-first will blow the heap pretty quickly).

Cycles will break it. I am assuming that someone coming to this code to write a search will be coming with a good understanding of that anyway. The Step function may incorporate some cycle detection code that is specific to the problem at hand, or rely on the cost function to price out cycles.


#23

Another interesting idea (that @evancz might prefer) would be to investigate what would happen if we split lazyness from memoization:

  • Just lazyness can be done by using the \()-> stuff to compute pattern we currently already have.
  • Memoization could be added by a memoize function that uses a variant of the Y-combinator to add memoization to recursive calls to the same function.

It might be a bit difficult to implement, however. (Maybe someone smarter than me can figure it out) is to figure out how the memoized results could be stored. Only using Elm, it would be intuitive to use a Dict, but it requires its keys to be comparable, which functions are not. I also cannot think of a sensible way to convert them to a comparable, which e.g. AllDict (eeue56/elm-all-dict) requires. But maybe it is possible to somehow gather all arguments and combine them into a tuple as key instead.But this might also fail if there is an argument that is not comparable…

Writing that memoization-code in a tiny snippet of native JS, would however be straightforward (I think) using the new Map() class, which even accepts functions as keys.

One other drawback might be that it leverages the Y-combinator, which

  • might be somewhat slow because of the extra function-calling indirection. (I did not perform any benchmarks.) This is probably not a problem since it is a conscious choice to use memoize in these contexts, and the time you save by not having to re-compute answers should be much greater (otherwise you should not use memoize in that situation).
  • is not the most beginner-friendly to use. However, I think that good documentation of how to use memoize (in essense: that the function you implement using memoize receives a function as input, which it should call as recursive call, because it is the wrapped-with-memoization version of the function).

#24

@Qqwy

  • Memoization could be added by a memoize function that uses a variant of the Y-combinator to add memoization to recursive calls to the same function.

I’ve thought about this a lot, and it really doesn’t seem to work out. While we can create a non-memoizing Stream as per your first point and memoize it linearly after the fact by mapping it with a function that converts it from the Union case of a thunk to an evaluated type; this is a solution that works fine for some cases but not when we need true recursion.

Storing things in any kind of an Elm structure other than as a Union as I mention above are all O(n log n) performance and will be terribly slow and the Y-combinator doesn’t memorize, but just enables recursion we already have at a cost in execution speed without any benefit.

Even if we could make it work out with a combination of the Y-combinator and Union’s, why should we go through hoops and still not end up with what we want as to execution speed when the memoize function in JavaScript (or any language) takes less than 10 lines of code, and we have established that it isn’t detrimental to memory management?

However, I’ll look into the Y-combinator and Union’s solution and benchmark it if it works out


#25

Here’s an API for a memoizing LazyList:

get : Int -> LazyList a -> ( Maybe a, LazyList a )

Like Random.step, it gives you back not only the answer, but also a new collection (which includes the cached results of the evaluated thunks) to use from then on instead of the previous collection. If you call the same function on the new collection with the same argument, it fetches the answer from the cache rather than reevaluating thunks.

As a teacher, I like what this API can teach on a conceptual level. Like Random.step, it demonstrates explicitly how state can evolve over a multi-step process without having to mutate anything in between! Rather than the caching functionality being hidden behind the curtain, it’s right there in the API so people can see what it’s doing.


#26

@rtfeldman, You are thinking along the right lines, but you are missing something that is generally applicable in all cases: Like I said, one can memoize this way after the LazyList is formulated, but you can’t when the LazyList is formed recursively as in my “plus1” previous examples, as it degrades into the same case as for the Stream type.

I’ve done some more work on this and proved conclusively that one can’t produce a true memoizing Lazy without mutation even if that mutation is hidden inside a function as in the former Lazy library. Here is a Test Bed in Ellie that you can use to play with different schemes (other than using Native JavaScript, of course).

To use, set the clock speed to that of your CPU (the boost rate as Elm runs single threaded), the limit of the number of loops (don’t go higher than a few thousand for the recursively defined “plus1” versions or be both prepared to wait a long time and/or for a stack overflow as the machine recursively tries to evaluate back to the one value at the beginning for each new value), and un-comment the version of the calculation() you want to test, commenting out the others.

I reformulated the sequence types to be more as generally expected with the head of the list wrapped in the deferral/laziness, for example the Stream:

type Stream a
    = ConsStream (() -> { head : a, tail : Stream a })

but that makes no difference to looping speed; it just moves the laziness to enclose the head rather than the tail.

The following examples are supplied:

  1. a simple loop so one can measure maximum speed, at about 6.35 CPU cycles per loop on my machine.
  2. a non-recursive numbers(), running at about 320 cycles per loop for 10 million loops on my machine with about linear performance with range (slightly variable depending how much garbage collection is triggered and when).
  3. a recursive version of numbers using Stream, running in exponential time at 250,000 cycles per loop for a range of 1000, over twice that per loop for a range of 2000 (for about four times the execution speed, because of the exponential).
  4. An attempt at a LazyList implementation using the ideas expressed here, but with no benefit when defined recursively, and no point when applied to a non-recursive algorithm which doesn’t need laziness to run efficiently.

This proves that just a non-memoizing Stream can’t handle cases which must be defined recursively, and that there is no substitute for a mutating Lazy (although the mutation may be hidden).


#27

[@evancz, With no reply I’m worried this thread will close without a proper answer:

, post:6, topic:1876"]
Having a lazy library allows you to create code like this:

type Stream a = Cons a (Lazy (Stream a))

ones : Stream Int
ones =
  Cons 1 (lazy (\() -> ones))

When the lazy part is forced, the internals switch from an unevaluated function to a pointer back to the initial allocation. This is cyclic data!

If you can create a value like this in a let expression, you can dynamically create cyclic data that needs to get GC’d at some point. This means a naive ARC implementation is not viable anymore.

So given the actual details about cycles (only top-level and statically known) I think the GC considerations are still valid. If I am missing something, I’d like to understand it via a small example.
[/quote]

With respect, I could be wrong myself but I believe your reason for rejecting Lazy for GC concerns is wrong as follows:

To re-write your above example not using Lazy:

type Stream a = Cons a (() -> Stream a)

ones : Stream Int
ones =
  Cons 1 (\() -> ones)
  1. This is identical to your example except it doesn’t memoize on evaluation.
  2. As of Elm 0.19, neither can be used in a local let .. in block, so that consideration of why to reject Lazy doesn’t apply or if they could be used, whether using Lazy or not causes the same problem either way.
  3. As you said in the quoted text above, cyclic data in the global scope is allowed because it doesn’t cause a problem; that’s true either way.
  4. If your Lazy version of “ones” is used at the global level, it doesn’t cause a memory management problem and it doesn’t cause a memory leak as although the reference occurs in two places, at the global level and buried in the closure, that’s as high as the reference count goes other than for external references to “ones”.
  5. A non-cyclic structure using Lazy with a global (or permanent local) reference can cause the entire memory memoized to be retained as the head of the chain is referenced globally, but this is true for any memoizing data structure such as Dict or Array; such definitions should really be initialized with a function call and thus the reference can be forgotten/deallocated when it isn’t needed any more, thus, this isn’t a consideration.

So given that Lazy doesn’t cause a problem with (reference counted) GC, what then is your objection? Just because you or NoRedInk don’t currently have an application doesn’t seem to be a good argument to me, when we are talking only about 10 lines of JavaScript code.