Synopsis: Although I partially agree with Evan that the Lazy library is often misused and unnecessary for many applications, I disagree that it should be completely removed from availability for those who truly need it. I will discuss why I think the reasons for this depreciation aren’t valid and also some applications (at least a few that I can think of quickly) that can’t be implemented efficiently without using this library, as follows:
Given Reasons for Depreciation
The reasons that Evan gives for this depreciation boil down to the following two concise statements:
- It is overkill for all the scenarios (he has) seen in Elm.
- It allows the creation of cyclic data, significantly complicating Garbage Collection (GC).
Consideration of Reasons for Depreciation
I choose to address point 2 first: Evan further says the following in the library ReadMe.md file:
With laziness you can create a list like
ones = 1 :: ones
that refers to itself. Without laziness, there is no way to create cyclic data in Elm. That means we can use a naive reference counting approach to collect garbage if we wanted.
I say to this “how does the removal of the Lazy library prevent that definition of cyclic data?”. It is true, with the lazy library we can write things like ones = lazy <| \() -> 1 :: Lazy.force ones
which will cause a stack overflow if one actually tries to use it; it may also cause a memory leak if reference counted allocations were used in that every stack element refers to a “Lazy number” structure in the heap; However, not using the Lazy library we can still write the same definition just using deferred execution as ones = \() -> 1 :: ones()
with the exact same problems, and in fact every invocation of the (single) ones
function causes the creation of a function invocation object or closure if free variables were referenced in the definition; thus, I don’t see that this can possibly be a valid rationale for the removal of the Lazy library: Exactly the same problem persists whether the Lazy library exists or not - in fact it is a version of the “halting problem”.
Evan’s point 2 is thus out of the consideration.
On to address point 1 - whether Lazy is needed or not:
To summarize what Evan said in the ReadMe file, the difference between deferred execution such as \() -> expensiveCalculation a
then using a()
when one actually needs the value a
, and laziness as in lazy <| \() -> expensiveCalculation a
then using Lazy.force a
when one needs the value, is that when using the first the calculation has to be run every time the value is needed, whereas for the second the calculation is run only the first time the value is needed and a saved result of the calculation is provided thereafter, which we call “memoization”.
It is true that sometimes the deferred results are needed only once, in which case lazy
is not needed.
It is a further fact that one can sometimes find other ways of doing memoization as in storage of single values in a memoization type or such as storage of many calculated values in an Array or a Dict; however I will show applications where using Lazy for many values produces a O(n) performance where the use of (immutable) Array or Dict is O(n log n) or slower with a quite high constant execution overhead, so these alternatives aren’t really acceptable.
It is also a fact that the use of Lazy
is slower than the simpler deferred execution (even slower because most JavaScript engines don’t handle creating a function from within a function very well, as is necessary for Elm in order for the necessary mutation not to be perceptible from the outside), but my point is that there are applications where one can’t easily implement them efficiently without laziness to the point that the constant execution cost of its use is very much less than the algorithmic cost of not using it.
I would not dispute the removal of the Lazy library if the only use was for single calculations at a time, as I agree that alternate means can be found; however, the class of problems that require a sequence often may require the memoization of the elements of the sequence provided by laziness.
Examples Requiring the Use of Lazy
First, lets define some list/sequence types to be used in the further examples:
For DeferredList executions:
type alias Deferred a = () -> a
type alias DeferredPairElement a = { head : a, tail : DeferredList a }
type DeferredList a = ConsDeferred (Deferred (DeferredPairElement a)) | EmptyDeferredList
nthDeferredListValue : Int -> DeferredList a -> Just a
nthDeferredListValue n stream =
case steam of
EmptyDeferredList -> Nothing
ConsDeferred thunk ->
let pair = thunk() in
if n <= 0 then Just pair.head
else nthDeferredListValue (n - 1) pair.tail
For LazyList executions:
-- defined by Lazy...
lazy : (() -> a) -> Lazy a
force : Lazy a -> a -- where the first "thunk" is only executed once..
type alias LazyPairElement a = { head : a, tail : LazyList a }
type LazyList a = ConsLazy (Lazy (LazyPairElement a)) | EmptyLazyList
nthLazyListValue : Int -> LazyList a -> Just a
nthLazyListValue n stream =
case steam of
EmptyLazyList -> Nothing
ConsLazy thunk ->
let pair = Lazy.force thunk in
if n <= 0 then Just pair.head
else nthLazyListValue (n - 1) pair.tail
Note how little difference there is between the two.
Now, I might use the following naive implementation of the Fibonacci sequence as a justification for using Lazy:
fibs() =
let
fib n =
if n < 2 then n
else fib (n - 1) + fib (n - 2)
fbs n =
ConsDeferred <| \() ->
{ head = fib n, tail = fbs (n + 1) ]
in fbs 0
Note that this is a perfectly legitimate use of infinite recursion in that, due to deferral/laziness, the tail calculation is only done when the next value is requested.
The performance of the above absolutely stinks, with seconds required just to calculate the first 35 numbers in the sequence, with the reason obvious in that it is repeatedly back calculating all of the previous values in the sequence for each new value! Simply substituting ConsLazy <| lazy <| ...
for ConsDeferred
and using the alternate nthLazyListValue
instead of nthDeferredListValue
to find the nth value results in negligible time to find the 100th value at which point the size already exceeds the number range (but the timing is still valId).
However, that is hardly fair, as the function should really be written as follows:
fibs() =
let
fbs first second =
ConsDeferred <| \() ->
{ head = first, tail = fbs second (first + second) }
in fbs 0 1
In which case it doesn’t need laziness in order to calculate 100’s of fibonacci numbers in an instant. That was an example of justification of Lazy by an idiot .
Following is an example that is not by an idiot, in the case of the sequence of Hamming or Smooth-5 numbers:
Translation of the following Haskell code:
hamming = 1 : foldr u [] [2,3,5] where
u n s = -- fix (merge s . map (n*) . (1:))
r where
r = merge s (map (n*) (1:r))
merge [] b = b
merge a@(x:xs) b@(y:ys) | x < y = x : merge xs b
| otherwise = y : merge a ys
as the following Elm code using DeferredList instead of LazyList:
hammings() =
let
merge xs ys =
case xs of
EmptyDeferredList -> ys
ConsDeferred thunkxs ->
let pairxs = thunkxs() in
case ys of
EmptyDeferredList -> xs
ConsDeferred thunkys ->
ConsDeferred <| \() ->
let pairys = thunkys() in
if pairxs.head < pairys.head then
{ head = pairxs.head, tail = merge pairxs.tail ys }
else
{ head = pairys.head, tail = merge xs pairys.tail }
-- does as map ((*) scale) stream, but faster; more specialized
smult scale stream =
let
smlt strm =
case strm of
Empty -> strm
ConsDeferred thunk ->
ConsDeferred <| \() ->
let pair = thunk() in
{ head = scale * pair.head, tail = smlt pair.tail }
in smlt stream
u n stream =
let
dfrstrm() =
merge
stream
(smult n (Cons <| \() -> { head = 1, tail = dfrstrm() } ))
in dfrstrm()
in cons 1 (\() -> List.foldl u EmptyDeferredList [ 5, 3, 2 ])
Now, in this case it takes seconds to calculate just the first 1691 hamming numbers (to not overflow the 32-bit number range) using DeferredList, whereas only an instant using LazyList. Although one can do this in other ways not requiring Lazy
if only the one value of the nth Hamming number is required, there aren’t better algorithms for Elm in O(n) time if one requires the complete sequence, such as one would use if counting the number of Hamming numbers up to a limit.
Another example requiring memoization in in using the Sieve of Eratosthenes to determine the sequence of prime numbers: the fastest way to do this is to use a page segmented sieve where one must scan across the base primes for each segment page; if one were not to memoize the sequence of base primes, they need to be re-determined for each segment page pass. The code to do this would make this post very long, and it just proves more of the same.
I’m sure we can find other sequences that need laziness if we put our minds to it.
Further Considerations
The old ReadMe file further stated that using laziness is dangerous in that one might create chains of laziness (as often happens accidentally in Haskell) such that the actual calculations don’t get unwrapped until the outer lazy is evaluated; My answer is that Haskell is different in that laziness (or rather non-strictness) is the default for everything unless the programmer manages to hint to the “strictness analyzer” that laziness isn’t necessary, in Elm, OCaml, F#, etc., strictness is the default and one must explicitly add laziness where it is deemed beneficial. Why would a programmer explicitly stack laziness if it were a detriment? Explicitly stacking laziness as in a LazyList (or DeferredList for that matter) is a choice made for a perceived benefit, and also isn’t what was being referenced as in stacked laziness for a single outcome result, which also is impossible without explicitly writing it that way.
Surely the Elm authors don’t want to treat its users as idiots and control what they are able and not able to do using legitimate Elm constructs? Every other functional language has the concept of memoized laziness used especially in the creation and use of LazyList’s, why not Elm when there ARE legitimate uses?
Summary: Due to the above, I propose that the Lazy library not be dropped, but just that the documentation be augmented to better explain, with examples, how not to misuse it and when just simple deferred execution would be a better choice.
If it is felt that the Lazy library is apt to be mishandled by less astute programmers, why not include a very good LazyList implementation in the official (not community) libraries, with that library wrapping the “dangerous” Lazy concept wrapped in such a way as to be less subject to abuse with the warning that DeferredList (which could also be an official library) is more appropriate in many cases for given reasons with good examples?