Everything in Elm is immutable or is it

Am I the only one who has considered the consequences of the Tail Call Optimization (TCO) implemented so well so many years ago (at least two years :grinning: ) on which we are so dependent? As for most languages that transpile to a language that (didn’t/doesn’t) directly support TCO, Elm implements this as an “under-the-covers” conversion to a simple JavaScript loop, and of course only works for tail call recursion within the same function (which covers the majority of the use cases).

However, that implementation means the arguments of the recursive function call are internally mutable, and it turns out that it is quite simple to reveal that mutation externally! Consider the following example run on elm repl (version 0.19):

let loop n list = if n <= 0 then list else loop (n - 1) ((\() -> n) :: list) in List.map (\f -> f()) <| loop 3 []

which produces [0,0,0] : List number indicating that the captured n value has been mutated to the final value of zero for the three closures. If n were not mutated, the code should produce [1,2,3] : List number.

It is easy to obtain the desired output by the addition of a let just before the creation of the closure in the “loop” as follows:

let loop n list = if n <= 0 then list else loop (n - 1) ((let nn = n in \() -> nn) :: list) in List.map (\f -> f()) <| loop 3 []

where the internal let creates a new immutable binding for the n value (copy on create in the case of a primitive number as here, but it will also work for reference heap values as it will capture the reference).

The problem is the capturing of the binding in a closure, as a list made using the value n directly works as expected. The fix is likely to create the required let invisibly when the captured value comes from a TCO. As currently Safari is the only major browser that implements JavaScript TCO, asking JavaScript to handle this is still not an option.

I’m surprised that no one has caught this before now.

The question is where to file the issue?

2 Likes

Looks like a compiler bug to me. The compiler should have auto-created the closed-over nn binding.

I made an Ellie illustrating it: https://ellie-app.com/3txnNTzHFFNa1

Makes sense as a new issue at https://github.com/elm/compiler/issues

Evan likes in-line code, not links to Ellie.

Below is the generated JS for both versions (I named the first one foo and the second bar). Of special interest is that you’ve lost the TCO in the second version.

var author$project$Main$foo = function () {
	var loop = F2(
		function (n, list) {
			loop:
			while (true) {
				if (n <= 0) {
					return list;
				} else {
					var $temp$n = n - 1,
						$temp$list = A2(
						elm$core$List$cons,
						function (_n0) {
							return n;
						},
						list);
					n = $temp$n;
					list = $temp$list;
					continue loop;
				}
			}
		});
	return A2(
		elm$core$List$map,
		function (f) {
			return f(_Utils_Tuple0);
		},
		A2(loop, 3, _List_Nil));
}();

var author$project$Main$bar = function () {
	var loop = F2(
		function (n, list) {
			return (n <= 0) ? list : A2(
				loop,
				n - 1,
				A2(
					elm$core$List$cons,
					function () {
						var nn = n;
						return function (_n0) {
							return nn;
						};
					}(),
					list));
		});
	return A2(
		elm$core$List$map,
		function (f) {
			return f(_Utils_Tuple0);
		},
		A2(loop, 3, _List_Nil));
}();
1 Like

@billstclair, thanks for looking at this and confirming the problem; in fact I think you’ve identified a second bug in “that you’ve lost the TCO in the second version” as it turns out that the only reason the workaround makes “bar” work is by breaking TCO; the alternate of using the let outside the definition of the closure doesn’t work and the problem of the mutability persists.

The fix for the first bug of the mutation of the loop arguments is easy: just change the declaration of temporary variables to let (block scoped) variables instead of using var (function scoped) and redefining what the new let values preserve, as it would be better to preserve the old values treating them as immutable with the let rather than the new values as for the continuing loop as in the following fixed JavaScript for “foo”:

var author$project$Main$foo = function () {
	var loop = F2(
		function (n, list) {
			loop:
			while (true) {
				if (n <= 0) {
					return list;
				} else {
					let	$temp$n = n,
						$temp$list = list;
					n = $temp$n - 1;
					list = A2(
						elm$core$List$cons,
						function (_n0) {
							return $temp$n;
						},
						$temp$list);
					continue loop;
				}
			}
		});
	return A2(
		elm$core$List$map,
		function (f) {
			return f(_Utils_Tuple0);
		},
		A2(loop, 3, _List_Nil));
}();

I’ve tested this and it works as expected.

For the second bug of eliminating the TCO when there is a Elm let just before (staged inside) the production of the closure, it isn’t so clear what should be done without digging into how TCO determines when it can be applied, but in this case there is no reason that TCO can’t be applied, especially with the use of JavaScript let and the defining of the temporary variables to preserve the “immutable” state.

The second is actually the more serious bug as it leads to building of stack (even to overflow) in cases where it isn’t necessary and would be unexpected.

2 Likes

I’ve posted an issue as follows: Compiler Issue #1813.

2 Likes

Interesting stuff.
I’ve noticed Elm’s generated JS generally uses very old-school ES3 syntax. I assume it’s to maximise browser compatibility. There are no other instances of let, only var with function scope. So to fit better with the rest of the codebase it might be an idea to replace the let with an immediately invoked function expression.
Although if I’m right about this then I guess it will come up on the GitHub issue!

It’s true that my suggested use of let to fix the problem would be the only use of let in the generated JavaScript, but as all major browsers (even Internet Explorer) now support this use of let why not? ES3 is getting pretty long in the tooth. Especially as using let is likely to have better performance than an immediately invoked function expression. Other than not building stack, it seems a shame to give up one of the advantages of TCO in better performance.

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.