WebAssembly compiler update

I’ve posted here a few times before about my project to compile Elm to WebAssembly.

The project consists of two GitHub repos, one for the compiler and one for the core libraries.

None of this is official. I’m not part of the core team and, as far as I know, they have no plans to move to WebAssembly any time soon. This is a hobby project driven by my own curiosity.

Summary of previous posts

In this post, I described the custom Garbage Collector, and the parts of the core libraries I’d ported to Wasm.

In my last post I described the system architecture. The Elm runtime remains in JavaScript because WebAssembly doesn’t have Web APIs yet. The Wasm app talks to the runtime through a JS wrapper. I also showed a demo of a very basic working app. It was “hand compiled” as I didn’t have a working compiler yet.

I’m using C as an intermediate language. Everyone asks “why not Rust?”! Well, Rust is great for preventing errors in handwritten code but I found it an inappropriate target for automatically generated code. More details here

Latest news

My latest demo is actually fully compiled code!

It’s a WebAssembly port of Evan’s TodoMVC example from a few years ago (here’s his original repo)

Compiler changes

The forked compiler accepts --output elm.c as a command-line option as well as --output elm.js and --output elm.html. Once I have the C file, I use Emscripten to further compile it to WebAssembly. There are a few build steps that I coordinate using GNU make.

I ran into a few challenges with type information. The compiler has several different stages. I only worked on the last stage, code generation, to limit the scope. But all type information has been dropped from the AST by then, and that created some challenges.

  • Currently it’s unsafe to use a Float parameter in an app-level Msg type. I have no way to tell Int from Float when passing messages from the JS runtime to the Wasm app.

  • The Time module doesn’t work because it uses Int for timestamps. Realistic values require at least 42 bits but I’m using 32 bits. Some low level details work out nicely that way, because Wasm pointers are 32 bits. And the Json and Bitwise libraries rrequire 32-bit integers as well.

  • I need to be able to distinguish custom types from tuples and lists. I’m using runtime type detection, but I’d prefer not to.

More detail here: https://github.com/brian-carroll/elm-compiler#architecture-challenges

Development Status

So is that it? Is it all working? Can I use it in production right now? Is it really fast? OMG!

Nope! Sorry!

I’m still working through lots of implementation issues. For example I have not yet managed to get Richard Feldman’s elm-spa-example working. It’s a great test-case because it’s complex enough that if I have any bugs, it’s bound to show them up!

I haven’t done any performance work yet. Before I can focus on that, I need to debug it and sort out some issues with the architecture (see “current focus” below).

Current focus

A lot of the work I’m currently doing is on the JS/Wasm interface. Since I have the runtime in JS and the app in Wasm, the interface between the two is a major focus.

Two of the topics I’m thinking about:

Some of the objects passed from the JS runtime to the app are unserialisable. For example, DOM events are not serialisable because they contain cyclical references. It’s all to do with how the Json library is implemented. I have something that works most of the time! But I’m working on something more reliable.

Currently the app’s Model is stored in JS but the update function is in Wasm. That means the model has to get passed from JS to Wasm and back again on every update cycle, getting serialised and deserialised along the way. The only reason it works this way is that it was quicker to get up and running, because I didn’t need to change anything in the JS runtime.

String encoding

The original post suggesting this project specifically mentions string encoding, and UTF-8 in particular. And there was some discussion of this in my last post. I suggested that UTF-16 might have advantages, due to better compatibility with JS and most of the browser APIs.

I did some benchmarking on both encodings, to get an idea of the performance implications.

There’s not much performance difference. Based on the results, I initially wanted to go with UTF-8. But then I realised that every time I pick an app to test the compiler on, I would also have to migrate its Elm code to using a new String library as well. Otherwise things like URL parsing might break, and who knows what else? It just makes things too complicated. So I’m sticking with UTF-16 for this project. UTF-8 is a separate project.

Asynchronous initialisation

WebAssembly modules are normally compiled asynchronously once loaded into the browser. We have to wait until the compilation is finished before we can call Elm.Main.init.

I created a new function Elm.onReady to help with this. You just put your app’s normal setup code in a callback, and Elm.onReady will execute it at the right time.

For my WebAssembly version of the TodoMVC example, it looks like this:

<script type="text/javascript">
  Elm.onReady(function () {
    var storedState = localStorage.getItem('elm-todo-save');
    var startingState = storedState ? JSON.parse(storedState) : null;
    var app = Elm.Main.init({ flags: startingState });
    app.ports.setStorage.subscribe(function (state) {
      localStorage.setItem('elm-todo-save', JSON.stringify(state));
    });
  });
</script>

Summary

We can now compile some Elm apps to WebAssembly, including the TodoMVC demo

There are some architecture issues to work out, there’s no performance work done yet, and there’s lots of kernel code unwritten.

Wasm enables UTF-8 but it’s a separate project

There are some changes in the setup API due to async compilation

70 Likes

Wow, this is pretty cool. :slight_smile: This is must have taken a lot of thought and time :+1:

3 Likes

Very cool. Excited to see how this evolves! Thanks for sharing.

2 Likes

WASM is definitely the future for both portability and performance. Great job on paving the way for a bright future for Elm! I can already imagine Elm working outside the browser on desktops, servers, and embedded devices through WebAssembly System Interface and talking directly, efficiently and safely with other languages like Rust through the interface types. Even though those goals are lofty, I’m excited about the great potentials of a WASM compiler for Elm. Great work!

1 Like

Future for performance? I doubt anyone would notice to be honest, except in some corner cases at best.

1 Like

It depends on your application. If you do a lot of number crunching and/or have huge data structure, the performance benefit is significant. In my experience, WASM generally speeds things up by at least 20% if not more. A recent project I worked on involved emulating a 32-bit computer in the browser. Elm’s array and overall architecture is just not fast enough, so I rewrite the core logic in Rust and see significant visible lags reduced to nearly none. That being said, if you pass a lot of data between Rust and JS, the serialization and deserialization may drag down WASM performance. However, most of the case this issue can be addressed by thoughtful interface design. Another note, JS needs to be parsed and garbage collected but WASM comes in a binary format that only need to be decoded (super fast compared to parsing) and needs no garbage collection. You can see more about WASM vs JS performance here.

3 Likes

Great progress on this! If at any point you need a test app to benchmark the performance, I would be really curious to see elm-physics compiled to WebAssembly. It has a lot of number crunching that could benefit from this!

I think the main challenge in making it work would be supporting immutable Arrays.

3 Likes

@Brian_Carroll I noticed you opened and closed a PR here. https://github.com/elm/compiler/pull/2113 What’s the reason you closed it? Seems you even talked to Evan about it.

Oh very good, yes, I’ve seen some of the things you posted about elm/physics before and it’s really cool!
We should talk when I get that far but, as I mentioned, I’m not there yet. I don’t expect to be able to do meaningful performance benchmarks until later this year at best. Just getting stuff to work at all is a big job.
I know Elm has some special syntax for graphics shaders but I never looked at it. Are you using that? That’s actually the one part of the Elm AST that I didn’t write any compiler code for, because I don’t understand it or know how to test it.

1 Like

Hey, thanks for the nice comments and the enthusiasm!
I’m finding the same thing here that you did - what slows things down is passing data back and forth between Wasm and JS. So I’m working on the interface design.
At some point Wasm will get native browser APIs so we won’t have to call out to JS at all. But right now there’s a lot of that.
About GC, I see it a bit differently. It’s Rust that doesn’t need GC! But an Elm program must have a GC because although the language has syntax to construct new values, it has no syntax to destroy them. (Like Rust’s lifetimes, or C’s free function, etc.)
Here’s some documentation on the GC I wrote, which fits in 6kB of Wasm.

2 Likes

Looks promising, thank you for that !

If you are compiling Elm to C on the way to WASM does that mean that you are effectively making Elm portable to any environment that can run C?

2 Likes

Partly!
If somebody did want to make Elm portable to any environment that can run C, they could reuse a lot of what I’ve done. But they’d also need to build a new runtime that would make sense for the target platform. The Elm core team have said a few times that they think that’s the really hard part - designing core libraries in such a way that Elm would be just as nice for other environments as it is for web apps. If code is the easy part, then I’m doing the easy part!

2 Likes

This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.