I’ve spent the last couple of weeks analyzing how the Javascript the compiler outputs affects performance, and how it can be improved.
I’ve summarized all my findings here: https://dev.to/skinney/improving-elm-s-compiler-output-5e1h
I’ve spent the last couple of weeks analyzing how the Javascript the compiler outputs affects performance, and how it can be improved.
I’ve summarized all my findings here: https://dev.to/skinney/improving-elm-s-compiler-output-5e1h
In this post, you say whole program analysis is too expensive. What about programs like prepack? Can those be co-opted to do optimizations specific to the app?
You tell me, I know next to nothing about webpack
Keep in mind that I mean in the context of a JIT. The JIT doesn’t have time to do whole program analysis, but external tools? Sure. Google Closure does a bunch of stuff based on whole program analysis.
Sorry, not webpack. Prepack. https://prepack.io/
I am just wondering about this as a way to make life easier for the JIT!
Wonderful findings! Those changes should be definitely pushed to the compiler. Especially inlining. I keep telling people how awesome functional languages are because inlining is much more predictible.
Another way the output could be improved would be with compiler time evaluation. This has the potential to trim a lot of both computation and payload. I guess this is part of what prepack
is trying to do BUT, the potential for this is way larger in Elm since Elm knows more about the code.
Nice work! Some of these optimizations would be great to have, but some of them would also increase the JS output size, which seems to be the more important to Elm so far. Compile time evaluation would also be very nice given the semantics of Elm (any function taking a constant input could be evaluated at compile time), but this also risks bloating the output size (if a function generates some large data structure for example).
If output size is still the most important optimization parameter, we would probably need some way of measuring if the optimization increases output size on a case-by-case basis, or we would need some heuristics at least (only do inlining on functions with a single call site for example).
I think we could still wring more benefit from dead code elimination though, and some constant propagation at least. For example, if a library defines a custom type with three constructors, but your code only uses one or two of those, the compiler could remove the associated branches and functions called from those branches. Or if a given function takes e.g. a Bool
as an argument, but your code only ever calls it with True
, the argument and all branches associated with it can be eliminated, and only the code called from True
branch remains.
It wouldn’t necessarily increase code size. Keep in mind that currently, every single function definition is wrapped in a F wrapper, and every function call is wrapped in a A call. Direct function calls eliminate the need for those wrappers so, in theory, it will reduce the asset size.
Making sure all CTs adhere to the same shape is likely to increase asset size, but it’s doubtful it would increase in any meaningful way. Combined with direct function calls, we still might see a net decrease for all apps.
Pre-calculation of functions can also reduce asset size, it’s just a matter of only doing pre-calculation if it results in less code, done.
And as you say, being able to remove unused branches might even result in reducing asset size to the point were any asset size increase from the above won’t matter.
Are those wrappers present at each call site? Because then it seems like it could be quite a win!
How would we measure the output size though? We could measure the final output size and compare it to the non-optimized version, but then we wouldn’t know what optimization to turn on or off. Can we lower only some specific code to JS and measure its size compared to an optimized version of the same code? And is that even good enough if we ignore the effect of minification and compression?
Yes, they’re present at each call site.
This can be meassured in the code gen, but it’s likely to increase compile time. As with all things, it’s about trade-offs.
I would be happy with that increase in compile time if it produces good results.
These are really great findings! I’ve tried to apply the proposed changes to elm-physics (125 boxes simulation) and got +4FPS in Chrome and +10FPS in Firefox!
The total bundle size is only a little bit smaller, because the unnecessary A2…n wrappers are removed. Another positive side effect of this change is improved profiling, because I can now see the original function names instead of anonymous functions:
I believe that improving performance profiling experience would be as useful as optimization per se.
To be honest, I’m a little frustrated to unfold so much A2 & co. functions to find something interesting, it’s almost like playing Minesweeper
If someone found some good tips to improve live performance profiling, I am all ears
Maybe a --debug-performance
that slows things down but show function names could be a very useful option.
The proposed optimization, if implemented, would improve both, performance and profiling. Not sure why you want to slow things down!
You can manually edit the compiled code to achieve the same result, or use this scrappy script that I wrote to try this on elm-physics.
Sure, I should have said “even if it slows things down or increase the JavaScript size”. Of course if it’s only benefits, it’s even better
Thank you very much for the script
Really nice analysis! How difficult would the codegen changes be?
I don’t think it would be too difficult, but it’s not trivial either so it will take some time to do. And there are so many things which take time!
This topic was automatically closed 10 days after the last reply. New replies are no longer allowed.