Repos with slow compile times wanted for research

I’ve been working on profiling builds and I need some more data points, if you have a public repo that takes a while to build let me know.

I’ve been using a version of elm-make compiled with some RTS flags enabled (-A128M -n4m), and it has cut build times in half. I’ve tested with some open source projects and seen similar results.

There is little effect on compilation times for smaller, less interlinked files, but for files with a lot of imports it has made a large difference.

I will post a Dockerfile and test scripts once the testing is complete.

AntounK previous did some research in this area as well (https://github.com/elm-lang/elm-make/issues/159#issuecomment-318850631) - that whole thread has a lot of good information.

2 Likes

I have a somewhat atypical example at robx/elm-unicode. It’s just 256 256-fold case statements, for converting Unicode code points to Char. Takes about half a minute to compile on my aging notebook.

Clearly this is not typical code, but maybe it’s a useful datapoint anyway? It does seem the compile time scales badly with the size of the case statements – a single 256*256 case took long enough I didn’t wait.

(And if anyone happens to know of an existing library that does this better (natively?), pointers appreciated!)

Thank you Rob! That repo is a great example, I wanted to test large case statements and their effect, I was worried I was going to have to make my own :slight_smile:

I published a repo that has a Dockerfile to test out a few compiler options (https://github.com/antew/elm-make-speed-tests)

These are from an underpowered docker instance on my machine (2 CPUs, 2 GB RAM)

Original elm-make (0.18)
real    1m12.560s
user    1m5.210s
sys     0m27.640s

elm-make with options: -A128m -n4m
real    0m40.068s
user    0m40.220s
sys     0m4.990s

elm-make with options: -qg
real    0m42.252s
user    0m40.510s
sys     0m1.830s

I’m going to be a bit busy today, but I’ll see about getting binaries for linux and mac posted somewhere.

Inside the Dockerfile all it is really doing is recompiling elm-make with the +rtsopts flag, and then trying out different sets of flags for the build.

I may very well be wrong, but it looks like that’s essentially Char.fromCode which converts UTF-16 code units to Char. For dealing with actual unicode codepoints and UTF-32, there is zwilias/elm-utf-tools, amongst others.

1 Like

Hah, thank you! I wouldn’t have guessed. Judging by the documentation this is not what it should be used for, but it will do for now.

I spent some more time profiling builds today, and found that with the current build of elm-make, 40% of the time is spent in the garbage collection (varies by repo, for this data I used elm-spa-example).

This first build is using the 0.18 elm-make from https://dl.bintray.com/elmlang/elm-platform/0.18.0/linux-x64.tar.gz, with the flags “+RTS -s -RTS” added to allow collecting this information

Starting run for repo: https://github.com/rtfeldman/elm-spa-example.git
Original elm-make 0.18
Success! Compiled 93 modules.
Successfully generated index.html
   9,922,025,592 bytes allocated in the heap
   2,175,916,768 bytes copied during GC
      10,988,024 bytes maximum residency (180 sample(s))
         939,136 bytes maximum slop
              29 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     19003 colls, 19003 par   13.640s   7.232s     0.0004s    0.0105s
  Gen  1       180 colls,   179 par    3.215s   1.618s     0.0090s    0.0314s

  Parallel GC work balance: 23.65% (serial 0%, perfect 100%)

  TASKS: 6 (1 bound, 5 peak workers (5 total), using -N2)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.001s  (  0.001s elapsed)
  MUT     time   14.685s  ( 12.258s elapsed)
  **GC      time   16.855s  (  8.850s elapsed)**
  EXIT    time    0.009s  (  0.011s elapsed)
  Total   time   31.555s  ( 21.120s elapsed)

  Alloc rate    675,641,834 bytes per MUT second

  Productivity  46.6% of total user, 69.6% of total elapsed

gc_alloc_block_sync: 811892
whitehole_spin: 0
gen[0].sync: 95
gen[1].sync: 52976

real     0m21.130s
user     0m21.070s
sys      0m10.480s

This second run is with a larger allocation area for the garbage collector, and with that area divided into larger chunks of memory so that the GC runs less often (-A128m -n8m divides the 128m allocation area into 8m chunks). The -n8m option only makes sense if you are running on multiple cores, and this runs faster than the sysconfcpus -N 1 trick for me (or using -N1 through rtsopts).

elm-make with options: -A128m -n8m
Success! Compiled 93 modules.
Successfully generated index.html
   9,969,011,928 bytes allocated in the heap
     109,864,712 bytes copied during GC
       6,429,864 bytes maximum residency (16 sample(s))
         121,480 bytes maximum slop
             278 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0        46 colls,    46 par    1.075s   0.537s     0.0117s    0.0368s
  Gen  1        16 colls,    15 par    0.277s   0.139s     0.0087s    0.0184s

  Parallel GC work balance: 24.00% (serial 0%, perfect 100%)

  TASKS: 6 (1 bound, 5 peak workers (5 total), using -N2)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.004s  (  0.004s elapsed)
  MUT     time    8.288s  (  8.271s elapsed)
  **GC      time    1.352s  (  0.676s elapsed)**
  EXIT    time    0.002s  (  0.002s elapsed)
  Total   time    9.653s  (  8.954s elapsed)

  Alloc rate    1,202,873,985 bytes per MUT second

  Productivity  86.0% of total user, 92.7% of total elapsed

gc_alloc_block_sync: 97391
whitehole_spin: 0
gen[0].sync: 0
gen[1].sync: 3096

real     0m9.097s
user     0m8.930s
sys      0m0.850s