Incremental compilation, and hello Deno! | Gleam programming language

Gleam is a type safe and scalable language for the Erlang virtual machine and JavaScript runtimes. Today Gleam v0.26.0 has been released, let's take a look at what's new.

Incremental compilation

A Gleam project is made of packages, typically a top level package and several dependency packages fetched by the package manager, and each package contains a collection of modules of Gleam code.

In the very early days of Gleam when the compiler was run it would compile from scratch every module in every package in the project. This was highly wasteful, especially for the dependency packages which are would not have changed at all. To tackle this inefficiency when the Gleam build tool was created it was made to compile dependency packages only once and reuse the compiled code for every following build, resulting in only the top level package being recompiled.

This has worked well for the last couple years, but now as more people are using Gleam it was time for an upgrade. Large projects such as those with a large amount of generated gRPC code were starting to take an irksome amount of time to compile. Gleam is all about fun and productivity, so this just won't do!

There are numerous ways we want to improve the performance of the (already very nimble) Gleam compiler, but the majority of the time is spent in the Erlang compiler, which we use to generate BEAM bytecode, so these improvements will not be very impactful here. Instead we need to improve the build tool such that it only compiles modules when it has to, rather than the entire package.

To benchmark the impact of this change I created a Gleam package with 300,000 lines of code and 370,000 lines of documentation comments across 1400 modules, and test recompiling the package without any changes. The old version of the compiler will recompile every module, while the new version will instead only read and verify the caches.

Erlang

Benchmark 1: v0.25
  Time (mean ± σ):     18.443 s ±  0.949 s    [User: 18.458 s, System: 2.995 s]
  Range (min … max):   17.102 s … 19.968 s    10 runs

Benchmark 2: v0.26
  Time (mean ± σ):     140.8 ms ±   3.9 ms    [User: 92.5 ms, System: 46.4 ms]
  Range (min … max):   138.0 ms … 156.1 ms    20 runs

Summary
  'v0.26' ran
  130.99 ± 7.67 times faster than 'v0.25'

When targeting Erlang rebuilding now 130 times faster than before for a project this size!

JavaScript

Benchmark 1: v0.25
  Time (mean ± σ):      1.861 s ±  0.026 s    [User: 1.543 s, System: 0.299 s]
  Range (min … max):    1.833 s …  1.927 s    10 runs

Benchmark 2: v0.26
  Time (mean ± σ):     145.3 ms ±   2.9 ms    [User: 92.9 ms, System: 50.8 ms]
  Range (min … max):   141.4 ms … 154.3 ms    20 runs

Summary
  'v0.26' ran
   12.81 ± 0.31 times faster than 'v0.25'

When targeting JavaScript the change is less impactful, running just under 13 times faster. This is because on this target we don't need to run the Erlang compiler to generate bytecode, the outputted JavaScript code can be loaded directly into a JavaScript runtime.

These benchmarks were performed with the excellent Hyperfine command line benchmarking tool.

How does it work?

When the compiler runs for each module it emits a set of reusable artefacts: 1. Erlang bytecode in a .beam file. 2. Erlang record definitions in .hrl files for use by any native Erlang modules. 3. Information on the types and values in the module in a .cache file. 4. Information the compilation of the module in a .cache_meta file.

If the module doesn't need to be compiled again then we can load the .beam bytecode into the virtual machine, load the module information from the .cache file so we can compile other modules that depend on it, and move on to the next module.

How do we tell if a module needs to be recomplied? There are two checks we need to make, both using information stored in the .cache_meta file.

The first is to check the modification time of the source file against the compile time stored in the .cache_meta file. If the source file modification time is newer then it has been changed and we need to recompile it.

The second is to look at the modules dependencies. The .cache_meta file stores a list of the modules the the module imports, and using this we can tell if any of modules upstream in the dependency tree are going to be recompiled. If so then we need to recompile the module as a change in a dependency may mean that this module needs to be compiled differently than last time.

What's next?

These changes have made a huge difference to compilation speed, but there's still a lot more easy wins we can apply in future here if the need arises such as improvements to the efficiency of the compiler's IRs, more precise cache invalidation, and multithreaded compilation.

Developer experience is a top priority for Gleam. You need your feedback as quickly as possible when writing Gleam code, so we're committed to keeping the compiler super speedy.

Running on Deno

Gleam can run on JavaScript as well as the Erlang virtual machine. Until now when you run gleam run or gleam test with a Gleam project targeting JavaScript it'll run your code using the NodeJS runtime. With v0.26 the Deno runtime can be used instead!

Deno is similar to NodeJS in many ways, but it boasts better compliance with web-standard APIs, much better security, and a very slick developer experience.

To use Deno instead of NodeJS you can either add the --runtime=deno flag to commands like gleam run, or you can add the javascript.runtime property to your gleam.toml file.

name = "my_project"
version = "1.0.0"

[javascript]
runtime = "deno"

Thank you to Brett Kolodny for this feature!

Thanks

Gleam is made possible by the support of all the kind people and companies who have very generously *sponsored* or contributed to the project. Thank you all!

If you like Gleam consider sponsoring or asking your employer to sponsor Gleam development. I work full time on Gleam and your kind sponsorship is how I pay my bills!

Thanks for reading! Happy hacking! 💜