Blog

Super fast npm install on Github Actions

Installing npm dependencies with Github Actions is a breeze. But it’s a slow breeze.

As always, performance tweaking takes experimentation— but we got your back. We did the hard work, and have the numbers to prove it. With our 4-step approach, you can reduce a 16-second task to take only 2 seconds. 2 seconds! If the installation kicked off when you started reading this sentence, it’s done right about… now. As a bonus, you’re doing the world a favour: that’s a 87.5% reduction of energy use. 🌳 ❤️

The baseline

GitHub Actions make it easy to use external official actions like setup-node in a single line: - uses: actions/setup-node@v2. Followed by running npm install like the setup-node readme suggests, takes care of Node.js and installing all needed dependencies.

Use a manifestation of the manifest

The biggest win in speed and efficiency is achieved by installing dependencies from the package lock file: package-lock.json. Npm generates this file by default, and by using the command npm ci, only the lock file is used during install. Since it contains a resolved dependency tree, npm can skip a whole lot of steps.

Caching on GitHub

Secondly, caching dependencies saves download time otherwise needed for each package. All cached dependencies are fetched in one go from GitHub, using a cache action:

- name: Cache dependencies
  uses: actions/cache@v2
  with:
    path: ~/.npm
    key: npm-${{ hashFiles('package-lock.json') }}
    restore-keys: npm-

By using this cache npm copies dependencies from this cache instead of downloading them. If package-lock.json changes, the then outdated GitHub cache is still used as the base for a new GitHub cache, under a new key, because of the restore-keys option.

No side effects allowed

The final small win is ignoring installation scripts with the --ignore-scripts flag. This could break certain dependencies that use installation scripts.[1] Instead of crossing fingers and giving it a try you can list native dependencies that might need these scripts with the native-modules CLI.

The result: faster workflow

Putting these three together in an example workflow, gives:

name: Continuous integration

on: pull_request

jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2

      - name: Setup Node.js
        uses: actions/setup-node@v2
        with:
          node-version: '14'

      - name: Cache dependencies
        uses: actions/cache@v2
        with:
          path: ~/.npm
          key: npm-${{ hashFiles('package-lock.json') }}
          restore-keys: npm-

      - name: Install dependencies
        run: npm ci --ignore-scripts

      # run npm test, build, lint, etc.

An even faster alternative

Combining npm ci with caching of ~/.npm is recommended by GitHub and npm, however an interesting alternative is caching the node_modules directory. This is not suggested because it contains potential footguns:

First off, combining a node_modules directory with npm ci is slow since the latter will first remove node_modules before installing dependencies. Secondly, when running multiple Node.js versions in your CI and/or when changing the Node version that runs on your CI, old native modules might break.

So given that no installation scripts are used, you can completely skip the installation step! To prevent restoring node_modules when the cache changed, the cache action is given no restore-keys. In other words: the cache is only used if there is an exact key match:

- name: Cache dependencies
  id: cache
  uses: actions/cache@v3
  with:
    path: ./node_modules
    key: modules-${{ hashFiles('package-lock.json') }}

- name: Install dependencies
  if: steps.cache.outputs.cache-hit != 'true'
  run: npm ci --ignore-scripts

Measure all the things

Step by step measuring the installation time, including restoring the cache, on a project with a thousand (indirect) dependencies gives the following:

bar chart showing difference between no cache, exact cache and changed cache

Changing the cache was done by modifying package-lock.json, using the alternative method with an exact key shows the same timing as expected with no cache.

Our verdict

The first approach shows a better approach for a variety of cases, a fit-all solution if you will. The alternative is definitely a lot faster if the workflow is often ran without package lock changes. Keep in mind that GitHub does remove caches that have not been accessed within the last week. So choose wisely, depending on the project, the stage of development and the regularity of workflow runs. Happy Github Actioning!

Update 2021-09-06: The setup-node action now includes caching which I personally do not like, it goes against doing one thing and doing it well, though one could argue it is hiding an implementation detail. This is not enabled by default so all of the above still works like described. If enabled it uses the caching action internally on ~/.npm/code.

Footnotes

[1] Installation scripts are only necessary for native packages that do not pre-bundle compiled code using the N-API. These scripts are often abused to log information about a package. It is also a convenient place to spread malware.

← All blog posts

Also in love with the web?

For us, that’s about technology and user experience. Fast, available for all, enjoyable to use. And fun to build. This is how our team bands together, adhering to the same values, to make sure we achieve a solid result for clients both large and small. Does that fit you?

Join our team