renv - reproducible R environments

About renv

Read the introductory `renv` vignette. Explanations there can make a lot more sense than the ones on this page.

A major way to keep the output of your R code reproducible (or even runnable) across collaborators is to use virtual environments (see Virtual environments) which let you recreate the state of R package library of any of your collaborators without affecting any other projects on your computer and vice-versa.

The current RStudio-recommended tool for creating virtual R environments is renv is. Once a renv environment is set up, whenever you run R in the project root or open that project in RStudio, that renv environment will be activated and your code will exclusively use a local, project-specific package library (located somewhere inside the renv/ folder in the root of your project). This is useful even if you are not planning to collaborate on a given project, because any installation/removal of R packages won't affect any other projects.

renv will also track the packages and their versions using a so-called lockfile usually called renv.lock. It is this file that you share to allow your collaborators recreate your environment with the same packages and their versions.

Basic pipeline

Only once:

Set up the environment

Every time the project is in one collaborator's hands:

Ideally, updating the lockfile happens before each commit, not only before the one that gets pushed to GitHub.

Set up the environment

Short version

In R console:

install.packages('renv')
renv::init()

In terminal/shell/git-bash:

echo "# Tell \`renv\` to only use the project library and not access the system library
RENV_CONFIG_SANDBOX_ENABLED=TRUE
# Tell \`renv\` to install as much as it can during \`renv::init()\`/\`renv::restore()\`
RENV_CONFIG_INSTALL_TRANSACTIONAL=FALSE
" >> .Renviron &&
git add \
    renv.lock \
    .Rprofile \
    renv/ \
    .Renviron &&
git commit -m "set up renv environment"

Install renv if it isn't installed.

To initialize the project, run the following command in R console in your project root:

renv::init()

If you and your collaborators have a way to switch between R versions (R Switch, rig, conda environments, etc.), consider not locking the full version of R (e.g., 4.3.1) in renv.lock but only its major.minor part (i.e., 4.3) in order to ignore the difference in patch numbers that would otherwise lead to constant uninteresting changes to renv.lock. Here is what to do:

major_minor <- sub("\\.\\d+$", "", getRversion())
renv::init(settings = list(r.version = minor_version))

Manually create a file named .Renviron (note capital R and no .txt, .rtf, etc. at the end of the filename) in the root of the project or open it if it already exists and add the following lines to it:

# Tell `renv` to only use the project library and not access the system library
RENV_CONFIG_SANDBOX_ENABLED=TRUE
# Tell `renv` to install as much as it can during `renv::init()`/`renv::restore()`
RENV_CONFIG_INSTALL_TRANSACTIONAL=FALSE

Add, commit, and push the following files to git:

.Renviron
.Rpofile
renv.lock
renv/.gitignore
renv/settings.json
renv/activate.R

Note that git status doesn't show contents of untracked folders by default so you will see renv/ but not renv/.gitignore and other files after running git status. To show the contents of the untracked folders, use git status --untracked-files=all or its abbreviated version git status -uall.

Activate the environment

Activating the environment is done automatically when R or RStudio is run in the project root. If all went well, you'll see something like this:

* Project '/path/to/the/folder' loaded. [renv 0.14.0]

Starting not from the project root

source('<path_to_your_project>/renv/activate.R')

Or, if running from a folder that is still in your project, just not from the root folder, do

source(here::here('renv/activate.R'))

Keep the `renv` environment in sync

We will use three functions that will help us do that:

  • renv::status() that lets you check whether everything is in sync.

  • renv::snapshot() that saves the state of project dependencies (more on that below) to the lockfile renv.lock.

  • renv::restore()

Before we look into them, let's see what specifically needs to be in sync. There are three places where a given package can be present or not:

  • lockfile - the renv.lock file that lists all packages and their versions,

  • library - the project-local package library somewhere inside the renv/ folder,

  • dependencies - packages that your code implicitly depends upon via library(bestPackageEver), bestPackageEver::an_ok_function(), etc. and all the packages that those packages in turn depend on.

renv::status()

Function renv::status() reports whether lockfile, library, and dependencies of the project are in sync. For each out-of-sync package (if any), it will tell you whether it is present in each of these three locations. See ?renv::status() for how you can deal with each possible combination. Or refer to the table below - whichever makes more sense to you.

installed
recorded
used
Solution

You won't see this because this package is in sync.

Start using the package if you need it or remove it if you don't.

New package that you installed and started using recently? Take a snapshot.

Either start using the package and take a snapshot or remove it if you don't need it.

You probably switched to a different verson of the repo. Restore to install the packages not in the library.

Restore the environment (to install) and start using the package or take a snapshot to remove it from the lockfile.

Install packages one by one or use renv::hydrate to install them all at once. Sometimes, this situation can indicate that the package is only used in the sense of there being a library(<pkg>) call somewhere but it isn't used anywhere else. In that case, remove the library call.

renv::snapshot()

The task of renv::snapshot() in the default - implicit mode is to save the implicit dependencies of the project to the lockfile overwriting whatever its contents were before.

There are several modes in which renv::snapshot can operate. You can set them at the project level, for example, when you initialize the project with git::init(), or you can choose it for each individual renv::snapshot() call. I'd recommend sticking to the default implicit mode unless you have a good reason not to.

renv::restore()

Before restoring, make sure the lockfile doesn't have any uncommitted changes.

Use the following code to ensure that the packages installed in your library are exactly the ones recorded in the lockfile and of exactly those versions:

renv::restore(clean = TRUE)

The task of renv::restore() (without additional arguments) is to make sure that the every package in the lockfile is installed to your library and has the correct version. If you add clean = TRUE into the function call, then it will also remove any packages that are not in the lockfile.

Installing a new package

  • Use the package (library(bestPackageEver), bestPackageEver::<function>()) somewhere in your code. This will implicitly add the package to the dependencies of the project.

  • Install the package with install.packages('bestPackageEver'). As long as renv has been activated, this will install the package into the local library.

  • Take a snapshot with renv::snapshot(). This will make a snapshot of the dependencies in the lockfile.

Removing a Package

  • Remove from the dependencies: First, make sure the package is not actively used in your code. Remove any references to the package, such as functions or library calls. This ensures that the package is no longer part of your project's dependencies.

  • Uninstall from the library with remove.packages("bestPackageEver")

  • Take a snapshot: renv::snapshot()

Last updated