renv - reproducible R environments
Last updated
Last updated
renv
A major way to keep the output of your R code reproducible (or even runnable) across collaborators is to use virtual environments (see Virtual environments) which let you recreate the state of R package library of any of your collaborators without affecting any other projects on your computer and vice-versa.
The current RStudio-recommended tool for creating virtual R environments is renv
is. Once a renv
environment is set up, whenever you run R in the project root or open that project in RStudio, that renv
environment will be activated and your code will exclusively use a local, project-specific package library (located somewhere inside the renv/
folder in the root of your project). This is useful even if you are not planning to collaborate on a given project, because any installation/removal of R packages won't affect any other projects.
renv
will also track the packages and their versions using a so-called lockfile usually called renv.lock
. It is this file that you share to allow your collaborators recreate your environment with the same packages and their versions.
Only once:
Every time the project is in one collaborator's hands:
pull from GitHub,
restore renv
environment(see renv::restore()),
work on the project: write code and install/remove R packages (see Installing a new package and Removing a Package)
check the R environment and update the lockfile, (see renv::status() and renv::snapshot()),
push to GitHub.
Ideally, updating the lockfile happens before each commit, not only before the one that gets pushed to GitHub.
Install renv
if it isn't installed.
To initialize the project, run the following command in R console in your project root:
Manually create a file named .Renviron
(note capital R
and no .txt
, .rtf
, etc. at the end of the filename) in the root of the project or open it if it already exists and add the following lines to it:
Add, commit, and push the following files to git:
Activating the environment is done automatically when R or RStudio is run in the project root. If all went well, you'll see something like this:
Or, if running from a folder that is still in your project, just not from the root folder, do
We will use three functions that will help us do that:
renv::status()
that lets you check whether everything is in sync.
renv::snapshot()
that saves the state of project dependencies (more on that below) to the lockfile renv.lock
.
renv::restore()
Before we look into them, let's see what specifically needs to be in sync. There are three places where a given package can be present or not:
lockfile - the renv.lock
file that lists all packages and their versions,
library - the project-local package library somewhere inside the renv/
folder,
dependencies - packages that your code implicitly depends upon via library(bestPackageEver)
, bestPackageEver::an_ok_function()
, etc. and all the packages that those packages in turn depend on.
renv::status()
Function renv::status()
reports whether lockfile, library, and dependencies of the project are in sync. For each out-of-sync package (if any), it will tell you whether it is present in each of these three locations. See ?renv::status()
for how you can deal with each possible combination. Or refer to the table below - whichever makes more sense to you.
You won't see this because this package is in sync.
Start using the package if you need it or remove it if you don't.
New package that you installed and started using recently? Take a snapshot.
Either start using the package and take a snapshot or remove it if you don't need it.
You probably switched to a different verson of the repo. Restore to install the packages not in the library.
Restore the environment (to install) and start using the package or take a snapshot to remove it from the lockfile.
Install packages one by one or use renv::hydrate
to install them all at once. Sometimes, this situation can indicate that the package is only used in the sense of there being a library(<pkg>)
call somewhere but it isn't used anywhere else. In that case, remove the library
call.
renv::snapshot()
The task of renv::snapshot()
in the default - implicit mode is to save the implicit dependencies of the project to the lockfile overwriting whatever its contents were before.
renv::restore()
Before restoring, make sure the lockfile doesn't have any uncommitted changes.
Use the following code to ensure that the packages installed in your library are exactly the ones recorded in the lockfile and of exactly those versions:
The task of renv::restore()
(without additional arguments) is to make sure that the every package in the lockfile is installed to your library and has the correct version. If you add clean = TRUE
into the function call, then it will also remove any packages that are not in the lockfile.
Use the package (library(bestPackageEver)
, bestPackageEver::<function>()
) somewhere in your code. This will implicitly add the package to the dependencies of the project.
Install the package with install.packages('bestPackageEver')
. As long as renv
has been activated, this will install the package into the local library.
Take a snapshot with renv::snapshot()
. This will make a snapshot of the dependencies in the lockfile.
Remove from the dependencies: First, make sure the package is not actively used in your code. Remove any references to the package, such as functions or library calls. This ensures that the package is no longer part of your project's dependencies.
Uninstall from the library with remove.packages("bestPackageEver")
Take a snapshot: renv::snapshot()