renv - reproducible R environments
About renv
renv
Read the introductory `renv` vignette. Explanations there can make a lot more sense than the ones on this page.
A major way to keep the output of your R code reproducible (or even runnable) across collaborators is to use virtual environments (see Virtual environments) which let you recreate the state of R package library of any of your collaborators without affecting any other projects on your computer and vice-versa.
The current RStudio-recommended tool for creating virtual R environments is renv
is. Once a renv
environment is set up, whenever you run R in the project root or open that project in RStudio, that renv
environment will be activated and your code will exclusively use a local, project-specific package library (located somewhere inside the renv/
folder in the root of your project). This is useful even if you are not planning to collaborate on a given project, because any installation/removal of R packages won't affect any other projects.
renv
will also track the packages and their versions using a so-called lockfile usually called renv.lock
. It is this file that you share to allow your collaborators recreate your environment with the same packages and their versions.
Basic pipeline
Only once:
Every time the project is in one collaborator's hands:
pull from GitHub,
restore
renv
environment(see renv::restore()),work on the project: write code and install/remove R packages (see Installing a new package and Removing a Package)
check the R environment and update the lockfile, (see renv::status() and renv::snapshot()),
push to GitHub.
Ideally, updating the lockfile happens before each commit, not only before the one that gets pushed to GitHub.
Set up the environment
Install renv
if it isn't installed.
To initialize the project, run the following command in R console in your project root:
If you and your collaborators have a way to switch between R versions (R Switch, rig, conda environments, etc.), consider not locking the full version of R (e.g., 4.3.1
) in renv.lock
but only its major.minor part (i.e., 4.3
) in order to ignore the difference in patch numbers that would otherwise lead to constant uninteresting changes to renv.lock
. Here is what to do:
Manually create a file named .Renviron
(note capital R
and no .txt
, .rtf
, etc. at the end of the filename) in the root of the project or open it if it already exists and add the following lines to it:
Add, commit, and push the following files to git:
Note that git status
doesn't show contents of untracked folders by default so you will see renv/
but not renv/.gitignore
and other files after running git status
. To show the contents of the untracked folders, use git status --untracked-files=all
or its abbreviated version git status -uall
.
Activate the environment
Activating the environment is done automatically when R or RStudio is run in the project root. If all went well, you'll see something like this:
Starting not from the project root
Or, if running from a folder that is still in your project, just not from the root folder, do
Keep the `renv` environment in sync
We will use three functions that will help us do that:
renv::status()
that lets you check whether everything is in sync.renv::snapshot()
that saves the state of project dependencies (more on that below) to the lockfilerenv.lock
.renv::restore()
Before we look into them, let's see what specifically needs to be in sync. There are three places where a given package can be present or not:
lockfile - the
renv.lock
file that lists all packages and their versions,library - the project-local package library somewhere inside the
renv/
folder,dependencies - packages that your code implicitly depends upon via
library(bestPackageEver)
,bestPackageEver::an_ok_function()
, etc. and all the packages that those packages in turn depend on.
renv::status()
renv::status()
Function renv::status()
reports whether lockfile, library, and dependencies of the project are in sync. For each out-of-sync package (if any), it will tell you whether it is present in each of these three locations. See ?renv::status()
for how you can deal with each possible combination. Or refer to the table below - whichever makes more sense to you.
You won't see this because this package is in sync.
Start using the package if you need it or remove it if you don't.
New package that you installed and started using recently? Take a snapshot.
Either start using the package and take a snapshot or remove it if you don't need it.
You probably switched to a different verson of the repo. Restore to install the packages not in the library.
Restore the environment (to install) and start using the package or take a snapshot to remove it from the lockfile.
Install packages one by one or use renv::hydrate
to install them all at once. Sometimes, this situation can indicate that the package is only used in the sense of there being a library(<pkg>)
call somewhere but it isn't used anywhere else. In that case, remove the library
call.
renv::snapshot()
renv::snapshot()
The task of renv::snapshot()
in the default - implicit mode is to save the implicit dependencies of the project to the lockfile overwriting whatever its contents were before.
There are several modes in which renv::snapshot
can operate. You can set them at the project level, for example, when you initialize the project with git::init()
, or you can choose it for each individual renv::snapshot()
call. I'd recommend sticking to the default implicit mode unless you have a good reason not to.
renv::restore()
renv::restore()
Before restoring, make sure the lockfile doesn't have any uncommitted changes.
Use the following code to ensure that the packages installed in your library are exactly the ones recorded in the lockfile and of exactly those versions:
The task of renv::restore()
(without additional arguments) is to make sure that the every package in the lockfile is installed to your library and has the correct version. If you add clean = TRUE
into the function call, then it will also remove any packages that are not in the lockfile.
Installing a new package
Use the package (
library(bestPackageEver)
,bestPackageEver::<function>()
) somewhere in your code. This will implicitly add the package to the dependencies of the project.Install the package with
install.packages('bestPackageEver')
. As long asrenv
has been activated, this will install the package into the local library.Take a snapshot with
renv::snapshot()
. This will make a snapshot of the dependencies in the lockfile.
Removing a Package
Remove from the dependencies: First, make sure the package is not actively used in your code. Remove any references to the package, such as functions or library calls. This ensures that the package is no longer part of your project's dependencies.
Uninstall from the library with
remove.packages("bestPackageEver")
Take a snapshot:
renv::snapshot()
Last updated