# renv - reproducible R environments

## About `renv`

{% hint style="info" %}
Read the [introductory \`renv\` vignette](https://rstudio.github.io/renv/articles/renv.html). Explanations there can make a lot more sense than the ones on this page.
{% endhint %}

A major way to keep the output of your R code reproducible (or even runnable) across collaborators is to use virtual environments (see [Virtual environments](/programming-info/computing-programming-guides/virtual-environments.md)) which let you recreate the state of R package library of any of your collaborators without affecting any other projects on your computer and vice-versa.

The current RStudio-recommended tool for creating virtual R environments is `renv` is. Once a `renv` environment is set up, whenever you run R in the project root or open that project in RStudio, that `renv` environment will be activated and your code will exclusively use a local, project-specific package library (located somewhere inside the `renv/` folder in the root of your project). This is useful even if you are not planning to collaborate on a given project, because any installation/removal of R packages won't affect any other projects.

`renv` will also track the packages and their versions using a so-called *lockfile* usually called `renv.lock`. It is this file that you share to allow your collaborators recreate your environment with the same packages and their versions.

## Basic pipeline

**Only once:**

&#x20;[#set-up-the-environment](#set-up-the-environment "mention")

**Every time the project is in one collaborator's hands:**

* pull from GitHub,
* restore `renv` environment(see [#renv-restore](#renv-restore "mention")),
* work on the project: write code and install/remove R packages (see [#installing-a-new-package](#installing-a-new-package "mention") and [#removing-a-package](#removing-a-package "mention"))
* check the R environment and update the lockfile, (see [#renv-status](#renv-status "mention") and [#renv-snapshot](#renv-snapshot "mention")),
* push to GitHub.

{% hint style="warning" %}
Ideally, updating the lockfile happens before each commit, not only before the one that gets pushed to GitHub.
{% endhint %}

## Set up the environment

<details>

<summary>Short version</summary>

In R console:

```
install.packages('renv')
renv::init()
```

In terminal/shell/git-bash:

<pre class="language-bash"><code class="lang-bash"><strong>echo "# Tell \`renv\` to only use the project library and not access the system library
</strong>RENV_CONFIG_SANDBOX_ENABLED=TRUE
# Tell \`renv\` to install as much as it can during \`renv::init()\`/\`renv::restore()\`
RENV_CONFIG_INSTALL_TRANSACTIONAL=FALSE
" >> .Renviron &#x26;&#x26;
git add \
    renv.lock \
    .Rprofile \
    renv/ \
    .Renviron &#x26;&#x26;
git commit -m "set up renv environment"
</code></pre>

</details>

Install `renv` if it isn't installed.

To initialize the project, run the following command in R console in your project root:

```r
renv::init()
```

{% hint style="info" %}
If you and your collaborators have a way to switch between R versions (R Switch, rig, conda environments, etc.), consider not locking the full version of R (e.g., `4.3.1`) in `renv.lock` but only its major.minor part (i.e., `4.3`)  in order to ignore the difference in patch numbers that would otherwise lead to constant uninteresting changes to `renv.lock`. Here is what to do:

```r
major_minor <- sub("\\.\\d+$", "", getRversion())
renv::init(settings = list(r.version = minor_version))
```

{% endhint %}

Manually create a file named `.Renviron`  (note capital `R` and no `.txt`, `.rtf`, etc. at the end of the filename) in the root of the project or open it if it already exists and add the following lines to it:

<pre class="language-shell"><code class="lang-shell"># Tell `renv` to only use the project library and not access the system library
RENV_CONFIG_SANDBOX_ENABLED=TRUE
<strong># Tell `renv` to install as much as it can during `renv::init()`/`renv::restore()`
</strong>RENV_CONFIG_INSTALL_TRANSACTIONAL=FALSE
</code></pre>

Add, commit, and push the following files to git:

```
.Renviron
.Rpofile
renv.lock
renv/.gitignore
renv/settings.json
renv/activate.R
```

{% hint style="info" %}
Note that `git status` doesn't show contents of untracked folders by default so you will see `renv/` but not `renv/.gitignore` and other files after running `git status`. To show the contents of the untracked folders, use `git status --untracked-files=all` or its abbreviated version `git status -uall`.
{% endhint %}

## Activate the environment

Activating the environment is done automatically when R or RStudio is run in the project root. If all went well, you'll see something like this:

```r
* Project '/path/to/the/folder' loaded. [renv 0.14.0]
```

#### Starting not from the project root

```r
source('<path_to_your_project>/renv/activate.R')
```

Or, if running from a folder that is still in your project, just not from the root folder, do

```r
source(here::here('renv/activate.R'))
```

## Keep the \`renv\` environment in sync

We will use three functions that will help us do that:

* `renv::status()` that lets you check whether everything is in sync.&#x20;
* `renv::snapshot()` that saves the state of project *dependencies* (more on that below) to the lockfile `renv.lock`.
* `renv::restore()`

Before we look into them, let's see what specifically needs to be in sync. There are three places where a given package can be present or not:

* *lockfile* - the `renv.lock` file that lists all packages and their versions,
* *library* - the project-local package library somewhere inside the `renv/` folder,
* *dependencies* - packages that your code implicitly depends upon via `library(bestPackageEver)`, `bestPackageEver::an_ok_function()`, etc. and all the packages that those packages in turn depend on.

## `renv::status()`

Function `renv::status()` reports whether lockfile, library, and dependencies of the project are in sync. For each out-of-sync package (if any), it will tell you whether it is present in each of these three locations. See `?renv::status()` for how you can deal with each possible combination. Or refer to the table below - whichever makes more sense to you.

<table><thead><tr><th width="115" data-type="checkbox">installed</th><th width="103" data-type="checkbox">recorded</th><th width="70" data-type="checkbox">used</th><th>Solution</th></tr></thead><tbody><tr><td>true</td><td>true</td><td>true</td><td>You won't see this because this package is in sync.</td></tr><tr><td>true</td><td>true</td><td>false</td><td>Start using the package if you need it or remove it if you don't.</td></tr><tr><td>true</td><td>false</td><td>true</td><td>New package that you installed and started using recently? Take a snapshot.</td></tr><tr><td>true</td><td>false</td><td>false</td><td>Either start using the package and take a snapshot or remove it if you don't need it.</td></tr><tr><td>false</td><td>true</td><td>true</td><td>You probably switched to a different verson of the repo. Restore to install the packages not in the library.</td></tr><tr><td>false</td><td>true</td><td>false</td><td>Restore the environment (to install) and start using the package or take a snapshot to remove it from the lockfile.</td></tr><tr><td>false</td><td>false</td><td>true</td><td>Install packages one by one or use <code>renv::hydrate</code> to install them all at once. Sometimes, this situation can indicate that the package is only used in the sense of there being a <code>library(&#x3C;pkg>)</code> call somewhere but it isn't used anywhere else. In that case, remove the <code>library</code> call.</td></tr></tbody></table>

## `renv::snapshot()`

The task of `renv::snapshot()` in the default - *implicit* mode is to save the implicit dependencies of the project to the lockfile overwriting whatever its contents were before.

{% hint style="info" %}
There are several modes in which `renv::snapshot` can operate. You can set them at the project level, for example, when you initialize the project with `git::init()`, or you can choose it for each individual `renv::snapshot()` call. I'd recommend sticking to the default *implicit* mode unless you have a good reason not to.
{% endhint %}

## `renv::restore()`

{% hint style="danger" %}
Before restoring, make sure the lockfile doesn't have any uncommitted changes.
{% endhint %}

Use the following code to ensure that the packages installed in your library are exactly the ones recorded in the lockfile and of exactly those versions:

```r
renv::restore(clean = TRUE)
```

The task of `renv::restore()` (without additional arguments) is to make sure that the every package in the lockfile is installed to your library and has the correct version. If you add `clean = TRUE` into the function call, then it will also remove any packages that are not in the lockfile.

## Installing a new package

* **Use the package** (`library(bestPackageEver)`, `bestPackageEver::<function>()`) somewhere in your code. This will implicitly add the package to the *dependencies* of the project.
* **Install the package** with `install.packages('bestPackageEver')`. As long as `renv` has been activated, this will install the package into the local *library*.
* **Take a snapshot with** `renv::snapshot()`. This will make a *snapshot* of the *dependencies* in the *lockfile*.

## Removing a Package

* **Remove from the&#x20;*****dependencies*****:** First, make sure the package is not actively used in your code. Remove any references to the package, such as functions or library calls. This ensures that the package is no longer part of your project's dependencies.
* **Uninstall from the&#x20;*****library*** with `remove.packages("bestPackageEver")`
* **Take a snapshot:** `renv::snapshot()`


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://gitbook.bergelsonlab.com/programming-info/computing-programming-guides/virtual-environments/renv-reproducible-r-environments.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
