> For the complete documentation index, see [llms.txt](https://gitbook.bergelsonlab.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://gitbook.bergelsonlab.com/programming-info/quick-programming-computing-how-tos/knit-.rmd-to-pdf-with-github-actions.md).

# Rmd-to-pdf GitHub workflow

## Why?

This is a good way to ensure that your notebook doesn't rely on you being connected to the shared drive, on files that only exist on your computer, etc.

## Prerequisite

You should be using `renv` in your repository. See [renv - reproducible R environments](/programming-info/computing-programming-guides/virtual-environments/renv-reproducible-r-environments.md)

## How

We will use GitHub Actions - service provided by GitHub that can run a series of commands on GitHub servers for you. That sequence of commands is referred to as a *workflow* and is defined by a file you put into `.github/workflows/` folder in the root of your repo.

These workflows can be run when a certain event happens (e.g., a push is made to the repository) using *triggers* you define in the same file. We, however, won't be doing that and will run our workflow manually - either in the browser or through command line.

The workflow we are going to use knits the R Markdown file you specify to a pdf document and save it on the GitHub servers. In the context of GitHub actions, this is referred to as *uploading an artifact*.

1. (done once) Add a workflow file to your repository.
2. Run the workflow.
3. (if it fails) Read the log and debug.
4. Download the output pdf file.

### Workflow File

Here is a template of a workflow file you can use:

```yaml
on:
  workflow_dispatch
  
# This workflow only runs when you manually trigger it. This can be done from
# the Actions tab on GitHub which takes unnecessarily many steps. A quicker way
# is to use GitHub CLI (https://cli.github.com/):
#
# To run the workflow on the current branch (you can still use it if you don't
# use branching):
# gh workflow run -w rmd-to-pdf --ref `git branch --show-curent`
#
# To see the results of the last 5 runs, use
# gh run list -w rmd-to-pdf -L 5
#
# Less useful, but still nice, is to delete all the run from GitHub and start
# over:
# gh run list -w rmd-to-pdf | awk '{print $1}' | xargs gh run delete


name: rmd-to-pdf

jobs:
  rmd-to-pdf:
    runs-on: ubuntu-latest
    env:
      GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
    steps:
      - name: Checkout repo
        uses: actions/checkout@v3
        with:
          fetch-depth: 1
          
      # - name: Clone a dataset to ~/BLAB_DATA
      #   uses: bergelsonlab/public-files/clone-blab-data@main
      #   with:
      #     repository: bergelsonlab/<repository-name>
      #     fetch-depth: 1
      
      - uses: bergelsonlab/public-files/knit-rmd-to-pdf@main
        with:
          rmd_path: <path-to/your/R-Markdown-notebook.Rmd>
```

* Copy it and save as `.github/workflows/rmd-to-pdf.yaml` in your repository.
* Change `<path-to/your/R-Markdown-notebook.Rmd>` to match your repository.
* If you use one of our repo-based datasets, such as `seedlings-nouns` or `vihi_annotations`, uncomment the `Clone a dataset to ~/BLAB_DATA` step and change `<repository-name>` to the name of the repo you need. You can copy this step if you use multiple datasets.

### Run the workflow

There are two ways:

#### In the browser

* Open the repository GitHub page.
* Click "Actions".
* Choose `rmd-to-pdf`.
* Find where it says "This workflow has a workflow\_dispatch event trigger."
* Click on "Run Workflow".

#### On the command line

You will need to have GitHub command line tools installed. If you authenticated on GitHub using our[Set up Git and GitHub](/programming-info/computing-programming-guides/git-and-github/set-up-github.md)instructions, you should already have them installed. If not, go to [Set up Git and GitHub](/programming-info/computing-programming-guides/git-and-github/set-up-github.md)and find instructions for installing them.

The command to run:

```bash
gh workflow run rmd-to-pdf --ref `git branch --show-current`
```

### Download the pdf

If everything goes well and the workflow runs successfully, the pdf files will be saved on GitHub servers. Here is how to access it:

* Open the repository GitHub page.
* Click "Actions".
* Choose `rmd-to-pdf` in the left pane.
* Click on the top `rmd-to-pdf` link in the table.
* Scroll down to "Artifacts".
* Click on the name of your R Markdown notebook to download an archive with the pdf.

### If the workflow fails

You will get an email telling you that a workflow run failed. There will be a link - click on it to see the log of the run. Find the part where the failure occurred and try to use information there to fix the problem. If something is unclear, send the link to the log to the lab technician.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://gitbook.bergelsonlab.com/programming-info/quick-programming-computing-how-tos/knit-.rmd-to-pdf-with-github-actions.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
