---
title: "Working with a Compendium"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Working with a Compendium}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---



```{r, include = FALSE}
knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
```



<br />



**PLEASE READ THE
[GET STARTED](https://frbcesab.github.io/rcompendium/articles/rcompendium.html) VIGNETTE FIRST**



<br />



## Compendium content

First, create a **new empty RStudio project**. Let's called it `comp`. 
To create a new compendium structure, run `rcompendium::new_compendium()`.

By default, the following content is created:

```
comp/                             # Root of the compendium
│
├── comp.Rproj                    # RStudio project (created by user, optional)
│
├── .git/                         # GIT tracking folder
├── .gitignore                    # List of files/folders to be ignored by GIT
|                                 # (specific to R language)
│
├── R/                            # R functions location
│   ├── fun-demo.R                # Example of an R function (to remove)
│   └── comp-package.R            # Dummy R file for high-level documentation
│
├── man/                          # R functions helps (automatically updated)
│   ├── print_msg.Rd              # Documentation of the demo R function
│   └── pkg-package.Rd            # High-level documentation
│
├── DESCRIPTION                   # Project metadata                              [*]
├── LICENSE.md                    # Content of the GPL (>= 2) license (default)
├── NAMESPACE                     # Automatically generated
├── .Rbuildignore                 # List of files/folders to be ignored while 
│                                 # checking/installing the package
│
├── README.md                     # GitHub README (automatically generated)
├── README.Rmd                    # GitHub README                                 [*]
│
├── data/                         # User raw data (.csv, .gpkg, etc.)
│   ├── raw-data/                 # Read-only files
│   └── derived-data/             # Modified data derived from raw data
│
├── analyses/                     # R scripts (not function) to run analyses
│
├── outputs/                      # Outputs (R objects, .csv, etc.)
├── figures/                      # Figures (.png, .pdf, etc.)
│
└── make.R                        # Main R script to source all R scripts 
                                  # stored in analyses/

[*] These files are automatically created but user needs to manually add 
    some information.
```

<br />

If `create_repo = TRUE` (default), a new GitHub repository will be created 
directly from R. It will be available at: `https://github.com/{{account}}/comp/` 
(where `{{account}}` is either your GitHub account or a GitHub organization).
This repository can be private if `private = TRUE`.



<br />



## Compendium metadata

### DESCRIPTION

The `DESCRIPTION` file contains important compendium metadata. 
By default `rcompendium` creates the following file:


```
Package: comp
Type: Package
Title: The Title of the Project                                              [*]
Version: 0.0.0.9000
Authors@R: c(
    person(given   = "John",
           family  = "Doe",
           role    = c("aut", "cre", "cph"),
           email   = "john.doe@domain.com",
           comment = c(ORCID = "9999-9999-9999-9999")))
Description: A paragraph providing a full description of the project (on     [*] 
    several lines...)
URL: https://github.com/jdoe/comp
BugReports: https://github.com/jdoe/comp/issues
License: GPL (>= 2)
Encoding: UTF-8
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.1.2
Imports: 
    devtools,
    here

[*] Title and Description must be adapted by user.
```

This `DESCRIPTION` file is specific to R package but it can be used to work with
research compendia (see below). 
For further information on how to edit this file, please read 
[https://r-pkgs.org/description.html](https://r-pkgs.org/description.html).




<br />



### README

The `README.md` is the homepage of your repository on GitHub. Its purpose is to 
help visitor to understand your project. Always edit the `README.Rmd` (not the 
`.md` version).

For further information, please read 
[https://r-pkgs.org/release.html?q=README#readme](https://r-pkgs.org/release.html?q=README#readme).



<br />



## Recommendations

- Keep the root of the project as clean as possible
- Store your raw data in `data/raw-data/`
- Document raw data modifications (with R scripts and/or R functions)
- Export modified raw data in `data/derived-data/` (or `outputs/`)
- Store only **R functions** in `R/`
- Store only **R scripts** and/or **Rmd** in `analyses/`
- Control your project with the `make.R` file (add lines to source R scripts)
- Built relative paths using `here::here()`
- Call external functions as `package::function()`
- Alternatively add `#' @import package` (or `#' @importFrom package function`) 
in `R/comp-package.R` to call external functions as `function()`
- Use `devtools::document()` to update the `NAMESPACE`
- Use `rcompendium::add_dependencies(".")` to update the list of required 
dependencies in `DESCRIPTION`
- Do not use `install.packages()` but `remotes::install_deps()` (this will install
required dependencies listed in `DESCRIPTION`)
- Do not use `library()` but `devtools::load_all()` (this will load
required dependencies listed in `DESCRIPTION` and R functions stored in `R/`)
- Do not source your functions but use instead `devtools::load_all()`
- And **document** everything!



<br />



## Advanced features

### Using renv

The default structure created by `rcompendium::new_compendium()` is a good 
starting point for making analyses reproducible. You can increase reproducibility
by using the package [`renv`](https://rstudio.github.io/renv/). 

`renv` will freeze the exact package versions you depend on (in `renv.lock`). 
This ensures that each collaborator (or you in the future) will use the exact 
same versions of these packages.
Moreover `renv` provides to each project its own private package library making 
each project isolated from others.

To initialize `renv` for your compendium, use `renv = TRUE` in 
`rcompendium::new_compendium()` or call the function `rcompendium::add_renv()` 
after `rcompendium::new_compendium()`.

The `make.R` will also be updated (replacement of `remotes::install_deps()` by 
`renv::restore()`)

<br />

**Working with renv**

1. Work as usual
2. Update `NAMESPACE` with `devtools::document()`
3. Update required dependencies in `DESCRIPTION` with `rcompendium::add_dependencies(".")`
4. Install required dependencies locally with `renv::install()`
5. Save the local environment with `renv::snapshot()`



<br />



### Using docker

[Docker](https://www.docker.com/) is a tool that creates a self-contained environment
(**containers**) within which the operating system, system libraries and software
(i.e. R, RStudio server) are frozen and ready-to-use. The use of a container requires a 
recipe (`Dockerfile`) that describes the steps needed to create the environment.
From this recipe, a Docker image is built: this is a template from which a container
will be created.

More information on Docker for R users: 
[https://environments.rstudio.com/docker](https://environments.rstudio.com/docker)


To add a `Dockerfile` to your compendium, use `dockerfile = TRUE` in 
`rcompendium::new_compendium()` or call the function `rcompendium::add_dockerfile()` 
after `rcompendium::new_compendium()`. 

By default `rcompendium` creates the following `Dockerfile`:

```
FROM rocker/rstudio:4.1.3

MAINTAINER John Doe <john.doe@domain.com>


## Install system dependencies (for devtools) ----

RUN sudo apt update -yq \
 && sudo apt install --no-install-recommends libxml2-dev -yq \
 && sudo apt clean all \
 && sudo apt purge \
 && sudo rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*


## Copy local project ----

ENV folder="/home/rstudio/"

COPY . $folder
RUN chown -R rstudio:rstudio $folder


## Set working directory ----

WORKDIR $folder


## Install R packages ----

RUN R -e "install.packages('remotes', repos = c(CRAN = 'https://cloud.r-project.org'))" \
 && R -e "remotes::install_deps(upgrade = 'never')"
```

<br />

If you use `renv` with Docker (recommended) the last step will look like:

```
## Install R packages ----

ENV RENV_VERSION 0.15.4

RUN R -e "install.packages('remotes', repos = c(CRAN = 'https://cloud.r-project.org'))" \
 && R -e "remotes::install_github('rstudio/renv@${RENV_VERSION}')" \
 && sudo -u rstudio R -e "renv::restore()"
```

<br />

By default the Docker image is based on [rocker/rstudio](https://hub.docker.com/r/rocker/rstudio).
You can customize this `Dockerfile` (e.g. adding system dependencies) and use a
different default Docker image (i.e. `tidyverse`, `verse`, `geospatial`, etc.). 
For more information: 
[https://github.com/rocker-org/rocker-versioned2](https://github.com/rocker-org/rocker-versioned2)

The versions of R and `renv` (if applicable) specified in the 
`Dockerfile` are the same as the local system.

When the image will be built, the whole project will be added to the image. When
a container will be launched the default working directory will be the root of the
project.

<br />

Once the project is ready to be shared, the collaborator (or you) must follow 
these steps: 

1. Clone the repository
2. Build the Docker image (in a terminal) from the `Dockerfile` by
running: `docker build -t "image_name" .` (this can take time)
3. Start a container (in a terminal) from this new Docker image `image_name` by running: 
`docker run --rm -p 127.0.0.1:8787:8787 -e DISABLE_AUTH=true image_name`
4. Open a Web browser and visit `127.0.0.1:8787`: a new instance of RStudio 
Server is available with everything ready-to-use (data, code, packages, etc.)
5. To reproduce the analysis, run: `source("make.R")`
