---
title: "Basic functionalities"
author: "Godwin Yung"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Basic functionalities}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

This vignette walks through the basic functions in npsurvSS. By the end, users should be able to:

* create an object that captures the distributional assumptions for a treatment arm
* visualize the distributional assumptions
* calculate power and sample size for a two-arm study
* approximate an event-driven study by a time-driven study
* simulate a two-arm study

## Creating objects of class "arm"

The cornerstone of npsurvSS lies in objects of class "arm". These objects are lists that capture for a treatment arm assumptions regarding its sample size, accrual, survival, censoring, and duration of follow-up. Once created, they serve as inputs for other functions, including functions for power/sample size calculation and trial simulation. 

The following code creates two arms, an `active` arm and a `control` arm. Both arms will accrue 120 patients uniformly over 6 months and follow them for an additional 12 months. Patients will be subjected to loss of follow-up at an exponential rate of 0.00578. `active` and `control` patients will experience event at exponential rates of 0.0462 and 0.0578, respectively. The hazard ratio between the two arms is therefore approximately 0.8.
```{r}
library(npsurvSS)
active <- create_arm(size=120, 
                     accr_time=6, 
                     surv_scale=0.0462, 
                     loss_scale=0.00578, 
                     follow_time=12)
control <- create_arm(size=120, 
                     accr_time=6, 
                     surv_scale=0.0578, 
                     loss_scale=0.00578, 
                     follow_time=12)
```
In practice, investigators seldom consider exponential distributions on the hazard rate scale. Instead, they consider the median survival or survival probability at some milestone _t_. We have defined additional functions to facilitate this practice. `per2haz` is a simple code that can convert an exponential survival percentile to the hazard rate and vice versa. 
```{r}
active <- create_arm(size=120, 
                     accr_time=6, 
                     surv_scale=per2haz(15), # corresponds to 15 month median
                     loss_scale=per2haz(120), # corresponds to 120 month median
                     follow_time=12)
control <- create_arm(size=120, 
                     accr_time=6, 
                     surv_scale=per2haz(12), # corresponds to 12 month median
                     loss_scale=per2haz(120), 
                     follow_time=12)
per2haz(15) # convert median survival to hazard rate
per2haz(0.0462) # convert hazard rate to median survival
```
Alternatively, `create_arm_lachin` allows investigators to specify exponential survival and censoring distributions by providing median survivals or milestone survivals.
```{r}
active <- create_arm_lachin(size=120, 
                            accr_time=6, 
                            surv_median=15,
                            loss_milestone=c(120, 0.5), # corresponds to 120 month median
                            follow_time=12)
control <- create_arm_lachin(size=120, 
                             accr_time=6, 
                             surv_milestone=c(12, 0.5), # corresponds to 12 month median
                             loss_median=120, 
                             follow_time=12)
class(active)
```
Note that objects created by `create_arm_lachin` belong to class "lachin" and "arm". "lachin" is a subclass of "arm". It is named after the class of distributions considered by Lachin (1986), which covers uniform/truncated-exponential accrual, exponential survival, and exponential censoring. Check out the R Documentation `?create_arm` for examples of "arm" objects with more sophisticated assumptions, such as piecewise-uniform accrual, piecewise-exponential/Weibull survival, and Weibull censoring. While objects created by `create_arm_lachin` are always of class "lachin" and "arm", objects created by `create_arm` are always of class "arm", but not necessarily of class "lachin".

## Visualizing distributional assumptions
Having created an "arm" object, visualizing its assumptions is easy. For example, the following code plots the accrual cumulative distribution function (CDF):
```{r}
x <- seq(0, 6, 0.1)
plot(x, paccr(q=x, arm=control), 
     xlab="Time from first patient in (month)", 
     ylab="Accrual CDF",
     type="l")
```

Likewise, the survival function:
```{r}
x <- seq(0, 18, 0.1)
plot(x, psurv(q=x, arm=control, lower.tail=F), 
     xlab="Time from study entry (month)", 
     ylab="Survival function",
     type="l")
```

Just as `pbinom` in the R package stats is accompanied by functions for the density, quantile, and random generation, `paccr` is similarly accompanied by `daccr`, `qaccr`, and `raccr`. Distribution functions `psurv` and `ploss` are further accompanied by `hsurv` and `hloss` for the hazard.

## Calculating power and sample size
Given an `active` arm and a `control` arm, calculating power and sample size is also easy. The following code calculates power under the default setting of an unweighted log-rank test with one-sided alpha 0.025:
```{r}
power_two_arm(control, active)
```
To calculate power for other tests:
```{r}
# unweighted log-rank
power_two_arm(control, active, test=list(test="weighted logrank"))
# Gehan-Breslow weighted log-rank
power_two_arm(control, active, test=list(test="weighted logrank", weight="n"))
# difference in 12 month survival
power_two_arm(control, active, test=list(test="survival difference", milestone=12))
# ratio of 12 month RMST
power_two_arm(control, active, test=list(test="rmst ratio", milestone=12))
```
Power for multiple tests can be calculated simulateously:
```{r}
power_two_arm(control, active, test=list(list(test="weighted logrank"),
                                         list(test="weighted logrank", weight="n"),
                                         list(test="survival difference", milestone=12),
                                         list(test="rmst ratio", milestone=12)
))
```
To calculate sample size required to achieve 80% power:
```{r}
size_two_arm(control, active, 
             test=list(list(test="weighted logrank"),
                       list(test="weighted logrank", weight="n"),
                       list(test="survival difference", milestone=12),
                       list(test="rmst ratio", milestone=12)
             ))
```
Note that `size_two_arm` returns the required sample size _n_ and expected number of events _d_ (per arm and total). When calculating the required sample size per arm, it considers as input the specified ratio between the two arms (e.g. 120:120) while ignoring their individual values (e.g. 120 and 120). Thus, the following two "arm" objects result in the same sample size calculation for the unweighted log-rank test:
```{r}
control_new <- control
active_new  <- active
control_new$size  <- 1
active_new$size   <- 1
size_two_arm(control_new, active_new)
```
Sample size for a trial with 2:1 randomization in favor of the active arm can be calculated like so:
```{r}
active_new$size   <- 2
size_two_arm(control_new, active_new)
```

## Event-driven trials
By containing the keys `follow_time` and `total_time`, "arm" objects intrinsically apply to time-driven trials that end when a fixed period of time has elapsed after the last patient in. However, they can also be used to approximate event-driven trials, trials in which the study ends when a desired number of events has been observed. Specifically, a trial requiring _d_ events can be approximated by a trial of length _t_,  where the expected number of events at _t_ is equal to _d_. The functions `exp_events` and `exp_duration` can be useful for this purpose:
```{r}
exp_events(control, active) # expected number of events
tau <- exp_duration(control, active, d=150) # study duration for expected number of events to equal d

tau
```
Therefore, under the given assumptions, a trial requiring 150 events can be approximated by a 23.75-month long trial. When updating the trial duration in an "arm" object, it is important to update both the `follow_time` and `total_time` to ensure their consistency:
```{r}
control_new <- control
control_new$total_time  <- tau
control_new$follow_time <- tau - control_new$accr_time
active_new  <- active
active_new$total_time   <- tau
active_new$follow_time  <- tau - active_new$accr_time

exp_events(control_new, active_new) # check expected number of events
```

## Simulating a trial
Finally, time-driven and event-driven trials can be simulated using the following code:
```{r}
trial1 <- simulate_trial(control, active, duration=18)
head(trial1, 5)
table(trial1$arm, trial1$reason)
max(trial1$time.total)

trial2 <- simulate_trial(control, active, events=150)
head(trial2, 5)
sum(trial2$censor)
```
If both `duration` and `events` are provided, the study will end whenever one of the criteria is met:
```{r}
trial3 <- simulate_trial(control, active, duration=18, events=150)
max(trial3$time.total)
sum(trial3$censor)
```
Should it be desired, investigators may also simulate complete data (accrual, survival, censoring) for each individual treatment arm. Note that no cutoff (by number of events or time) is applied. Hence, no patients are administratively censored:
```{r}
control_sim <- simulate_arm(control)
head(control_sim, 5)
table(control_sim$arm, control_sim$reason)
```
