---
title: "rddtools"
author: "Matthieu Stigler"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{rddtools}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---


```{r, echo = FALSE, message = FALSE}
knitr::opts_chunk$set(collapse = T, comment = "#>")
```

**RDDtools** works in an object-oriented way: the user has to define once the characteristic of the data, creating a *rdd_data* object, on which different anaylsis tools can be applied. 

# Data Preparation and Visualisation
Load the package, and load the built-in dataset from [Lee 2008]:

```{r}
library(rddtools)
data(house)
```

Declare the data to be a *rdd_data* object:

```{r}
house_rdd <- rdd_data(y=house$y, x=house$x, cutpoint=0)
```


You can now directly summarise and visualise this data:

```{r dataPlot}
summary(house_rdd)
plot(house_rdd)
```


# Parametric Estimation

Estimate parametrically, by fitting a 4th order polynomial.

```{r reg_para}
reg_para <- rdd_reg_lm(rdd_object=house_rdd, order=4)
reg_para

plot(reg_para)
```


# Non-parametric Estimation

Run a simple local regression, using the [Imbens and Kalyanaraman 2012] bandwidth.

```{r RegPlot}
bw_ik <- rdd_bw_ik(house_rdd)
reg_nonpara <- rdd_reg_np(rdd_object=house_rdd, bw=bw_ik)
print(reg_nonpara)
```

# Regression Sensitivity tests:

One can easily check the sensitivity of the estimate to different bandwidths:
```{r SensiPlot}
plotSensi(reg_nonpara, from=0.05, to=1, by=0.1)
```

Or run the Placebo test, estimating the RDD effect based on fake cutpoints:
```{r placeboPlot}
plotPlacebo(reg_nonpara)
```

# Design Sensitivity tests:

Design sensitivity tests check whether the discontinuity found can actually be attributed ot other causes. Two types of tests are available:

+ Discontinuity comes from manipulation: test whether there is possible manipulation around the cutoff, McCrary 2008 test: **dens_test()**
+ Discontinuity comes from other variables: should test whether discontinuity arises with covariates. Currently, only simple tests of equality of covariates around the threshold are available: 

## Discontinuity comes from manipulation: McCrary test

use simply the function **dens_test()**, on either the raw data, or the regression output:
```{r DensPlot}
dens_test(reg_nonpara)
```

## Discontinuity comes from covariates: covariates balance tests

Two tests available:
+ equal means of covariates: **covarTest_mean()**
+ equal density of covariates: **covarTest_dens()**


We need here to simulate some data, given that the Lee (2008) dataset contains no covariates.
We here simulate three variables, with the second having a different mean on the left and the right. 

```{r}
set.seed(123)
n_Lee <- nrow(house)
Z <- data.frame(z1 = rnorm(n_Lee, sd=2), 
                z2 = rnorm(n_Lee, mean = ifelse(house<0, 5, 8)), 
                z3 = sample(letters, size = n_Lee, replace = TRUE))
house_rdd_Z <- rdd_data(y = house$y, x = house$x, covar = Z, cutpoint = 0)
```

Tests correctly reject equality of the second, and correctly do not reject equality for the first and third. 
