---
title: "Using provenance to debug errors and warnings"
date: "June 11, 2020"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Using provenance to debug errors and warnings}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---
Errors and warnings are a clear sign that a script is not working
correctly.  While sometimes just seeing the error message is enough 
to know how to fix the problem, that is not always the case.  
`debug.error`, `debug.warning`, and `debug.type.changes` are three
functions to help a user understand an error and what portions of 
the script led to the error.

## debug.error and debug.warning

`debug.error` and `debug.warning` both provide information on the 
backwards lineage of an error or warning message produced by R.
The backward lineage contains the lines of code that led either
directly or indirectly to computing a value that is used on the
line that produced the error.

## Example

Let `myScript.R` be the following:
```
x <- 1
y <- 2
x <- a + x
```

Running this script will result in the following error since no
value has been assigned to a:
```
Error in eval(annot, environ, NULL) : object 'a' not found
```

The result of `debug.error()` is:
```
Your Error: Error in eval(annot, environ, NULL): object 'a' not found

Code that led to error message:

  scriptNum scriptName startLine       code
1         1 myScript.R         1     x <- 1
2         1 myScript.R         3 x <- a + x
```
This shows that lines 1 and 3 may have contributed to the error.
Notice that line 2 is not shown.  It is not part of the lineage of the statement that resulted in the error, since statement 3 neither uses `y`
nor any other variable that itself depends on `y`.

The data frame returned by `debug.error` contains the following columns:

* `scriptNum` The script number the error is associated with.
* `scriptName` The name of the script the error is associated with.
* `startLine` The line number the error is reported on
* `code` The line of code which resulted in the error.

`debug.warning` similarly displays the lineage of a warning message,
although it does not currently support the connection with Stack Overflow, which is discussed below.

## Usage

The function signature for `debug.error` is:
```
debug.error(stack.overflow = FALSE)
```

The parameter for this function is:

* `stack.overflow` If TRUE, the error message will be searched for on Stack Overflow.

This function may be called only after initialising the debugger using either 
`prov.debug`, `prov.debug.run`, or `prov.debug.file`. For example:
```
prov.debug.run("myScript.R")
debug.error()
debug.error(stack.overflow = TRUE)
```

## Stack Overflow

When TRUE is passed in for the stack.overflow parameter, in addition to
returning the backwards lineage of the error, the error will also
be searched on Stack Overflow.  The user will be presented with
the titles of the top 6 matching search results.  The user can then
select one or more of these and a browser window will open displaying
the corresponding Stack Overflow page.

The result of `debug.error(stack.overflow = TRUE)` is:
```
Your Error: Error in eval(annot, environ, NULL): object 'a' not found

Code that led to error message:

  scriptNum scriptName startLine       code
1         1 myScript.R         1     x <- 1
2         1 myScript.R         3 x <- a + x


Results from StackOverflow:
1. "Object not found error with ddply inside a function"                  
2. "ggplot object not found error when adding layer with different data"  
3. "Error in eval(expr, envir, enclos) : object not found"                
4. "data.table throws \"object not found\" error"                         
5. "Object not found error when passing model formula to another function"
6. "Object not found error with ggplot2"                                  

Choose a numeric value that matches your error the best or q to quit: 
```


## debug.type.changes

`debug.type.changes` is also intended to help users resolve errors.  
In this case, it is intended to help with a specific error in which
a variable is bound to a new value and that new value has a different
type.  Since R is not a type-checked language, these errors can 
easily creep into a program and can be hard to debug.  For example,
this can occur if a variable name gets reused for a different purpose,
or if a vectorizing operation changes a variable from a single 
scalar value to a long vector unexpectedly.

Consider this simple example:
```
a <- 1
<code omitted>
a <- a + b
```
If the programmer thought that b was an integer and instead it was a 
longer vector, they might
have expected that they were simply adding two integers together.
However, now the type of `a` has changed from a vector of length 1 to
a vector as long as the vector referenced by b.

Calling `debug.type.changes()` will show all variables whose type has 
changed, what those values were, and where the changes occurred, as 
shown here:

```
$a
                          value container dimension    type       code  scriptNum    scriptName startLine
1                             1    vector         1 numeric     a <- 1  1         1 typechanges.R         1
2  2  3  4  5  6  7  8  9 10 11    vector        10 numeric a <- a + b  2         1 typechanges.R         3
```
