---
title: "Basic Library Operations"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Basic Library Operations}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```
The main motivation for developing the **libr** package is to create and use 
data libraries and data dictionaries.  These concepts are useful when 
dealing with sets of related data files.  The `libname()` function allows
you to define a library for an entire directory of data files.  The library
can then be manipulated as a whole using the `lib_*` functions in the **libr**
package.


### Basic Library Operations
There are four main **libr** functions for creating and using a data library:

* `libname()`
* `lib_load()`
* `lib_unload()`
* `lib_write()`

The `libname()` function creates a data library.  The function has parameters
for the library name and a directory to associate it with.  If the directory
has existing data files, those data files will be automatically loaded
into the library.  Once in the library, the data can be accessed using list
syntax.

You may create a data library for several different types of files: 'rds', 'Rdata',
'rda', 'csv', 'xlsx', 'xls', 'sas7bdat', 'xpt', and 'dbf'.  The type of library is
defined using the `engine` parameter on the `libname()` function.  The default
data engine is 'rds'.  The data engines will attempt to identify the correct
data type for each column of data.  You may also control the data type of 
the columns using the `import_specs` parameter and the `specs()` and 
`import_spec()` functions.

If you prefer to access the data via the workspace, call
the `lib_load()` function on the library.  This function will load the 
library data into the parent frame, where it can be accessed using a two-level
(&lt;library&gt;.&lt;dataset&gt;) name.  

When you are done with the data, call the `lib_unload()` function to remove
the data from the parent frame and put it back in the library list.  To write
any added or modified data to disk, call the `lib_write()` function. The 
`lib_write()` function will only write data that has changed since the last
write.

The following example will illustrate some basic functionality of the 
**libr** package regarding the creation of libnames and use of dictionaries.
The example first places some sample data in a temp directory for
illustration purposes.  Then the example creates a libname from the temp
directory, loads it into memory, adds data to it, and then unloads and 
writes everything to disk:
```{r eval=FALSE, echo=TRUE}
library(libr)

# Create temp directory
tmp <- tempdir()

# Save some data to temp directory
# for illustration purposes
saveRDS(trees, file.path(tmp, "trees.rds"))
saveRDS(rock, file.path(tmp, "rocks.rds"))

# Create library
libname(dat, tmp)
# library 'dat': 2 items
# - attributes: not loaded
# - path: C:\Users\User\AppData\Local\Temp\RtmpCSJ6Gc
# - items:
#    Name Extension Rows Cols   Size        LastModified
# 1 rocks       rds   48    4 3.1 Kb 2020-11-05 23:25:34
# 2 trees       rds   31    3 2.4 Kb 2020-11-05 23:25:34

# Examine data dictionary for library
dictionary(dat)
# A tibble: 7 x 9
#   Name  Column Class   Label Description Format Width  Rows   NAs
#   <chr> <chr>  <chr>   <lgl> <lgl>       <lgl>  <lgl> <int> <int>
# 1 rocks area   integer NA    NA          NA     NA       48     0
# 2 rocks peri   numeric NA    NA          NA     NA       48     0
# 3 rocks shape  numeric NA    NA          NA     NA       48     0
# 4 rocks perm   numeric NA    NA          NA     NA       48     0
# 5 trees Girth  numeric NA    NA          NA     NA       31     0
# 6 trees Height numeric NA    NA          NA     NA       31     0
# 7 trees Volume numeric NA    NA          NA     NA       31     0

# Load library
lib_load(dat)

# Examine workspace
ls()
# [1] "dat" "dat.rocks" "dat.trees" "tmp"

# Use data from the library
summary(dat.rocks)
#      area            peri            shape              perm        
# Min.   : 1016   Min.   : 308.6   Min.   :0.09033   Min.   :   6.30  
# 1st Qu.: 5305   1st Qu.:1414.9   1st Qu.:0.16226   1st Qu.:  76.45  
# Median : 7487   Median :2536.2   Median :0.19886   Median : 130.50  
# Mean   : 7188   Mean   :2682.2   Mean   :0.21811   Mean   : 415.45  
# 3rd Qu.: 8870   3rd Qu.:3989.5   3rd Qu.:0.26267   3rd Qu.: 777.50  
# Max.   :12212   Max.   :4864.2   Max.   :0.46413   Max.   :1300.00 

# Add data to the library
dat.trees_subset <- subset(dat.trees, Girth > 11)

# Add more data to the library
dat.cars <- mtcars

# Unload the library from memory
lib_unload(dat)

# Examine workspace again
ls()
# [1] "dat" "tmp"

# Write the library to disk
lib_write(dat)
# library 'dat': 4 items
# - attributes: not loaded
# - path: C:\Users\User\AppData\Local\Temp\RtmpCSJ6Gc
# - items:
#           Name Extension Rows Cols   Size        LastModified
# 1        rocks       rds   48    4 3.1 Kb 2020-11-05 23:37:45
# 2        trees       rds   31    3 2.4 Kb 2020-11-05 23:37:45
# 3         cars       rds   32   11 7.3 Kb 2020-11-05 23:37:45
# 4 trees_subset       rds   23    3 1.8 Kb 2020-11-05 23:37:45

# Clean up
lib_delete(dat)

# Examine workspace again
ls()
# [1] "tmp"
```


Next: [Library Management](libr-management.html)
