---
title: "Draw your own Rectangular Statistical Cartogram with **recmap**"
author: "Christian Panse"
date: "`r Sys.Date()`"
bibliography:
  - recmap.bib
csl: ieee.csl
output:
  tufte::tufte_html: default
  tufte::tufte_handout:
    toc: true
    citation_package: natbib
    latex_engine: xelatex
vignette: >
  %\VignetteIndexEntry{Draw your own Rectangular Statistical Cartogram  with recmap}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

# Introduction

This package contains a C++ implementation of the RecMap algorithm 
[[@recmap]](http://dx.doi.org/10.1109/INFVIS.2004.57), [@2016arXiv160600464P] to draw maps according to 
given statistical values. These so-called cartograms or value-by-area-maps may 
be used to visualize any geospatial-related data, e.g., political, economic or 
public health data.
The input consists of a map represented by overlapping rectangles.
This map is defined by the following parameters for each map region:

- a tuple of (x, y) values corresponding to the (longitude, latitude) position,
- a tuple of (dx, dy) of expansion along (longitude, latitude), 
- and a statistical value z.

The (x, y) coordinates represent the center of the minimal bounding boxes (MBB),
The coordinates of the MBB are derived by adding or subtracting the (dx, dy) 
tuple accordingly. The tuple (dx, dy) also defines the ratio of the map region.
The statistical values define the desired area of each map region. 

The output is a rectangular cartogram where the map regions are:

- Non-overlapping, 
- connected, 
- ratio and area of each rectangle correspond to the desired areas,
- rectangles are placed parallel to the axes.

The construction heuristic  places the rectangles in a way that important 
spatial constraints, in particular

- the topology of the pseudo dual graph,
- the relative position of MBB centers.

are tried to be preserved.

The ratios are preserved, and the area of each region corresponds to 
the as input given statistical value z.

The graphic below depicts a typical example of a rectangular cartogram drawing.

```{r eval = TRUE, echo = FALSE}
options(prompt = "R> ",
  continue = "+  ",
  width = 70,
  useFancyQuotes = FALSE,
  warn = -1)
```

```{r fig.width=7, fig.height=3.5, fig.retina=2, fig.align='left', fig.cap="Rectangular Cartogram of the U.S. election 2004; The area corresponds to the number of electors (color indicates the party red: democrats / blue: Republican; the color intensity ~ outcome of the vote.). The graphic was computed by using the original implementation of the construction heuristic RecMap MP2 introduced in [@recmap].", echo=FALSE, warning=FALSE, comment="ccc", error=FALSE, message=FALSE}

library(recmap)
op <- par(mar = c(0,0,0,0), bg = NA)
recmap:::.draw_recmap_us_state_ev()
par(op)
# detach("package:recmap", unload=TRUE)
```

# The Usage

attach the package
```{r eval=TRUE, echo=TRUE, message=TRUE}
library(recmap)
```

look into for documentation
```{r eval=FALSE}
help(package="recmap") 
```

## Input - using the U.S. `state` Facts and Figures Dataset

```{r}
usa <- data.frame(x = state.center$x, 
    y = state.center$y, 
    # make the rectangles overlapping by correcting lines of longitude distance
    dx = sqrt(state.area) / 2 / (0.8 * 60 * cos(state.center$y*pi/180)), 
    dy = sqrt(state.area) / 2 / (0.8 * 60) , 
    z = sqrt(state.area),
    name = state.name)
```

## Compute Pseudo Dual Graph (PD)

The rectangles have to  overlap to compute the dual graph. This enables to 
generate valid input having only the (x, y) coordinates of the map region.

```{r fig.width=7, fig.height=3}
library(recmap)
op <- par(mfrow = c(1 ,1), mar = c(0, 0, 0, 0), bg = NA)
plot.recmap(M <- usa[!usa$name %in% c("Hawaii", "Alaska"), ],  
            col.text = 'black', lwd=2)
```

## Apply a Metaheuristic

The index order of the input map
has an impact to the resulting cartogram. This algorithmic property is caused due to 
the computation of the dual graph. In [@recmap] a genetic algorithm was applied 
as metaheuristic. 
Due to the limited computing resources on the CRAN 
check systems, we do not use all the potential of the metaheuristic.


Study the examples of the reference manual `?recmapGA` on 
how the [GA](https://cran.r-project.org/package=GA) package can be used.


## Objective Functions

The **topology error** is an indicator of the deviation of
the neighborhood relationships.
The error is computed by counting the differences 
between dual graphs or adjacency graphs of map and cartogram 

The **relative positions error**
measures the angle difference between all region centers.


## Output

The output is a `data.frame` object.

```{r}
Cartogram <- recmap(Map <- usa[!usa$name %in% c("Hawaii", "Alaska"), ])
head(Cartogram)
```

# Application

## Rectangular Map Approximation

```{r fig.width=8,  fig.height=4.5, fig.align='left', fig.cap="Rectangular Map Approximation - rectangle area correspond  to state area." }

smp <- c(29, 22, 30, 3, 17, 8, 9, 41, 18, 15, 38, 35, 21, 23, 19, 6, 31, 32, 20, 
        28, 48, 4, 13, 14, 42, 37, 5, 16, 36 , 43, 25, 33, 12, 7, 39, 44, 2, 47,
        45, 46, 24, 10, 1,11 ,40 ,26 ,27 ,34)

op <- par(mfrow = c(1 ,1), mar = c(0, 0, 0, 0), bg = NA)
plot(Cartogram.Area <- recmap(M[smp, ]),
            col.text = 'black', lwd = 2)
```

```{r}
summary.recmap(M)
summary(Cartogram.Area)
```

## `state.x77[, 'Population']`

```{r fig.width=8, fig.height=4, fig.align='left', fig.cap="Area ~ population estimate as of July 1, 1975;", warning=FALSE}

op <- par(mfrow = c(1 ,1), mar = c(0, 0, 0, 0), bg = NA)
usa$z <- state.x77[, 'Population']
M <- usa[!usa$name %in% c("Hawaii", "Alaska"), ]
plot(Cartogram.Population <- recmap(M[order(M$x), ]),
            col.text = 'black', lwd = 2)

```

```{r fig.width=8, fig.height=4, fig.align='left', fig.cap="Area ~ population estimate as of July 1, 1975; a better index order has been chosen to minimize the relative position error."}
# index order

smp <- c(20,47,4,40,9,6,32,33,3,10,34,22,2,28,15,12,39,7,42,45,19,13,43,30,24,
         25,11,17,37,41,26,29,21,35,8,36,14,16,31,48,46,38,23,18,1,5,44,27)

op <- par(mfrow = c(1 ,1), mar = c(0, 0, 0, 0), bg = NA)
plot(Cartogram.Population <- recmap(M[smp,]), col.text = 'black', lwd = 2)
```

## `state.x77[, 'Income']`

```{r fig.width=8, fig.height=4, fig.align='left', fig.cap="Area ~ capita income (1974);"}
usa$z <- state.x77[, 'Income']
M <- usa[!usa$name %in% c("Hawaii", "Alaska"), ]
op <- par(mfrow = c(1 ,1), mar = c(0, 0, 0, 0), bg = NA)
plot(Cartogram.Income <- recmap(M[order(M$x),]),
  col.text = 'black', lwd = 2)
```

## `state.x77[, 'Frost']`

```{r recmapGA, fig.width=8, fig.height=4, fig.align='left', warnings = FALSE, fig.cap="Area ~ mean number of days with minimum temperature below freezing (1931–1960) in capital or large city;", error = TRUE }
usa$z <- state.x77[, 'Frost'] 
M <- usa[!usa$name %in% c("Hawaii", "Alaska"), ]
op <- par(mfrow = c(1 ,1), mar = c(0, 0, 0, 0), bg = NA)
gaControl("useRcpp" = FALSE)
Frost <- recmapGA(M, seed = 1)
plot(Frost$Cartogram, 
            col.text = 'black', lwd = 2)
```

```{r Frost, error=TRUE}
summary(Frost)
```

More interactive examples using `state.x77` data are available by
running the code snippet below. 
```{r eval=FALSE}
# Requires to install the suggested packages
# install.packages(c('colorspace', 'maps', 'noncensus', 'shiny'))

library(shiny)

recmap_shiny <- system.file("shiny-examples", package = "recmap")
shiny::runApp(recmap_shiny, display.mode = "normal")
```

## Synthetic input maps - checkerboard

Checkerboards provide examples of sets of map regions which 
do not have ideal cartogram solutions according to Definition 1 [@cartodraw].


```{r fig.width=7, fig.height=2.5, fig.align='center', fig.retina=2, fig.cap="checkerboard fun - input, area of black regions have to be four times as big as white regions (left); solution found by a greedy random algorithm (middle); solution found by genetic algorithm (right)", fig.align='left'}
op <- par(mar = c(0, 0, 0, 0), mfrow = c(1, 3), bg = NA)

plot(checkerboard8x8 <- checkerboard(8),
            col=c('white','white','white','black')[checkerboard8x8$z])

# found by a greedy randomized search
index.greedy <- c(8, 56, 18, 5, 13, 57, 3, 37, 62, 58, 7, 16, 40, 59, 17, 34,
                  29, 41, 46, 27, 54, 43, 2, 21, 38, 52, 31, 20, 28, 48, 1, 22,
                  55, 11, 25, 19, 50, 10, 24, 53, 47, 30, 45, 44, 32, 35, 51,
                  15, 64, 12, 14, 39, 26, 6, 42, 33, 4, 36, 63, 49, 60, 61, 9,
                  23)

plot(Cartogram.checkerboard8x8.greedy <- recmap(checkerboard8x8[index.greedy,]),
            col = c('white','white','white','black')[Cartogram.checkerboard8x8.greedy$z])

# found by a genetic algorithm
index.ga <- c(52, 10, 27, 63, 7, 20, 32, 18, 47, 28, 6, 55, 11, 61, 38, 50, 5,
              21, 36, 34, 2, 22, 3, 1, 29, 57, 43, 4, 51, 58, 31, 49, 44, 25,
              59, 33, 17, 40, 8, 41, 26, 37, 19, 56, 45, 35, 62, 53, 24, 64,
              30, 15, 39, 12, 60, 48, 16, 23, 46, 42, 13, 54, 14, 9)

plot(Cartogram.checkerboard8x8.ga <- recmap(checkerboard8x8[index.ga,]),
            col = c('white','white','white','black')[Cartogram.checkerboard8x8.ga$z])

```

# History 

The work on RecMap was initiated by understanding the limits of contiguous 
cartogram drawing [@cartodraw] and after studying the visualizations
drawn by Erwin Raisz [@ErwinRaisz].
The purpose of the first implementation [@recmap] was a feasibility
check on how computer-generated rectangular cartograms with zero area error
could look like.
The `recmap` R package on CRAN provides a rectangular cartogram algorithm to be used by any
R user. Now, it is easy to generate input (e.g., no complex polygon mesh),
the code is maintainable (less than 500 lines of `C++-11` code), and the algorithm is made robust
to the price of not having all features implemented (simplified local placement; 
no *empty space error*; no MP1 variant).
Recent research publications on rectangular cartogram drawing include
[@Speckmann2004], [@Speckmann2007], [@Speckmann2012], [@Buchin:2016].
However, according to a recent publication [@TheStateoftheArtInCartograms],
`recmap` remains the only rectangular cartogram algorithm that 'maintains zero 
cartographic error'. 
The interested reader can find more details on the package usage and its implementation
in [@2016arXiv160600464P].

# Session Info

```{r}
sessionInfo()
```

# References

