%%\VignetteEngine{knitr::knitr}
%%\VignetteIndexEntry{Using proftools}
\documentclass[nojss]{jss}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% declarations for jss.cls %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

%% almost as usual
\author{Luke Tierney\\University of Iowa \And 
        Riad Jarjour\\University of Iowa}
\title{Examining \proglang{R} Profiling Data: The \pkg{proftools} Package}

%% for pretty printing and a nice hypersummary also set:
\Plainauthor{Luke Tierney, Riad Jarjour} %% comma-separated
\Plaintitle{Examining R Profiling Data:
  The proftools Package} %% without formatting
%$ \Shorttitle{\pkg{foo}: A Capitalized Title} %% a short title (if necessary)

%% an abstract and keywords
\Abstract{
  This note introduces the \pkg{proftools} package for
  examining data collected by \proglang{R}'s sampling profiler.
  \pkg{proftools} includes facilities for summarizing results at the
  function, call, and source line level; for filtering to narrow the
  focus to functions of primary interest; and for visualizing
  profiling data. Use of the package is illustrated with a small
  running example.
}
\Keywords{\pkg{proftools}, profiling, \proglang{R}, \code{Rprof}}
\Plainkeywords{proftools, profiling, R, Rprof} %% without formatting
%% at least one keyword must be supplied

%% publication information
%% NOTE: Typically, this can be left commented and will be filled out by the technical editor
%% \Volume{50}
%% \Issue{9}
%% \Month{June}
%% \Year{2012}
%% \Submitdate{2012-06-04}
%% \Acceptdate{2012-06-04}

%% The address of (at least) one author should be given
%% in the following format:
\Address{
  Luke Tierney\\
  Department of Statistics and Actuarial Science\\
  Faculty of Statistics\\
  University of Iowa\\
  Iowa City, USA\\
  E-mail: \email{luke-tierney@uiowa.edu}\\
  URL: \url{http://homepage.stat.uiowa.edu/~luke/}
  
  Riad Jarjour\\
  Department of Statistics and Actuarial Science\\
  Faculty of Statistics\\
  University of Iowa\\
  Iowa City, USA\\
  E-mail: \email{riad-jarjour@uiowa.edu}
}
%% It is also possible to add a telephone and fax number
%% before the e-mail in the following format:
%% Telephone: +43/512/507-7103
%% Fax: +43/512/507-2851

%% for those who use Sweave please include the following line (with % symbols):
%% need no \usepackage{Sweave.sty}

%% end of declarations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


\begin{document}

%% include your article here, just as usual
%% Note that you should use the \pkg{}, \proglang{} and <<>>=} commands.

<<echo=FALSE,results='hide'>>=
library("proftools")
options(keep.source = TRUE)
knitr::opts_chunk$set(size = "small", fig.align = "center")
@ %%

\section{Introduction}
Profiling is a program analysis method for determining where a program
run spends most of its execution time, and can be very helpful in
guiding programmer effort for improving program performance.
\proglang{R} includes a sampling based profiling mechanism that
records information about calls on the stack at specified time
intervals.  If available, information about the specific source code
lines active at the sampling point is recorded as well. Information
about time spent in the garbage collector can also be collected. The
collected profiling data is written to a file, by default the file
\code{Rprof.out} in the current working directory.  The function
\code{summaryRprof}, available in the \pkg{base} package in
\proglang{R}, provides a simple interface for examining this data.
The \pkg{profvis} package \citep{Chang:profvis} provides a
visualization and another representation of the data. The
\pkg{proftools} package provides a much more extensive set of tools
for summarizing, visualizing, and filtering this data.

\section{Collecting Profile Data}
The \pkg{proftools} package includes an example script \code{bootlm.R}
in the \code{samples} directory that runs several examples from the
\pkg{boot} package \citep{CantyRipley:Boot} and then fits a simple
linear model. The file path can be obtained as
<<>>=
srcfile <- system.file("samples", "bootlmEx.R", package = "proftools")
@  %%

The traditional way to collect profiling data in R is to call
\code{Rprof} to start profiling, run the code to be profiled, and then
call \code{Rprof} again to end profiling. For example, to profile the
code in \code{bootlm.R}, and collect both source line and GC
information, you could use
<<eval = FALSE>>=
profout <- tempfile()
Rprof(file = profout, gc.profiling = TRUE, line.profiling = TRUE)
source(srcfile)
Rprof(NULL)
pd <- readProfileData(profout)
unlink(profout)
@ %%
The \pkg{proftools} package provides the simpler alternative
<<cache = TRUE>>=
pd <- profileExpr(source(srcfile))
@ %%
By default \code{profileExpr} enables GC and source information to be
collected. It also trims off stack information leading up to the
\code{profileExpr} call.

\textbf{RStudio} \citep{RStudio} provides a \textbf{Profile} menu for
collecting profile data. The data are stored in a cache directory. The
function \code{readRStudioProfileCacheData} returns the data for the
most recent profile run in the cache, or \code{NULL} if none is
available.


\section{Summary Functions}
The most basic summary function is \code{funSummary} for summarizing
profile results at the function level. It produces information similar
to the result returned by R's \code{summaryRprof} but in a more usable
form:
<<>>=
head(funSummary(pd), 10)
@ %%
The result returned by \code{funSummary}, and most summary functions,
is a data frame, so \code{head} can be used to focus on the top
entries. 

By default, when source information is available results are
summarized at the call level, so multiple calls to \code{boot} from
different source lines are shown separately. This can be suppressed by
supplying the \code{srclines = FALSE} argument:
<<>>=
head(funSummary(pd, srclines = FALSE), 10)
@ %%

Data can also be summarized by call:
<<>>=
head(callSummary(pd), 10)
@ %%

When source information is available in the profile data the
\code{srcSummary} function can be used to summarize at the source line
level; only lines appearing in the sample are included:
<<>>=
srcSummary(pd)
@ %%
The function \code{annotateSource} can show the full files with
profiling annotations.

A useful way to examine profile data is to look for hot execution
paths.  This approach sorts functions called at top level, i.e. at the
bottom of the call stack, by the total amount of time spent in their
top level calls; within each top level call to a function \code{f} the
functions called by \code{f} are sorted by the amount of time spent in
them within the top level call to \code{f}; and the process continues
for higher level calls. The function \code{hotPaths} produces a hot
path summary; the \code{total.pct} argument causes leaf functions in
stack traces to be pruned back until the execution time percentage for
each stack trace is at least \code{total.pct}:
<<>>=
hotPaths(pd, total.pct = 10.0)
@ %%
Examining the result of \code{hotPaths} starting with high values of
\code{total.pct} and then moving to lower values is a useful way to
explore where the computational effort is concentrated.  An alternative
for limiting the depth to which stack traces are followed is provided
by the \code{maxdepth} argument.
%% Other summary: \code{pathSummary}

\section{Filtering Profile Data}
The hot path summary shows information associated with the source
command itself that is not directly relevant to our analysis. The
\code{filterProfileData} function can be used to select or omit
certain functions, drop functions with small self or total times,
narrow to a particular time interval, among others.  For example, by
selecting only stack traces that include calls to \code{withVisible}
and then trimming off the leading four calls we can focus just on the
work done in the sourced file:
<<>>=
filteredPD <- filterProfileData(pd, select = "withVisible", skip = 4)
@ %%
The hot path summary for this reduced profile is
<<>>=
hotPaths(filteredPD, total.pct = 10.0) 
@ %%

We can use the \code{focus} filter to further narrow our examination
to stack frames containing calls to \code{glm} and also remove all
calls preceding the first \code{glm} call from the selected stack
frames. For this reduced data it also makes sense to follow the paths
further by lowering \code{total.pct} to 5\%:
<<>>=
glmPD <- filterProfileData(filteredPD, focus = "glm")
hotPaths(glmPD, total.pct = 5.0)
@ %%


\section{Visualizing Profile Data}
\href{https://en.wikipedia.org/wiki/Call_graph}{Call graphs} annotated
with profile information are a very popular way to view profiling
results.  \code{plotProfileCallGraph} uses the \pkg{graph}
\citep{Gentleman:graph} and \pkg{Rgraphviz} \citep{Gentry:Rgraphviz}
packages from Bioconductor to render an annotated call graph. The
default style for the graph is based on the style used in
\href{https://github.com/gperftools}{Google's profiling library} but
can be customized in a number of ways.  A call graph for the full
profile data is produced by
<<fullCallGraph, fig.cap = "Full call graph of profile data.">>=
plotProfileCallGraph(pd)
@ %%
and is shown in Figure \ref{fig:fullCallGraph}.

By default the call graph size is limited to at most 30 nodes; nodes
with lower total hit counts are dropped. The \code{maxnodes} argument
can be used to adjust this limit. Also by default color is used to
encode the total hit score for the function nodes. This can be
suppressed with the \code{score = "none"} argument.

We can obtain a more readable graph by filtering. For example, to
examine the \code{glm.fit} call and its callees we can use
<<filteredCallGraph, fig.cap = "Call graph for \\code{glm.fit} call.">>=
plotProfileCallGraph(filterProfileData(pd, focus = "glm.fit"))
@ %%
The result is shown in Figure \ref{fig:filteredCallGraph}.

A printed version of the call graph, similar to the call graph
produced by \code{gprof} \citep{Graham:1982:GCG:800230.806987}, can be
obtained with \code{printProfileCallGraph}.  For the subgraph of the
\code{glm.fit} calls, for example,
<<printProfileCallGraph, eval=FALSE>>=
printProfileCallGraph(filterProfileData(pd, focus = "glm.fit"))
@ %%
produces the printed representation shown in Figure \ref{fig:printGraph}.
\begin{figure}
  \tiny
<<echo = FALSE, comment = NA>>=
<<printProfileCallGraph>>
@ %%
\caption{Printed call graph.}
\label{fig:printGraph}
\end{figure}

Another visualization sometimes used is a
\href{http://www.brendangregg.com/flamegraphs.html}{flame graph}. A flame graph for the original data is produced by
<<flameGraph, out.width = "4in", fig.cap = "Flame graph visualizing hot paths for the full profile data.">>=
flameGraph(pd)
@ %%
The result is shown in Figure \ref{fig:flameGraph}.  A flame graph of
the filtered data is produced by
<<filteredFlameGraph, out.width = "4in", fig.cap = "Flame graph of the filtered profile data.">>=
flameGraph(filteredPD)
@ %%
and shown in Figure \ref{fig:filteredFlameGraph}
%% Furthermore, an svg file of the graph can be outputted by using the
%% svg argument:
%% <<eval=FALSE>>=
%% flameGraph(pd, svg = "graph.svg", order = "hot")
%% @ 

The vertical positions of rectangles in these flame graphs represent
call depth on the stack. The widths of the rectangles represent the
amount of time spent in a call at a particular call or set of calls at
a particular depth. The \code{order} argument determines the ordering
of call rectangles at a particular level within a call at the lower
level. The default order is \code{"hot"}; it uses the hot path
ordering with the call with the largest amount of time first.  This
produces a visual representation of the hot path summary.  The
\code{"alpha"} ordering orders the calls alphabetically.
Specifying \code{order = "time"} shows the calls in the order in which
they occurred:
<<timeGraph, out.width = "4in", fig.cap = "Time graph of the full profile data.">>=
flameGraph(pd, order = "time")
@ %%
Figure \ref{fig:timeGraph} shows the \code{boot} calls preceding the
data generation for the \code{lm} call and the \code{lm} call itself.

Default colors for flame graphs are based on the \code{rainbow}
palette. Alternative colors can be specified by providing a
\code{colormap} function. If space permits, the call labels are
printed within the rectangles representing the calls. A method for the
\code{identify} function is available that can be used to identify the
individual calls when they are not visible. An option to this method
is to request that all instances of the identified calls be outlined on
the plot. The \code{identify} method uses the value returned by
\code{flameGraph}; thus it would be used as
<<eval = FALSE>>=
fg <- flameGraph(pd)
identify(fg)
@ 

A third visualization that is sometimes used is a callee tree map.
This is produced  by
<<calleeTreeMap, out.width = "4in", fig.cap = "Call tree map of the full profile data.">>=
calleeTreeMap(pd)
@ %%
and shown in Figure \ref{fig:calleeTreeMap}.  A callee tree map shows
a tree map \citep{ShneidermanTreemaps} of the calls in the call
tree. The tiling algorithm used depends on the \code{squarify}
argument.  If \code{squarify} is \code{TRUE} then the
\emph{squarified} algorithm \citep{BrulsVanWijk:sqarified} is used;
otherwise, the longer side is partitioned. Again a method for
\code{identify} is provided for \code{calleeTreeMap} objects. Clicking
on rectangles in the plot returns a list of the call stacks for the
identified rectangles.


\section{Graphical User Interfaces}
The function \code{writeCallgrindFile} can be used to write the
profile data in \href{valgrind.org}{Valgrind's} \code{callgrind}
format for use with the
\href{http://kcachegrind.sourceforge.net/html/Home.html}{\code{kcachegrind}}
or \code{qcachegrind} graphical user interfaces available on Linux and
Mac OS X. Figure \ref{kcachegrindFig} shows the \code{kcachegrind}
interface for the example profile data written out using
\code{writeCallgrindFile}.
\begin{figure}
  \centering
  \includegraphics[width=\maxwidth]{kcachegrind}
  \caption{\code{kcachegrind} interface for examining profile data for
    the example.}
\label{kcachegrindFig}
\end{figure}
By default \code{writeCallgrindFile} assumes the common case where, as
here, the profiled code is in a file run using \code{source} and
source code information is retained. If this is the case,
\code{writeCallgrindFile} removes calls associated with the
\code{source} call and adds a \code{<TOP>} entry; this persuades
\code{kcachegrind} to show the top level line information.

Graphical user interfaces within R will be made available in the
\href{https://github.com/ltierney/Rpkg-proftools-GUI}{\pkg{proftoolsGUI}}
package. The current development version provides two interfaces, one
based on \pkg{gWidgets2} \citep{Verzani:gWidgets2} and one on
\pkg{shiny} \citep{Chang:shiny}. Figure \ref{shinyGUIFig} shows the shiny
interface at the time of writing.
\begin{figure}
  \centering
  \includegraphics[width=\maxwidth]{shinyGUI}
  \caption{\code{proftoolsGUI} \code{shiny} interface for examining profile
           data for the example.}
\label{shinyGUIFig}
\end{figure}
\bibliography{proftools}
\end{document}
