Special considerations for writing functions in R packages
Brainstorm ideas for your own package
Building a “dummy” package using R Studio and devtools
Set up and start building your own package
What is an R package?
“Packages are the fundamental units of reproducible R code. They include reusable R functions, the documentation that describes how to use them, and sample data.” -Hadley Wickham & Jenny Bryan in R Packages
R packages are…
Portable
Everything in an R package directory (functions, documentation, data, etc.) is “built” into a “tarball” (packageName.tar.gz) that is easy to download, install and load
Open source
If you install a package, you can see all the function code
Inside the geom_point() function in ggplot2
library(ggplot2)geom_point
function (mapping = NULL, data = NULL, stat = "identity", position = "identity",
..., na.rm = FALSE, show.legend = NA, inherit.aes = TRUE)
{
layer(data = data, mapping = mapping, stat = stat, geom = GeomPoint,
position = position, show.legend = show.legend, inherit.aes = inherit.aes,
params = list2(na.rm = na.rm, ...))
}
<bytecode: 0x1176f60f0>
<environment: namespace:ggplot2>
Why should I make an R package?
To share R functions (and/or data) with others (for a general audience!)
via the Comprehensive RArchive Network (CRAN), Bioconductor, even GitHub
To share R functions (and/or data) with others (in your lab, company, etc.)
via GitHub or other file sharing tool
To store functions (and/or data) for yourself!
R packages can be huge and complicated, or can have only one or two functions! It’s entirely up to you, and what you think will be useful for your intended audience.
Note: The more public-facing your R package, the more complex the documentation should be, and the more “generalized” the functions should be.
Required Elements of an R Package
Package
R/: contains R code files that contain function(s)
man/: contains documentation files for each function
DESCRIPTION: A file containing key package metadata
NAMESPACE: A file that determines which other packages your package relies on, which functions your package exports, etc.
Optional additional elements
data/: Contains example dataset(s) as .rda files inst/: Can contain multiple objects, including a CITATION file tests/: Contains files that perform automated tests of each function vignettes/: Contains package vignette(s) as .Rmd files LICENSE: A file that explains the license you want to use for your package (e.g. Creative Commons, MIT, etc.) README.Rmd: A file that explains your package, displays on GitHub repo homepage and on CRAN
getRow <-function(dat, rowInd =1) {# check that the 'dat' argument is a data frameif (is.data.frame(dat) ==FALSE) {stop("Wrong input") }# check that the 'ind' argument is numericstopifnot(is.numeric(rowInd)) newRow <- dat[rowInd,]return(newRow)}
Try to make the ‘dat’ argument a vector
getRow(dat =c(1,2,3,4))
Error in getRow(dat = c(1, 2, 3, 4)): Wrong input
Try to make the ‘rowInd’ argument a character
getRow(dat = mtcars, rowInd ="3")
Error in getRow(dat = mtcars, rowInd = "3"): is.numeric(rowInd) is not TRUE
getRow <-function(dat, rowInd =1) {# check that the 'dat' argument is a data frameif (is.data.frame(dat) ==FALSE) {stop("Wrong input") }# check that the 'ind' argument is numericstopifnot(is.numeric(rowInd)) newRow <- dat[rowInd,]return(newRow)}getRow(dat =c(1,2,3,4))
Error in getRow(dat = c(1, 2, 3, 4)): Wrong input
getRow <-function(dat, rowInd =1) {# check that the 'dat' argument is a data frameif (is.data.frame(dat) ==FALSE) {stop("The 'dat' argument must be a data frame") }# check that the 'ind' argument is numericstopifnot("The 'rowInd' argument must be numeric"=is.numeric(rowInd)) newRow <- dat[rowInd,]return(newRow)}getRow(dat =c(1,2,3,4))
Error in getRow(dat = c(1, 2, 3, 4)): The 'dat' argument must be a data frame
getRow(dat = mtcars, rowInd ="3")
Error in getRow(dat = mtcars, rowInd = "3"): The 'rowInd' argument must be numeric