Topics Map > Technical Support > Remote Access > Community Servers

Efficient way to script R package installs

An efficient way to install packages for R.

Taken from

When working with R scripts on Community Servers, a package may not be installed on one or more servers in your local profile.   In order to ensure this, you would typically have install.packages("packagename") in your R script. However, this can be rather inefficient and slow when opens up the script continually on your local machine for editing. An example below of a script installing and loading several packages.

Inefficient way to install and load R packages

# Installation of required packages

# Load packages

More efficient way

# Package names
packages <- c("ggplot2", "readxl", "dplyr", "tidyr", "ggfortify", "DT", "reshape2", "knitr", "lubridate", "pwr", "psy", "car", "doBy", "imputeMissings", "RcmdrMisc", "questionr", "vcd", "multcomp", "KappaGUI", "rcompanion", "FactoMineR", "factoextra", "corrplot", "ltm", "goeveg", "corrplot", "FSA", "MASS", "scales", "nlme", "psych", "ordinal", "lmtest", "ggpubr", "dslabs", "stringr", "assist", "ggstatsplot", "forcats", "styler", "remedy", "snakecaser", "addinslist", "esquisse", "here", "summarytools", "magrittr", "tidyverse", "funModeling", "pander", "cluster", "abind") 

# Install packages not yet installed
installed_packages <- packages %in% rownames(installed.packages())
if (any(installed_packages == FALSE)) {  install.packages(packages[!installed_packages]) }

# Packages loading
invisible(lapply(packages, library, character.only = TRUE))

This code for installing and loading R packages is more efficient in several ways:

  1. The function install.packages() accepts a vector as argument, so one line of code for each package in the past is now one line including all packages
  2. In the second part of the code, it checks whether a package is already installed or not, and then install only the missing ones
  3. Regarding the packages loading (the last part of the code), the lapply() function is used to call the library() function on all packages at once, which makes the code more condense.
  4. The output when loading a package is rarely useful. The invisible() function removes this output.

Most efficient way

{pacman} package

The function p_load() from {pacman} checks to see if a package is installed, if not it attempts to install the package and then loads it. It can also be applied to several packages at once, all this in a very condensed way:

pacman::p_load(ggplot2, tidyr, dplyr)

Find more about this package on CRAN.

{librarian} package

Like {pacman}, the shelf() function from the {librarian} package automatically installs, updates, and loads R packages that are not yet installed in a single function. The function accepts packages from CRAN, GitHub, and Bioconductor (only if Bioconductor’s Biobase package is installed). The function also accepts multiple package entries, provided as a comma-separated list of unquoted names (so no "" around package names).

Last but not least, the {librarian} package allows to load packages automatically at the start of every R session (thanks to the lib_startup() function) and search for new packages on CRAN by keywords or regular expressions (thanks to the browse_cran() function).

Here is an example of how to install missing packages and load them with the shelf() function:

# From CRAN:
librarian::shelf(ggplot2, DesiQuintans / desiderata, pander)

For CRAN packages, provide the package name as normal without "" and for GitHub packages, provide the username and package name separated by / (i.e., UserName/RepoName as shown for the desiderata package).

Keywords:R script efficient package install lincomm   Doc ID:112694
Owner:Eric D.Group:Agricultural & Applied Economics
Created:2021-07-28 07:56 CDTUpdated:2021-07-28 08:16 CDT
Sites:Agricultural & Applied Economics
Feedback:  0   0