R is a language and environment for statistical computing and graphics. It is similar to the S language and environment developed at Bell Laboratories (formerly AT&T, now Lucent Technologies). R provides a wide variety of statistical and graphical techniques and is highly extensible.
#Availability and Restrictions
Versions
The following versions of R are available on OSC systems:
Version | Owens | PITZER |
---|---|---|
3.3.2 | X | |
3.4.0 | X | |
3.4.2 | X | |
3.5.0# | X* | |
3.5.1 | X | |
3.5.2 | X* | |
3.6.0 | X** | X** |
You can use module spider R
to view available modules for a given machine. Feel free to contact OSC Help if you need other versions for your work.
Access
R is available to all OSC users. If you have any questions, please contact OSC Help.
Publisher/Vendor/Repository and License Type
R Foundation, Open source
Usage
R software can be launched two different ways; through Rstudio on OSC OnDemand and through terminal.
Inorder to access Rstudio, please follow this tutorial. For terminal access, the details are given below.
Set-up
In order to configure your environment for the usage of R, run the following command:
module load R
Using R
Once your environment is configured, R can be started simply by entering the following command:
R
For a listing of command line options, run:
R --help
Batch Usage
Running R interactively on a login node for extended computations is not recommended and may violate OSC usage policy. In order to run R in batch, reference the example batch script below. This script requests one full node on the Oakley cluster for 1 hour of walltime.
#PBS -N R_ExampleJob #PBS -l nodes=1:ppn=12 module load R cd $PBS_O_WORKDIR cp in.dat test.R $TMPDIR cd $TMPDIR R CMD BATCH test.R test.Rout cp test.Rout $PBS_O_WORKDIR
HOWTO: Install Local R Packages
Please see our HOWTO guide on installing Local R packages.
R Parallel Cluster Job Submission
In this demonstration, we will submit a parallel job to the cluster using R. Most parallelization concepts in R are centered around loop-level parallelism with independence, where each iteration acts as a separate simulation. There are a variety of strategies based on the concept of:
- foreach - enumerating through the contents of a collection
- apply - application of a function to a collection
This is a toy example, hence we use a function that will generate values sampled from a normal distribution and summing the vector of those results; every call to the function is a separate simulation. But sometimes simple is the best way to demonstrate using OSC resources, and to consider a variety of approaches, afforded by the packages that provide foreach, mclapply, and clusterApply. For sufficient parallelism, we generate 100M elements and run this function 100 times.
For reference, this example was heavily adapted from TACC. We then combine several of the strategies in one single R script.
Background: Indirectly, MPI bindings are provided with the Rmpi package, compiled with OpenMPI. On the OSC Owens cluster, this is automatically selected for you if you use the system packages provided with R/3.3.1+.
The code is presented below in 100 lines of parallel_testing.R
which you can toggle below.
parallel_testing.R
myProc <- function(size=50000000) { # Load a large vector vec <- rnorm(size) # Now sum the vec values return(sum(vec)) } detachDoParallel <- function() { detach("package:doParallel") detach("package:foreach") detach("package:parallel") detach("package:iterators") } max_loop <- 100 # version 1: use mclapply (multicore) - warning - generates zombie processes library(parallel) tick <- proc.time() result <- mclapply(1:max_loop, function(i) myProc(), mc.cores=detectCores()) tock <- proc.time() - tick cat("\nmclapply/multicore test times using", detectCores(), "cores: \n") tock # version 2: use foreach with explicit MPI cluster on one node library(doParallel, quiet = TRUE) library(Rmpi) slaves <- detectCores() - 1 { sink("/dev/null"); cl_onenode <- makeCluster(slaves, type="MPI"); sink(); } # number of MPI tasks to use registerDoParallel(cl_onenode) tick <- proc.time() result <- foreach(i=1:max_loop, .combine=c) %dopar% { myProc() } tock <- proc.time() - tick cat("\nforeach w/ Rmpi test times using", slaves, "MPI slaves: \n") tock invisible(stopCluster(cl_onenode)) detachDoParallel() # version 3: use foreach (multicore) library(doParallel, quiet = TRUE) cores <- detectCores() cl <- makeCluster(cores) registerDoParallel(cl) tick <- proc.time() result <- foreach(i=1:max_loop, .combine=c) %dopar% { myProc() } tock <- proc.time() - tick cat("\nforeach w/ fork times using", cores, "cores: \n") tock invisible(stopCluster(cl)) detachDoParallel() ## version 4: use foreach (doSNOW/Rmpi) library(doParallel, quiet = TRUE) library(Rmpi) slaves <- as.numeric(Sys.getenv(c("PBS_NP")))-1 { sink("/dev/null"); cl <- makeCluster(slaves, type="MPI"); sink(); } # number of MPI tasks to use registerDoParallel(cl) tick <- proc.time() result <- foreach(i=1:max_loop, .combine=c) %dopar% { myProc() } tock <- proc.time() - tick cat("\nforeach w/ Rmpi test times using", slaves, "MPI slaves: \n") tock detachDoParallel() # no need to stop cluster we will use it again ## version 5: use snow backed by Rmpi (cluster already created) library(Rmpi) # for mpi.* library(snow) # for clusterExport, clusterApply #slaves <- as.numeric(Sys.getenv(c("PBS_NP")))-1 clusterExport(cl, list('myProc')) tick <- proc.time() result <- clusterApply(cl, 1:max_loop, function(i) myProc()) tock <- proc.time() - tick cat("\nsnow w/ Rmpi test times using", slaves, "MPI slaves: \n") tock print (result) invisible(stopCluster(cl)) mpi.quit()
Job sumbission script you can toggle below:
owens_job.sh
#!/bin/bash #PBS -l walltime=10:00 #PBS -l nodes=2:ppn=28 #PBS -j oe cd $PBS_O_WORKDIR ml gnu/7.3.0 ml mkl/2018.0.3 ml openmpi/1.10.7 ml R/3.6.0 # parallel R: submit job with one MPI master mpirun -np 1 R --slave < Rmpi.R
The Rmpi.R script is below.
Rmpi.R
myProc <- function(size=500000000){ # Load a large vector vec <- rnorm(size) # Now sum the vec values return(sum(vec)) } max_loop <- 10 ## version 5: use snow backed by Rmpi library(Rmpi)# for mpi.* library(snow) # for clusterExport, clusterApply slaves <- as.numeric(Sys.getenv(c("PBS_NP")))-1 cl <- makeCluster(slaves, type="MPI") # MPI tasks to use clusterExport(cl, list('myProc')) tick <- proc.time() result <- clusterApply(cl, 1:max_loop, function(i) myProc()) write.table(result, file = "foo.csv", sep = ",") tock <- proc.time() - tick cat("\nsnow w/ Rmpi test times using", slaves, "MPI slaves: \n") tock stopCluster(cl) mpi.quit()
Then you can submit to the cluster via of qsub owens_job.sh
.
If everything is good, you should see some output like:
snow w/ Rmpi test times using 55 MPI slaves: user system elapsed 7.494 8.052 15.581
This example shows that MPI has the potential to be effectively parallelized on 1 and 2 nodes. In another series, we will look at a more realistic example using more than a single data structure.
Takeaways
When using more than one node use Rmpi or Rmpi SNOW (like in version 4 or version 5). Also, keep in mind that in independent simulation scenarios the use of a do loop means that there are logical processor alignments with the value of max_loop
which is set to 100. Because we are setting a fixed problem size (100 times 100M), the benefits of parallelism are chunked by the granularity of the iterations.
Typically, one sets the max_loop
to the number of MPI slaves, or where MPI slaves is a factor of max_loop
, because each MPI slave will have its own iteration to perform. However, this all depends on whether or not you have sufficient work on a single iteration. With max_loop
set to 100, we are interested in strong scaling effects for a fixed problem size. For this example, the chart below, using 4 nodes of 28 cores (112 cores total available), makes it clear that you should set max_loop
equal to MPI slaves.
parallel_testing_version5_only.R
Further Reading
- The R home page
-
Data Analysis with R (Udacity course)