Hamilton supports using the R programming language. By default, a fairly old version of Python is available via the R command on the login nodes, which is provided by the underlying operating system. We do not advise its use.
A number of versions of R are made available through the module command. module avail r and module avail rstudio will show what is available. We currently recommend that you load the r/3.6.3 module.
If you need to use RStudio (the standard R interactive development environment) and are not connecting from the University campus, we recommend that you connect to Hamilton using X2GO (see the Login page for details) to improve performance and stability and then load the rstudio/1.1 module (which provides a copy of R 3.6.0).
Each user is able to install R packages accessible only to them. As people will need different packages and versions of those packages, this is a useful way of giving users the flexibility to install software without waiting for the Hamilton administrators to install it for you.
For example, to install package Matrix (Skipping some of the output for brevity. Files will be saved under a folder called R in your home directory):
[aabb22@hamilton1 ~]$ module load r/3.6.3 [aabb22@hamilton1 ~]$ R R version 3.6.3 (2020-02-29) -- "Holding the Windsock" Copyright (C) 2020 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) > install.packages("Matrix") Warning in install.packages("Matrix") : 'lib = "/ddn/apps/Cluster-Apps/r/3.6.3/lib64/R/library"' is not writable Would you like to use a personal library instead? (yes/No/cancel) yes Would you like to create a personal library '~/R/x86_64-pc-linux-gnu-library/3.6' to install packages into? (yes/No/cancel) yes * DONE (Matrix) The downloaded source packages are in '/tmp/RtmpKVcvHk/downloaded_packages' >
Once installed, R will be able to find and use these packages using the library("<package>") command.
Note that the exact steps required may differ. For example, some versions of R on Hamilton require you to select where to download the package. If it does this, UK (Bristol) [https] and UK (London 1) [https] are sensible choices to make.
Example R program, in file my_r_program.R:
Example job script my_r_job.sh to run an R program using a single CPU core:
#!/bin/bash # Request resources: #SBATCH -n 1 # 1 CPU core #SBATCH --mem=1G # 1 GB RAM #SBATCH --time=6:0:0 # 6 hours (hours:minutes:seconds) # Run on the queue for serial ("sequential") work # (job will share node with other jobs) #SBATCH -p seq7.q # Make R available: module load r/3.6.3 # Commands to be run: R CMD BATCH my_r_program.R
Submit it to the queue with the command: sbatch my_r_job.sh
Output from the job, including any messages from the batch queue system, will be found in a file called slurm-<jobid>.out and output from the R program will be found in my_r_job.Rout.
If you need to make your R code run faster, there are a number of things you can try. Roughly, in order of importance (most important first):
- Use a profiler such as the lineprof library, or benchmark using the microbenchmark library to understand: where your program is spending most of its time; where you need to concentrate your effort; and the impact of any changes made to speed up the code.
- Use vectorised functions that act on an object as a whole, such as rowSums() etc., instead of iterating over each element in a list.
- Try writing the most numerically intensive part of your program in another language, such as Fortran or C, and call it from R.
- Try using packages such as future or Rmpi to make use of more than one CPU core.