Hamilton supports using the Matlab programming environment. Matlab is made available through the module command. module avail matlab will show which versions are installed.
The full Matlab graphical interface can be used on the login nodes by loading a module, e.g. module load matlab/R2021a and executing the matlab command. We recommend that you do this using X2GO (see the Login page for details) to improve performance and stability.
Non-interactive Matlab scripts can be run on the compute nodes through the batch queue; however, this will use the text-based interface for Matlab so that the output can be captured in a text output file.
If you would like to save a Matlab function in a file, create a matlab folder in your home directory and place .m files in it.
For example, file $HOME/matlab/f.m containing the following will define a function, f, which can be used from the Matlab prompt:
function y = f(x) y = x + 2
Alternatively, if you have downloaded a community-provided Matlab Toolbox, you can make it available to your Matlab session by adding the location of its directory to the MATLABPATH environment variable.
Example Matlab program, in file my_matlab_program.m:
% Perform a simple calculation: x = 2; y = 3; x + y
Example job script my_matlab_job.sh to run a Matlab program using a single CPU core:
#!/bin/bash # Request resources: #SBATCH -c 1 # 1 CPU core #SBATCH --mem=1G # 1 GB RAM #SBATCH --time=1:0:0 # 1 hour (days-hours:minutes:seconds) # Run in the 'shared' queue # (job will share node with other jobs) #SBATCH -p shared # Make Matlab available: module load matlab/R2021 # Start Matlab: # (-nodisplay disables the graphics interface) # (-singleCompThread stops matlab from using more than one core) matlab -nodisplay -singleCompThread -r "run my_matlab_program.m"
Submit it to the queue with the command: sbatch my_matlab_job.sh
The above example only uses one CPU core, but Matlab has a number of ways to making use of more cores in order to improve the performance of numerical work:
- Running many jobs
Hamilton can support running many different Matlab jobs at the same time.
Many Matlab routines, such as fft() are able to take advantage of more than one CPU core on a single machine.
- Distributed Computing
This is where Matlab creates multiple copies of itself, allowing a program to distribute work across them by using the parfor function
Matlab supports the use of GPU cards where available, although the code needs to be specially written to take advantage of such hardware.
The use of the Matlab profiler may be necessary in order to understand which of these techniques is best suitable for your code.
If you need to run many hundreds of Matlab jobs at the same time, you may need to compile your program into a standalone executable. The executable does not run any more quickly, but it does minimise the amount of contact with the Matlab license server. To generate an executable using a single CPU core:
mcc -R -singleCompThread -m my_matlab_program.m
An executable called my_matlab_program will be created. You may need to execute the following command before running it:
Many, but not all, Matlab routines automatically take advantage of multiple CPU cores in a computer. As Hamilton's compute nodes have 128 cores, this allows a significant speed up compared to running on many laptops or desktops.
Example job script, allowing the use of an entire compute node:
#!/bin/bash # Request resources: #SBATCH -c 128 # number of CPU cores. 128 is a whole compute node. #SBATCH -t 01:00:00 # 1 hour (hours:minutes:seconds) # Run in the default queue #SBATCH -p shared # Make Matlab available: module load matlab/R2021a # Start Matlab: # (-nodisplay disables the graphics interface) matlab -nodisplay -r "run my_matlab_program.m"
Experimentation may be required - comparing how quickly single CPU core and multithreaded job scripts run, to see if your Matlab program can take advantage of multiple cores.
If multithreading alone does not provide the speed up you require, Matlab is also able to create multiple copies of itself, called workers, which run code within a Matlab parfor loop. Note that functions inside the parfor loop may multithread. Workers come in two flavours, which are licensed differently:
Distributed Computing Toolbox (formerly Parallel Toolbox) - the workers are all on the same node as the main Matlab program
Distributed Compute Engine (formerly Distributed Computing Server) - the workers are spread across multiple nodes, allowing the parfor to use more CPU cores and memory than available in a single node
To use Distributed Computing Toolbox, use the same job script as in the Multithreading section, but:
- In the job script, add the following lines after loading the Matlab module:
unset OMP_PLACES unset OMP_PROC_BIND
- In your Matlab script. execute the following Matlab command to start the workers: parpool('local', str2num(getenv('SLURM_CPUS_PER_TASK')))
- Use the parfor Matlab command to distribute work across the workers.
If your work uses 'parpool', you will also need to make sure that Matlab's temporary files for the pool are saved in a unique location for each job. $TMPDIR is suitable for this. Adapt the following lines to suit your own code:
pooltmpdir = getenv('TMPDIR');
pcluster = parcluster('local');
pcluster.JobStorageLocation = pooltmpdir;
If you wish to try Distributed Compute Engine, which may permit running much larger computations over several of Hamilton's compute nodes, please contact us first.