MATLAB

Software Description:

MATLAB is a multi-paradigm numerical computing environment and proprietary programming language developed by MathWorks. MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages.

Software Home Page: MATLAB

Software Documentation:

Matlab can be used in multiple ways on the LCC cluster  and is a very popular application in our HPC user community.

We explain multiple ways on how to run and use Matlab on the LCC cluster.

A) Running Matlab using  slurm scheduler on the cluster

      See example Matlab job submission script submit.sh in folder /share/examples/LCC/MATLAB/containerized

run the command(s):

Container Package
#To load module
module load ccs/container/matlab/R2018b (2018 version)
module load ccs/container/matlab/R2019a (2019 version)

#To unload module per version
module unload ccs/container/matlab/R2018b
module unload ccs/container/matlab/R2019a
Software Package
#To load module
module load ccs/matlab/R2018b (2018 version)
module load ccs/matlab/R2019a (2019 version)

#To unload module per version
module unload ccs/matlab/R2018b
module unload ccs/matlab/R2019a

B) To use the GUI version using OOD (Open OnDemand) on the cluster

         Launch a VNC session to the compute node steps 1 to 7 from /wiki/spaces/PreReleaseUKYHPCDocs/pages/7280090

         Once VNC session is opened then

         Click on menu bar at the bottom to launch Matlab

That should launch Matlab and you can use it interactively.

C) If you need to run the job through the scheduler interactively through GUI on the cluster, do the following

1) Inside your Matlab GUI, you need to setup the environment as below to use the cluster.

Software Package
clear;clc
rehash toolbox
configCluster
c=parcluster;
c.AdditionalProperties.WallTime = '1:30:0';
c.AdditionalProperties.QueueName = 'SAN16M64_L';
c.AdditionalProperties.AccountName = 'col_vgazu2_uksr';
c.saveProfile
c.AdditionalProperties 

2) run a sample test to verify the job is allocated and running.

j=batch(c,@pause,0,{120});

j.State;

3) check using slurm commands

4) delete the job

    j.delete

D) To use MATLAB parpool interactively using slurm scheduler on the cluster

1)Setup the slurm environment as in step C above and start parpool (see at the left corner where Matlab is requesting the parpool from the slurm scheduler)

parpool(12);

2) Once parpool is allocated you will see the left corner where it says the worker slaves are allocated.

       3) Test code to see if its working.

j=batch(c,@pwd,1,{},'pool',12);

4) check using slurm commands

5) After using parpool, make sure you delete the parpool (left corner) so that the cores are returned back to the cluster and also delete the Matlab job.

    

E) Parallel Computing using MATLAB for remote submission to the cluster from your workstation

Add the support package to your MATLAB Path by untaring (matlab-remote-submission-uky-R2018b-R2019a.tar) / unzipping (matlab-remote-submission-uky-R2018b-R2019a.zip) it into $MATLAB/toolbox/local or some other available folder on your system.

Start MATLAB.  Configure MATLAB to run parallel jobs on your cluster by calling configCluster.  For each cluster, configCluster only needs to be called once per version of MATLAB.

>> configCluster

Submission to the remote cluster requires SSH credentials.  You will be prompted for your ssh username and password or identity file (private key).  The username and location of the private key will be stored in MATLAB for future sessions.

Jobs will now default to the cluster rather than submit to the local machine.

NOTE: If you would like to submit to the local machine then run the following command:

>> % Get a handle to the local resources
>> c = parcluster('local');

CONFIGURING JOBS

Prior to submitting the job, we can specify various parameters to pass to our jobs, such as queue, e-mail, walltime, etc. [A Queue name and Walltime are required]

>> % Get a handle to the cluster
>> c = parcluster;

[REQUIRED]

>> % Specify a queue to use for MATLAB jobs                

>> c.AdditionalProperties.QueueName = 'queue-name';

>> % Specify the Walltime (e.g. 5 hours)

>> c.AdditionalProperties.WallTime = '05:00:00';


[OPTIONAL]

>> % Specify an account to use for MATLAB jobs

>> c.AdditionalProperties.AccountName = 'account-name';

>> % Specify how many GPUs to use for MATLAB jobs

>> c.AdditionalProperties.GpusPerNode = '2';

>> % Specify memory to use for MATLAB jobs, per core (MB)

>> c.AdditionalProperties.MemUsage = '4000';

>> % Specify e-mail address to receive notifications about your job

>> c.AdditionalProperties.EmailAddress = 'linkblue@uky.edu';


Save changes after modifying AdditionalProperties for the above changes to persist between MATLAB sessions.

>> c.saveProfile


To see the values of the current configuration options, display AdditionalProperties.

>> % To view current properties

>> c.AdditionalProperties


Unset a value when no longer needed.

>> % Turn off email notifications
>> c.AdditionalProperties.EmailAddress = '';

>> c.saveProfile


INDEPENDENT BATCH JOB

Rather than running interactively, use the batch command to submit asynchronous jobs to the cluster.  The batch command will return a job object which is used to access the output of the submitted job.  See the MATLAB documentation for more help on batch.

>> % Get a handle to the cluster
>> c = parcluster;

>> % Submit job to query where MATLAB is running on the cluster

>> j = c.batch(@pwd, 1, {}, 'CurrentFolder', '.', 'AutoAddClientPath',false);

>> % Query job for state

>> j.State

>> % If state is finished, fetch the results

>> j.fetchOutputs{:}

>> % Delete the job after results are no longer needed

>> j.delete


To retrieve a list of currently running or completed jobs, call parcluster to retrieve the cluster object.  The cluster object stores an array of jobs that were run, are running, or are queued to run.  This allows us to fetch the results of completed jobs.  Retrieve and view the list of jobs as shown below.

>> c = parcluster;

>> jobs = c.Jobs;

Once we’ve identified the job we want, we can retrieve the results as we’ve done previously.

fetchOutputs is used to retrieve function output arguments; if calling batch with a script, use load instead.   Data that has been written to files on the cluster needs be retrieved directly from the file system (e.g. via ftp).

To view results of a previously completed job:

>> % Get a handle to the job with ID 2

>> j2 = c.Jobs(2);


NOTE: You can view a list of your jobs, as well as their IDs, using the above c.Jobs command. 

>> % Fetch results for job with ID 2

>> j2.fetchOutputs{:}


PARALLEL BATCH JOB(Not recommended since this requires 2 way communication between the cluster compute nodes and your workstation and in most cases compute nodes would not be able to connect to your remote workstation.)

Users can also submit parallel workflows with the batch command.  Let’s use the following example for a parallel job.   

This time when we use the batch command, in order to run a parallel job, we’ll also specify a MATLAB Pool.     

>> % Get a handle to the cluster

>> c = parcluster;

>> % Submit a batch pool job using 4 workers for 16 simulations

>> j = c.batch(@parallel_example, 1, {}, 'Pool',4, 'CurrentFolder', '.', 'AutoAddClientPath',false);

>> % View current job status

>> j.State

>> % Fetch the results after a finished state is retrieved

>> j.fetchOutputs{:}

ans =     8.8872


The job ran in 8.89 seconds using four workers.  Note that these jobs will always request N+1 CPU cores, since one worker is required to manage the batch job and pool of workers.   For example, a job that needs eight workers will consume nine CPU cores.            

We’ll run the same simulation but increase the Pool size.  This time, to retrieve the results later, we’ll keep track of the job ID.

NOTE: For some applications, there will be a diminishing return when allocating too many workers, as the overhead may exceed computation time.   

>> % Get a handle to the cluster

>> c = parcluster;


>> % Submit a batch pool job using 8 workers for 16 simulations

>> j = c.batch(@parallel_example, 1, {}, 'Pool', 8, 'CurrentFolder', '.', 'AutoAddClientPath',false);


>> % Get the job ID

>> id = j.ID

id =

     4

>> % Clear j from workspace (as though we quit MATLAB)

>> clear j

Once we have a handle to the cluster, we’ll call the findJob method to search for the job with the specified job ID.  

>> % Get a handle to the cluster

>> c = parcluster;


>> % Find the old job

>> j = c.findJob('ID', 4);


>> % Retrieve the state of the job

>> j.State

ans

finished

>> % Fetch the results

>> j.fetchOutputs{:};

ans =

4.7270

The job now runs in 4.73 seconds using eight workers.  Run code with different number of workers to determine the ideal number to use.

Alternatively, to retrieve job results via a graphical user interface, use the Job Monitor (Parallel > Monitor Jobs).


DEBUGGING

If a serial job produces an error, call the getDebugLog method to view the error log file.  When submitting independent jobs, with multiple tasks, specify the task number. 

>> c.getDebugLog(j.Tasks(3))


For Pool jobs, only specify the job object.

>> c.getDebugLog(j)

When troubleshooting a job, the cluster admin may request the scheduler ID of the job.  This can be derived by calling schedID

>> schedID(j)

ans

25539


TO LEARN MORE

To learn more about the MATLAB Parallel Computing Toolbox, check out these resources:












Center for Computational Sciences