Lawrence has three methods of job submission: interactive, batch, and GUI (graphical user interface).
Interactive jobs: An interactive job, as its name suggests, is the more user-involved. Users request a node (please don't perform computations in the login node), and then perform computations or analysis by directly typing commands into the command line. Interactive jobs end if the user logs off of Lawrence.
Batch jobs: Batch jobs are designed to run one or more scripts (python, C, etc.) on one or more files through a pre-written script. These do not need interaction with the user once they have been submitted in the terminal (either started on a node, or put in Lawrence's queue if the desired node is in use). Batch scripts continue to run if the user logs off of Lawrence.
GUI jobs: It is possible to open some types of software in a window on Lawrence. Software such as Firefox, Gaussian, Lumerical, and RStudio can be opened and used in a manner similar to how they would be used on a desktop.
Science Gateway: Some software applications are now accessible through the Science Gateway. Please see the website to view the available applications.
The Slurm Workload Manager is the job scheduler used by the Lawrence HPC. For a comprehensive overview of Slurm commands, visit the Slurm webpage: https://slurm.schedmd.com/quickstart.html
For the commonly used Slurm commands on the Lawrence HPC, we have provided quick-start documentation with examples within the Wiki.
There are five Slurm partitions on Lawrence: the default partition (nodes), preemptible partition (preemptible), high memory partition (himem), graphics processing partition (gpu), and visualization partition (viz). For an in-depth overview of Slurm preemption, please visit the corresponding Slurm webpage.
The default Slurm partition is called “nodes” and will run a job for up to two days on a general compute node/s. When running the sbatch or srun command without passing any -p arguments, your job will be scheduled on the “nodes” partition.
Press Ctrl+D to exit the node and return to the login node.
To accommodate longer running jobs, users also have the option of using the preemptible partition (using the "-p preemptible" flag). This partition will allow a job to run for up to 90 days on a general compute node/s. However, if the general compute node/s is needed for a new job in the "nodes" partition, the preemptible job will be canceled (preempted) to allow the regular job to run.
Press Ctrl+D to exit the preemptible partition and return to the login node.
Jobs that require a large amount of memory (RAM) may be run on a high-memory (himem) node using the "-p himem" flag.
Press Ctrl+D to exit the high memory partition and return to the login node.
To use the graphics processing unit (GPU) partition, use the "-p gpu" flag.
Press Ctrl+D to exit the GPU partition and return to the login node.
For the visualization (viz) partition, use the "-p viz" flag.
Press Ctrl+D to exit the visualization partition and return to the login node.
Interactive sessions on compute nodes can be used with the Slurm command "srun". For the use of one node, this command can be used generally as demonstrated below:
The Lawrence high-memory (himem) partition has two nodes, each with 1.5 TB of RAM. This node is especially useful for jobs requiring a large amount of memory and can be accessed either interactively or with a batch script.
For interactive jobs on the Lawrence himem nodes, use the srun command as follows:
When requesting a new GPU node, the access to a GPU device must be explicitly requested using the "--gres" parameter. The format for requesting a generic resource (gres) is TYPE:LABEL:NUMBER. On Lawrence, type will always be "gpu", and label will be either "pascal" or "volta".
NUMBER is the number of GPUs being requested per node. On Lawrence there are six GPU nodes: GPU01, which has two pascal GPUs, and GPU02 through GPU06, which have one volta GPU each. The number of GPUs per node is what this NUMBER is requesting. So for the GPU01 pascal node, "1" or "2" can be used depending on how many GPUs your workflow requires, but for GPU02 through GPU06, "1" is needed.\
[[email protected]@login ~]$ srun --pty -p gpu --gres=gpu:pascal:1 -B 1:12 bash[[email protected]@gpu01 ~]$
[[email protected]@login ~]$ srun --pty -p gpu --gres=gpu:volta:1,gpu:volta:1,gpu:volta:1 bash[[email protected]@gpu02 ~]$
After being allotted a GPU, to list the stats of your allocated GPU(s), use:
To make submitting a batch job easier, there are a few templates available for the general nodes, the high memory nodes, and the GPU node. There is also a template for setting up a parallel job using MPI. To use a template, copy the template directory into one of your directories:
[[email protected]@login ~]$ cp -r /opt/examples/ ./
[[email protected]@login ~]$ cp -r /opt/examples/ $HOME/your/directoryPath/here
Open the desired template with an editor such as nano, and edit the contents as needed.
Batch jobs can be submitted on the Lawrence cluster using the sbatch command.
[[email protected]@login ~]$ sbatch simple-template.sh
A variety of configurations can be used for formulating a batch script. A basic batch script will look like the one below:
Below is an example batch script, called simple-template.sh in the example template directory (/opt/examples/simple-template.sh). This template can be followed when requesting a node on Lawrence:
To use a high memory node within a batch job, add “--partition=himem” to your script.
Below is an example batch script which calls the a high-memory node. This template (/opt/examples/himem-template.sh) can be followed when requesting the himem node on Lawrence:
Below is an example batch script which calls the GPU node, this template (/opt/examples/gpu-template.sh) can be followed when requesting a GPU node on Lawrence:
Python scripts can be used to produce visual products on Lawrence. As an example, we have provided a batch script (/opt/examples/elephant/elephant-template.sh) that calls a python script (elephant.py) which produces a .png file containing a graph with a line shaped like an elephant:
elephant-template.sh#!/bin/bash# Example job submission script# ensure anaconda is installed# install with /apps/install-anaconda.sh# This is a comment.# Lines beginning with the # symbol are comments and are not interpreted by# the Job Scheduler.# Lines beginning with #SBATCH are special commands to configure the job.### Job Configuration Starts Here ############################################## Export all current environment variables to the job (Don't change this)#SBATCH --get-user-env# The default is one task per node#SBATCH --ntasks=1#SBATCH --nodes=1#request 10 minutes of runtime - the job will be killed if it exceeds this#SBATCH --time=10:00# Change [email protected] to your real email address#SBATCH [email protected]#SBATCH --mail-type=END### Commands to run your program start here ####################################pwdecho "This is the elephant example"python elephant.py
elephant.py"""Author: Piotr A. Zolnierczuk (zolnierczukp at ornl dot gov)Based on a paper by:Drawing an elephant with four complex parametersJurgen Mayer, Khaled Khairy, and Jonathon Howard,Am. J. Phys. 78, 648 (2010), DOI:10.1119/1.3254017"""import numpy as npimport matplotlibmatplotlib.use('Agg')import matplotlib.pyplot as pylab# import pylab# elephant parametersp1, p2, p3, p4 = (50 - 30j, 18 + 8j, 12 - 10j, -14 - 60j )p5 = 40 + 20j # eyepiecedef fourier(t, C):f = np.zeros(t.shape)A, B = C.real, C.imagfor k in range(len(C)):f = f + A[k]*np.cos(k*t) + B[k]*np.sin(k*t)return fdef elephant(t, p1, p2, p3, p4, p5):npar = 6Cx = np.zeros((npar,), dtype='complex')Cy = np.zeros((npar,), dtype='complex')Cx = p1.real*1jCx = p2.real*1jCx = p3.realCx = p4.realCy = p4.imag + p1.imag*1jCy = p2.imag*1jCy = p3.imag*1jx = np.append(fourier(t,Cx), [-p5.imag])y = np.append(fourier(t,Cy), [p5.imag])return x,yx, y = elephant(np.linspace(0,2*np.pi,1000), p1, p2, p3, p4, p5)pylab.plot(y,-x,'.')print("Saving figure")pylab.savefig('elephant.png')#pylab.show()print("Done")
R is a commonly used language to make visualizations. Provided in the /opt/examples/Rscripts folder is an example R script (exampleScript.R) and a batch script (R-batch-tempate.sh) for running it in batch. (The file data.csv in the same directory contains the data used.)
Batch script (R-batch-template.sh)
R script (exampleScript.R)
MPI is a software environment used to divide work among multiple processors. Below is a template script (/opt/examples/mpi/mpi-template.sh) and example MPI program written in the C language (mpi_hello_world.c). Both can be found in /opt/examples/mpi/.
Some researchers prefer the python programming language, rather than C. If this is true of you, a python mpi template script is also available. Before beginning, ensure that you have Anaconda (or Bioconda) installed on your Lawrence login. If you don't have one of these, install as below (it will take a few minutes). When the Anaconda installer asks if you would like to add the Anaconda commands to your path, select yes.
[[email protected]@login ~]$ /apps/install-anaconda.sh……installation finished.Do you wish the installer to prepend the Anaconda3 install locationto PATH in your /home/usd.local/adison.kleinsasser/.bashrc ? [yes|no][no] >>> yesAppending source /home/usd.local/adison.kleinsasser/anaconda3/bin/activate to /home/usd.local/adison.kleinsasser/.bashrcA backup will be made to: /home/usd.local/adison.kleinsasser/.bashrc-anaconda3.bakFor this change to become active, you have to open a new terminal.Thank you for installing Anaconda3!===========================================================================
Make sure that no other modules are loaded, and remove them if needed. You can use the "which" command to verify that you are using the python and mpirun commands from Anaconda.
[[email protected]@login ~]$ module listCurrently Loaded Modulefiles:1) openmpi-2.0/gcc[[email protected]@login ~]$ module purge[[email protected]@login ~]$ module listNo Modulefiles Currently Loaded.[[email protected]@login ~]$[[email protected]@login ~]$ which python~/anaconda3/bin/python[[email protected]@login ~]$ which mpirun~/anaconda3/bin/mpirun.....
Below is a template script (mpi-python-template.sh) and example MPI program written in the python language (csvIntoPython.py). This python script reads a csv file, and prints the data to a slurm file (slurm-00000.out). Both templates can be found in "/opt/examples/mpi/".
Log into Lawrence, using a flag for Mac or Linux:
MobaX on Windows:
[User.Name.NI11018] ➤ ssh [email protected]
ITSCkMac07:~ user.name$ ssh -Y [email protected]
Request a node:
Run a graphical software on a node (such as Lumerical, Gaussian, or Firefox):
If the software is part of a module, it will need to be loaded first:
[[email protected]@node51 ~]$ module load module_nameCurrently Loaded Modulefiles:1) module_name
Then run the software:
[[email protected]@node51 ~]$ name_of_software
See below for specific examples:
The GUI will open:
There may be a list of 200 or more of this error:
(firefox:189943): dconf-CRITICAL **: unable to create directory '/run/user/1093713210/dconf': Permission denied. dconf will not work properly.
If this error appears, it's nothing to worry about.
Load the Gaussian module:
[[email protected]@node51 ~]$ module load gaussian/16Currently Loaded Modulefiles:1) gaussian/16
Launch the Gaussian GUI;
[[email protected]@node51 ~]$ gview
The GUI will open:
Launch the Lumerical GUI
[[email protected]@node51 ~]$ module load lumericalCurrently Loaded Modulefiles:1) lumerical[[email protected]@node51 ~]$ srun fdtd-solutions
Instructions for the Science Gateway coming soon.