Interactive Jobs on Fox
Sometimes you might want to test or debug a calculation interactively, or run a computation that needs interaction or a graphical interface, but running interactively on the login node is discouraged and not an option. In such cases, you need an interactive job.
Running an Interactive Job
Instead of running on a login node, you can ask the queue system to allocate compute resources for you, and once assigned, you can run the command(s) interactively for as long as requested. The examples below are for devel jobs, but the procedure also holds for the other job types .
salloc --ntasks=1 --mem-per-cpu=4G --time=00:30:00 --qos=devel --account=YourAccount
When you are done, simply exit the shell (exit
, logout
or ^D
) to
end the job.
The arguments to salloc
could be any arguments you
would have given to sbatch
when submitting a non-interactive
job. However, --qos=devel
is probably a good idea to avoid waiting
too long in the queue.
Note that interactive jobs stop when you log out from the login machine, so unless you have very long days in office (or elsewhere, for that matter), specifying more than 6-8 hrs runtime is not very useful. An alternative is to start the job in a tmux session (see below).
Keeping Interactive Jobs Alive
Interactive jobs stop when you disconnect from the login node either by
choice or by internet connection problems. To keep a job alive you can
use a terminal multiplexer like tmux
. tmux
allows you to run
processes as usual in your standard shell, and if you detach or get
disconnected from the session, you can reconnect to it later and
continue where you left.
You start tmux
on a login node before you get a interactive Slurm
job with srun
and then do all the work inside it. In case of a
disconnect you simply reconnect to the login node and attach to the tmux
session again by typing:
tmux attach
Or in case you have multiple session running:
tmux list-session
tmux attach -t SESSION_NUMBER
As long as the tmux session is not closed or terminated (e.g. by a
server restart) your session should continue. Note that the tmux
session is bound to the particular login server you get connected
to. So if you start a tmux session on login-1 and next time log in to
fox.educloud.no
you get randomly connected to login-2 you first have
to connect to login-1 again by:
ssh login-1
Or you can log in directly to it from the outside:
ssh login-1.fox.educloud.no
To detach from a tmux session without closing it you have to press Ctrl-B (that the Ctrl key and simultaneously "b", which is the standard tmux prefix) and then "d" (without the quotation marks). To close a session just close the bash session with either Ctrl-D or type exit. You can get a list of all tmux commands by Ctrl-B and the ? (question mark). See also this page for a short tutorial of tmux. Otherwise working inside of a tmux session is almost the same as a normal bash session.
Graphical User Interfaces in Interactive Jobs
It is possible to run X commands, i.e., programs with a graphical user interface (GUI), in interactive jobs. This allows you to get graphical output back from your job running on a login node.
First, you must make sure that you have turned on X forwarding when logging
in to the cluster. With ssh
from a Linux or MacOS machine, you do this with
the -Y
flag, e.g.:
ssh -Y fox.educloud.no
Check that the X forwarding works by running a graphical command like emacs &
and verify that it sets up a window. (Note that due to network latency, it
can take a long time to set up a window.)
To be able to run X commands in interactive jobs, add the argument --x11
(note the lowercase x
) to salloc
, like this:
salloc --ntasks=1 --mem-per-cpu=4G --time=00:30:00 --qos=devel --account=YourAccount --x11
An Alternative to salloc
An alternative to using salloc
is to use
srun --ntasks=1 --mem-per-cpu=4G --time=00:30:00 --qos=devel --account=YourAccount --pty bash -i
(I.e., the same arguments, plus --pty bash -i
.)
As with salloc
, the arguments to srun
can be any you would have
given to sbatch
. srun
used to be the preferred way to run
interactive jobs, but changes in recent versions of Slurm has lead us
to recommend salloc
. For instance, inside an srun
interactive
job, in order to run steps with srun
, they must now be run with the
--overlap
argument to be able to start.
Use the --x11
argument to run X commands.
CC Attribution: This page is maintained by the University of Oslo IT FFU-BT group.
It has either been modified from, or is a derivative of, "Interactive Jobs"
by NRIS under CC-BY-4.0.
Changes: the section on salloc
and srun
was rewritten.