Fox Jupyter Felles

This is your quick grab-and-go document for running a Jupyter lab server on Fox, with felles drive attached.

Step 0: Get Fox access

Step 1: Create a sbatch file

  • sbatch is the command to submit a new job to the Slurm job manager. You can read more about job, Slurm in the Fox documentation:

  • You can copy-paste the script below to a new document named sjupyter.sbatch, just change all the variable with < > to a value you like.

    #! /bin/bash
    #SBATCH --job-name=<your_job_name>
    #SBATCH --account=ec12 ## This has to be ec12, which is the code for Fox
    #SBATCH --time=6:00:00 ## Change this to the time you want 
    #SBATCH --mem-per-cpu=8G
    #SBATCH --ntasks=1
    #SBATCH --output=<your_output_file_path> # example: /fp/homes01/u01/<your_username>/sjupyter.log
    #SBATCH --partition=accel # read more about different partitions in the Fox Documentation
    #SBATCH --gpus=1
    ## Set up job environment:
    set -o errexit  # Exit the script on any error
    set -o nounset  # Treat any unset variables as an error
    module --quiet purge
    ## load conda
    module load Miniconda3/4.9.2
    ## Set the ${PS1} (needed in the source of the Anaconda environment)
    export PS1=\\$
    ## Source the conda environment setup
    ## The variable ${EBROOTANACONDA3} or ${EBROOTMINICONDA3}
    ## So use one of the following lines
    ## comes with the module load command
    source ${EBROOTMINICONDA3}/etc/profile.d/
    ## Deactivate any spill-over environment from the login node
    conda deactivate &>/dev/null
    ## create conda env in the job's localscratch and install packages
    ## note that the environment will be deleted when the job is done, the /localscratch only exists when the job's running
    conda init bash
    conda clean --all --yes --quiet
    conda create -q -y -p /localscratch/$JOB_ID/conda/env/base python=3.10
    source activate /localscratch/$JOB_ID/conda/env/base
    ## notice the --download-only flag for conda install and --no-cache-dir flag for pip install
    ## they are very important so that your storage space at ~/ won't explode
    conda install -q -y -c conda-forge nb_conda_kernels --download-only
    yes | pip install jupyterlab --no-cache-dir
    yes | pip install torch torchvision torchaudio --index-url <> --no-cache-dir
    ## add any package you want to install here
    yes | pip install matplotlib librosa jupyterlab tqdm lmdb==1.4.0 --no-cache-dir
    # set UI output to None, otherwise a permission error 
    export XDG_RUNTIME_DIR=""
    # Start the jupyter lab server
    jupyter lab --ip= --port=8080

Step 2: Submit the job and wait for starting

  • Make sure you are in a login node, your terminal should show something like this:

    [ec-<username>@login-3 ~]$
  • Now we can submit a job using our file:

    $ sbatch sjupyter.sbatch
    # You'll get something like this:
    # Submitted batch job 200658
  • Now the job has been submitted, we can check the queue and our job status by running squeue:

    $ squeue
    ## Output will look like this:
    #          200453     accel job_fit. ec-xxxxx PD       0:00      1 (Resources)
    #          200610    normal norbert3 ec-yyyyy R        41:01      1 c1-29
    # filter the output by specifying your username:
    $ squeue -u ec-<your_username>
    ## Output will look like this:
    #          JOBID PARTITION     NAME     USER            ST       TIME  NODES NODELIST(REASON)
    #          200658     accel job_fit. ec-<your_username> PD       0:00      1 (Resources)
  • Now we need to wait in the queue, until the ST becomes R, which mean it’s running.

Step 3: Connect to the Jupyter server from your local computer

  • Once the job starts to run, the outputs will be generated to a file that we assigned in the sbatch file #SBATCH --output=<your_output_file_path>.

  • Open the file and see if any error shows up. If not, the Jupyter server will start after installing all the packages, and show something like this:

    [I 2023-04-21 05:09:11.659 ServerApp] Serving notebooks from local directory: /fp/homes01/u01/ec-<your_username>
    [I 2023-04-21 05:09:11.659 ServerApp] Jupyter Server 2.5.0 is running at:
    [I 2023-04-21 05:09:11.659 ServerApp] <http://gpu-1:8080/lab?token=40f6d00286cff03ace3b500adf24c4501af984bcaeaaa9de>
    [I 2023-04-21 05:09:11.659 ServerApp]     <>
    [I 2023-04-21 05:09:11.659 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
    [W 2023-04-21 05:09:11.666 ServerApp] No web browser found: Error('could not locate runnable browser').
    [C 2023-04-21 05:09:11.666 ServerApp] 
        To access the server, open this file in a browser:
        Or copy and paste one of these URLs:
  • Notice the machine gpu-1, the port 8080, and the token.

  • Now open another terminal from your local machine and run this:

    # change all the variables with <> to your value
    $ ssh -t -t ec-<your_username> -L 6060:localhost:6061 ssh <machine_name> -L 6061:localhost:<jupyter_port>
    # here we need to use 2FA and your password again
    # if succeed, the terminal will now show your fox username and the machine name:
    # [ec-<your_username>@gpu-1 ~]$ 
  • Now, open a browser and go to localhost:6060, you’ll see the jupyter server!

    • It requires the token at the first time, which is the sequence of characters in the output file.

Step 4: Mount felles drive

# connect to felle drive and copy the dataset to node's /localscratch
mkdir ~/felles
sshfs -p 22 <your_uio_username> ~/felles
# now you need your 2FA and password of your UIO account, NOT THE EDUCLOUD FOX ACCOUNT 
  • If you want to unmount felles, you could use fusermount -u:

    ## unmount felles
    fusermount -u ~/felles
    rmdir ~/felles
Published Sep. 20, 2023 1:26 PM - Last modified Sep. 20, 2023 1:26 PM