Transformers

Description

State-of-the-art Machine Learning for PyTorch, TensorFlow and Jax.

Home page

https://github.com/huggingface/transformers

Documentation

https://huggingface.co/docs/transformers/

License

Apache License 2.0

Usage

Transformers uses the PyTorch backend and should (ideally) be run on the GPU nodes (using the "cuda" module). It's configured for offline use and points to a local Huggingface cache.

Use

module avail Transformers

to see which versions of Transformers are available. Use

module load Transformers/version

to get access to Transformers.

Huggingface cache

The module sets environment variables (HF_DATASETS_OFFLINE, HF_HUB_OFFLINE, TRANSFORMERS_OFFLINE) for offline usage in TSD.
The module sets "HB_HOME=/cluster/software/huggingface" to point to a local repository for models and data. Models and data are distributed under various licenses, so please read the license for your models prior to usage. To add models and data to the central repository submit a request to tsd-drift.

You can list the available models using the following command:

huggingface-cli scan-cache

Overwrite the "HB_HOME" environment variable to point to your own imported models/data:

export HB_HOME=/path/to/your/model

Example

Please note that the example below is model specific. Scripts will require modification to fit the needs of your analysis and data. The PyTorch default device is the cpu, so the cuda device needs to be set in the code. Please consult the model documentation on Huggingface for details and examples.

The following example from Huggingface will help you to get started. Create the following Slurm script "transformers.sm":

#!/bin/bash
#SBATCH --job-name=transformers
#SBATCH --account=pXX_tsd
#SBATCH --partition=accel
#SBATCH --gres=gpu:1
#SBATCH --time=0:05:00
#SBATCH --mem-per-cpu=4G

set -o errexit
module purge --quiet
module load Transformers/4.37.2-foss-2021a-CUDA-11.3.1

python transformers_lmm.py

Then create the "transformers_lmm.py" file:

from transformers import AutoModelForCausalLM
from transformers import AutoTokenizer

# Specify the model
model_id = "mistralai/Mistral-7B-v0.1"

# Set "local_files_only" to skip online lookup attempts
model = AutoModelForCausalLM.from_pretrained(
    model_id, device_map="auto", load_in_4bit=True, local_files_only=True
)
tokenizer = AutoTokenizer.from_pretrained(model_id, padding_side="left", local_files_only=True)

print('... processing single input data')

model_inputs = tokenizer(["A list of colors: red, blue"], return_tensors="pt").to("cuda")
generated_ids = model.generate(**model_inputs, max_new_tokens=50)
print(tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0])

print('... processing batch input data')

# Most LLMs don't have a pad token by default
tokenizer.pad_token = tokenizer.eos_token
model_inputs = tokenizer(
    ["A list of colors: red, blue", "Portugal is"], return_tensors="pt", padding=True
).to("cuda")
generated_ids = model.generate(**model_inputs, max_new_tokens=50)
print(tokenizer.batch_decode(generated_ids, skip_special_tokens=True))

print('... done')

Finally submit the job:

sbatch transformers.sm

Which returns something along the lines of:

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please run

python -m bitsandbytes

 and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
bin /cluster/software/EASYBUILD/Transformers/4.37.2-foss-2021a-CUDA-11.3.1/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda113.so
CUDA SETUP: CUDA runtime path found: /cluster/software/EASYBUILD/CUDA/11.3.1/lib/libcudart.so.11.0
CUDA SETUP: Highest compute capability among GPUs detected: 8.0
CUDA SETUP: Detected CUDA version 113
CUDA SETUP: Loading binary /cluster/software/EASYBUILD/Transformers/4.37.2-foss-2021a-CUDA-11.3.1/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda113.so...
Loading checkpoint shards: 100%|��������������������| 2/2 [00:41<00:00, 20.85s/it]
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.
Setting `pad_token_id` to `eos_token_id`:2 for open-end generation.

... processing single input data
A list of colors: red, blue, green, yellow, orange, purple, pink, brown, black, white, gray, silver, gold, tan, beige, cream, ivory, tan, and more.

... processing batch input data
['A list of colors: red, blue, green, yellow, orange, purple, pink, brown, black, white, gray, silver, gold, tan, beige, cream, ivory, tan, and more.\n\nA list of colors: red, blue, green', 'Portugal is a country in southwestern Europe, on the Iberian Peninsula. It is the westernmost country of mainland Europe, being bordered by the Atlantic Ocean to the west and south and by Spain to the north and east.\n\nPort']

... done

Published Feb. 21, 2024 7:36 AM - Last modified Feb. 21, 2024 10:15 AM