Description
State-of-the-art Machine Learning for PyTorch, TensorFlow and Jax.
Home page
https://github.com/huggingface/transformers
Documentation
https://huggingface.co/docs/transformers/
License
Usage
Transformers uses the PyTorch backend and should (ideally) be run on the GPU nodes (using the "cuda" module). It's configured for offline use and points to a local Huggingface cache.
Use
module avail Transformers
to see which versions of Transformers are available. Use
module load Transformers/version
to get access to Transformers.
Huggingface cache
The module sets environment variables (HF_DATASETS_OFFLINE, HF_HUB_OFFLINE, TRANSFORMERS_OFFLINE) for offline usage in TSD.
The module sets "HB_HOME=/cluster/software/huggingface" to point to a local repository for models and data. Models and data are distributed under various licenses, so please read the license for your models prior to usage. To add models and data to the central repository submit a request to tsd-drift.
You can list the available models using the following command:
huggingface-cli scan-cache
Overwrite the "HB_HOME" environment variable to point to your own imported models/data:
export HB_HOME=/path/to/your/model
Example
Please note that the example below is model specific. Scripts will require modification to fit the needs of your analysis and data. The PyTorch default device is the cpu, so the cuda device needs to be set in the code. Please consult the model documentation on Huggingface for details and examples.
The following example from Huggingface will help you to get started. Create the following Slurm script "transformers.sm":
#!/bin/bash #SBATCH --job-name=transformers #SBATCH --account=pXX_tsd #SBATCH --partition=accel #SBATCH --gres=gpu:1 #SBATCH --time=0:05:00 #SBATCH --mem-per-cpu=4G set -o errexit module purge --quiet module load Transformers/4.37.2-foss-2021a-CUDA-11.3.1 python transformers_lmm.py
Then create the "transformers_lmm.py" file:
from transformers import AutoModelForCausalLM from transformers import AutoTokenizer # Specify the model model_id = "mistralai/Mistral-7B-v0.1" # Set "local_files_only" to skip online lookup attempts model = AutoModelForCausalLM.from_pretrained( model_id, device_map="auto", load_in_4bit=True, local_files_only=True ) tokenizer = AutoTokenizer.from_pretrained(model_id, padding_side="left", local_files_only=True) print('... processing single input data') model_inputs = tokenizer(["A list of colors: red, blue"], return_tensors="pt").to("cuda") generated_ids = model.generate(**model_inputs, max_new_tokens=50) print(tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]) print('... processing batch input data') # Most LLMs don't have a pad token by default tokenizer.pad_token = tokenizer.eos_token model_inputs = tokenizer( ["A list of colors: red, blue", "Portugal is"], return_tensors="pt", padding=True ).to("cuda") generated_ids = model.generate(**model_inputs, max_new_tokens=50) print(tokenizer.batch_decode(generated_ids, skip_special_tokens=True)) print('... done')
Finally submit the job:
sbatch transformers.sm
Which returns something along the lines of:
===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug reports, please run python -m bitsandbytes and submit this information together with your error trace to: https://github.com/TimDettmers/bitsandbytes/issues ================================================================================ bin /cluster/software/EASYBUILD/Transformers/4.37.2-foss-2021a-CUDA-11.3.1/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda113.so CUDA SETUP: CUDA runtime path found: /cluster/software/EASYBUILD/CUDA/11.3.1/lib/libcudart.so.11.0 CUDA SETUP: Highest compute capability among GPUs detected: 8.0 CUDA SETUP: Detected CUDA version 113 CUDA SETUP: Loading binary /cluster/software/EASYBUILD/Transformers/4.37.2-foss-2021a-CUDA-11.3.1/lib/python3.9/site-packages/bitsandbytes/libbitsandbytes_cuda113.so... Loading checkpoint shards: 100%|██████████| 2/2 [00:41<00:00, 20.85s/it] Setting `pad_token_id` to `eos_token_id`:2 for open-end generation. Setting `pad_token_id` to `eos_token_id`:2 for open-end generation. ... processing single input data A list of colors: red, blue, green, yellow, orange, purple, pink, brown, black, white, gray, silver, gold, tan, beige, cream, ivory, tan, and more. ... processing batch input data ['A list of colors: red, blue, green, yellow, orange, purple, pink, brown, black, white, gray, silver, gold, tan, beige, cream, ivory, tan, and more.\n\nA list of colors: red, blue, green', 'Portugal is a country in southwestern Europe, on the Iberian Peninsula. It is the westernmost country of mainland Europe, being bordered by the Atlantic Ocean to the west and south and by Spain to the north and east.\n\nPort'] ... done