Python
We will use Python 3, version 3.7 or higher. When mandatory assignments demand python code, you should use this.
IPython or Jupyter notebook
We recommend working interactively when working with Python. The IPython package adds a lot of functionality, including auto indent and command completion with the tab key. Also the help function with ? and ?? are useful, and so are the magic commands.
Try for example
- import nltk
- nltk.FreqD #and hit tab
- nltk.FreqDistribution?
- nltk.FreqDistribution??
- history?
You can work with ipython either in the terminal window, or through a Jupyter notebook.
Toolkits and packages
During the course, we will be familiar with several toolkits and packages.
The family of packages for numeric computing:
- numpy
- scipy
- matplotlib
- pandas
Machine learning packages
- scikit-learn
- pytorch
NLP-packages
- nltk
- spacy
- gensim
(and maybe more)
IFI machines
On an IFI terminal, the default python is now
opt/ifi/anaconda3/bin/python3
This means that when you start a python/ipython/jupyter notebook session, you get access to all these packages.
In the course, we will use several corpora and data provided with nltk. To get easy access to them on the IFI-machines, you can put
- export NLTK_DATA=/projects/nlp/nltk_data
in your .bashrc file. Then you don't have to install the data to your own disk area.
Remote login
You may login remotely to the IFI-machines when working from home. The recommended solution is now to use VDI, see
/english/services/it/computer/vdi/
It will (eventually) give you access to the same environment as when you login directly to an IFI linux machine. However, you will not have access to the /projects/nlp/. You will have to download data within nltk when needed.
Installation on your own machine
You may install Python with all the packages to your own PC. For the first parts of the semester, you are recommended to make a dedicated environment with the required packages. This can be done easily with Anaconda or miniconda. It manages all the packages and dependencies between them and makes it easy to stay updated. It should work on Linux, iOS and Windows. See the instructions here. We will make an extended environment with new instructions for the last part of the semester.