Workshop Resources
The machine learning and neural network lessons share the same set of research problems, datasets, and hands-on files. The notebooks linked below are for this neural network lesson, but the ZIP files contain everything needed to learn both lessons.
Obtaining Hands-on Materials
If you are taking this training using ODU’s Wahab cluster, please read through the instructions on launching a Jupyter session via Open OnDemand and copying the hands-on files in order to set up your own copy of the files in your own home directory on the cluster.
The downloadable resources below are made available here for the general public to use on their own computers. These were taken from the online workshop series in the Summer of 2021 (a.k.a. “WS-2020-2021”).
To download the notebooks and the hands-on files, please right-click on the links below and select “Save Link As…” or a similar menu.
Resources: Jupyter Notebooks
- Session 1: Binary Classification with Keras - (html)
- Session 2: Classifying Smartphone Apps with Keras - (html)
(The HTML files were provided for convenient web viewing.)
Resources: Hands-on Package
- Sherlock hands-on files for ML and NN lessons, except the large files (table of contents) – This also contains the Jupyter notebooks above
- Sherlock large dataset:
sherlock_2apps
(table of contents) - Sherlock large dataset:
sherlock_18apps
(table of contents)
The hands-on files are packed in ZIP format. The three ZIP files above are mandatory. To reconstitute: Unzip all the files, preserving the paths, into the same destination directory.
Setting Up Hands-On Files
Taking Both ML and NN Lessons?
If you have recently done the hands-on activities from the DeapSECURE’s ML lesson, then you already have all the files needed for the hands-on learning of this (NN) lesson; you can skip the setup procedure below.
The DeapSECURE hands-on exercises can be run on many platforms. They were initially created and tested for ODU Wahab cluster, but they can be adapted to other HPC clusters. They can also be run on a sufficiently powerful local computer (desktop/laptop) with a standalone Python distribution such as Anaconda. Please find below the instructions for the platform you will be using. Your instructor or mentor should have informed you concerning which platform you should be using.
Preparing Hands-on Files on ODU Wahab Cluster
To prepare for the exercises on Wahab, please run the following commands on the shell. (This can be done using a terminal session under SSH, or a terminal session within Jupyter.)
Hands-on files are located on Wahab on this subdirectory:
/shared/DeapSECURE/module-ml/
(For Turing, the location is /scratch-lustre/DeapSECURE/module-ml/Exercises
).
Create a directory ~/CItraining/module-ml
:
$ mkdir -p ~/CItraining/module-ml
Copy the entire directory tree to your ~/CItraining/module-ml
:
$ cp -a /shared/DeapSECURE/module-ml/. ~/CItraining/module-ml/
Be careful! All characters do matter (even a period must not be missed). Do NOT insert whitespace where there is not one in the command above!
Now change directory to ~/CItraining/module-ml
,
$ cd ~/CItraining/module-ml
and you are ready to learn! If you are using the Jupyter notebooks (see the resources near the top of this page), navigate your Jupyter’s file browser to this directory and select the appropriate notebook to open.
Obtaining Compute Resource (Non-Jupyter)
DeapSECURE lessons can also be carried out without the Jupyter platform. While it is possible to use the plain python interface for learning, we recommend that learners at minimum use ipython, which has the nice autocomplete, history, and shell-like facility.
In this workshop, we will begin by training neural networks interactively, which is a computationally intensive process. For this reason, we must do our hands-on activities on a compute node. (We intentionally limit our session time to a maximum of one day. You can increase or decrease the session time as needed.)
For the Wahab cluster:
$ salloc -c 2 -t 1-0
For the Turing cluster:
$ salloc -c 2 -C AVX2 -t 1-0
(We request a compute node on the cluster that has the AVX2 support. AVX2 is vector instruction which will significantly speed up machine learning computations.)
For all other clusters with a SLURM job scheduler, the following command may work (check with your instructor or local cluster documentation):
$ srun --pty --preserve-env -c 2 -t 1-0 /bin/bash
Notice that the host name printed on the shell prompt would change; that is an indicator that we have logged on to a compute node.
Setting Up Software Environment
Keras and Tensorflow software have a lot of dependencies.
Additionally, DeapSECURE lesson requires additional Python libries such as pandas
and scikit-learn
.
After you obtain the compute resource, you will need to load a number of modules.
Wahab Cluster
On Wahab, we have prepared a custom environment module called “DeapSECURE”. Once this is loaded, you only need to load the TensorFlow module, then you are all set:
module load DeapSECURE
module load py-tensorflow
Turing Cluster
The following is an example for Turing cluster:
enable_lmod
module load python/3.6
module load numpy scipy
module load pandas
module load scikit-learn
module load ipython
module load matplotlib
module load cuda/9.1
module load tensorflow/1.10
module load keras/2.2
We created a shell include file named keras-env-py3
in your hands-on
directory to ease reloading of these modules later on.
To take advantage of this, please issue this command in the same directory as before:
$ source keras-env-py3
Do this only once per shell session, right after you obtain the compute resource.
Other Clusters
The following software and libraries are the prerequisite for the hands-on activities of the ML lesson:
- Python 3 (3.6 or later)
- Jupyter (or JupyterLab)
- ipython (included in Jupyter)
- pandas (1.0 or later)
- scikit-learn
- matplotlib
- seaborn
- tensorflow (1.13 or later; version 2.x is highly recommended)
- keras (included in tensorflow)
We recommend you get the newest version of each package. Should you encounter issue with software, please file an issue on Gitlab.