This lesson is in the early stages of development (Alpha version)

DeapSECURE module 1: Introduction to HPC

This is the first module of ODU’s DeapSECURE cyberinfrastructure training. This module contains the introduction to High-Performance Computing (HPC) clusters, complemented with many hands-on exercises. Our final goal in this module is to perform basic analysis on a massive set of spam emails.

Contents & Roadmap

  1. Introduction to HPC–the background
  2. Accessing an HPC system–hands-on on Turing cluster
  3. UNIX shell interaction
  4. Text processing with UNIX shell
  5. Speeding up massive data processing: an example of SPAM email analysis
  6. Going forward

Where are we going?

First, we introduce what an HPC is, then how to access an HPC system (ODU Turing cluster). Next, we will present a crash course, or refresher, on UNIX shell. We will use the UNIX shell knowledge to write a simple pipeline to process a very large number of spam emails and obtain some statistical knowledge about them.

Prerequisites

  • Basic computer interactions, students should know how to interact with a computer using a keyborad.
  • Basic concepts such as directories, files, and paths.
  • Basic text editing skills. Students should know how to input text, issue commands…

Schedule

Setup Download files required for the lesson
00:00 1. Introduction to High-Performance Computing What is a High-Performance Computing (HPC) system?
Who uses HPC systems?
Why HPC?
00:10 2. Spam: Everyone's Cybersecurity Issue What is a spam?
What the different types of spam?
What are the problems posed by spam?
How does spam indicate cybersecurity problems?
Why do we need a powerful supercomputer to analyze a massive collection of spam emails?
00:20 3. Accessing HPC How do we access a modern HPC system?
How do we interact with a basic HPC interface?
00:30 4. Basic Shell Interaction How do we interact with a UNIX shell, the basic HPC interface?
What is a file system?
How do we navigate around files and directories from a UNIX shell?
How do we manage files and directories from a UNIX shell?
How do we work with text files from a UNIX shell?
How do we get help on UNIX shell commands?
01:20 5. Text Processing Tools & Pipeline How do we process text-based information using UNIX tools?
How do we build a processing pipeline by combining UNIX tools?
01:50 6. Task Automation with Scripts How can we repeat the same or similar set of commands over and over?
02:15 7. My First HPC Computation: Spam Mail Analysis How do we run a serial computation on a modern HPC system?
02:30 8. Parallel Processing 1: UNIX Background Process How do we launch simultaneous computations on a UNIX system?
03:05 9. Parallel Processing 2: Using GNU `parallel` How do we launch many computations on a modern HPC system?
03:40 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.