UNIX Shell: First Impression

For this session, we assume that you have just logged in to Turing. As mentioned in the previous episode, you will be greeted with the UNIX shell prompt which looks like this:

[YOUR_MIDAS_ID@turing1 ~]$

As a user, you interact with UNIX shell by typing commands on this prompt. Once a command is typed, you will press Enter in order to execute it.

Because UNIX shell is such an important aspect of interaction with HPC, this episode serves as a crash course on UNIX shell and tools. What is presented here is short enough to allow newcomers to “survive” the (apparently) clunky interface used in many HPC systems. Take heart: despite the archaic look, the tools are actually very powerful and will allow you to be very productive.
First Exploration with UNIX Shell

IMPORTANT: Please follow along all the commands shown below, as they are also meant to prepare your files for the hands-on exercises later on.

Just as your laptop computer, files on an HPC system are organized in terms of (sub)directories and files. (Some operating systems use the name “folder” for a directory; they are interchangeable.)

There are three life-saver commands on UNIX shell that you must always remember:

    pwd (shorthand of “print working directory”) is the command to find out the current working directory of your shell.

    cd (shorthand of “change directory”) is the command to change the current working directory of your shell.

    ls (shorthand of “list”) is a command to list the contents of current working directory.

These three commands are essentially the most frequently used actions in a graphical file explorer (such as Windows Explorer or Finder): knowing where we are in a folder structure, changing folder, and listing the folder contents.
pwd – Current working directory

When you first log in to a computer, you will be placed in a directory that is designated to be your home directory, denoted by the ~ character in the prompt. This ~ is a common shortcut in UNIX, and is usable on many instances when working on a UNIX shell. But what is the actual location of your home directory? To find out, we invoke the pwd command:

[YOUR_MIDAS_ID@turing1 ~]$ pwd
/home/YOUR_MIDAS_ID

So, /home/YOUR_MIDAS_ID is your home directory.
cd – Change directory

Now let us use the cd command to move to another subdirectory.

[YOUR_MIDAS_ID@turing1 ~]$ cd /scratch-lustre/DeapSECURE/module01
[YOUR_MIDAS_ID@turing1 module01]$

EXERCISE: Verify that your current working is indeed as you expected.

NOTE: Please remember this subdirectory. It is a location where we store shared files for our CI training.

    File and directory names are case sensitive!

    Please be aware that in Linux and UNIX, file names are case sensitive. For example, you have to use capital “D” and “SECURE” in the example above.

There is a nifty trick: cd command with no argument will return you to your home directory. This is handy in many situations.

EXERCISE: Please go back to your home directory now.
ls – List the contents of a directory

Use the ls command to list the contents of the current working directory. Continuing the previous exercise, invoke the ls command and see what happens:

[YOUR_MIDAS_ID@turing1 module01]$ ls
bandwidt_test.tar.gz  Exercises  geoip  spams

In this example, there are four objects residing in the module01 directory: one regular file (bandwidt_test.tar.gz) and three subdirectories (Exercises, geoip, and spams). In modern Linux distributions, the names are often color-coded (for example, blue may indicate a directory, while light green is often used for executable files).

You can supply file or directory names as an argument to ls:

    Given a directory, ls will print the content of that specified directory:

    [YOUR_MIDAS_ID@turing1 module01]$ ls Exercises
    Bandwidth_test  Spam_analyser  Unix

    Given a file, ls will print back the file name, if it exists:

    [YOUR_MIDAS_ID@turing1 module01]$ ls bandwidt_test.tar.gz
    bandwidt_test.tar.gz
    [YOUR_MIDAS_ID@turing1 module01]$ ls doesthisexist
    ls: cannot access doesthisexist: No such file or directory

    You can specify more than one objects to show:

    [YOUR_MIDAS_ID@turing1 module01]$ ls bandwidt_test.tar.gz  Exercises
    bandwidt_test.tar.gz

    Exercises:
    Bandwidth_test  Spam_analyser  Unix

    You can use * wildcard to partially specify the filename–very useful in the case of looking for files with a certain extension, or files who have a certain substring in their names:

    [YOUR_MIDAS_ID@turing1 module01]$ ls *.gz
    bandwidt_test.tar.gz
    [YOUR_MIDAS_ID@turing1 module01]$ ls band*
    bandwidt_test.tar.gz
    [YOUR_MIDAS_ID@turing1 module01]$ ls *and*
    bandwidt_test.tar.gz  Bandwidth_test

UNIX commands come with a rich set of options (or sometimes called flags) that enrich the usability of the commands. For ls command, the -l flag tells a lot more information about each file or directory printed.

  [wpurwant@turing1 module01]$ ls -l
  total 16
  -rwx------ 1 ingat001 DeapSECURE 2333 Oct 15 16:41 bandwidt_test.tar.gz
  drwxrwxr-x 5 wpurwant DeapSECURE 4096 Oct 18 13:19 Exercises
  drwxrwxr-x 3 wpurwant DeapSECURE 4096 Oct  5 13:53 geoip
  drwxrwxr-x 3 wpurwant DeapSECURE 4096 Oct 17 13:39 spams

Each file is printed with the following attributes (let’s take file bandwidt_test.tar.gz as an example):

    File attribute and permission bits (-rwx------)
    Link count (don’t worry about this for now)
    File owner (ingat001)
    File group (DeapSECURE)
    File size (2333 bytes)
    File modification time (October 15, at 16:41)
    File name (bandwidt_test.tar.gz)

As you can see, directories have d as their first letter in the file attribute.

The -l flag can be specified along with the file name. (The typical UNIX convention specifies that flags need to come before the file/directory arguments.)

  [YOUR_MIDAS_ID@turing1 module01]$ ls -l *.gz
  -rwx------ 1 ingat001 DeapSECURE 2333 Oct 15 16:41 bandwidt_test.tar.gz

    Exercises

        What are the contents of your home directory?

        Enter into the Exercises subdirectory and list the content of each subdirectory contained therein.

Too Much Typing? Tab to the Rescue

TODO: Tab completion
mkdir – Make directory

Let’s create a directory called CItraining on your home directory, then another subdirectory called module1 in that directory. mkdir is the command to do that:

  [YOUR_MIDAS_ID@turing1 ~]$ mkdir ~/CItraining
  [YOUR_MIDAS_ID@turing1 ~]$ cd ~/CItraining
  [YOUR_MIDAS_ID@turing1 CItraining]$ mkdir module1

Note that the ~/ prefix stands for your home directory.

    You can substitute /home/YOUR_MIDAS_ID/ in place of ~/. This is necessary outside of the context of shell, as many programs actually do not know how to interpret the ~ at the beginning of a path name.

    The ~/ prefix is not necessary if you are already in the home directory.

cp – Copy files and directories

The cp command is used to copy one or more files and/or directories. The following activities show some capabilities of cp.

IMPORTANT: We are going to use the files above in the subsequent exercises. Please follow along and do these commands on your HPC account.

EXERCISE: We assume that you are in the ~/CItraining/module1 directory. If you don’t have that yet, please create it and enter into it.

    Copying a single file: let’s copy a file named wrong_file.txt from subdirectory /scratch-lustre/DeapSECURE/module01/Exercises/Unix

    [YOUR_MIDAS_ID@turing1 module1]$ cp /scratch-lustre/DeapSECURE/module01/Exercises/Unix/wrong_file.txt .

    (that period character in the end is important: it means current directory).

    Copying a single file with a new name:

    [YOUR_MIDAS_ID@turing1 module1]$ cp wrong_file.txt wrong2.txt

    Copying multiple files:

    [YOUR_MIDAS_ID@turing1 module1]$ cp /scratch-lustre/DeapSECURE/module01/Exercises/Unix/input*.txt .

    Copying an entire subdirectory: we’ll also want to copy the directory Spam_analyser from the DeapSECURE shared location into your own directory:

    [YOUR_MIDAS_ID@turing1 module1]$ cp -r /scratch-lustre/DeapSECURE/module01/Exercises/Spam_analyser .

    The -r flag means “recursive copy”–it copies everything in that directory, including subdirectories found in that directory.

    EXERCISE: Please also copy the Slurm and Unix directories from the shared location into your own directory.

EXERCISE: Check the final contents of the module1 directory after all the commands above were completed.
mv – Moving and renaming

Moving and renaming files and directories can be done using the mv command.

    Renaming a single file:

    [YOUR_MIDAS_ID@turing1 module1]$ mv wrong_file.txt correct_file.txt

    Renaming a single directory:

    [YOUR_MIDAS_ID@turing1 module1]$ mv Unix Unix-tests

    Moving one or more files to another directory (Slurm):

    [YOUR_MIDAS_ID@turing1 module1]$ mv input*.txt Slurm

    or

    [YOUR_MIDAS_ID@turing1 module1]$ mv input1.txt input2.txt input3.txt Slurm

    The command moves all the input files to the Slurm directory. This command also works to move a directory, or a combination of files and directories, as long as the destination directory (to which the other files/directories are moved) is mentioned last.

rm and rmdir – Deleting files and directories

The rm command deletes a file. For example, we can delete the excess wrong2.txt file above.`

    Deleting a single file:

    [YOUR_MIDAS_ID@turing1 module1]$ rm wrong2.txt

    File Deletion is Permanent!

    In the UNIX world there is no concept of “Recycle Bin” or “Trash Bin”. Once a file is deleted, it is permanently inaccessible. Therefore always perform rm with extra care!

The rmdir command can be used to delete an empty directory. It refuses to delete a directory that is not empty. Example activity:

  [YOUR_MIDAS_ID@turing1 module1]$ cp -r Unix-tests testdir2
  [YOUR_MIDAS_ID@turing1 module1]$ ls testdir2
  [YOUR_MIDAS_ID@turing1 module1]$ rmdir testdir2
  rmdir: failed to remove `testdir2': Directory not empty
  [YOUR_MIDAS_ID@turing1 module1]$ rm testdir2/*
  [YOUR_MIDAS_ID@turing1 module1]$ rmdir testdir2

In the example above we create a copy of Unix-tests directory as testdir2. To delete the copied directory, the files in testdir2 must be removed first before rmdir can work.
