This lesson is in the early stages of development (Alpha version)

Spam: Everyone's Cybersecurity Issue

Overview

Teaching: 10 min
Exercises: 0 min
Questions
  • What is a spam?

  • What the different types of spam?

  • What are the problems posed by spam?

  • How does spam indicate cybersecurity problems?

  • Why do we need a powerful supercomputer to analyze a massive collection of spam emails?

Objectives
  • Explain what a spam is.

  • Explain the different types of spam emails and the motivation behind them.

  • Explain the cybersecurity issues posed by spam.

  • Explain the necessity of a powerful supercomputer to handle massive amounts of data.

Spam refers to unsolicited emails that typically contain unwanted advertisements, requests, or promotions. The primary goal of spam is to take advantage of recipients by enticing them to click on links, provide personal information, or part with their money. The main motivation of spam emails is often to obtain sensitive personal information, steal identities, or lure individuals into financial scams.

Here are some examples of spam email messages.

Image of spamemail

#TODO Include the picture of “Are you still alone?”

Types of Spam

Spam emails can be classified into three main types:

  1. Unsolicited Advertising: Involves sending emails containing advertisements, whether legitimate or junk, without the recipient’s consent.

  2. Scams: Utilizes deceptive tactics to lure individuals into financial traps, often with messages like “I need money now” or impersonating official entities like the “ministry of petroleum of country X”.

  3. Phishing: Aims to trick recipients into sharing sensitive information by posing as a trustworthy source, such as fake messages claiming “your bank account has been suspended”.

  4. Malicious Attachments and Links: Includes emails with attachments or links to malicious websites designed to compromise the recipient’s computer security and gain unauthorized access.

What Are the Threats Posed by Spam Emails?

Spam is a significant problem to today’s society due to various reasons. In the mildest form, receiving spam emails is simply an annoyance. Spam emails inundate inboxes, causing frustration and wasting time for recipients. Spam often leads to more serious problems: For example, the message recipient may suffer deception, which leads to financial loss or theft of personal information. Spam often includes scams that deceive individuals into sending money, giving out financial information, sensitive personal identification numbers, and other troubles that inflict sufferings and losses to individual persons and businesses. Finally, spam may pose cybersecurity threats. Spam emails can contain malicious attachments or links that pose cybersecurity risks, such as spreading viruses or stealing personal information.

How is spamming made possible?

There are some ways to make spamming possible. Firstly, email systems have weak identity verification mechanisms, making it easy for senders to fake their identities and send spam messages to real users.And, spammers collect email addresses through various methods like harvesting from the internet, newsgroups, or simply guessing addresses from different back domain, enabling them to target recipients with spam emails.Lastly, spammers utilize compromised computers or accounts to send out large volumes of spam emails without being easily traced back to their true origin, contributing to the widespread distribution of spam.

Why do we need to learn this class for spamming?

Every spam is an indicator of cybersecurity problem! Like what we know above, many spam originated from hacked machines or a compromised email account.

At the end of this class, our goal is to study the statistics of a spam collection to answer following questions from dataset SPAM Archive. The questions are from which country did all these spam emails originate and how many spam emails come from each different country.

Think about it

Now, it’s time for action! What are the steps to get the answer to the questions above? Discuss this with your classmates.

Overview of the steps

Step 1: Spam Collection

We start by gathering spam emails from a wide array of sources. Specifically, we utilize the spam collection curated by Bruce Guenter. This collection includes a significant number of emails, totaling over 10 million, and the data size is substantial.

Step2: Email Headers and Origin Tracking

Each email comes with a header that carries tracking information. This can be used to trace the origin of the email. The most crucial piece of information in this context is the sender’s IP address, which can pinpoint the location from where the email was sent.

Step3: ?

If we have 10 million spam emails, what kinds of tools are we supposed to use. If it takes 1 second to process an email, then processing more than 10 million emails would take over 115 days of continuous operation. It shows that we need to process this task more effeciently. We will learn how to use more powerful tools to process millions of information.

Key Points

  • Spam is an unsolicited email that contains unwanted advertisements, requests, or enticements.

  • Different types of spam emails include: unsolicited advertisements, scam, phishing, email with malicious payload.

  • Spam poses cybersecurity risks through stealing personal information, malicious software, and system break-in.

  • Powerful supercomputers can tremendously reduce the time to process massive amounts of data through parallel processing.