Spam: Everyone's Cybersecurity Issue
Overview
Teaching: 10 min
Exercises: 0 minQuestions
What is a spam?
What the different types of spam?
What are the problems posed by spam?
How does spam indicate cybersecurity problems?
Why do we need a powerful supercomputer to analyze a massive collection of spam emails?
Objectives
Explain what a spam is.
Explain the different types of spam emails and the motivation behind them.
Explain the cybersecurity issues posed by spam.
Explain the necessity of a powerful supercomputer to handle massive amounts of data.
What is a Spam?
In the modern world where cyber technologies have become indispensable in virtually all aspects of the society, we find that these technologies also come with many problems. Those who have used any messaging system (email and instant messages), at one point or another, will experience unpleasant problems caused by email spam (shortened as spam in this lesson). A spam refers to an unsolicited email message that typically contain unwanted advertisements, requests, or promotions. The spammers, i.e. the actors behind the sending of spam emails, are typically motivated by the desire for illicit personal gains. They do this by sending email messages that entice the recipients to click web links, download and open a malicious attachments, provide sensitive personal information (such as name, age, license numbers, social security numbers, banking or other account information, username, password), or send money to unknown entities. What do the spammers want to gain out of their potential victims?
Here are some examples of spam email messages.
The main content of the email is a message that asks, “Are you still alone?” in large letters, with a subtext that reads, “WOW! These women are your perfect matches!” There is an option to unsubscribe from the mailing list at the bottom, and the unsubscribe text is followed by a physical mailing address, which is another common element in such emails, though it doesn’t necessarily confirm legitimacy. The email client has marked it as spam, likely because it is similar to other messages previously identified as spam.
Types of Spam
Spam emails can be classified into three main types:
-
Unsolicited Advertising: Involves sending emails containing advertisements, whether legitimate or junk, without the recipient’s consent.
-
Scams: Utilizes deceptive tactics to lure individuals into financial traps, often with messages like “I need money now” or impersonating official entities like the “ministry of petroleum of country X”.
-
Phishing: Aims to trick recipients into sharing sensitive information by posing as a trustworthy source, such as fake messages claiming “your bank account has been suspended”.
-
Malicious Attachments and Links: Includes emails with attachments or links to malicious websites designed to compromise the recipient’s computer security and gain unauthorized access.
What Are the Threats Posed by Spam Emails?
Spam is a significant problem to today’s society due to various reasons. In the mildest form, receiving spam emails is simply an annoyance. Spam emails inundate inboxes, causing frustration and wasting time for recipients. Spam often leads to more serious problems: For example, the message recipient may suffer deception, which leads to financial loss or theft of personal information. Spam often includes scams that deceive individuals into sending money, giving out financial information, sensitive personal identification numbers, and other troubles that inflict sufferings and losses to individual persons and businesses. Finally, spam may pose cybersecurity threats. Spam emails can contain malicious attachments or links that pose cybersecurity risks, such as spreading viruses or stealing personal information.
How Is Spamming Made Possible?
Why is spam problem so prevalent? It appears that sending spam emails is so easy, that virtually all email accounts will suffer from being spammed. The spam problem exists because of multiple reasons:
-
The weaknesses of the original email transmission protocol (known as Simple Mail Transfer Protocol [SMTP]). The most rudimentary mail sending protocol does not require any authentication, which leads to easy abuse by hackers when they find an open mail server. Today, this problem should have been ameliorated significantly by requiring clients to supply appropriate credentials (e.g. username and password) to allow email transmission through most mail servers. This does not fully stop hackers from abusing mail servers, because they may still be able to obtain genuine usernames and passwords by means of phishing.
-
Related to the first problem, SMTP has weak identity verification mechanisms: SMTP allows email sender to specify its identity arbitrarily. This means that it is trivial to spoof (fake) the identity of the sender.
-
Spammers collect email addresses through various methods, like harvesting (“scraping”) from the internet, newsgroups, or simply guessing email addresses by combining reasonable words with domain names. Therefore, hackers have a large number of potential recipients to send spam emails to.
-
Lastly, spammers utilize compromised computers or accounts in order to send out the large volumes of spam emails without being easily traced back to their true origin (i.e. the true hackers).
By considering all these factors together, we can see that spam is an indicator of cybersecurity problems at least in two ways: (1) compromised user credentials; (2) compromised machines. These compromised assets (whether credentials are machines), plus the deployment of “spambots” (i.e. software to perform automated spamming) are what enabled spam emails to be sent out in a widespread manner.
Why do we need to learn this class for spamming?
In the context of cybersecurity, spam emails are not just a nuisance, but they can also be indicative of larger security issues. Spam can originate from botnets, which are networks of infected computers that are controlled remotely and used to send out massive quantities of spam. This can happen through malware infections, phishing attacks, or through the exploitation of network vulnerabilities. By studying the origins and patterns of spam, we can gain insights into the methods used by cybercriminals, as well as identify possible compromised machines or networks.
As we delve into the world of cybersecurity and its implications, our class project takes on a practical challenge: analyzing the statistics of a spam collection from the SPAM Archive. The crux of our investigation is to uncover the geographic origins of these unsolicited emails and to quantify them by country.
By the conclusion of our explorations, we aim to have answered these pivotal questions: Which countries are the most common sources of spam in this dataset? And how does the volume of spam emails compare between these countries?
Think about it
Now, it’s time for action! What are the steps to get the answer to the questions above? Discuss this with your classmates.
Overview of the steps
Imagine that we’re a team of cybersecurity detectives. Our mission is to sift through a mountain of spam and trace the breadcrumbs back to their origins. We’ve got our work cut out for us, with over 10 million emails lying in wait. Bruce Guenter has collected these.
Step 1: Spam Collection
We begin our journey in the vast digital ocean of spam. It’s a bit like fishing; we’re casting our nets into the depths of Guenter’s extensive spam archive. We’re not just catching a few stray fish—we’re hauling in a gargantuan catch.
Step2: Email Headers and Origin Tracking
Each email is like a puzzle piece, complete with its own set of clues in the header. These headers are our map, leading us to the treasure—or in our case, the origin of the spam. We’re particularly interested in the sender’s IP address. It’s the “X marks the spot” on our map, guiding us to the location from which each spam email embarked on its journey.
Step3: The Need for Speed
Now, we hit a snag. Processing 10 million emails one by one would be like counting every star in the sky—tedious and practically impossible within a reasonable time. We need a warp drive to make this trip feasible. That’s where powerful computational tools come into play. They’re our spaceship, capable of hyper-processing and crunching through data at lightning speed.
With the steps outlined, we now need to buckle up and prepare for this epic adventure. It’s time to learn about the high performance computer(HPC) and Python that’ll transform our mission from a centuries-long odyssey into a swift expedition. By learning to use these tools effectively, we can automate the processing of our massive email collection.
Key Points
Spam is an unsolicited email that contains unwanted advertisements, requests, or enticements.
Different types of spam emails include: unsolicited advertisements, scam, phishing, email with malicious payload.
Spam poses cybersecurity risks through stealing personal information, malicious software, and system break-in.
Powerful supercomputers can tremendously reduce the time to process massive amounts of data through parallel processing.