DeapSECURE module 1: Introduction to HPC: Discussion

The Whole Truth

Several facts were simplified in the lesson episodes so as not to inundate learners with excessive level of details. This part is dedicating to spelling out the blunt facts, which can be painful for newcomers to realize.

Determining the Origin of Spam Emails

This is actually a very complicated question to answer. The short answer is: There are a lot of uncertainties in determining where exactly was a spam originated. Crooks have many ways to hide his true identity and location. The reason is that there is no way to fully ascertain the authenticity and veracity of every field in email’s header. For example, it is a well known fact that the From: line in an email can be easily forged. You can make your email appear from anyone you want to impersonate.

In a similar vein, the Received: lines can also be forged. For example, a hacker may add a few more carefully crafted Received: lines at the top of the spam sent from his controlled machine. Then who knows what country of origin would the hacker mislead you to believe?

Although there are standards on email headers in the Internet RFC documents (including a standard for the format of the From: header line), not all mail programs / servers follow them strictly. This creates difficulty in extracting the IP addresses.

The use of web-based mails such as Yahoo! Mail and Hotmail also presents additional complexity. With those emails, the first registered host in the oldest Received: line is most likely the mail server’s location (such as, United States in the case of Yahoo! Mail). The real sender could be doing this spamming from a compromised web mail account, and he could reside in (let’s say) Libya to send that email to a victim in the United States. As a result, the spam would appear to have originated from the U.S. Web mail providers have some custom (non-standard) headers such as X-Originating-IP which could be used to track down the IP address of the web browser that connects to the web mail’s server.

Tracking down the real hackers behind a spam operation can be extremely difficult. Hackers can also make one or more layers of shield for themselves by using compromised desktops or servers (“internet bots”) to run the web browser or mail program which are then used to send the spam, and using proxy servers. Forensic analysis into these internet bots would then be required, with varying degrees of lack. This is why it is not easy to track down cybercriminals, and even more so, to arrest and prosecute them.

Character Collating Order

In the discussion of sort, we simply mentioned that the character’s number representation is what used to determine the sort order of strings. Today, the use of local language support (“locale”) can mean that the numerical value of the number representation would not straightly translate to the ordering of the associated characters. For example, in Spanish locale, the ñ character would appear right after n, and before o. What does not change is this: There is a deterministic ordering of characters based on the collating order of the current locale. Please see this Wikipedia article for more information about collation order.