The Whole Truth
Several facts were simplified in the lesson episodes so as not to inundate learners with excessive level of details. This part is dedicating to spelling out the blunt facts, which can be painful for newcomers to realize.
Determining the Origin of Spam Emails
This is actually a very complicated question to answer.
The short answer is: There are a lot of uncertainties in determining
where exactly was a spam originated.
Crooks have many ways to hide his true identity and location.
The reason is that there is no way to fully ascertain
the authenticity and veracity of every field in email’s header.
For example, it is a well known fact that the From:
line
in an email can be easily forged.
You can make your email appear from anyone you want to impersonate.
In a similar vein, the Received:
lines can also be forged.
For example, a hacker may add a few more carefully crafted
Received:
lines at the top of the spam sent from his controlled machine.
Then who knows what country of origin
would the hacker mislead you to believe?
Although there are standards on email headers in the Internet RFC documents
(including a standard for the format of the From:
header line),
not all mail programs / servers follow them strictly.
This creates difficulty in extracting the IP addresses.
The use of web-based mails such as Yahoo! Mail and Hotmail also presents
additional complexity.
With those emails, the first registered host in the oldest Received:
line is most likely the mail server’s location (such as, United States
in the case of Yahoo! Mail).
The real sender could be doing this spamming from a compromised web mail
account, and he could reside in (let’s say) Libya to send that email to
a victim in the United States.
As a result, the spam would appear to have originated from the U.S.
Web mail providers have some custom (non-standard) headers such as
X-Originating-IP
which could be used to track down the IP address
of the web browser that connects to the web mail’s server.
Tracking down the real hackers behind a spam operation can be extremely difficult. Hackers can also make one or more layers of shield for themselves by using compromised desktops or servers (“internet bots”) to run the web browser or mail program which are then used to send the spam, and using proxy servers. Forensic analysis into these internet bots would then be required, with varying degrees of lack. This is why it is not easy to track down cybercriminals, and even more so, to arrest and prosecute them.
Character Collating Order
In the discussion of sort
, we simply mentioned that
the character’s number representation is what used to determine the sort
order of strings.
Today, the use of local language support (“locale”) can mean that
the numerical value of the number representation would not straightly
translate to the ordering of the associated characters.
For example, in Spanish locale, the
ñ
character would appear right after n
, and before o
.
What does not change is this: There is a deterministic ordering of characters
based on the collating order of the current locale.
Please see this Wikipedia article
for more information about collation order.