Google has updated Gmail spam filters to block out emails coming from addresses using suspicious combinations of Latin and non-Latin characters, the company said on Tuesday.
Google announced last week that it has adopted an email standard that supports addresses with non-Latin and accented Latin characters. For example, a woman from China named 王芳 (Wang Fang) and a man from Spain named José Ramón might want to use email addresses in which their names are spelled with Chinese characters, respectively with accented Latin characters.
This is possible thanks to an email standard created in 2012 by a group called the Internet Engineering Task Force (IETF). By adopting the standard for Gmail, and soon for Calendar, Google allows its customers to send and receive emails from people who have non-Latin and accented Latin characters in their addresses.
For the time being, the company doesn’t allow customers to create addresses with such characters, but this first step toward more global email opens up new opportunities for scammers and spammers.
They can exploit the fact that many non-Latin characters look nearly identical to Latin characters. For instance, the Gujarati digit zero (૦) and the Greek small letter omicron (ο) are very similar to the letter o, allowing scammers to “hoodwink” unsuspecting users by mixing and matching characters, Mark Risher of the Gmail spam and abuse team explained in a blog post on Tuesday. One of the examples provided by Risher is “MyBank” vs. “MyBɑnk.”
In order to address this issue, Google has updated Gmail spam filters to reject emails coming from addresses that use combinations identified by the Unicode community as being suspicious and potentially misleading.
“We’re using an open standard—the Unicode Consortium’s ‘Highly Restricted’ specification—which we believe strikes a healthy balance between legitimate uses of these new domains and those likely to be abused,” explained Risher.
In the “Highly Restrictive” specification, all characters must be from a single script, or from a series of combinations between Latin and Japanese or Chinese characters. The combinations are:
-Latin + Han + Hiragana + Katakana;
-Latin + Han + Bopomofo; or
-Latin + Han + Hangul.
Google has updated its guidelines for bulk senders to clarify that the authenticating domain, envelope “From” domain, payload “From” domain, reply-to domain, and sender domain should not violate these rules.
In April, Kaspersky spotted a series of spam emails in which certain letters in the subject line and the body were replaced with Cyrillic and Greek characters, and International Phonetic Alphabet (IPA) symbols in an effort to evade spam filters.