Arthur T Knackerbracket has found the following story:
Researchers at Soluble today said they worked with Verisign to thwart the registration of domain names that use homoglyphs – non-Latin characters that look just like letters of the Latin alphabet – to masquerade as legit domains.
[...] There have been a number of efforts over the years, most recently in 2017, we reckon, to rid the internet of homograph abuse once and for all.
In the most recent case, it was found that the Unicode Latin IPA Extension characters could and were being exploited to setup lookalike domains.
"Between 2017 and today, more than a dozen homograph domains have had active HTTPS certificates," noted Soluble researcher Matt Hamilton. "This included prominent financial, internet shopping, technology, and other Fortune 100 sites. There is no legitimate or non-fraudulent justification for this activity."
Normally, it would not be possible to register domains with mixed scripts, as Verisign put protections in place years ago. However, the researchers found that those protections did not extend to Unicode Latin IPA, meaning that prior to Verisign updating its filters after being tipped off by Soluble, the characters could be used to set up lookalike URLs.
[...] "While it is unlikely that you, the reader, were attacked with this technique," Hamilton notes, "it is likely that this technique was used in highly targeted social-engineering campaigns."
-- submitted from IRC
The most notable of these confusables are:
| Latin: | a | g | l |
| IPA: | ɑ | ɡ | ɩ |
It is much easier to tell them apart when the confusables are shown adjacent to each other. In the spoiler below, only one of the entries is correct... how long does it take you to figure out which one it is?
- google.ɑpis
- ɡoogle.ɑpis
- ɡoogle.apis
- gooɡle.apis
- google.apis
- ɡooɡle.ɑpis
- ɡooɡle.apis
- gooɡle.ɑpis
Are you sure? This is the number of the correct entry:
Are you really sure?Did you pick number 6?That was wrong. It was number 5.
(Score: 2, Interesting) by fustakrakich on Thursday March 05 2020, @01:15AM (12 children)
This problem would go away?
La politica e i criminali sono la stessa cosa..
(Score: 4, Funny) by NPC-131072 on Thursday March 05 2020, @01:19AM
homophobe!
(Score: 2) by takyon on Thursday March 05 2020, @01:30AM (6 children)
We need to put Klingon in Unicode.
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 3, Touché) by coolgopher on Thursday March 05 2020, @01:39AM (4 children)
Are you sure it's not already there, just stealthed?
(Score: 1) by fustakrakich on Thursday March 05 2020, @01:43AM (1 child)
Not a very good disguise...
La politica e i criminali sono la stessa cosa..
(Score: 3, Funny) by Freeman on Thursday March 05 2020, @05:07PM
"We don't talk about it." --Worf
Joshua 1:9 "Be strong and of a good courage; be not afraid, neither be thou dismayed: for the Lord thy God is with thee"
(Score: 3, Funny) by sjames on Thursday March 05 2020, @01:46AM
We could check for an ion trail, but that might be actual rocket surgery.
(Score: 2) by takyon on Thursday March 05 2020, @01:49AM
https://en.wikipedia.org/wiki/Klingon_scripts#ConScript_Unicode_Registry [wikipedia.org]
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 3, Informative) by Pino P on Thursday March 05 2020, @05:04AM
Unicode does not encode characters encumbered by copyright or trademark. Adding pIqaD to Unicode won't happen until ViacomCBS stops asserting copyright in the Klingon language [soylentnews.org].
(Score: 1) by Ethanol-fueled on Thursday March 05 2020, @01:58AM
We will go away, you fucksuckers. Let's GO!
(Score: 2, Touché) by Anonymous Coward on Thursday March 05 2020, @05:10AM
Only in the same sense as "killing all humans will prevent killing of humans". You don't have to kill Unicode to not allow anything outside of latin characters, numbers and a dash in URLs.
(Score: 4, Interesting) by maxwell demon on Thursday March 05 2020, @07:06AM (1 child)
It would be sufficient if punicode domains would be shown in their ASCII form.
Indeed, it probably would be sufficient if punicode domains would be shown in a different colour. Or even better, show only those letters in different colour that are punicode-generated.
For example, in punicode, "ɡoogle.apis" is encoded "xn--oogle-qmc.apis". Thus the domain would show (using bold instead of colour for obvious reasons) as "ɡoogle.apis". The difference being easy to spot this way.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 2, Informative) by Anonymous Coward on Thursday March 05 2020, @07:59AM
(Score: 2, Funny) by Anonymous Coward on Thursday March 05 2020, @02:21AM (3 children)
Send your donations to S0ylentnews.org... thanks for your generous support.
(Score: 3, Funny) by takyon on Thursday March 05 2020, @02:30AM
hɑx
[SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
(Score: 2, Funny) by Anonymous Coward on Thursday March 05 2020, @03:09AM
(Score: 3, Funny) by maxwell demon on Thursday March 05 2020, @07:14AM
Use ꜱοylentNews.οrɡ instead.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 3, Interesting) by stormwyrm on Thursday March 05 2020, @02:44AM
Firefox has an algorithm [mozilla.org] for detecting and mitigating these kinds of IDN homograph attacks. Chrome seems to use a similar algorithm [chromium.org] as well. Both algorithms are based on the recommendations of UTS #39 [unicode.org]. These characters from the IPA Extension Block appear to be part of the classes of characters that UTS #39 defines as Restricted, namely:
(emphasis added). So domains using them shouldn't display normally, coming out as punycode garbage. Indeed, Firefox 73.0.1 displays all of the domain names in the test, except for the proper one, using punycode, so #1 comes out as google.xn--pis-fsb, and #6 appears as xn--oole-z7bc.xn--pis-fsb. Chromium 79.0.3945.130 does same thing. The attack doesn't seem to work on these modern browsers at least.
Numquam ponenda est pluralitas sine necessitate.
(Score: 0) by Anonymous Coward on Thursday March 05 2020, @03:23AM (1 child)
They gotta Яuin everything.
(Score: 2, Funny) by redneckmother on Thursday March 05 2020, @05:12AM
At least they didn't rune it.
Mas cerveza por favor.
(Score: 3, Funny) by Rosco P. Coltrane on Thursday March 05 2020, @03:27AM
No gay mathematicians, no homographs. Simple!
(Score: 0) by Anonymous Coward on Thursday March 05 2020, @06:09AM (5 children)
rest all look the same over here on ubuntu 16.04, chromium.
so 3, 4, 5 and 7 all look the same.
(Score: 2) by maxwell demon on Thursday March 05 2020, @07:21AM (4 children)
On Mint/Waterfox (but with several additional fonts installed, I can't tell if those make a difference), the only of those letters that look the same are g and ɡ.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 2) by zocalo on Thursday March 05 2020, @08:55AM (3 children)
UNIX? They're not even circumcised! Savages!
(Score: 3, Informative) by maxwell demon on Thursday March 05 2020, @09:40AM (2 children)
On my system, l and ɩ look very different, indeed more different than l and i. In particular, ɩ on my system has the height of a common lowercase letter, while l has the height of an uppercase letter. Moreover, ɩ has an arc at the bottom, while l doesn't.
So you'd have more chances to fool me with “googie” than to fool me with “googɩe”.
Actually they look more similar in <tt> (lɩ) but still, the different height very clearly distinguishes them.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 0) by Anonymous Coward on Friday March 06 2020, @05:55AM (1 child)
in the summary and in your reply I see different ell-s as well. but in the variants, I see only the regular ell. Maybe it's the spoiler tag, no idea.
(Score: 2) by maxwell demon on Friday March 06 2020, @09:31AM
Maybe it's because in the spoiler tags there are only normal "l" letters. I just checked by copy-pasting it into a hex converter, and indeed every "l" is represented by the byte 6c, that is, the character “U+006C LATIN SMALL LETTER L”.
The Tao of math: The numbers you can count are not the real numbers.
(Score: 3, Informative) by Anonymous Coward on Thursday March 05 2020, @06:31AM (1 child)
Reject emoti-utf.
They had their chance, they proved they didn't deserve it.
(Score: 2, Funny) by Anonymous Coward on Thursday March 05 2020, @06:41AM
😂😂😂
(Score: 2) by J_Darnley on Thursday March 05 2020, @04:18PM
None are valid because ".apis" is not a valid TLD.
Also Unicode only needs 1 way to express a Latin lower case A. If it worked for CJK then it'll work for Latin.
(Score: 1) by Snort on Thursday March 05 2020, @06:53PM
on a Plam Pilot on eBay back in the day.
People will always take advantage of human failings.
(Score: 0) by Anonymous Coward on Thursday March 05 2020, @09:39PM
We do have a known defense. Unfortunately every time FF updates, it seems to revert to shipped default :(
You would think that by now someone at FF would call a meeting and choose to ship it with the more secure setting.
Windows solution, but applies to all FF on all platforms:
https://www.tenforums.com/tutorials/104760-enable-disable-idn-punycode-firefox-address-bar-windows.html [tenforums.com]