Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Gmail Recognizes Addresses Containing Non-Latin Characters

timothy posted about 3 months ago | from the germanic-frankish-anything dept.

Communications 149

An anonymous reader writes In response to the creation in 2012 by the Internet Engineering Task Force (IETF) of "a new email standard that supports addresses incorporating non-Latin and accented Latin characters", Google has now made it possible for its Gmail users to "send emails to, and receive emails from, people who have these characters in their email addresses." Their goal is to eventually allow its users to create Gmail addresses utilizing these characters.

Sorry! There are no comments related to the filter you selected.

Next wave of phishing? (4, Funny)

CRC'99 (96526) | about 3 months ago | (#47612207)

So the next lot of phishing will come from: róót@gmail.com / Àdministrator@gmail.com or BìllGàtes@gmail.com etc?

Great.

Metal umlaut! (5, Funny)

Anonymous Coward | about 3 months ago | (#47612223)

Finally I can get motörhead@gmail.com!

Re:Metal umlaut! (2)

jones_supa (887896) | about 3 months ago | (#47612243)

I will represent myself as a shady unofficial sales representative for an Australian microphone brand.

Anti-Semitism (0)

Anonymous Coward | about 3 months ago | (#47612575)

Don't be afraid of it.

Anti-Semitism (-1)

Anonymous Coward | about 3 months ago | (#47612583)

When in history have the Jews lived uneventfully with anyone else?
Never.
They are a species of parasites. Look at how they have subverted the government response to their massacre in Gaza. It's time for a more honest review of Hitler's motives for wanting them gone.

Re:Anti-Semitism (1)

INT_QRK (1043164) | about 3 months ago | (#47613377)

You are beneath contempt, and it would be otherwise intuitive that you should be ignored as an aberration. However, it is extremely important that decent people of good will realize that their opposites, people like you, are not an aberration, that you exist in the environment as a pervasive and pernicious evil, and therefore appropriate countermeasures must be put in place and vigilance maintained.

Re:Anti-Semitism (1)

ArmoredDragon (3450605) | about 3 months ago | (#47613803)

Posts like these are just random pot shots looking for a response. Chances are he doesn't even believe what he says, rather he just wants to cause somebody to come out speak in a righteous manner. Mission accomplished, I think?

Re:Anti-Semitism (1)

ArcadeMan (2766669) | about 3 months ago | (#47613627)

I don't want to sound racist, but I've never heard of Jewish suicide bombers, Jewish plane hijackings, etc.

Re:Metal umlaut! (0)

Anonymous Coward | about 3 months ago | (#47612797)

Lemmy, you magnificent bastard, I went deaf at your concert!

Re:Next wave of phishing? (2)

rvw (755107) | about 3 months ago | (#47612225)

So the next lot of phishing will come from: róót@gmail.com / Àdministrator@gmail.com or BìllGàtes@gmail.com etc?

It's not about bìllgàtes@outlook.com, but billgates@óutlook.com. It's the domain that is going to cause problems, not the user!

Re:Next wave of phishing? (2)

ArsenneLupin (766289) | about 3 months ago | (#47612491)

... and they'll use a greek lower case omicron (), rather than an accented o. The looks exactly the same as an o (except on Slashdot, of course. Slashdot hates Unicode...)

Re:Next wave of phishing? (4, Insightful)

Captain_Chaos (103843) | about 3 months ago | (#47612269)

Worse; they will come from root@gmail.com, administrator@gmail.com or BillGates@gmail.com, only those o's and a's will be Cyrillic or something like that (can't do it here; Slashdot doesn't display them).

Re:Next wave of phishing? (2)

rvw (755107) | about 3 months ago | (#47612317)

Worse; they will come from root@gmail.com, administrator@gmail.com or BillGates@gmail.com, only those o's and a's will be Cyrillic or something like that (can't do it here; Slashdot doesn't display them).

When you mix Latin htmail with a Cyrillic o to get hotmail, Google and all email programs should refuse that address immediately, mark it as spam, make the address red with a warning sign etc. Mixing character sets should not be allowed in a domain or in a username. So the username may be all Cyrillic or Greek, the domain name may be all Chinese or Latin, and these may mix, but no mixes in the domain name or username itself.

Re:Next wave of phishing? (2)

Chrisq (894406) | about 3 months ago | (#47612329)

I think that's the way to go - only allow characters from a single unicode script [unicode.org] in the username and in the domain name. The domain name part is currently handled by registras so that may not need any additional rules.

However this really should be part of the RFC, or else anyone banning mixed names would be "non compliant". If the RCF does not specify this then the best that gmail (or any other system could do) would be to prevent people registering mixed names themselves and giving a warning (and maybe colour characters) if email is recieived from an address with mixed scripts.

Re:Next wave of phishing? (1)

rvw (755107) | about 3 months ago | (#47612463)

However this really should be part of the RFC, or else anyone banning mixed names would be "non compliant". If the RCF does not specify this then the best that gmail (or any other system could do) would be to prevent people registering mixed names themselves and giving a warning (and maybe colour characters) if email is recieived from an address with mixed scripts.

Gmail, Microsoft and Yahoo and others like gmx, universities, big companies should simply refuse these mails. Microsoft should make Exchange so that this is the default way for handling these mails. The same goes for qmail, postfix etc. But that won't be enough.

As another commenter said, you can make up latin looking names using cyrillic characters, and we won't notice. How do you catch that? I guess this will the the time that PGP will prove it's value.

Re:Next wave of phishing? (1)

MrNaz (730548) | about 3 months ago | (#47612605)

I agree. The real solution is hardened authentication getting baked right into email. I'm all for UTF8 domain names and email user names, however if the email protocol suite is going to be expanded to allow for more features, then I think security should be top of that list.

Sure, for a while, domains that span multiple character sets such as hotmail.com with a Cyrillic o could be spam flagged, however what happens when (not, if, but when) legitimate domains with multiple character sets start appearing? What about domains that use characters restricted to the intersection of two character sets such that they appear to be from one but are in fact from another?

The ONLY answer to this is an email client that can associate a certificate with a domain and checks it against received email as a matter of course. This solution not only has the property of preventing domain spoofing, but also comprehensively solves the spam problem. (It didn't get done earlier because it fell foul of the "requires everyone to agree at the same time" point on that pro forma "Why your proposal won't work" sheet.)

Re:Next wave of phishing? (1)

CrankyFool (680025) | about 3 months ago | (#47613483)

How does confirming the domain's identity automatically solve this problem?

If someone from the gxail.com domain sends me email (let's assume here the 'x' is some weird Cyrillic character that looks just like an 'm'), any automated confirmation of the domain's validity would not do some sort of eyeball check "Oh, that looks like gmail.com, let's confirm if it is, oops it isn't..." but rather an automated "did that email come from gxail.com? Yup, sure did."

Even if you popped up a notice that said "hey, I don't know about that domain," the typical user -- heck, I'd argue even the typical Slashdot user -- would go "weird, looks like it lost creds for it" and click whatever the equivalent of "Oh well" button the notice had.

Re: Next wave of phishing? (1)

Samantha Wright (1324923) | about 3 months ago | (#47612359)

Belarusian uses a Latin-style "i" in place of the typical Cyrillic short i... So you can still phish admirably with "paypaI" and never leave Cyrillic. e, x, c, y, i, o, p, a: how many words can you make?

Re: Next wave of phishing? (0)

Anonymous Coward | about 3 months ago | (#47612925)

You forgot j and s.

Re:Next wave of phishing? (1, Insightful)

pla (258480) | about 3 months ago | (#47612623)

Worse; they will come from root@gmail.com, administrator@gmail.com or BillGates@gmail.com, only those o's and a's will be Cyrillic or something like that (can't do it here; Slashdot doesn't display them).

"Rich company problems". XD

Bluntly, this won't affect most Americans for the same reason spam from .il, ru, or .cn doesn't matter - Because we simply don't get any legitimate email from those domains. It doesn't take your spam filter long to figure out "if the address contains character-X, 100% chance of spam"... And that assumes your mail server doesn't outright block those as a hardcoded rule (in a former life I had to babysit the Exchange server for a small business; if you came from anywhere not in one of the big-six TLDs, auto-junk).

So by all means, spammers, please start using Cyrillic or vowels with diacritics in addresses - It will make you that much easier to filter.

Re:Next wave of phishing? (1)

drinkypoo (153816) | about 3 months ago | (#47612799)

It doesn't take your spam filter long to figure out "if the address contains character-X, 100% chance of spam"...

Yes, yes it does. It took Google years to stop sending me spam in foreign languages that I couldn't read anyway.

Re:Next wave of phishing? (1)

Kjella (173770) | about 3 months ago | (#47613045)

Bluntly, this won't affect most Americans for the same reason spam from .il, ru, or .cn doesn't matter - Because we simply don't get any legitimate email from those domains. It doesn't take your spam filter long to figure out "if the address contains character-X, 100% chance of spam"... And that assumes your mail server doesn't outright block those as a hardcoded rule (in a former life I had to babysit the Exchange server for a small business; if you came from anywhere not in one of the big-six TLDs, auto-junk).

It must be wonderful to run a mom and pop operation where none of your customers, suppliers or anyone else has an international mail address. And it certainly won't work for any other country but the US, a canadian business that doesn't accept .ca mail? Don't think so. And if you're operating an ISP, university or whatever some of your users will be foreigners in real contact with the rest of the world. Neat that you can wave the WORKS4ME flag, it's still a problem for a lot of other people.

Re:Next wave of phishing? (1)

pla (258480) | about 3 months ago | (#47613325)

It must be wonderful to run a mom and pop operation where none of your customers, suppliers or anyone else has an international mail address. And it certainly won't work for any other country but the US, a canadian business that doesn't accept .ca mail? Don't think so.

Yellow flag: Failure to extrapolate.

Canadians can't block .ca, of course. They probably feel pretty much the same as I do about .il, .ru. and .cn, however. Canadians can't block the few diacriticals used in French (although in my experience, most Canadians would probably consider blocking Quebecois a feature, not a bug); I doubt they have a lot of use for Cyrillic, however. Similarly, a Russian mom-n'-pop probably doesn't get all that much email in Mandarin, even if they need to allow Cyrillic through.

Yes, as I originally said, the largest scale entities don't have the luxury of blocking based on the local norm. For the rest of us, yes, "WORKS4ME". You can find your own solution.

Re:Next wave of phishing? (1)

Behrooz (302401) | about 3 months ago | (#47613883)

Is it deeply wrong that I originally read "WORKS4ME" as 'worksame' due to excessive IRC leetspeak exposure in my youth... and yet it still kind of made sense?

Re:Next wave of phishing? (1)

DarkOx (621550) | about 3 months ago | (#47613547)

No worse the will come from

updates@?tfosorcim?.com which will be displayed like:
updates@microsoft.com

Just imagine the ? marks being the left-right reverse character.

Re:Next wave of phishing? (2)

dejanc (1528235) | about 3 months ago | (#47612289)

That kind of phishing already exists, even more sophisticated: a bug that a lot of software contains is not distinguishing between same looking characters in different alphabets. E.g. you can sign up on many forum/bbs platforms as Administrator if your leading A is cyrillic [fileformat.info] A instead of latin [fileformat.info] A. Both look the same but have different html entity codes and are different unicode chracatres, which is true for most vowels and many consonants (e.g. cyrillic B and latin B, C and C, E and E...). Or, for more fun, look at this (single) character [fileformat.info] which looks exactly as "lj".

Those of us with customers who use two alphabets constantly have known about this problem for a long time and we've seen phishing on all different kinds of platforms using this strategy.

IDN (internationalized domain names) solves this problem in domain names with policy: you can't register a domain which looks exactly like some other domain except for that change in character. Still though, you can register both casino.it and casinò.it and that's where the real phishing potential is. I think, at least most native English speakers, would probably be fooled easier by a domain such as paypal-customer-division.com than paypàl.com.

Re:Next wave of phishing? (0)

Anonymous Coward | about 3 months ago | (#47612775)

Except paypÃfl.com should generally be rejected as the Ãf is not a common letter in English, so it should not be allowed per the IDN policy if I remember correctly.

Re:Next wave of phishing? (3, Funny)

Megane (129182) | about 3 months ago | (#47613469)

I think ròót@gmail.com is a better choice because it looks angry.

Dammit this is a terrible idea (0)

Anonymous Coward | about 3 months ago | (#47612213)

Now we'll get spam from addresses that use code pages that just look like valid latin characters.

Re:Dammit this is a terrible idea (5, Insightful)

Chrisq (894406) | about 3 months ago | (#47612239)

This is a real concern,and probably why gmail is not yet allowing internationalised gmail addresses. Most email names could be spoofed using Cyrillic characters which look exactly the same as latin ones [wikipedia.org] . How could you tell if the "c" in chrisq@gmal.com really was a latin 'c' or a cyrillic Es [wikipedia.org] ?

Re:Dammit this is a terrible idea (1)

Anonymous Coward | about 3 months ago | (#47612287)

Also, thin space, zero width space, zero width non joiner, combiners that combine in such a way that they essentially do nothing. There are a lot of possibilities and if any of them are missed it will be a disaster.
I forsee a lot of pain comming from this.

Re:Dammit this is a terrible idea (1)

petermgreen (876956) | about 3 months ago | (#47613529)

Usually with such things it's better to whitelist than blacklist. As you add characters to the whitelist you determine what character they should be equivilent to for conflict-management purposes.

Out of interest does anyone know if people actually use internationalised domain names as their main domains or if they stick to conventional names that work with all software and which everyone can type.

Re:Dammit this is a terrible idea (1)

stephanruby (542433) | about 3 months ago | (#47612311)

Most email names could be spoofed using Cyrillic characters which look exactly the same as latin ones [wikipedia.org] . How could you tell if the "c" in chrisq@gmal.com really was a latin 'c' or a cyrillic Es [wikipedia.org] ?

gmail.ru (or its equivalent) will find a way to support cyrillic
gmail.qc.ca and gmail.fr will find ways to support French accents (otherwise, Google will get sued or blocked by Quebec or France)
These details will get worked out at the local level. It will take time, but they'll get there eventually.

Re:Dammit this is a terrible idea (1)

Chrisq (894406) | about 3 months ago | (#47612507)

Most email names could be spoofed using Cyrillic characters which look exactly the same as latin ones [wikipedia.org] . How could you tell if the "c" in chrisq@gmal.com really was a latin 'c' or a cyrillic Es [wikipedia.org] ?

gmail.ru (or its equivalent) will find a way to support cyrillic gmail.qc.ca and gmail.fr will find ways to support French accents (otherwise, Google will get sued or blocked by Quebec or France) These details will get worked out at the local level. It will take time, but they'll get there eventually.

I don't think that would work in protecting users against attacks unless you said that only users if gmail.ru could receive emails from users with Cyrillic characters in the name, etc.

Re:Dammit this is a terrible idea (1)

ArcadeMan (2766669) | about 3 months ago | (#47613663)

I wouldn't be surprised to see l'Office québécois de la langue française do something like that. I speak french and I still think they're assholes who are over-reaching their boundaries.

Re:Dammit this is a terrible idea (2)

mwvdlee (775178) | about 3 months ago | (#47612337)

They might mark conspicuous characters, like when multiple character sets are combined in a single domainname.

Re:Dammit this is a terrible idea (0)

Anonymous Coward | about 3 months ago | (#47612367)

Sure there are all sorts of things they 'might' do, the problem is that this adds a lot of complexity, it is almost inevitable that things will be missed.
The Unicode standard is *huge* and complex - there are going to be problems, legitimate people will get defrauded because of this it is inevitable, fancy solutions aside it is going to be an arms race just like all other phishing etc.
Is it worth opening Pandoras box? I'm not so sure, but then I'm an English speaker...

Re:Dammit this is a terrible idea (1)

idji (984038) | about 3 months ago | (#47612455)

It would be easy to WARN a USER if the name contains mixed alphabets or diacritics that differed from the user's browser's preferred language. Each Unicode Character has a name eg "Greek Upsilon With Hook Symbol", or "Latin Capital Letter R", or "Cyrillic Capital Letter Es With Descender", "Arabic Letter Qaf", or "CJK Ideograph" for Chinese/Korean/Japanese.

Re:Dammit this is a terrible idea (1)

Anonymous Coward | about 3 months ago | (#47612495)

Yes, warning users works really well. Especially after decades of windows training users to click accept on alerts without reading them.

Re:Dammit this is a terrible idea (0)

Anonymous Coward | about 3 months ago | (#47612597)

Because no language ever makes use of characters from other languages, I mean surely Latin capital letter R is only used by latin speakers. Seriously you should get a better understanding of what you are saying before you make bold claims about how 'easy' something is going to be, could it be done, maybe, will there be oversights, bugs and glitches for people to exploit, almost definitely.

Re:Dammit this is a terrible idea (1)

Chrisq (894406) | about 3 months ago | (#47613053)

Because no language ever makes use of characters from other languages, I mean surely Latin capital letter R is only used by latin speakers. Seriously you should get a better understanding of what you are saying before you make bold claims about how 'easy' something is going to be, could it be done, maybe, will there be oversights, bugs and glitches for people to exploit, almost definitely.

Actually Unicode does make a good effort of classifying characters into scripts [wikipedia.org] , with some "common" characters that can appear in any scripts and some "inherited" characters (like diacritics) that belong to the character that they are applied to. Thus the Cyrillic"Es [wikipedia.org] " looks like a Latin "C" but is a different Unicode character, one belonging to the Cyrillic scripts and the other to the Latin script. The different languages using the same scriptis a red-herring, it doesn't matter that both French and English use the capital "R", what does matter is that you can't put a Cyrillic character into the middle of a Latin script string to make something that looks like a certain name but isn't. Checking whether a name contains characters from more than one script is easy. there are methods in some languages [oracle.com] that trivialise this.

Re:Dammit this is a terrible idea (0)

Anonymous Coward | about 3 months ago | (#47613175)

Until you take a closer look at the scripts, and notice that they encompass so many characters that people won't need to 'mix scripts' for trickery.
Take "mathematical letter kappa" and "latin k" for example, do you think your mom will be able to tell the difference?

Sure some of this stuff can be handled, with enough work, but you can rest assured there will be mistakes, anyone who thinks this will be 'easy' and go down without a hitch is either stupid, naive or in denial.

Re:Dammit this is a terrible idea (1)

Chrisq (894406) | about 3 months ago | (#47613845)

Take "mathematical letter kappa" and "latin k" for example, do you think your mom will be able to tell the difference?

To be fair they do have different script values so would be identified by the proposal

Re:Dammit this is a terrible idea (0)

Anonymous Coward | about 3 months ago | (#47613341)

Of course that is exactly why unicode is a security risk. The same visual representation has different encodings, therefore the "wrong" address appears "correct". bye-bye bank login details. Adopting unicode in DNS was a severely bad idea; adopting it in email is also a bad idea but granted email was already easy to make appear "correct" while being "wrong".

Well, I'm impressed. (3, Insightful)

Anonymous Coward | about 3 months ago | (#47612217)

Google updated their regular expression. Good for them.

Re:Well, I'm impressed. (5, Informative)

Chrisq (894406) | about 3 months ago | (#47612229)

I would imagine that there they implemented RFC6532 [ietf.org] , which involves a lot more than changing a regular expression

Re:Well, I'm impressed. (0)

Anonymous Coward | about 3 months ago | (#47612327)

And again, an army of PhDs engages in some useless administrative activity, while not saving the world.
Thanks, Google!

Re:Well, I'm impressed. (0)

Anonymous Coward | about 3 months ago | (#47612745)

Half the web sites I try to register with spit chips if you try to use an ASCII plus symbol (+) in your email address. It's a perfectly valid character and legal in the RFCs, but dumbass web developers can't even get that right. What hope will there be for people with internationalized usernames and domain names?

Sigh (5, Insightful)

ledow (319597) | about 3 months ago | (#47612231)

From what I can tell, a mail server has two options when receiving this mail:

Accept it.
Reject it.

The default, with software that doesn't understand this RFC yet (which seems to be... just about everything), is to reject. So trying to use this as an email is not only going to mess up every form you try to fill in online (because they won't see it as an email address either), but quite likely just gets you bouncebacks from everyone you email.

What was needed was surely a system similar to the IDN system for internationalisation, which would allow those with ASCII-only DNS servers etc. to STILL WORK, by converting the Unicode characters to ASCII subsets and then sending the email as normal, through the entire PLANET-worth of working email servers out there that could accept it.

Having a content negotiation option at the SMTP level, that mail servers have to implement and handle specifically, is just ridiculous, and even with GMail's kickstart it could be decades before you can guarantee that your UTF-8 email address will work across the Internet and even then there'll be some old legacy server that will just bounce all your email BECAUSE of that character set in your address. And it will be perfectly legitimate to do so.

However, as others have pointed out, if this goes through, it will be nigh-on impossible to spot phished/faked email addresses, just like it is with IDN links unless you know how to find the original ASCII-encoding of them.

Re:Sigh (0)

Anonymous Coward | about 3 months ago | (#47612281)

From what I can tell, a mail server has two options when receiving this mail:

Accept it.
Reject it.

Temporary failure, try again later.

There! Are! THREE! Options!

Re:Sigh (3, Funny)

hawkinspeter (831501) | about 3 months ago | (#47612549)

Nope, there's four options:

Accept it.
Reject it.
Temporary failure, try again later.
User not local, will forward to <somewhere>.
Syntax error, command unrecognised.

Wait, I'll come in again...

Re:Sigh (1)

scdeimos (632778) | about 3 months ago | (#47612749)

Picard? Is that you?

Re:Sigh (1)

gurps_npc (621217) | about 3 months ago | (#47613453)

Many email accounts have the option of setting up a temporary clone email with different letters. That is, you could be something_in_Mandarin@gmail.com and also AlexanderTheGreat@gmail.com All in one single account. So you use the Mandarin email address for your mandarin business cards, and the English one for all web sites and even on your English business cards.

Good luck (4, Interesting)

Pascal Sartoretti (454385) | about 3 months ago | (#47612255)

My e-mail address ends with the suffix ".name". It is perfectly correct (even if not common), but I still sometimes have issues today because some stupid website has an outdated regular expression which says that ".name" is not correct.

Now imagine this with non-latin characters (or just non-ASCII characters)... If you only write to people also using GMail, it might work.

+ in an e-mail address (0)

Anonymous Coward | about 3 months ago | (#47612701)

My e-mail address ends with the suffix ".name". It is perfectly correct (even if not common), but I still sometimes have issues today because some stupid website has an outdated regular expression which says that ".name" is not correct.

I'm still waiting for web developers to realize that a "+" character is valid in the local-part of an e-mail address. It's been around since RFC 822, and yet the web folks still can't get their shit together.

Re:+ in an e-mail address (1)

hobarrera (2008506) | about 3 months ago | (#47612727)

"+" or plenty of other special characters. Stuff like quotes can even be valid if used properly, while we still have some website that won't even accept a dash/underscore.

Re:+ in an e-mail address (1)

Pascal Sartoretti (454385) | about 3 months ago | (#47612863)

"+" or plenty of other special characters. Stuff like quotes can even be valid if used properly, while we still have some website that won't even accept a dash/underscore.

I had to wait nearly 10 years for my ".name" domain to be accepted by most websites (say, 99.5%).

For "+" or other funny characters, my estimate is that you will need at least 10 years starting from now.

I would not hold my breath.

Re:Good luck (0)

Anonymous Coward | about 3 months ago | (#47612757)

Try including a plus symbol in the localpart and see how far you get.

Re:Good luck (1)

RevWaldo (1186281) | about 3 months ago | (#47613145)

I was once given a corporate e-mail address with an apostrophe for my last name. Perfectly legal, but many web sites choked on it. And they left the apostrophe off my first batch of business cards.

(Fortunately I also had an alias address which didn't have the apostrophe and was about two dozen characters shorter.)

.

Re:Good luck (0, Interesting)

Anonymous Coward | about 3 months ago | (#47613397)

My e-mail address ends with the suffix ".name". It is perfectly correct (even if not common), but I still sometimes have issues today because some stupid website has an outdated regular expression which says that ".name" is not correct.

Now imagine this with non-latin characters (or just non-ASCII characters)... If you only write to people also using GMail, it might work.

Well well well.

Imagine that.

Google is walling off their garden...

Could that be considered, oh, maybe, EVIL?

Terrible idea (0)

Anonymous Coward | about 3 months ago | (#47612275)

The Internet is about interoperability. Intentionally breaking that goes against everything that has made the Internet worthwhile.

Re:Terrible idea (0)

Anonymous Coward | about 3 months ago | (#47612283)

Interoperability as long as it's by american english rules?

Re:Terrible idea (1)

Anonymous Coward | about 3 months ago | (#47612313)

The Latin alphabet is not American.

Re:Terrible idea (0)

Anonymous Coward | about 3 months ago | (#47612483)

Right, but ASCII is still used too damn much. Even the US gov doesn't understand ASCII is not enough for the world, so interoperatively speaking, shouldn't we all just have to live with ASCII? Fuck that. So even with limiting to latin, latin alphabet is missing half the world, that's just another half assed way to go.

Re:Terrible idea (0)

Anonymous Coward | about 3 months ago | (#47612625)

I can only think of the Babel tower finally coming down again... It was a sweet dream...

And don't forget that maybe some Chinese dude has problem with typing English (although I think most keyboards all around the world do keep ASCII letters and base ASCII punctuation at least, so there's that at least today...), but I'm pretty sure he will fare much worse if he ever has to type Arabic or Burmese... You are just as much considering only the viewpoint of the English/Latin world vs. the rest of the world lumped together...

There sure are some mitigations possible, notably for text-only identifiers, which you can more or less easily copy-paste (hopefully everything you use now supports full Unicode and all fonts everywhere), but text in images, on business cards, on magazines, etc., will be significantly more difficult to handle... Good luck OCR-ing badly-handwritten Japanese unagi pr0n URL on some low-definition GIF picture to find out more... (well, with some luck, I suppose you could try to send the picture to Google Images or some other image source identification website, in this case...).

Well, that is if you can even recognize it is an URL... 'cause of course those ASCII punctuation symbols should be localized too, going this way... Their shape may mean something completely different in another language, or not exist at all...

Anyway, let's take another example outside of the Internet... What about localizing again all traffic and caution/information signs all around the world? Even colors and basic shapes can have very different original meanings depending on the culture...

What about localizing math/physics/chemistry units and formulas again?

There's a balance in everything... Too much localization just breaks the web.

Re:Terrible idea (1)

RevWaldo (1186281) | about 3 months ago | (#47613235)

And don't forget that maybe some Chinese dude has problem with typing English (although I think most keyboards all around the world do keep ASCII letters and base ASCII punctuation at least, so there's that at least today...)

Phonetic entry using pinyin is still the most common method, which has been greatly sped up with predictive text like on cell phones, so the most common characters can be entered with a few keystrokes. Google Pinyn [wikipedia.org] in this regard is, as the kid's say, the shiznit.

.

Re:Terrible idea (1)

petermgreen (876956) | about 3 months ago | (#47613785)

If a chinaman and a russian swap buisness cards and both have used their own scripts for email addresses are thier thoughts going to be "great" or "how the fuck do I type this?"

My guess is nationalists who don't care about the world beyond their countries borders may adopt this, those who care about being part of the global community (or simply about interoperating with older software) will avoid it like the plauge it is.

Re: Terrible idea (1)

jrumney (197329) | about 3 months ago | (#47613095)

Interoperability is why we still write to anonymous@slashdot.org!mail.comcast.net!mail.myisp.com!gateway.local instead of just having globally resolvable addresses. Upgrading the infrastructure is just too hard and will never happen.

Anti-phishing measures? (1)

Captain_Chaos (103843) | about 3 months ago | (#47612277)

I hope they implement the same kind of anti-phishing measures that browsers are taking for displaying domain names with non-Latin scripts. http://en.wikipedia.org/wiki/I... [wikipedia.org]

tin tuc phat giao (-1, Flamebait)

oho39 (3775277) | about 3 months ago | (#47612299)

[url=http://tintucphatphap.blogspot.com/2014/08/clb-thanh-nien-phat-tu-chua-tu-tan-cung.html/]tin tuc phat giao[/url]

Great idea (0)

Anonymous Coward | about 3 months ago | (#47612301)

How on earth am I supposed to email someone when I don't even have a key that corresponds to a letter in their email address. And do I'm not keeping a huge chart of Alt+number combinations handy.

Re:Great idea (1)

Chrisq (894406) | about 3 months ago | (#47612333)

How on earth am I supposed to email someone when I don't even have a key that corresponds to a letter in their email address. And do I'm not keeping a huge chart of Alt+number combinations handy.

Of course there is probably someone in China or Korea thinking "why do I have to use this special keyboard mode with characters I don't understand to write emails".

Re:Great idea (1)

Richy_T (111409) | about 3 months ago | (#47613641)

Cause while his countrymen were running around killing sparrows with sticks at the behest of an insane, leftist ruler, the capitalist west had already been working on the transistor for 10 years, and was continuing working on improving the integrated circuit it had become part of and thus had a huge head start on defining the standards that would be used in a global communications network of billions of computers.

Re:Great idea (0)

Anonymous Coward | about 3 months ago | (#47612371)

Use the Character Map program.

Re:Great idea (0)

Anonymous Coward | about 3 months ago | (#47612401)

More specifically, use the Character Map program for Windows. The Character Map in Ubuntu seems to show a lot of the code pages with just squares containing numbers in them. More open source garbage. :(

Try installing international fonts (1)

Anonymous Coward | about 3 months ago | (#47612459)

By default ubuntu doesn't unless your codepage requires it. Most of the 'complete' unicode fonts aren't included by default.

Re:Try installing international fonts (0)

Anonymous Coward | about 3 months ago | (#47612517)

One font including the full Unicode map would be sufficient.

Re:Try installing international fonts (0)

Anonymous Coward | about 3 months ago | (#47612615)

There is no such font.

Re:Try installing international fonts (1)

Richy_T (111409) | about 3 months ago | (#47613669)

Shouldn't there be? Serious question.

Re:Great idea (2)

OolimPhon (1120895) | about 3 months ago | (#47612385)

Hit the 'Reply-To' button, naturally.

- After adding the user to your Address book.

ó nie dziaa, do bani (0)

Anonymous Coward | about 3 months ago | (#47612335)

you cannot use solely international characters, the first one need to be simple ascii

Re: ó nie dziaa, do bani (1)

Chrisq (894406) | about 3 months ago | (#47612351)

you cannot use solely international characters, the first one need to be simple ascii

What?Where do you get that from? TFA gives examples where the whole email address is in international characters (katakana)

Re: ó nie dziaa, do bani (0)

Anonymous Coward | about 3 months ago | (#47612801)

They're making it up, of course, unless they're including the '<' character in the addr-spec, http://tools.ietf.org/html/rfc... [ietf.org]

The ABNF in that section is so broken, though, I have no idea how anyone could hope to provide a compliant implementation.

Ah, great! (1)

Black Parrot (19622) | about 3 months ago | (#47612419)

Maybe now my e-mails to Tutankhamun [google.com] will quit bouncing.

A new email standard? (0)

Anonymous Coward | about 3 months ago | (#47612533)

The IETF devoted time to a _new_ email standard and still hasn't fscking solved the spam problem?

WTF? Send them to bed without dinner.

Re:A new email standard? (0)

Anonymous Coward | about 3 months ago | (#47612539)

It's not a spam problem, it's a people problem. There are too many people, sending spam.

Re:A new email standard? (1)

MrNaz (730548) | about 3 months ago | (#47612643)

Implementing proper domain and user authentication by baking PGP or some other PKI right into the email protocols will both solve the spam problem comprehensively AND allow UTF8 domains with minimum risk of phishing /spoofing.

Cyrillic substitutes for 'aoe': long-forgotten.. (0)

Anonymous Coward | about 3 months ago | (#47612555)

So there was an IRC network running in CP1251.
So there were attempts to substitute letters and impersonate other users.. so an ircd patch was written to treat all similar-looking characters as Latin.

Next wave of Unicode dirty-hacks and workarounds surely will be even more strange than that. // irccity, nya.

Internet Engineering Task Force Realised... (1)

Dan Askme (2895283) | about 3 months ago | (#47612751)

That "signed char" was a bad coding choice back in the day.

Pile of poo at gmail dot com (0)

Anonymous Coward | about 3 months ago | (#47612969)

5 times pile of poo at gmail dot com!

They better filter out non-printing characters (1)

Megane (129182) | about 3 months ago | (#47613405)

They better be filtering out the non-printing characters that do fun stuff like reverse the text direction, overstrike, etc. How long until people start registering gmail addresses with Zalgo [google.com] text?

And how long until someone registers pile of poo [google.com] @gmail.com?

What should the rest of us do? (2)

Zaiff Urgulbunger (591514) | about 3 months ago | (#47613473)

As a webdev who gets irritated at websites that fail badly with their email validation (e.g. not allowing + in the local part, or only allowing 2 or 3 char TLDs), I do try very hard to get this right. So I've got a solid(ish) email validation function. But, I'm a bit sketchy on what to do with UTF-8.

For the domain, I'd hope that the MTA (Postfix in my case) would allow UTF-8 and convert to punycode as required, but I'm not sure it does. So currently I don't allow for that. I _could_ convert to punycode myself, but I don't.

And as for the local-part, I'm fairly certain Postfix doesn't allow for UTF-8 at present.... at least, not the Postfix version supported on Debian 7.

So I'm just wondering what everyone else is doing? Should I improve my support, or should I just wait for support to be added to my MTA before I bother?

Re:What should the rest of us do? (2)

Richy_T (111409) | about 3 months ago | (#47613687)

I don't see much point getting anal about email validation, especially since it's fairly hard problem. It's been a while since I've written one but something along the lines of something@something.something is usually enough and let the mail servers sort out the rest.

They should all be duel email addresses (1)

gurps_npc (621217) | about 3 months ago | (#47613475)

As in, one email account connected to two email addresses, one in say Russian, and the other using the latin alphabet.

Probably set up so that if the Russian gets bounced, it tries again with the latin alphabet.

Also, the signature of all emails sent from this should have a copy of the latin email address, so that people that don't have the Russian capability can reply.

Slashdot's time? (1)

J'raxis (248192) | about 3 months ago | (#47613573)

Now that Google has implemented 2012 i18n technology, maybe vaunted technology site Slashdot can catch up to 1998 [ietf.org] and implement UTF-8 properly?

Nah.

address standards are a nightmare (2)

netsavior (627338) | about 3 months ago | (#47613645)

first off, I went down the slippery death defying slope of email address validation recently... Our software had simple regex rules... so I thought I would just implement RFC rules, or find a library that did... wow. RFC is a mess... APIs are worse.
This is a valid email address:
dude"".dude@[192.168.1.1]
so is this:
a@com
also valid:
test+test=gmail.com@test.com
none of those will work in MS Outlook or exchange, none of them will work with jquery validation plug-in, some close to that will work with java mail API. Most funky but standards compliant email addresses will pass Apache commons validation.

In the end, I went with a 2 part validation: 1) Apache Commons Validation (mostly RFC correct), then a second pass on Javax.mail because if I can't send email to it, then what is the point of having it? We still get addresses that pass both validations, and bounce at some SMTP relay due to "invalid address format."

I am sure internationalization will make all this better.

Re:address standards are a nightmare (1)

netsavior (627338) | about 3 months ago | (#47613695)

take the name part of your gmail address... add this string to it:
+a."b(c)d,e:f;gh>i[j\k]l".a@gmail.com

not only is yourname+a."b(c)d,e:f;gh>i[j\k]l".a@gmail.com a valid email address... but you will actually get email addressed that way. It will fail most email address validation that I have found.

Re:address standards are a nightmare (2)

allo (1728082) | about 3 months ago | (#47613793)

just send a mail. if it fails, discard the pending registration or whatever, possibly via "not confirmed" timeout some days later.

WTF? (1)

allo (1728082) | about 3 months ago | (#47613719)

Isn't this something, which was introduced years ago?

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?