Follow Slashdot blog updates by subscribing to our blog RSS feed

 



Forgot your password?
typodupeerror
×
The Internet Technology

First Non-Latin TLDs Go Online Today 302

eldavojohn writes "ICANN today switched on the country code top level domains for Egypt, Saudi Arabia, and the United Arab Emirates, which are the first non-Latin TLDs available and are also fully readable right to left. Slashdot does not support them but you can find the TLDs in the BBC article. ICANN said it had 21 more requests for TLDs in 11 different languages. A quick note — if you do not have the language packs installed, you may experience unpredictable browser behavior in the URL bar. Right now countries like China and Thailand have implemented workarounds to achieve the same effect."
This discussion has been archived. No new comments can be posted.

First Non-Latin TLDs Go Online Today

Comments Filter:
  • Really? (Score:4, Funny)

    by Tekfactory ( 937086 ) on Thursday May 06, 2010 @11:03AM (#32111482) Homepage

    China and Thailand have implemented workarounds to achieve unpredictable browser behavior in the URL bar?

    • Re: (Score:2, Funny)

      by Anonymous Coward

      Yeah, like www.bankofamerica.com.secure.cn

      • Re:Really? (Score:5, Funny)

        by AnonymousClown ( 1788472 ) on Thursday May 06, 2010 @11:14AM (#32111590)

        Yeah, like www.bankofamerica.com.secure.cn

        I have a better one: www.bankofamerica.com.secure.ru.

        It has an algorithm that predicts what expenses you will have in the near future and withdraws your money and puts it in a safe account that's unknown to you so that you don't spend it. They did that to me - took all my money out - and then all I have to do is send them an email and they'll pay my bills - all for a $19.95 monthly service fee on my CC. I can't loose!

        This is my first month on this program, so I'll let you guys know how it works.

        • Re:Really? (Score:5, Interesting)

          by CastrTroy ( 595695 ) on Thursday May 06, 2010 @11:41AM (#32111912)
          I got a better one. www.bankofamerica.com. See, I used Unicode character 212e instead of the e. Looks the same to most people, and would probably fool quite a bit of people. I wonder how they hope to stop situations like this. (I actaully used an e, because slashdot wouldn't let me put in the HTML entity, but this is good enough to demonstrate the problem)
          • Re:Really? (Score:5, Funny)

            by PopeRatzo ( 965947 ) * on Thursday May 06, 2010 @11:48AM (#32111986) Journal

            I actaully used an e, because slashdot wouldn't let me put in the HTML entity, but this is good enough to demonstrate the problem

            So, if you only could have done it, you might have done it.

            Now I'm really scared.

          • Re:Really? (Score:4, Insightful)

            by phantomcircuit ( 938963 ) on Thursday May 06, 2010 @11:57AM (#32112092) Homepage

            Just for anybody who is interested and lazy... javascript:alert(unescape("http://www.bankofam%u212ercia.com"))

            It doesn't look exactly like 'e', but it's certainly close enough to fool some people.

            • Re: (Score:3, Insightful)

              by Lumpy ( 12016 )

              Do that to ebay.com paypal.com, etc.... It opens up a world of unholy hell for all the scammers on this planet to make it even harder to determine if a site is real or fake....

              Thanks ICANN!

              • by DrYak ( 748999 ) on Thursday May 06, 2010 @01:33PM (#32113630) Homepage

                This [mashable.com] has been dome for a long time (spelling paypal with similarily looking cyrillic characters. i.e.: "raura" but in cyrillic. or "eVau" for "eBay").
                Most browsers [mozillazine.org] circumvent it by either displaying the escaped characters (a.k.a. Punny Code) [wikipedia.org] or by using a different colour to tag non-lating characters (don't know which browser uses this technique).

                The current difference now, is that the top-level domain, too could be done in non-latin caracters.

                i.e.: up until now, the hacks only spellt "PayPal" with seemilarily-looking cyrillics. starting from today a new TLD could be created which looks like "com" but is instead cyrillcs ( "som" in this instance )

                Browsers will simply react by showing the escaped form or flag the letters with a different colour.

            • Re:Really? (Score:4, Funny)

              by vux984 ( 928602 ) on Thursday May 06, 2010 @12:16PM (#32112326)

              Ironically, "http://www.bankofamercia.com" is probably close enough to fool some people too, and doesn't require any fancy javascript. ;)

          • Re: (Score:3, Informative)

            by OnlyJedi ( 709288 )

            The solution is pretty already for the most part in place, and occurs at the browser level. Most of the browser vendors have known about this for years [wikipedia.org] (since 2005 I believe) and implement a combination of whitelists, phishing filters, and Punycode [wikipedia.org] to avert the problem .

            Other possibilities they could add is highlighting the background of any URL not in the user's native character set, or that uses characters of different sets, write those suspect characters in bold, or pop up a security dialog. The problems

        • Isn't that Bank of Nicolai [youtube.com]

          I have pen!

        • Re:Really? (Score:5, Funny)

          by boxwood ( 1742976 ) on Thursday May 06, 2010 @12:25PM (#32112436)

          I have an even better one: www.bankofamerica.com

          It has an algorithm that increases my taxes and deposits the money into their executive bonus pool.

    • Re: (Score:3, Funny)

      Yeah, they are a little behind. They just implemented AOL keywords.

  • by sunderland56 ( 621843 ) on Thursday May 06, 2010 @11:05AM (#32111492)

    Slashdot does not support them

    It is now possible to get a domain that cannot be slashdotted!

  • Thats all good (Score:3, Insightful)

    by Johnny Fusion ( 658094 ) <(zenmondo) (at) (gmail.com)> on Thursday May 06, 2010 @11:05AM (#32111494) Homepage Journal
    But they will still need Latin characters to type "http://"
  • Non-latin TLDs? (Score:4, Insightful)

    by kvezach ( 1199717 ) on Thursday May 06, 2010 @11:05AM (#32111496)
    Well, hooray for a more fragmented Internet. While every keyboard can type A-Za-z, that's not true of Chinese or Arabic, so sites using those TLDs will be effectively off-limits to those that aren't "native". Sure, the sites can also register an ordinary domain name, but then why not just use that domain name to begin with?
    • Re:Non-latin TLDs? (Score:5, Interesting)

      by Just Some Guy ( 3352 ) <kirk+slashdot@strauser.com> on Thursday May 06, 2010 @11:11AM (#32111560) Homepage Journal

      While every keyboard can type A-Za-z, that's not true of Chinese or Arabic, so sites using those TLDs will be effectively off-limits to those that aren't "native".

      For now, I hope so. Imagine a RTL domain name, coupled with a phishing email telling recipients to visit moc.tfosorcim.[NEWGTLD] that renders as [NEWGTLD].microsoft.com. Won't that be fun?

      • Re:Non-latin TLDs? (Score:5, Insightful)

        by C0vardeAn0nim0 ( 232451 ) on Thursday May 06, 2010 @11:52AM (#32112036) Journal

        this a boon for russian scamers.

        some letters in russian cyrillic look like latin characters but have different uses. example, the cyrillic character that looks like a "C", is actually aquivalent to "S", their "H" is actually our "N". so a TLD ".som" in cyrillic would be seen on the screen (and understood by westerners) as ".com".

        so here's my suggestion to firefox developers: put some easy to see visual clue on the address bar to tell exactly in which language or character set the URL is written in.

      • Heck, I'm just waiting for people to put up those domains and then push out trojans under the guise of "font packs".

      • It would be strange for (in that case) some Arabic TLD to 1) render correctly in a browser of a non-Arabic speaker, and 2) to have Latin characters in the domain name itself and 3) still manage to look correct in RTL scripts.
    • Re: (Score:3, Insightful)

      by QuantumRiff ( 120817 )

      Everyone with a Western Keyboard can type A-Z and a-z. Not so with other countries keyboards. (btw, you can still type the unicode characters in windows, its just much more difficult). But really, if they have a Chinese language URL, and a site that is entirely written in Chinese, are they worried about not having you as a potential customer, when you can't figure out how to connect using their language?

      There are more people online in China than live in the US. This is going to be awesome for their local

    • Re: (Score:3, Interesting)

      by sznupi ( 719324 )

      When was the last time you had to type in a relativelly unknown URL? (not things like google, gmail, your bank, etc.)

      For that matter, when was the last time you had to type an URL of a site in a language which is off-limits to you anyway?...

      This might help greatly in popularization of the internet in large part of so called "developing countries", especially since the biggest changes can be expected when the common folks get hang of it; they are much more likely to be fluent only in their native language an

      • This might help greatly in popularization of the internet in large part of so called "developing countries", especially since the biggest changes can be expected when the common folks get hang of it; they are much more likely to be fluent only in their native language and script.

        And seeing that most minority languages adopt the Latin alphabet these days, how is the support of Arabic script going to help them?

    • If only someone could implement a system that points from one page to another, a "link" between them if you will. Maybe some text or image where the user could click and be redirected to some other page. Would that be even possible?

      Now more seriously, how often do you type the URL for a site in a language you can't speak? Even if you do that sometimes, I'd say you would be able to get there by a Google search written in your native charset. I don't see this as a big issue.

    • Re: (Score:2, Interesting)

      by wvmarle ( 1070040 )

      This way lots of non-English speakers, or even users of non-Latin alphabets, can use the Internet much better than they could before. Only half of the world uses Latin. So the other half is more or less excluded because domain names have those limitations, so just to be able to use the Internet they first have to learn a foreign script (a phonetic script - Chinese for example is not phonetic, so that in itself is a huge challenge for a Chinese to learn).

      But you probably never set foot outside of your own c

      • Re: (Score:3, Informative)

        by raju1kabir ( 251972 )

        So the other half is more or less excluded because domain names have those limitations, so just to be able to use the Internet they first have to learn a foreign script (a phonetic script - Chinese for example is not phonetic, so that in itself is a huge challenge for a Chinese to learn).

        Actually, the way that most Chinese people type on the computer is using a Latin keyboard to type pinyin phonetics. So they've already learned it whether or not they are using the internet. This is not going to change with

  • Seriously? (Score:2, Insightful)

    by elewton ( 1743958 )
    Is it chauvinistic that I find this insane?

    I wouldn't mind if they used an escape character sequence and then mapped other alphabets to strings of Latin characters, but actually breaking backwards compatibility...

    • Re:Seriously? (Score:5, Informative)

      by Anonymous Coward on Thursday May 06, 2010 @11:10AM (#32111548)

      they didn't break backwards compatability,
      here's the brilliant standard http://en.wikipedia.org/wiki/Punycode
      it's just awesome.

    • Re: (Score:3, Funny)

      by Hognoxious ( 631665 )

      I think unicode is a bag of shit. If ISO 8859-1 was good enough for Homer, Jesus, & Shakespeare it's good enough for everyone.

    • Re:Seriously? (Score:5, Informative)

      by tlhIngan ( 30335 ) <slashdot.worf@net> on Thursday May 06, 2010 @11:39AM (#32111888)

      Is it chauvinistic that I find this insane?
      I wouldn't mind if they used an escape character sequence and then mapped other alphabets to strings of Latin characters, but actually breaking backwards compatibility...

      Except there *IS* an escape sequence. And the actual representation is in standard latin alphabets.

      The reason is that browsers can detect the escape sequence and interpret the rest of the URL as a unicode string.

      The escape is "xn--" - domains using it have xn--domain, TLDs as xn--TLD. Use both and they both have to be escaped - xn--blah.xn--blahtld.

      The trick for the Rest of Us is to be able to set that as "off" by default to keep these xn-- sequences from looking like normal latin characters. The good news is the encoding is such that Paypal and the like don't get rendered as xn--paypal.com and such, but xn--junk_that_renders_as_paypal.com.

      Internationalized domain names have been around a few years. This is just an internationalized TLD using the same DNS-friendly encoding scheme.

  • Why not post example (Score:4, Interesting)

    by grahamm ( 8844 ) <gmurray@webwayone.co.uk> on Thursday May 06, 2010 @11:10AM (#32111552) Homepage

    Why did the BBC article not include a link to a valid non-latin URL so we could see how our browsers cope? Even if the page is not understandable, it would be nice to know that the pages load.

  • by Unka Willbur ( 1771596 ) on Thursday May 06, 2010 @11:12AM (#32111566)
    Ridiculous tribalism, that's all it is. Fragmentation of the Internet to appease some regressive, regional e-peenery is the stupidest idea to date. I speak 8 languages and love some, like Russian immensely, but the internet is a nation with its own language, and that language is Standard English [wikipedia.org]. I call shenanigans on anything else being shoehorned into its basic infrastructure!
    • currently people are not getting on the internet because its all in english: it serves as a barrier and they see no reason to even try

      but when the internet supports their native language, they get on the internet, get a taste of it, like it, want to use more it, and inevitably this drives them to the english web, since there's more of whatever they're looking for over there

      in other words, the long term effect of supporting other languages on the web is paradoxically further and faster consolidation to english

      • by wvmarle ( 1070040 ) on Thursday May 06, 2010 @12:27PM (#32112452)

        Not necessarily.

        Since sometime back in the '90s the web site www.ilse.nl was founded: a search engine that would index Dutch language sites only. Very useful for those that speak Dutch. Luckily for us Dutch we don't need a different character set, we can get along with the limits of Latin just fine. When it came to searching for Dutch language sites it was the number one choice. And so there are a few more, www.startpagina.nl is another very popular one.

        Wikipedia comes in lots of languages - but I have never heard anyone here shout "fragmentation! Less freedom!" about that. Even though most of those other languages are inaccessible to them. But then English is inaccessible for a large part of the world, and the vast majority of people still prefers to use their native language. And that preference continues online. Even though they may be proficient in English.

        I can actually imagine that the English language in the long term becomes a minority language. After all there are more native Chinese speakers than native English speakers in this world. There are probably even more non-English speakers than that there are English speakers, and in this case not even talking about native English speakers.

    • Re: (Score:2, Insightful)

      by Anonymous Coward

      English? WTF? LOL!

    • by Anonymous Coward on Thursday May 06, 2010 @11:27AM (#32111758)

      When the 7 Billion people around the globe will be speaking Standard English, then you may have a point. Until then I think it is everybody's right to use his/her native/preferred language on the Internet, including in TLDs. I speak 5 languages and Arabic is my native language and I think that today is a great day for the Internet.

    • by melikamp ( 631205 ) on Thursday May 06, 2010 @11:30AM (#32111792) Homepage Journal

      but the internet is a nation with its own language

      Yeah, but it's not English, it's TCP/IP. And DNS is not even an integral part of the Internet, but rather a layer on top, used mostly for the WWW part. Many peer-to-peer applications would work just fine even if DNS was never created.

      • DNS is not even an integral part of the Internet, but rather a layer on top, used mostly for the WWW part. Many peer-to-peer applications would work just fine even if DNS was never created.

        Except a lot of peer-to-peer applications depend on WWW for discovery. You still need DNS to download the client or a recent list of well-known hosts.

      • It's purpose is to advertise your service. Even 4 billion is a large search space for humans.

        It all started with host files. non scalable updating and distributing a flat file with all the people who wanted to run services and to allow other people to use them easily (name vs address) so DNS was invented to allow people to advertise their serv(ers|ices) all by themselves.

        If people want to create their own little unusable fiefdoms. Go right ahead.
         

      • Actually the vast majority of P2P applications use the DNS system to bootstrap into the network.

        DNS is absolutely vital to the functioning of the Internet as it exists today.

    • I see some problems cropping up in the future.

      Imagine a domain like BankOfAmerica.com - only one of the letters is non latin, yet simmilar looking. Links look OK, address bar looks OK.

      Just say'n - there's going to be bad guys exploiting this.

      • Re: (Score:3, Insightful)

        by TheRaven64 ( 641858 )
        This is a UI problem, and it's pretty easy to fix. Simply display punycode URLs in a different colour, such as red. Some browsers do this already. Punycode isn't new; it's been supported for second-level domains for a long time. The only new thing here is that some ccTLDs are now using Punycode for the top-level part as well as the subdomains.
    • by eldavojohn ( 898314 ) * <eldavojohn@noSpAM.gmail.com> on Thursday May 06, 2010 @11:36AM (#32111844) Journal

      Ridiculous tribalism, that's all it is.

      Well, then as the submitter, I regret tagging it with "culture."

      I speak 8 languages and love some, like Russian immensely, but the internet is a nation with its own language, and that language is Standard English [wikipedia.org]. I call shenanigans on anything else being shoehorned into its basic infrastructure!

      Huh, as a developer I had always assumed that we wrote software to help people. Not that people changed their behaviors and customs to be able to use our software. I guess I was wrong. I find it disturbing that a polyglot like yourself can so easily dismiss an engineering challenge as "ridiculous" and "shenanigans" because all it takes to get around it is for everyone in the world to learn my language of takes.

      I find it humorous that we sit here and rail for interoperability and satisfying the consumer and no DRM and open standards ... only to turn around and call something that opens up the internet to the rest of the world "ridiculous."

      If this is the consensus among geeks, what a shame it is to be a geek.

      Where do you stand on the effort that went into the Linux language packs? Were those ridiculous tribalism as well when someone took the time to make them?

    • by Cyberax ( 705495 )

      No, support for local languages is good.

      However, the way it's being added is technically bad.

    • I'd rather see a character translation method. I should be able to type CNN in my native language and, once put into a URL bar, it will translate it to cnn.com and move forward. There are plenty of URLs in different languages, but as far as I know they're all the latin characters (abcdef, etc).

    • stultorum calami carbones moenia chartae
    • the internet is a nation with its own language, and that language is Standard English

      So by your logic, no-one should be allowed to write emails, or post to forums, or compose web pages in anything other than English?

    • Re: (Score:3, Insightful)

      by jmv ( 93421 )

      Ridiculous tribalism, that's all it is. Fragmentation of the Internet to appease some regressive, regional e-peenery is the stupidest idea to date.

      Maybe if DNS addresses were based on Chinese, Hindi or Arabic then you'd have a different opinion.

  • So now we have mention of a new website. Slashdot cites its shortcomings as unable to display a link to the site and the article has no link.

    You can find a link by following here:
    http://www.google.com/search?hl=en&source=hp&q=http://xn--4gbrim.xn----rmckbbajlc6dj7bxne2c.xn--wgbh1c/ar/default.aspx [google.com]

  • by drumcat ( 1659893 ) on Thursday May 06, 2010 @11:19AM (#32111644)
    Guess what -- this will all get blocked. More fragmentation = less free internet. Here comes Sharia law that says all internet usage must be in Farsi, and all websites with latin endings will be blocked. Weak.
    • Re: (Score:3, Insightful)

      Guess what -- this will all get blocked. More fragmentation = less free internet. Here comes Sharia law that says all internet usage must be in Farsi, and all websites with latin endings will be blocked. Weak.

      No, the sharia will be that all internet usage must be in Arabic since that is the only language the Koran can be in (if it isn't in Arabic, it isn't the Koran according to Muslims).

    • I'm pretty sure that if Sharia specified a language to be used, I'm pretty sure it would be Arabic, smart guy.

  • by MoellerPlesset2 ( 1419023 ) on Thursday May 06, 2010 @11:20AM (#32111650)
    For the inhabitants of Mönsterås [monsteras.se], Sweden.
    The town name means 'patterned ridge', but to date they've have had to put up with the domain "Monsteras" - which means "monster-carcass".
    (å, ä, ö/ø in the Scandinavian languages are considered to be their own unique characters, not accented 'a's and 'o's.)
    • Re: (Score:2, Informative)

      by Fenris Ulf ( 208159 )

      That's a hostname (which is already supported via IDN, such as http://xn--malmpeeps-37a.se/ [xn--malmpeeps-37a.se] ), this story is talking about TLDs.

      There's no technical reason Mönsterås can't have mönsterås.se

    • Å, ä and ö have been usable in .se domain names for quite some time actually, it's just that support for it in various development tools is absolutely pathetic (unless you want to change your language and library choice solely based on which one has support for that specific feature).

    • by sootman ( 158191 ) on Thursday May 06, 2010 @12:25PM (#32112442) Homepage Journal

      Sweet! Now I can finally get my site, "anosfeliz.es" off blacklists! ;-)

      (For the non-Spanish speakers out there: "feliz" means happy (and adjectives come after nouns, like Rio Grande [big river] or El Camino Real [the royal road]), "años" means "years" and "anos" means "anus." So instead of "happy years" it would be... something else entirely.)

  • My browser has had support for Mojibake [wikipedia.org] encoding for years.
  • Now to axe the latin protocol prefix, colon, slashes and dots. Also, what about those with disabilities- it is visual after all. We need "thought domains"- but wait, what about those with impaired mental capacities? Domains by intuitition would work. But what about parallel universes! Argh.
  • Preventing similar characters from being used to make one domain look like another.

     

    • by Qzukk ( 229616 )

      Preventing similar characters from being used to make one domain look like another.

      There wasn't one before (remember paypai.com?) there isn't one now.

  • Isn't that a possible hacking vector?

  • So when is /. going to allow none Latin characters? I cannot even say ! What gives?
  • by ral ( 93840 )
    The site in the ICANN blog [icann.org] worked for me in both Safari and Firefox, in the Windows XP and OSX versions of both. Both Safari and Firefox showed Arabic in the text on the tab, but only Safari showed Arabic in the address bar.
  • How it works (Score:5, Interesting)

    by jroysdon ( 201893 ) on Thursday May 06, 2010 @12:31PM (#32112480)

    As I maintain my own DNS servers and such, I was curious how this worked. Here's what I learned with 15 minutes of research:

    My first stop was to see the root.zone [internic.net] and I looked for these new TLDs, curious to see how they would show up in a Latin-based zone file. Ah, I spotted these odd XN-- zones and then knew what to dig into more.

    Take for instance (I pasted a Unicode domain, but Slashcode won't show it) which is handled by ns[1-3].dotmasr.eg.:

    $ dig ns (Unicode domain)

    ; > DiG 9.6.2-P1-RedHat-9.6.2-3.P1.fc12 > ns (Unicode domain) ;; QUESTION SECTION: ;.(Unicode domain) IN NS ;; ANSWER SECTION:
    . 3600(Unicode domain) IN NS ns1.dotmasr.eg.
    . 3600 (Unicode domain)IN NS ns2.dotmasr.eg.
    . 3600(Unicode domain) IN NS ns3.dotmasr.eg.

    If you look in the root.zone file, you will see that the ASCII/Latin version of this zone is really XN--WGBH1C.:
    XN--WGBH1C. NS NS1.DOTMASR.EG.
    XN--WGBH1C. NS NS2.DOTMASR.EG.
    XN--WGBH1C. NS NS3.DOTMASR.EG.

    TLD Reserved Domains [wikipedia.org] has a list of the current mappings. ToASCII and ToUnicode [wikipedia.org] are the methods to convert back and forth which links to RFC 3490 [ietf.org] which has the nitty gritty details.

    (meh, Slashcode doesn't support Unicode encoding, but I can see the Unicode domain name I am pasting in before I hit Preview in Firefox)

    Also, the whole switching from right to left in Latin characters to left to right in some Unicode is odd when trying to edit!

BLISS is ignorance.

Working...