Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Getting Unicode Character Codes in JavaScript?

Cliff posted more than 12 years ago | from the tricky-problems dept.

Java 26

jargonCCNA asks: "I've searched high and low across the web, but I can't seem to be able to find any code snippets or even anything that'll help me out here. I'm trying to get a Unicode character code from a data stream in JavaScript and there doesn't seem to be anything out there to help me; JavaScript itself only has onboard support for ISO-Latin_1, or something. I tried hacking my own converter code, but it's rife with errors. Anybody know of some code that I can include in a GPL project?"

"Here's the buggy code, if you're interested:

function unicode2hex( unicode )

{
var hexString = "";

for( var i = 0x0000; i <= 0xFFFF; i++ )
{
test = eval( "\\u" + i );

if ( unicode == test )
{
hexString += i / 4096;

hexString += i / 256;
hexString += i / 16;
hexString += i % 16;
hexString += "";

return hexString;
}
}

return false;
}
"Mozilla's JavaScript console lets me know that '\u0' is an illegal character. I think this would work if I could make it use the string "0000" instead of the number 0 for i.

Just for reference -- I've seen a lot of people get nailed on Ask /. because they didn't do the proper research before asking their question. Google has failed me; I've been trying to figure this out on my own for about a month. I hope someone can shed some light on my situation."

cancel ×

26 comments

Sorry! There are no comments related to the filter you selected.

One question (5, Funny)

Henry V .009 (518000) | more than 12 years ago | (#3995289)

How did this story get past the lameness filter?

Re:One question (2)

Real World Stuff (561780) | more than 12 years ago | (#3995322)

Desperation for Quality content.

Story submissions (1)

yerricde (125198) | more than 12 years ago | (#4005723)

How did this story get past the lameness filter?

Stories are probably not subject to the lameness filter (or at least they have looser filters) because an editor must approve each story by hand.

That said, I have a possible (untested) solution: Try changing each += in the inner loop to a +=""+ to force the strings to be concatenated rather than treated as numbers.

Ask the Experts (1)

Tux2000 (523259) | more than 12 years ago | (#3995327)

Ask the Experts at http://selfforum.teamone.de [teamone.de] . It's a german forum, but most people there can read and write english as well. The SelfForum is related to the famous SelfHTML (at least here in Germany, it is famous). Just copy and paste your question there.

Isn't this a question for developer.net? (2)

displague (4438) | more than 12 years ago | (#3995390)

What's the deal? Cliff must have hit the "Accept" instead of the "Reject" button by accident.

Try asking your question in IRC before hitting up "Ask Slashdot."

A search on google for unicode and javascript brings back a lot of positive looking results without actually delving into them. It seems like JS1.5 has support for this (from the Google summaries).

Re:Isn't this a question for developer.net? (1)

jargonCCNA (531779) | more than 12 years ago | (#3995617)

A search on google for unicode and javascript brings back a lot positive looking results without actually delving into them.

Yeah, positive looking. That's the thing. Looks are exceedingly deceiving on a search engine. Try actually delving in; I can almost guarantee that it won't convert Unicode characters to their character codes.

Now for the answer.. (1)

displague (4438) | more than 12 years ago | (#3995406)

Ok, I got my "Second Post" in.. Now here's the good answer.

document.write("\u00A9 Netscape Communications" );

I just did that in Galeon and it works fine...

See - http://developer.netscape.com/docs/manuals/js/core /jsguide15/ident.html#1009690

Re:Now for the answer.. (1)

Lazarus Short (248042) | more than 12 years ago | (#3995498)

That's great, except that it does the opposite of what he wants. He seems to want a function that'll turn the copyright sign to "00A9".

Re:Now for the answer.. (1)

displague (4438) | more than 12 years ago | (#3995818)

Ahhh.. You're right, I'm wrong... But I'll repeat the truly correct answer as I have already lured someone down the wrong path:

document.write("\u00A9".charCodeAt(0));

That provides the decimal, then you just have to convert to hex.

function Dec2Hex (Dec) { var a=Dec % 16; var b=(Dec - a)/16; hex="" + hexChars.charAt(b) + hexChars.charAt(a); return hex; }

Blatently ripped off from here [internet.com]

cliff cliff cliff.... (0)

Anonymous Coward | more than 12 years ago | (#3995546)

Why the hell did you let someone place this story under the topic JAVA?? JAVA != JAVASCRIPT. They're two completely different things. this story is a flat out troll.

Wrong topic Cliff, you cockfoster (-1)

BlackTriangle (581416) | more than 12 years ago | (#3995581)

Javascript is not Java. Lick a dick.

Re:Wrong topic Cliff, you cockfoster (1)

jargonCCNA (531779) | more than 12 years ago | (#3995654)

Too bad there's no JavaScript topic, eh there chico?

Lick your own.

Straight to the source! (1)

yancey (136972) | more than 12 years ago | (#3995611)


Why don't you ask the Mozilla developers that are working on JavaScript 2.0?

Did you try looking at the docs? (5, Informative)

Lazarus Short (248042) | more than 12 years ago | (#3995620)

No offense, but I haven't used JS in years, and I found this in a matter of minutes.

document.write("\u00A9 is ");
document.write("\u00A9".charCodeAt(0));

That will give you the answer in decimal. I trust you can convert to hex yourself.

(Note: Requires Javascript 1.3; previous versions used ISO-Latin-1 rather than unicode, and I don't know what they'd do with a character higher than 255.)

Re:Did you try looking at the docs? (1)

jargonCCNA (531779) | more than 12 years ago | (#3995638)

All right, you're officially The Most Helpful Person On Slashdot now.

I looked through all the documentation I could find; the only thing I found about charCodeAt() was that it use ISO-Latin.. But I think they also said they were JavaScript 1.2-specific.

Any idea what version of JavaScript IE6 emulates, and Mozilla actually uses?

Re:Did you try looking at the docs? (1)

Lazarus Short (248042) | more than 12 years ago | (#3995765)

Well, the example I used works as expected in IE 5.0 , NS 4.7, and Moz 1.1a.

(Similar code with characters outside the range of Latin-1 also works on both, though the browsers sometimes display the "no glyph for that" glyph (open box for IE, "?" for NS/Moz).

Couldn't tell you what JS versions each browser actually uses, though.

Re:Did you try looking at the docs? (1)

Karma Farmer (595141) | more than 12 years ago | (#3996021)

I have no idea who decides what is officially JavaScript. I'm imagining an oracle sitting on a subway platform somewhere, eating a corndog and spouting off ziggyisms to anyone who will listen.

But, I'm assuming that IE will just use whatever version of JScript you happen to have installed on your machine. And, as far as I know, JScript really does follow the ECMAScript specification, which is a real spec, with standards bodies and the whole works, unlike "JavaScript", whatever that is, exactly.

Anyhow, take a look here [microsoft.com] to get a look at some of the features of the JScript interpreter hosted in some of your favorite applications.

Re:Did you try looking at the docs? (1)

Karma Farmer (595141) | more than 12 years ago | (#3996153)

Any idea what version of JavaScript IE6 emulates, and Mozilla actually uses?

IE6 doesn't emulate JavaScript. It uses JScript, which is Microsoft's implimentation of the ECMA-262 Edition 3 language standard (ECMAScript). Similarly, JavaScript is Netscape's implementation of the same standard. Neither is "emulating" anything.

You can find the ECMAScript standard here: ECMA-262v3 [www.ecma.ch] . You can discover what your favorite vendor has actually implemented by visiting either mozilla [mozilla.com] and microsoft [microsoft.com] documentation for each vendor's implementation.

Re:Did you try looking at the docs? (1)

jargonCCNA (531779) | more than 12 years ago | (#3997048)

Oh, okay... The way I've understood it for years was that JScript was a sorta-cheap knockoff of JavaScript.. D'oops!

Re:Did you try looking at the docs? (0)

Anonymous Coward | more than 12 years ago | (#3996507)

If I were you I would feel incredibly stupid. You research something for a month and in under an hour get back someone who did nothing more than browse the documentation for "a few minutes", and who, let me add, hasn't used the technology in years. admit it, you haven't felt this dumb in ages....

Re:Did you try looking at the docs? (1)

jargonCCNA (531779) | more than 12 years ago | (#3996568)

If I were you, I'd not only use better grammar, but I'd identify myself. So somebody the results. Good for him. A lucky search.

Re:Did you try looking at the docs? (0)

Anonymous Coward | more than 12 years ago | (#3997478)

So somebody the results.
This sentance no verb. Anyway, he right, you just too indignant admit it. All these sentances no verb, for those too shortsited notice.

Re:Did you try looking at the docs? (1)

jargonCCNA (531779) | more than 12 years ago | (#4002945)

Hilarious. It's called a typographic error. Your satire would have been perfect had you spelled "sentence" and "shortsighted" correctly.

Re:Did you try looking at the docs? (0)

Anonymous Coward | more than 12 years ago | (#4010202)

No, his satire would have been perfect if he'd quoted a bit more creatively:
I were you, I'd [...]
use better grammar. So somebody the results. Good for him.

Re:Did you try looking at the docs? (2)

josepha48 (13953) | more than 12 years ago | (#3996551)

Here is something that will convert:
function tounicode(instr) {
len = instr.length;
switch (len) {
case 1:
return instr.charCodeAt(0);
case 2:
return new String(instr.charCodeAt(1)) + new String(instr.charCodeAt(0));
case 3:
return instr.charCodeAt(2) + instr.charCodeAt(1) + instr.charCodeAt(0);
case 4:
return instr.charCodeAt(3) + instr.charCodeAt(2) + instr.charCodeAt(1) + instr.charCodeAt(0);
}
return "";
}

document.write(tounicode("\u002d") + " " + tounicode("-") + "
");

With this you can take a string like "fooo" with a unicode equivalant.

Hey, Cliff... (0)

Anonymous Coward | more than 12 years ago | (#3995978)

Slashdot is a nerd website. We know better than to think JavaScript is at all Java. Change that Coffee Cup Graphic, bud.
Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>