Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Machine Learning Used For JavaScript Code De-obfuscation

Soulskill posted about 4 months ago | from the cleaning-up-the-digital-streets dept.

Programming 31

New submitter velco writes: "ETH Zurich Software Reliability Lab announced JSNice, a statistical de-obfuscation and de-minification tool for JavaScript. The interesting thing about JSNice is that it combines program analysis with machine learning techniques to build a database of name and type regularities from large amounts of available open source code on GitHub. Then, given new JavaScript code, JSNice tries to infer the most likely names and types for that code by basing its decision on the learned regularities in the training phase."

cancel ×

31 comments

Sorry! There are no comments related to the filter you selected.

Biggest beneficiary: Minecraft mods (-1)

Anonymous Coward | about 4 months ago | (#47160379)

We've been waiting on a mod API for going on four years now.

Re:Biggest beneficiary: Minecraft mods (2)

Anaerin (905998) | about 4 months ago | (#47160467)

  1. Minecraft is written in Java, not JavaScript.
  2. The MCP (Minecraft Coder Pack) already has a deobfuscator built in (kinda sorta)

Re:Biggest beneficiary: Minecraft mods (-1)

Anonymous Coward | about 4 months ago | (#47160737)

I heard Java is slower than JavaScript because Java has five fewer letters in the name.

Re:Biggest beneficiary: Minecraft mods (2)

wonkey_monkey (2592601) | about 3 months ago | (#47162479)

I hear there's a bug in the string length() method that miscounts by 1.

Re:Biggest beneficiary: Minecraft mods (0)

Anonymous Coward | about 3 months ago | (#47162583)

Yeah, but how seriously can you take "i". It's just a line with a dot. It's like some spastic tried to draw a straight line and then his pen jumped from the paper. It's not really a letter; it doesn't deserve to be counted.

On the other hand, the S... look at that S. That's some big fat S, guys! It's go two curves flowing into capital perfection. It's an S so nice, it should count twice!

So I guess you're right.

Re:Biggest beneficiary: Minecraft mods (0)

Anonymous Coward | about 3 months ago | (#47162183)

Wow there are still idiots who fall for that ancient and obvious trollbait? Hopefully your excuse is that you're autistic.

Re:Biggest beneficiary: Minecraft mods (0)

Anonymous Coward | about 4 months ago | (#47160553)

I hope to christ you are trolling.

Not de-obfuscation (0)

Anonymous Coward | about 4 months ago | (#47160405)

Minifcation is obfuscation, if you try running some _really_ obfuscated code through it nothing really improves.

Re:Not de-obfuscation (0)

Anonymous Coward | about 3 months ago | (#47161349)

I am genuinely confused. If minification is obfuscation, and this thing de-minifies, how is that not de-obfuscation?

Re:Not de-obfuscation (0)

Anonymous Coward | about 3 months ago | (#47162219)

It may be able to de-minify by adding back some white spaces and indenting, which is the easy part because that's just based on the language syntax. I think the hard part is restoring the obfuscated vars, funcs, and objs to have meaningful names again.

Re:Not de-obfuscation (2)

hotcut (1289754) | about 3 months ago | (#47162419)

Yes, the hard part is getting meaningful names back - which is exactly what the article is about; they claim to have found a way to do it. Granted, I doubt how good it could possibly be - but on the other hand, it is an interesting project that may come to use.

Muhhahahah! (0)

Anonymous Coward | about 4 months ago | (#47160425)

Mine! All Mine! The javascript! It's all Mine!

So, what are you good at, JSNice? (2)

albacrankie (1017430) | about 4 months ago | (#47160449)

this and that

Needs Better Name (0)

Anonymous Coward | about 4 months ago | (#47160473)

Since this is pretty similar to duck typing, I nominate "UnDuck" since it can also be used as a verb. As in, "I was finally able to tell what that obfuscated JS library does after I unducked it."

Hahahaha! (4, Funny)

pigiron (104729) | about 4 months ago | (#47160483)

The development of tools like these started out of necessity for figuring out old COBOL code.

Re:Hahahaha! (2)

K. S. Kyosuke (729550) | about 4 months ago | (#47160861)

If DIVIDE X BY Y GIVING Z REMAINDER W is the minified version, I'm not sure I want to see the un-minified one!

Re:Hahahaha! (2, Funny)

Anonymous Coward | about 3 months ago | (#47161391)

That would be
    "DIVIDE REC-WORKER-TOTAL-ANNUAL-SALARY BY WS-HOURS-IN-FISCAL-YEAR
    GIVING WS-HOURLY-RATE REMAINDER WS-ANNUAL-BONUS."
or something similar.

Re: Hahahaha! (0)

Anonymous Coward | about 3 months ago | (#47167285)

Some of the early development tools were created out of necessity for programming the original COBOL code. Everything progresses based on earlier work...

Finally consistent naming (5, Funny)

orionpi (318587) | about 4 months ago | (#47160569)

Now we just run every JavaScript program through an obfuscator then JSNice and we have consistent naming.

Re:Finally consistent naming (1)

Anonymous Coward | about 4 months ago | (#47160675)

Now we just run every JavaScript program through an obfuscator then JSNice and we have consistent naming.

You laugh, but I have tried it.

The naming isn't as good as you would like, but for some projects, it may be an improvement. o.O

Didn't work for me (0)

Anonymous Coward | about 4 months ago | (#47160603)

I tried it on my obfuscated code and it made no improvement.

Fail (0)

viperidaenz (2515578) | about 4 months ago | (#47160669)

I tried it on a minified jquery 1.7.2 and got:

Error compiling input:

Line 3: Parse error. missing ) after condition
Line 3: Parse error. unterminated string literal
Line 4: Parse error. missing ; before statement
Line 4: Parse error. syntax error
Line 4: Parse error. missing ) in parenthetical
Line 4: Parse error. missing } after property list
Line 4: Parse error. illegal character
Line 4: Parse error. syntax error
Line 4: Parse error. illegal character
Line 4: Parse error. illegal character

Re:Fail (5, Informative)

Martin Vechev (3680079) | about 4 months ago | (#47160891)

Hi, Thanks for trying the tool out. I tried http://code.jquery.com/jquery-... [jquery.com] (from here: http://blog.jquery.com/2012/03... [jquery.com] ) and it worked fine. best, Martin

Re:Fail (1)

Menkhaf (627996) | about 3 months ago | (#47162393)

Now that you're here...

I tried it on this hunk of JavaScript: http://pastebin.com/miGDVkdf [pastebin.com] , but all I got was a parse error:

"// Error contacting the server...
parsererror
SyntaxError: Unexpected token :"

potential tool for JS code refactoring (0)

Anonymous Coward | about 4 months ago | (#47160683)

this can be a good candidate for JavaScript code refactoring when people are building large scale JS based application.

RMS (0)

Anonymous Coward | about 4 months ago | (#47160733)

I wonder what would R.M.Stallman say about this. Maybe he will feel 60% freer to browse the web now?

Re:RMS (1)

tepples (727027) | about 3 months ago | (#47161133)

I don't think he would. Code distributed under terms that prohibit modification is still distributed under terms that prohibit modification, whether or not it's possible to convert it into a form suitable for making modifications.

jsunpack? (0)

Anonymous Coward | about 3 months ago | (#47161067)

How is this different than jsunpack-n?

As a exploit kit researcher.... (3, Interesting)

guardiangod (880192) | about 3 months ago | (#47161377)

This tool looks very intriguing, so I gave it some malicious code for a spin (all codes are from malicious drive-by sites in the last 24 hours.)
 
 

/** @type {function (string): *} */
e = eval;
/** @type {string} */
v = "0" + "x";
/** @type {number} */
a = 0;
try {
  a *= 2;
} catch (q) {
/** @type {number} */
  a = 1;
}
if (!a) {
  try {
    document["bod" + "y"]++;
  } catch (q$$1) {
/** @type {string} */
    a2 = "_";
  }
  z = "2f_6d_*snip*"["split"](a2);
/** @type {string} */
  za = "";
/** @type {number} */
  i = 0;
  for (;i < z.length;i++) {
    za += String["fromCharCode"](e(v + z[i]) - sa);
  }
  zaz = za;
  e(zaz);
}
/**
  * @param {string} n
  * @param {string} k
  * @param {number} v
  * @param {string} reason
  * @return {undefined}
  */
function SetCookie(n, k, v, reason) {
/** @type {Date} */
  var defaultCenturyStart = new Date;
/** @type {Date} */
  var expiryDate = new Date;

Sort of useful, I guess. But ultimately not an essential feature for malicious javascript analysis. I think the tool would be more useful to legitmate JS reverse-engineering tasks as their obfuscated JS are much much bigger.

Try some unobfuscated JS for fun! (0)

Anonymous Coward | about 3 months ago | (#47161969)

function loadVideo(css,src,version,flashvars,width,height){
$(css).flash({src:src,
width:width,height:height,
menu:'false',
allowfullscreen:'true',
allowscriptaccess:'always',
flashvars:flashvars
},{version:version});}

becomes: /**
  * @param {?} data
  * @param {string} src
  * @param {string} browserVersion
  * @param {?} dataAndEvents
  * @param {number} w
  * @param {number} rowHeight
  * @return {undefined}
  */
function loadVideo(data, src, browserVersion, dataAndEvents, w, rowHeight) {
    $(data).flash({
        src : src,
        width : w,
        height : rowHeight,
        menu : "false",
        allowfullscreen : "true",
        allowscriptaccess : "always",
        flashvars : dataAndEvents
    }, {
        version : browserVersion
    });
};

As a exploit kit researcher.... (0)

Anonymous Coward | about 3 months ago | (#47162303)

Yeah, I don't think it added much value to this code: https://gist.github.com/nickwb/3d95944feaa2d9409a57

Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>