Google Talks About the Dangers of User Content

Become a fan of Slashdot on Facebook

Google Talks About the Dangers of User Content 172

Posted by samzenpus on Thursday August 30, 2012 @02:59AM from the watch-what-you-say dept.

An anonymous reader writes "Here's an interesting article on the Google security blog about the dangers faced by modern web applications when hosting any user supplied data. The surprising conclusion is that it's apparently almost impossible to host images or text files safely unless you use a completely separate domain. Is it really that bad? "

This discussion has been archived. No new comments can be posted.

Google Talks About the Dangers of User Content

Load All Comments

Search 172 Comments Log In/Create an Account

Comments Filter:

I don't know if the question should be... (Score:3)

by Tastecicles ( 1153671 ) writes: on Thursday August 30, 2012 @03:17AM (#41175865)

...is it a server problem, with the way it interprets record data, or the browser (any browser) (maybe as instructions rather than markup)? I'm guessing server in this case, since if the stream is intercepted and there's a referrer URL that directly references an image or other blob on the same or another server on a subdomain, that could be used to pwn the account/whatever... I'm not up on that sort of hack (you can probably tell). I don't quite get how hosting blobs on an entirely different domain would mitigate against that hack, since you would require some sort of URI that the other domain would recognise to be able to serve up the correct file - which would be in the URL request! Someone want to try and make sense of what I'm trying to say here?

Share
twitter facebook
- Re:I don't know if the question should be... (Score:5, Informative)
  
  by Sarusa ( 104047 ) writes: on Thursday August 30, 2012 @03:49AM (#41176017)
  
  It's fundamentally a problem with the browsers. Without getting too technical...
  Problem 1: Browsers try real hard to be clever and interpret maltagged/malformed content so people with defective markup or bad mime content headers won't say 'My page doesn't work in Browser X, Browser X is defective!'. Or if the site is just serving up user text in html, stick some javascript tags in the text. Whichever way, you end up so someone malicious can upload some 'text' to a clipboard or document site which the browser then executes when the malicious person shares the URL.
  Problem 2: There are a lot of checks in most browsers against 'cross site scripting', which is a page on site foobar.com (for instance) making data load requests to derp.com, or looking at derp.com's cookies, or even leaving a foobar.com cookie when derp.com is the main page. But if your script is running 'from' derp.com (as above) then permissions for derp.com are almost wide open, because it would just be too annoying for most users to manage permissions on the same site. Now they can grab all your docs, submit requests to email info, whatever is allowed. This is why just changing to another domain name helps.
  There's more nitpicky stuff in the second half of TFA, but I think that's the gist of it.
  
  Parent Share
  twitter facebook
  - Re:I don't know if the question should be... (Score:5, Insightful)
    
    by TubeSteak ( 669689 ) writes: on Thursday August 30, 2012 @04:00AM (#41176049) Journal
    
    It's fundamentally a problem with not validating inputs. Without getting too technical...
    Problem 1: Browsers try real hard to be clever and interpret maltagged/malformed content instead of validating inputs.
    Problem 2: There are a lot of checks in most browsers against 'cross site scripting', which is fundamentally a problem of not validating inputs.
    /don't forget to validate your outputs either.
    
    Parent Share
    twitter facebook
    - Re:I don't know if the question should be... (Score:4, Insightful)
      
      by Sarusa ( 104047 ) writes: on Thursday August 30, 2012 @04:53AM (#41176231)
      
      This is true! You could even say it's a sooper-dooper-fundamental problem of HTTP/HTML not sufficiently separating the control channel from the data channel and/or not sufficiently encapsulating things (active code anywhere? noooo.)
      But since browsers have actively chosen to validate invalid inputs and nobody's going to bother securing HTTP/HTML against this kind of thing any time soon, or fix the problems with cookies, or, etc etc etc, I figured that was a good enough high level summary of where we're at realistically. Nobody's willing to fix the foundations or 'break' when looking at malformed pages.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by steelfood ( 895457 ) writes:
        
        Throw up a warning screen whenever there's malformed input. Kinda like the warning screen with self-signed certs, without the stupid part of having the add the site to a permanent exception list.
        And if people want the convenience of whitelisting or just turning the message off entirely, put those in the options, just like the way browsers handle cookies.
        This warning page will show up a lot at first. But it would also ultimately shame people into fixing their outputs.
      - Re: (Score:2)
        
        by Khyber ( 864651 ) writes:
        
        And all of that is on the same data channel. Again, lack of proper separation.
        
        Re: (Score:2)
        
        by DamnStupidElf ( 649844 ) writes:
        
        That's not really a valid complaint. Even if HTTP was like FTP and opened a second TCP/IP connection to transfer data the exact same problems would arise if browsers ignored mime-types and tried to interpret the data contents instead of trusting the control channel.
    - Re:I don't know if the question should be... (Score:5, Interesting)
      
      by 19thNervousBreakdown ( 768619 ) writes: <davec-slashdotNO@SPAMlepertheory.net> on Thursday August 30, 2012 @04:56AM (#41176253) Homepage
      
      I'm actually not a big fan of validating inputs. I find proper escaping is a much more effective tool, and validation typically leads to both arbitrary restrictions of what your fields can hold and a false sense of security. It's why you can't put a + sign in e-mail fields, or have an apostrophe in your description field.
      In short, if a data type can hold something, it should be able to read every possible value of that data type, and output every possible value of that data type. That means that if you have a Unicode string field, you should accept all valid Unicode characters, and be able to output the same. If you want to restrict it, don't use a string. Create a new data type. This makes escaping easy as well. You don't have a method that can output strings, at all. You have a method that can output HTMLString, and it escapes everything it outputs. If you want to output raw HTML, you have RawHTMLString. Makes it much harder to make a mistake when you're doing Response.Write(new RawHTMLString(userField)).
      A multi-pronged approach is best, and input validation certainly has its place (ensuring that the user-supplied data conforms to the data type's domain, not trying to protect your output), but the first and primary line of defense should be making it harder to do it wrong than it is to do it right.
      
      Parent Share
      twitter facebook
      - Re:I don't know if the question should be... (Score:4, Interesting)
        
        by dzfoo ( 772245 ) writes: on Thursday August 30, 2012 @05:30AM (#41176361)
        
        I'm actually not a big fan of validating inputs. I find proper escaping is a much more effective tool, and validation typically leads to both arbitrary restrictions of what your fields can hold and a false sense of security.
        
        OK, fair point. How about if we expand the concept of "validating input" to include canonicalization and sanitation as well? Oh, it already does. Go figure.
        Reducing it to a mere reg-exp is missing the point. Proper canonicalization (and proper understanding of the underlying standards and protocols, but that's another argument) would allow you to use a plus-sign in an e-mail address field.
        But this won't happen as long as every kid fresh out of college wants to roll their own because they known The One True Way to fix it, this time For Real. As long as they keep ignoring everything learned before because, you know, it's old stuff and this is the new technology of The Web, where everything old does not count at all; nothing will change.
        A multi-pronged approach is best, and input validation certainly has its place (ensuring that the user-supplied data conforms to the data type's domain, not trying to protect your output), but the first and primary line of defense should be making it harder to do it wrong than it is to do it right.
        "MOAR TECH!!!1" and over-wrought protocols are no silver-bullet against ignorance, naivety, and hubris.
        -dZ.
        
        Parent Share
        twitter facebook
        
        Re: (Score:3)
        
        by 19thNervousBreakdown ( 768619 ) writes:
        
        Your solution appears to be, "Do exactly what we've been doing, just more." My rebuttal to that is the entire history of computer security. While it's true that proper understanding of underlying standards and protocols would go a long way toward mitigating the problems, a more complete solution is to make such detail-oriented understanding unnecessary. Compartmentalization of knowledge is, in my opinion anyway, the primary benefit of computers, and the rejection of providing that benefit to other programme
        
        Re: (Score:3)
        
        by dzfoo ( 772245 ) writes:
        
        You misunderstood my point, and then went on to suggest that the "old way" won't work; inadvertently falling into the trap I was pointing out.
        My "solution" (which really, it wasn't a solution per se) is not "more of the same." It is the realization that previous knowledge or practices may not be obsolete, and that we shouldn't try to find new ways to do things for the mere sake of being new.
        A lot, though not all, of the security problems encountered in modern applications have been known and addressed in t
        
        Re: (Score:2)
        
        by postbigbang ( 761081 ) writes:
        
        It's easier for a lot of coders to just bypass the step of input parsing and validation. That ease, which IMHO amounts to sloppy coding, is a major crux of things like injection problems, downstream coding errors (and far beyond things like simple type mismatch), and eventual corruption.
        For every programmer shrugging it off, there's another wondering if someone did the work, and probing everything from packets to simple scriptycrap to break it open for giggles, grins, and profit. They write long tomes of ga
        
        Re: (Score:2)
        
        by dzfoo ( 772245 ) writes:
        
        Agreed.
        
        Re: (Score:2)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        "Your solution appears to be, "Do exactly what we've been doing, just more."
        No. His solution is that people need to start doing it. You're solution is to ignore solid secure programming practices [ibm.com]. In other words, your solution is to keep failing to practice secure programming.
        "Some new approaches work better, some work worse, but we already know exactly what the old approach accomplishes."
        Right. And we have also seen what doesn't work. Another way to say it is: "What we've got here is failure to commun
        
        Re: (Score:2)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        "That link is suggesting that a regex is the proper way to validate an e-mail address." and "Hilariously, it even suggests restricting user input to not allow it to contain SQL control characters. Man."
        It suggests no such thing. It explicitly describes a default deny policy rather than a default allow policy. I now see why you don't think secure programming can work though, since if it is you doing the programming, it definitely won't.
        
        Re: (Score:2)
        
        by Zero__Kelvin ( 151819 ) writes:
        
        You don't seem to understand what was written or the fact that nobody is talking about Javascript programming. If you are using Javascript, you're already by definition incapable of practicing secure programming.
        
        Wheeler is talking about the API implementer, not the client programmer.
        
        Re: (Score:2)
        
        by fatphil ( 181876 ) writes:
        
        > ... that explicitly allowing them actually results in more secure code ...
        
        Amen! Everything's a bomb!
        
        (and no, not one of those airport security 125ml toothpaste tube bombs which they dispose of by chucking it in the waste paper bin below the counter.)
        
        Re: (Score:2)
        
        by cdrguru ( 88047 ) writes:
        
        I'm not sure you have a firm grasp of the problem.
        The problem, from my reading, can be explained as analogous to having a numeric data item that sometimes gets letters put in it. Rather than rejecting this as invalid browsers are making stuff up as they go along so A=1, B=2, and so on and so forth. This has the obvious benefit to users of not exposing them to the improper construction of web pages, but it does create sort of a sub-standard whereby other authors recognize this work-around and decide to mak
        
        Re: (Score:2)
        
        by DamnStupidElf ( 649844 ) writes:
        
        No, the problem is users uploading content containing javascript. Then referencing this javascript from an external site. When a user who happens to also be logged into the popular site hosting my content views my page my javascript executes in the context of the other sites domain with access to the users credentials.
        I'm surprised no one else has pointed this out before. The domain security model is broken. Period. It's time to move beyond that and to an explicit capability system that is granular enough to reference individual URIs, and even *that* is probably insufficient. It is entirely reasonable that a trusted script at http://trusted.example.com/trusted_path/trusted_handler.js?script=1 [example.com] should be able to *not trust* http://trusted.example.com/trusted_path/trusted_handler.js?script=2 [example.com], and further that even http: [example.com]
      - Constructor overhead (Score:2)
        
        by tepples ( 727027 ) writes:
        
        You don't have a method that can output strings, at all. You have a method that can output HTMLString, and it escapes everything it outputs. If you want to output raw HTML, you have RawHTMLString. Makes it much harder to make a mistake when you're doing Response.Write(new RawHTMLString(userField)).
        Interesting technique. But how much runtime overhead do all those constructors impose for Java, C#/VB.NET, PHP, and Python?
        
        Re: (Score:2)
        
        by Cajun Hell ( 725246 ) writes:
        
        But how much runtime overhead do all those constructors impose for Java, C#/VB.NET, PHP, and Python?
        Either nothing, or nothing significant, or something-but-it-fixed-a-bug-which-was-definitely-there.
        
        Re: (Score:2)
        
        by 19thNervousBreakdown ( 768619 ) writes:
        
        Seriously?
        Compared to the overhead of reading from the database, building the rest of the page's HTML, and then sending over the network, practically nothing. This is not hyperbole.
        Even if it wasn't nothing, it would have to be very significant, and performance would have to be a primary factor in the software's spec before I'd consider scrapping an extremely easy to use security practice in order for a faster runtime.
        
        Re: (Score:2)
        
        by tepples ( 727027 ) writes:
        
        Interesting. Do you know of a PHP framework that uses this idea? (I have to use PHP because a lot of hosting plans lack ASP.) If I knew the canonical name of this technique, I'd search for it myself, but when I tried Google "new RawHTMLString", the only thing it found was your comment.
        
        Re: (Score:2)
        
        by 19thNervousBreakdown ( 768619 ) writes:
        
        I don't. I'm not sure it's even common enough to be considered a pattern, let alone have good libraries, those names are just things I came up with in the moment.
        What it essentially boils down to though is you create classes and conversions between those classes that always maintain the correct escaping. If you find yourself writing the same escaping method more than once, refactor. There should be One Version of the Truth, that is a pattern. You then write output routines that refuse to render unrecognized
        
        Re: (Score:2)
        
        by Richy_T ( 111409 ) writes:
        
        You escape user input for SQL (if you're not using parameterized queries) or whatever database you're using. You escape the output for HTML or whatever you are outputting.
        If you've ever run across an application where someone has HTML escaped user input before insertion into the database and you now want to output it in a format that isn't HTML, you'll know what I'm talking about. User data should usually be *stored* as accurately to the original as possible.
        
        Re: (Score:2)
        
        by tepples ( 727027 ) writes:
        
        If you've ever run across an application where someone has HTML escaped user input before insertion into the database and you now want to output it in a format that isn't HTML
        For example, Slashdot comment fields contain a subset of HTML, and I imagine that they're inserted into the database as HTML. But for full-text searching, one wants to search the text, not the tags.
        
        Re: (Score:2)
        
        by Richy_T ( 111409 ) writes:
        
        Search is a whole 'nother beast, probably second only to dates in its "gotcha" potential.
      - Escaping is hard (Score:2)
        
        by TheLink ( 130905 ) writes:
        
        The problem is you currently can't escape everything reliably.
        Why? Because the mainstream browser security concept is making sure that all the thousands of "Go" buttons are not pressed aka "escaped". But people are always introducing new "Go" buttons. If your library is not aware of the latest stuff it will not escape the latest crazy "Go" button the www/html/browser bunch have come up with.
        So in theory a perfectly safe site could suddenly become unsafe, just because someone made a new "Go" button for the l
        
        Re: (Score:2)
        
        by dgatwood ( 11270 ) writes:
        
        This is why the correct solution is always whitelisting, not blacklisting. Whitelist the allowed tags, attributes, CSS subsets, etc. that you consider safe. This way, anything added to the specification is likely to get stripped out by your filtering code.
        For example, I'm working on a website in which users provide content in a subset of HTML/XML. The only tags I'm allowing are p, span, div, select, and a couple of custom tags. The only attributes I'm allowing are the chosen value for the select elemen
        
        Re: (Score:2)
        
        by fatphil ( 181876 ) writes:
        
        > The problem is you currently can't escape everything reliably.
        
        You can - by escaping everything.
        
        > If your library is not aware of the latest stuff it will not escape the latest crazy "Go" button
        
        It will. It escapes everything. What bit of "everything" did you not understand.
        
        Sure, it won't let people have crazy "go" buttons, whatever they are, but nothing of value was lost.
      - Re: (Score:2)
        
        by fatphil ( 181876 ) writes:
        
        I'm with you, but I think we're in the minority. However, I cling to my view because I have a strange obsession with a kind of purity that's probably because of my pure mathematics background.
        
        If an envelope weighs less than 60g, then the postal service should deliver it. It should presume that it contains bombs, nerve poison, and corrosives, etc. but it should deliver it in tact, and then it's the recipient's problem. It should let the recipient know that it's not to be trusted, of course.
        
        If a text entry fi
      - Re: (Score:2)
        
        by jgrahn ( 181062 ) writes:
        
        I'm actually not a big fan of validating inputs. I find proper escaping is a much more effective tool, and validation typically leads to both arbitrary restrictions of what your fields can hold and a false sense of security. It's why you can't put a + sign in e-mail fields, or [...]
        That's not validation! That is trying (and failing, because you are too ignorant to read an RFC) to guess what some other software wants, even if it's none of your business. A well-formed mail address is, for most purposes, one which /lib/sendmail will not complain about.
        That does of course not mean you shouldn't validate data meant to be interpreted by *you*. It's simple: if you need to interpret it, you need to validate it. Hell, you *are* validating it by intrepreting it, even if you do a lousy job.
      - Re: (Score:3)
        
        by Cajun Hell ( 725246 ) writes:
        
        You're assuming input comes from a browser using a page you made yourself. .. . If you aren't validating input in your server code, what is?
        No he's not. If you do things right, then hostile input, honestly mistaken input, and perfectly valid input all get handled the same way. Instead of getting "validated," they get escaped for whatever context they're used within, as they get written to that context.
        If you're building a string for use in a SQL statement, then the string gets escaped for SQL, regardl
        
        Re: (Score:2)
        
        by dgatwood ( 11270 ) writes:
        
        Validation is required, too, in many cases. XSS is a good example. Quoting prevents somebody who only has the ability to write plain text from being able to insert arbitrary HTML. Quoting does nothing if the users are able to actually provide HTML (e.g. any website built around contentEditable).
        In the latter case, if your server does nothing but quote user-generated HTML properly, you're wide open to XSS attacks, because they do not require malformed content. Merely the fact that your server is serving
        
        Re: (Score:2)
        
        by dgatwood ( 11270 ) writes:
        
        Yes, I'm quite aware of what the GP is saying. What I'm saying is that more often than not these days, plain text input isn't sufficient. So you have two choices: HTML input or some alternative scheme like BBCode. And as soon as you find yourself using either one, mere quoting isn't sufficient. You have to do validation.
        
        Re: (Score:2)
        
        by TuringCheck ( 1989202 ) writes:
        
        Blah blah.
        All nice, elegant and safe until management decides this field NEEDS to be red, and underlined, and this one too, and this one - so let's store it in database as HTML... oh, wait, what field was it?
        
        Re: (Score:2)
        
        by 19thNervousBreakdown ( 768619 ) writes:
        
        If your output is properly escaped and you're correctly using parameterized queries and not doing stupid dynamic SQL tricks that are generally necessitated by having a terrible DB layout, it doesn't matter. Go ahead. Put a billion apostrophes, and Unicode apostrophes (that MS SQL [and maybe others] will horrifically collapse down to regular if your connection is ASCII), semicolons, whatever you want. It'll sit there in the field looking pretty.
        Uh, just don't make it 2.2 billion apostrophes. Bad things.
    - Re:I don't know if the question should be... (Score:5, Informative)
      
      by ais523 ( 1172701 ) writes: <ais523(524\)(525)x)@bham.ac.uk> on Thursday August 30, 2012 @06:05AM (#41176457)
      
      After seeing a demonstration of a successful XSS attack on a plaintext file (IE7 was the offending browser, incidentally), I find it hard to see what sort of validation could possibly help. After all, the offending code was a perfectly valid ASCII plain text file that didn't even look particularly like HTML, but happened to contain a few HTML tags. (Incidentally, for this reason, Wikipedia refuses to serve user-entered content as text/plain; it uses text/css instead, because it happens to render the same on all major browsers and doesn't have bizarre security issues with IE.)
      
      Parent Share
      twitter facebook
      - Re: (Score:2, Informative)
        
        by Anonymous Coward writes:
        
        It doesn't "refuse" to serve text/plain, it just makes you ask for it specifically. (Use ?action=raw via index.php and/or format=txt via api.php)
        
        Re: (Score:3)
        
        by ais523 ( 1172701 ) writes:
        
        http://en.wikipedia.org/w/index.php?title=Main_Page&action=raw&ctype=text/plain [wikipedia.org]
        "You have chosen to open index.php which is a: text/x-wiki from: http://en.wikipedia.org/ [wikipedia.org]"
        http://en.wikipedia.org/w/api.php?format=txt [wikipedia.org]
        "You have chosen to open api.php which is a: text/text from: http://en.wikipedia.org/ [wikipedia.org]"
        It refuses to serve text/plain, even if you ask for it specifically. (Compare http://en.wikipedia.org/w/index.php?title=Main_Page&action=raw&ctype=text/css [wikipedia.org], which it'll serve quite happily.)
    - Re: (Score:2)
      
      by fatphil ( 181876 ) writes:
      
      > Problem 1: Browsers try real hard to be clever and interpret maltagged/malformed content instead of validating inputs.
      
      But XHTML saved us from that over a decade ago!
      
      Channelling Eric Naggum: Clearly we're not using enough XML!
    - - Scripts in the body (Score:2)
        
        by tepples ( 727027 ) writes:
        
        Where originally scripts could only be defined in the HTML header, some not-to-be-named company in Redmond
        It wasn't Nintendo of America, was it? :-p
        decided it was a good idea to permit defining them in the document body as well.
        Anywhere you have HTML element attributes beginning with on, you have scripts in the body. It's been so long ago, I can't remember: did Netscape's original version of JavaScript have onclick or onmouseover?
        
        Re: (Score:2)
        
        by Richy_T ( 111409 ) writes:
        
        I think the <a href="javascript: predates even that.
It's called reprocessing (Score:2, Interesting)

by KreAture ( 105311 ) writes:

Convert the file to the site supported format and quality level in sandbox.
Tadaaaa,,,
- Re:It's called reprocessing (Score:5, Informative)
  
  by Anonymous Coward writes: on Thursday August 30, 2012 @03:57AM (#41176037)
  
  As TFA points out, it is possible to create a Flash applet using nothing but alphanumeric characters. Good luck catching that in your reprocessing.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by 19thNervousBreakdown ( 768619 ) writes:
    
    Without an example it's tough to say for sure, but I suspect that it only works when the output isn't properly escaped.
  - Re: (Score:2)
    
    by KreAture ( 105311 ) writes:
    
    How is that a picture, and how would your reprocessing code not reject it for not having header matching file name?
    I think it was obvious my post refered to pictures.
    - Vector pictures (Score:2)
      
      by tepples ( 727027 ) writes:
      
      Before SVG, and even now with Internet Explorer on Windows XP, SWF was the most widely compatible format for displaying vector pictures on a PC.
  - Re: (Score:2)
    
    by fulldecent ( 598482 ) writes:
    
    I'd like to see that regex
- Re: (Score:2)
  
  by Jonner ( 189691 ) writes:
  
  Convert the file to the site supported format and quality level in sandbox.
  Tadaaaa,,,
  If you'd read TFA, you'd know it covers that and explains why it's insufficient.
- Re: (Score:2)
  
  by Carnildo ( 712617 ) writes:
  
  Convert the file to the site supported format and quality level in sandbox.
  You're applying a known transform to the image. By reversing the transform, the attacker can craft an image such that the original upload is innocent, while the reprocessed image is malicious. I've seen it done where the upload is clean, but the generated thumbnail is goatse; it shouldn't be too hard to create a clean upload that the converter turns into something IE will interpret as Javascript.
Google security breaches (Score:2)

by romit_icarus ( 613431 ) writes:

For all its transparency, I've yet to see a working list of security breach attempts made on Google servers. I bet there are many, and it would be useful to know just the source and method if nothing more.
- Re: (Score:2)
  
  by cbiltcliffe ( 186293 ) writes:
  
  Security breach *attempts*?
  I'm guessing a simple csv of that would be several TB in size. That's probably why you can't get a working list.
  - Filter the multi-TB IDS log plz (Score:2)
    
    by tepples ( 727027 ) writes:
    
    You cite a multi-TB IDS log. May I have it filtered to the cases that came closest to a substantial intrusion?
    - Re: (Score:2)
      
      by Gavagai80 ( 1275204 ) writes:
      
      The either break in or they don't. There is no closest.
Yes, it really is that bad. (Score:5, Interesting)

by VortexCortex ( 1117377 ) writes: <VortexCortex AT ... trograde DOT com> on Thursday August 30, 2012 @03:59AM (#41176043)

This is what happens when you try to be lenient with markup instead of strict (note: compliant does not preclude extensible), and then proceed to use a horribly inefficient and inconsistent (by design) scripting language and a dysfunctional family of almost sane document display engines combined with a stateless protocol to produce a stateful application development platform by way of increasingly ridiculous hacks.
When I first heard of "HTML5" I thought: Thank Fuck Almighty! They're finally going to start over and do shit right, but no, they're not. HTML5 is just taking the exact same cluster of fucks to even more dizzying degrees. HOW MANY YEARS have we been waiting for v5? I've HONESTLY lost count and any capacity to give a damn when we reached a decade -- Just looked it up, 12 years. For about one third the age of the Internet we've been stuck on v4.01... ugh. I don't, even -- no, bad. Wrong Universe! Get me out!
In 20XX when HTML6 may be available I may reconsider "web development". As it stands web development is chin-deep in its own filth which it sprays with each mention, onto passers by and they receive the horrid spittle joyously not because its good or even not-putrid, but because we've actually had worse! I can crank out a cross platform pixel perfect native application for Android, iOS, Linux, OSX, XP, Vista, Win7, and mother fucking BSD in one third the time it takes to make a web app work on the various flavours of IE, Firefox, Safari, Chrom(e|ium). The time goes from 1/3rd down to 1/6th when I cut out testing for BSD, Vista, W7 (runs on XP, likely runs on Vista & Win7. Runs on X11 + OpenGL + Linux, likely builds/runs on BSD & Mac).
Long live the Internet and actual cross platform development toolchains, but fuck the web.

Share
twitter facebook
- Re:Yes, it really is that bad. (Score:5, Funny)
  
  by sgrover ( 1167171 ) writes: on Thursday August 30, 2012 @04:14AM (#41176099) Homepage
  
  +1, but tell us how you really feel
  
  Parent Share
  twitter facebook
- Re:Yes, it really is that bad. (Score:5, Insightful)
  
  by SuricouRaven ( 1897204 ) writes: on Thursday August 30, 2012 @04:29AM (#41176143)
  
  Of course it's a mess. The combination of HTTP and HTML was designed for simple, static documents displaying predominatly text, a little formatting and a few images. By this point we're using extensions to extensions to extensions. It's a miracle it works at all.
  
  Parent Share
  twitter facebook
  - Re:Yes, it really is that bad. (Score:5, Funny)
    
    by adolf ( 21054 ) writes: <flodadolf@gmail.com> on Thursday August 30, 2012 @06:07AM (#41176471) Journal
    
    It's a miracle it works at all.
    It works?
    
    Parent Share
    twitter facebook
    - - Re: (Score:2)
        
        by SuricouRaven ( 1897204 ) writes:
        
        Aside from the script-driven ability to expand comments and use a slider-bar to set a filter. Features which depend underneath on the ability to fetch new data via HTTP and seemlessly incorporate it into an already-open page without a full refresh.
- Re: (Score:2)
  
  by svick ( 1158077 ) writes:
  
  HOW MANY YEARS have we been waiting for v5? I've HONESTLY lost count and any capacity to give a damn when we reached a decade -- Just looked it up, 12 years.
  But HTML 5 is already here! It's just that it's not like the standards of old, it's a living standard. And if you don't like that, you're not agile enough.
  - Re: (Score:2)
    
    by arose ( 644256 ) writes:
    
    Remind me, is i possible to serve XHMTL 1.0 accross the board yet? I think it just about it, and we are to the point of "why the fuck bother anymore", if you can do better at getting shit implemented go right ahead, but so far HsTML5 has made more tangible progress than just about any other single initiative of W3C.
    - April 2014 (Score:2)
      
      by tepples ( 727027 ) writes:
      
      It will be in April 2014 when Windows XP, the operating system for which the latest version of the bundled browser is IE 8, leaves extended support.
    - Re: (Score:2)
      
      by Jonner ( 189691 ) writes:
      
      Remind me, is i possible to serve XHMTL 1.0 accross the board yet? I think it just about it, and we are to the point of "why the fuck bother anymore", if you can do better at getting shit implemented go right ahead, but so far HsTML5 has made more tangible progress than just about any other single initiative of W3C.
      I think IE 9 finally handles XHTML properly. Of course it's far too late, since XHTML is completely dead.
  - Re: (Score:2)
    
    by Jonner ( 189691 ) writes:
    
    HOW MANY YEARS have we been waiting for v5? I've HONESTLY lost count and any capacity to give a damn when we reached a decade -- Just looked it up, 12 years.
    But HTML 5 is already here! It's just that it's not like the standards of old, it's a living standard. And if you don't like that, you're not agile enough.
    I'm not sure if they're on the right track in general, but at least the WHATWG is honestly recognizing that web developers have never waited for an official standard to use new browser features. It's a chicken and egg problem: if nobody used a new feature until it were described in an official standard, browsers wouldn't have much motivation to implement and test the feature.
- Re: (Score:2)
  
  by Skapare ( 16644 ) writes:
  
  It's posts like this that make me wish Slashdot could do moderations above level 5.
- Re: (Score:3)
  
  by TheDarkMaster ( 1292526 ) writes:
  
  I think the same thing. I currently work doing "web systems". And do they work? Work, I managed to make a web application that can use a card printer. But at what price? I spent twice the time that I would spend if I did compiled desktop applications, and lost count of the many horrible hacks I had to do to similar desktop functionality using HTML
- Re: (Score:2)
  
  by FireFury03 ( 653718 ) writes:
  
  When I first heard of "HTML5" I thought: Thank Fuck Almighty! They're finally going to start over and do shit right, but no, they're not. HTML5 is just taking the exact same cluster of fucks to even more dizzying degrees.
  XHTML was a pretty good step in the right direction. Enforced well-formed ness is a good thing (although IMHO browsers should've had a built in "please try to fix this page" function that the user could manually run over a broken page), genericsising tags is sensible (if you're going to embed a rectangular object then it makes sense to have a single <object> tag to do it for all content, for example - no need to produce a whole new revision of the language just because someone has invented a new type
- Re: (Score:2)
  
  by Jonner ( 189691 ) writes:
  
  Do whatever kind of development floats your boat and pays the bills. As much as some aspects of web development suck, it is getting gradually better and it can't be ignored. The answer to web development problems certainly isn't to return to platform-specific binaries.
HTML needs a sandbox tag (Score:3)

by Hentes ( 2461350 ) writes: on Thursday August 30, 2012 @05:02AM (#41176273)

The easiest way to secure embedded content would be a sandbox tag that allows to limit what kind of content can be inside of it.

Share
twitter facebook
- Please no... (Score:2)
  
  by betterunixthanunix ( 980855 ) writes:
  
  Stop extending HTML! HTML does not need more tags. HTML was not designed to be a presentation language for applications and certainly not to be an environment for running applications; it was designed to be a hypertext document language (yes, "hypertext" is a word with meaning beyond HTML). The worst thing we did was to allow HTML documents with embedded programs -- applets, Javascript, etc.
  
  The real answer is a new standard that is designed for application presentation and deliver, that does not have
  - Good luck getting Apple to adopt it (Score:2)
    
    by tepples ( 727027 ) writes:
    
    The real answer is a new standard that is designed for application presentation and deliver
    That's been tried, in the form of Flex and Silverlight. Good luck getting Apple to adopt your proposed new standard.
    - Re: (Score:2)
      
      by Richy_T ( 111409 ) writes:
      
      Flex? Silverlight was just another Microsoft attempt to abuse the market and that's a play everyone has gotten wise to by now.
      - Re: (Score:2)
        
        by tepples ( 727027 ) writes:
        
        Flex?
        Flex was Adobe's attempt to reposition Flash Player as a rich Internet application platform.
  - Re: (Score:2)
    
    by Jonner ( 189691 ) writes:
    
    The real answer is a new standard that is designed for application presentation and deliver, that does not have so much in-band signaling. We need to get it right the first time by building security into the system, not extend an already bloated monstrosity to make up for the inevitable security problems that result from turning a language for describing documents into a platform for running distributed software with malicious users.
    Let us know how that works out.
- Re: (Score:2)
  
  by TheLink ( 130905 ) writes:
  
  I suggested something like that 10 years ago: http://lists.w3.org/Archives/Public/www-html/2002May/0021.html [w3.org]
  http://www.mail-archive.com/mozilla-security@mozilla.org/msg01448.html [mail-archive.com]
  But hardly anyone was interested. If implemented it could have prevented the Hotmail, MySpace, yahoo and many other XSS worms.
  There's Content Security Policy now:
  https://developer.mozilla.org/en-US/docs/Security/CSP/Introducing_Content_Security_Policy [mozilla.org]
  As far as I see security is not a priority for the browser and W3C bunch.
- - Re: (Score:2)
    
    by dgatwood ( 11270 ) writes:
    
    The iframe tag is less than ideal because the content must be provided out-of-band in a separate request. I mean sure, you can usually jam most things in so that they'll fit, but it isn't really ideal for content inserted dynamically.
    Also, if I read the spec correctly, this has almost all of the same flaws as Mozilla's content security policies (minus the requirement that you must use a separate server to provide the content). Specifically, it's all-or-nothing. Either you allow scripts or you don't. A
Problem can be solved, but users are the problem (Score:3)

by gweihir ( 88907 ) writes: on Thursday August 30, 2012 @07:06AM (#41176659)

Images and text can be sanitized reliably. The problem is that this strips out all of the non-essential features. Users have a hard time understanding that, because users do not understand the trade-offs involved.
But the process is easy: Map all images to meta-data and compression free formats (pnm, e.g.) then recompress with a trusted compressor. For text, accept plain ASCII, RTF and HTML 2.0. Everything else, convert either to images or to cleaned PDF/Postscript by "printing" and OCR'ing.

Share
twitter facebook
- Re: (Score:2)
  
  by Jonner ( 189691 ) writes:
  
  Images and text can be sanitized reliably. The problem is that this strips out all of the non-essential features. Users have a hard time understanding that, because users do not understand the trade-offs involved.
  But the process is easy: Map all images to meta-data and compression free formats (pnm, e.g.) then recompress with a trusted compressor. For text, accept plain ASCII, RTF and HTML 2.0. Everything else, convert either to images or to cleaned PDF/Postscript by "printing" and OCR'ing.
  If you'd read TFA, you'd know that it explains why this is insufficient:
  For a while, we focused on content sanitization as a possible workaround - but in many cases, we found it to be insufficient. For example, Aleksandr Dobkin managed to construct a purely alphanumeric Flash applet, and in our internal work the Google security team created images that can be forced to include a particular plaintext string in their body, after being scrubbed and recoded in a deterministic way.
- Re: (Score:2)
  
  by Carnildo ( 712617 ) writes:
  
  Images and text can be sanitized reliably.
  The point of the article is that they can't. Internet Explorer can be coerced into interpreting JPEG images as HTML, interpreting ASCII text as Flash, and interpreting text/plain documents as text/html, among other things. You can also play games with the encoding-recognition code by tweaking the first few bytes of the file, such that a document uploaded as ISO-8859-1 is interpreted by IE as UTF-7, or whatever other encoding suits your purposes. Note that in all
Novel Solution (Score:3, Interesting)

by Sentrion ( 964745 ) writes: on Thursday August 30, 2012 @10:00AM (#41177825)

This was a real problem back in the 1980s. Everytime I would connect to a BBS my computer would execute any code it came across, which made it very easy for viruses to infect my PC. But lucky for me, in the early 90's the world wide web came into being and I didn't have to run executable code just to view content that someone else posted. The PC was insulated from outside threats by viewing the web "pages" only through a "web browser" that only let you view the content, which could be innocuous text, graphics, images, sound, and even animation that was uploaded to the net by way of a non-executable markup language known as HTML. It was at this time that the whole world began to use their home computers to view content online because it was now safe for amateurs and noobs to connect their PCs to the internet without any worries of being inundated with viruses and other malware.
Today I only surf the web with browsers like Erwise, Viola, Mosaic, and Cello. People today are accessing the internet with applications that run executable code, such as Internet Explorer and Firefox. Very dangerous for amateurs and noobs.

Share
twitter facebook
- Re: (Score:2)
  
  by BronsCon ( 927697 ) writes:
  
  Today I only surf the web with browsers like Erwise, Viola, Mosaic, and Cello. People today are accessing the internet with applications that run executable code, such as Internet Explorer and Firefox. Very dangerous for amateurs and noobs.
  So, which are you, an amateur or a noob?
My explanation of article (Score:5, Informative)

by kent.dickey ( 685796 ) writes: on Thursday August 30, 2012 @11:35AM (#41178939)

The blog post was a bit terse, but I gather one of the main problems is the following:
Google lets users upload profile photos. So when anyone views that user's page, they will see that photo. But, malicious users were making their photos files contain Javascript/Java/Flash/HTML code. Browsers (I think it's always IE) are very lax and will try to interpret files how they please, regardless of what the web page says. So, webpage says it's pointing to a IMG, but some browsers will interpret it as Javascript/Java/Flash/HTML anyway once they look at the file. So now a malicious user can serve up scripts that seem to be coming from Google.com, and so they are given a lot of access at Google.com and break their security (e.g., let you look at other people's private files).
Their solution: user images are hosted at googleusercontent.com. Now, if a malicious user tries to put a script in there, it will only have the privileges of a script run from that domain--which is no privileges at all. Note this just protects Google's security...you're still running some other user's malicious script. Not google's problem.
The article then discusses how trying to sanitize images can never work, since valid images can appear to have HTML/whatever in them, and their own internal team worked out how to get HTML to appear in images even after image manipulation was done.
Shorter summary: Browsers suck.

Share
twitter facebook
- Mod parent up (Score:2)
  
  by psydeshow ( 154300 ) writes:
  
  I read the TFA, that's a great summary.
  It's like waking up in a crappy mirror universe where all the work that we have done on security in the past 10 years is out the window, because unbeknownst to anyone but the browser vendors, our web browsers will go ahead and execute code embedded in non-executable mimetypes.
  Would it have been so hard to limit JavaScript execution to the handful of content types where it is supposed to be found? Apparently. So now images are Turing-complete, and all your cookies can b
  - Re: (Score:2)
    
    by Carnildo ( 712617 ) writes:
    
    Apparently. So now images are Turing-complete, and all your cookies can be lifted by someone who puts <script src="http://private.com/users/you/profile.jpg"></script> in a page you visit.
    It's worse than that. If you're using Internet Explorer, your cookies can be lifted by someone who puts <img src="http://private.com/users/you/profile.jpg"> in a page you visit, or your flash storage tampered by <a href="http://private.com/uploads/schedule.txt">.
- Re: (Score:2, Interesting)
  
  by Anonymous Coward writes:
  
  Umm, what does your comment have to do with the subject in TFA? They used to host content on google.com, then they moved it to googleusercontent.com for security reasons. If anything they have made it clear that the user owns it, but not for that reason.
- Re: (Score:2)
  
  by 19thNervousBreakdown ( 768619 ) writes:
  
  Because writing a script to forge the Referer (sic) header is trivial.
  - Show how it can still be done in 2012 (Score:2)
    
    by tepples ( 727027 ) writes:
    
    Another objection has to do with [...] machinary that would allow referer checks to be forged/circumvented. If you are asserting this you need to show how it can still be done in 2012.
    Because writing a script to forge the Referer (sic) header is trivial.
    Go ahead and show us how please.
    - Re: (Score:2)
      
      by 19thNervousBreakdown ( 768619 ) writes:
      
      Dammit, I was going to whip up a quick example of this, and a google showed me that browsers have added protection to the XMLHttpRequest. I was basing my claim on years-old information. My bad.
      I mean, I still think it's not a good idea, and that same google shows that there are many ways around it, and other headers don't share the same protection so trusting them is a bad habit to get into, but it looks like it would take more than the two minutes I was willing to spend. Google around. It may not be trivia
      - Re: (Score:2)
        
        by WaffleMonster ( 969671 ) writes:
        
        mean, I still think it's not a good idea, and that same google shows that there are many ways around it, and other headers don't share the same protection so trusting them is a bad habit to get into, but it looks like it would take more than the two minutes I was willing to spend. Google around. It may not be trivial anymore, but it's still very possible.
        I would REALLY like to know how it is still possible to forge a referer field from a browser request. There are lots of talking heads on the Internet who more or less assume what many here have in this regard.
        I know how to clear the referer field but having previously spent dozens of hours researching this with no solution that does not involve ancient bugs or signaling outside the browsers session context.
        Any specific pointers or implementations rather than unspecific references to talking heads assuming
      - No relevant results for "around". (Score:2)
        
        by tepples ( 727027 ) writes:
        
        Google around.
        around didn't provide relevant results.
        But with the literal-minded housekeeper [tvtropes.org] costume off, forge referer and spoof referer still don't. This page [cgisecurity.com] is from 2006, and this page [pseudo-flaw.net] likewise explains a flaw that has since been fixed. This page [blackhatworld.com] claims that it's possible to forge a referer in the visitor's browser using redirection, but only from a domain that the attacker controls. This result [stackoverflow.com] claims that the only way is to get the user to install a plug-in: "If you want to redirect a visitor to another website
        
        Re: (Score:2)
        
        by 19thNervousBreakdown ( 768619 ) writes:
        
        There are legitimate reasons to want to do this, and people are much more likely to be helpful when you're not asking how to exploit. With that in mind, the search set referer xmlhttprequest [google.com] gives decent results. If you look through them, you'll see that it used to work in basically every browser by simply setting the header, but there are now various levels of protection depending on the browser, the calling code's domain, and where the request is going.
        All in all, basing security on a header that was neve
        
        Re: (Score:2)
        
        by WaffleMonster ( 969671 ) writes:
        
        All in all, basing security on a header that was never secure is a dumb idea. Instead of redefining an old header, make a new one. This is security we're talking about, not opening a Word 97 document on Word 2008. If it's not secure, it should break, it shouldn't make a best effort.
        Please see the specification which explicitly exempts Referer and a host of other fields in request from being user changable.
        http://www.w3.org/TR/XMLHttpRequest [w3.org]
        All in all, basing security on a header that was never secure is a dumb idea.
        The browser is expected to moderate certain activity protecting the end user from an "anything goes" scripting environment.
        There were security bugs that got fixed but I don't understand the assertion being made about this field never being secure. The specifications for xmlhttprequest seems pretty clear to me on the intent to protect this field.
        If it's not secure, it should break, it shouldn't make a best effort.
        Yo
        
        Re: (Score:2)
        
        by 19thNervousBreakdown ( 768619 ) writes:
        
        I've already conceded that it is no longer supposed to be possible.
        The point is that it's not a strong security technique. If you look at the first release of the spec you linked, you'll find that there is no mention of the Referer header [w3.org], and according to the spec, the opposite behavior is what was specified, and that specification was in fact followed [schinckel.net]. Do some googling, you'll find many references to this working. And XMLHttpRequest came out in 1999, so it's spent most of its life in its insecure incarnat
    - - Re: (Score:2)
        
        by tepples ( 727027 ) writes:
        
        it only gets easier if you don't assume said script has to run inside the browser.
        If a script doesn't run inside the browser, then how is the victim's computer induced to execute it?
  - - Re: (Score:2)
      
      by 19thNervousBreakdown ( 768619 ) writes:
      
      I'll do that as soon as you demonstrate how to bite your own ear off.
      What's that? You never claimed it was possible? Huh.
      An implicit assumption when talking about checking an HTTP header is that it's done somewhere that actually needs to check it. A browser is the source of the header. Looking at it is silly and irrelev[b]a[/b]nt. It's you that doesn't understand the problem.
- Strip Referer (Score:2)
  
  by tepples ( 727027 ) writes:
  
  Why not check HTTP_REFERER variable and not serve up content if missing
  Because a lot of proxies and web browser extensions strip Referer for privacy reasons.
  - Re: (Score:2)
    
    by WaffleMonster ( 969671 ) writes:
    
    Because a lot of proxies and web browser extensions strip Referer for privacy reasons.
    Privacy plugins only strip foreign referers not same domain which is all that is needed in this case.
- Re: (Score:2)
  
  by Richy_T ( 111409 ) writes:
  
  Any decent web filtering software allows blocking based on URL components, not just the domain. Google would have to work pretty hard to circumvent that and what would be the motivation?
  - Re: (Score:2)
    
    by fatphil ( 181876 ) writes:
    
    Web filtering won't work in general. See the cookie "path" issue.
    Summary: http://www.foo.com/you/trust/this/bit/../.%2e/../../oh/shit

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

I don't know if the question should be... (Score:3)

Re:I don't know if the question should be... (Score:5, Informative)

Re:I don't know if the question should be... (Score:5, Insightful)

Re:I don't know if the question should be... (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:I don't know if the question should be... (Score:5, Interesting)

Re:I don't know if the question should be... (Score:4, Interesting)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Constructor overhead (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Escaping is hard (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:I don't know if the question should be... (Score:5, Informative)

Re: (Score:2, Informative)

Re: (Score:3)

Re: (Score:2)

Scripts in the body (Score:2)

Re: (Score:2)

It's called reprocessing (Score:2, Interesting)

Re:It's called reprocessing (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Vector pictures (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Google security breaches (Score:2)

Re: (Score:2)

Filter the multi-TB IDS log plz (Score:2)

Re: (Score:2)

Yes, it really is that bad. (Score:5, Interesting)

Re:Yes, it really is that bad. (Score:5, Funny)

Re:Yes, it really is that bad. (Score:5, Insightful)

Re:Yes, it really is that bad. (Score:5, Funny)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

April 2014 (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

HTML needs a sandbox tag (Score:3)

Please no... (Score:2)

Good luck getting Apple to adopt it (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Problem can be solved, but users are the problem (Score:3)

Re: (Score:2)

Re: (Score:2)