Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Google Talks About the Dangers of User Content

samzenpus posted more than 2 years ago | from the watch-what-you-say dept.

Google 172

An anonymous reader writes "Here's an interesting article on the Google security blog about the dangers faced by modern web applications when hosting any user supplied data. The surprising conclusion is that it's apparently almost impossible to host images or text files safely unless you use a completely separate domain. Is it really that bad? "

cancel ×

172 comments

Sorry! There are no comments related to the filter you selected.

First Comment!! (-1)

Anonymous Coward | more than 2 years ago | (#41175827)

Woohoo!

Can't trust a First Post (-1, Redundant)

billstewart (78916) | more than 2 years ago | (#41175829)

full of dangerous viruses.

Re:Can't trust a First Post (-1, Offtopic)

Lord Lode (1290856) | more than 2 years ago | (#41176131)

Good thing yours is the second!

Reply (-1, Offtopic)

Anonymous Coward | more than 2 years ago | (#41175861)

Shanghai Shunky Machinery Co.,ltd is a famous manufacturer of crushing and screening equipments in China. We provide our customers complete crushing plant, including cone crusher, jaw crusher, impact crusher, VSI sand making machine, mobile crusher and vibrating screen. What we provide is not just the high value-added products, but also the first class service team and problems solution suggestions. Our crushers are widely used in the fundamental construction projects. The complete crushing plants are exported to Russia, Mongolia, middle Asia, Africa and other regions around the world.
http://www.sandmaker.biz
http://www.shunkycrusher.com
http://www.jaw-breaker.org
http://www.jawcrusher.hk
http://www.c-crusher.net
http://www.sandmakingplant.net
http://www.vibrating-screen.biz
http://www.mcrushingstation.com
http://www.cnstonecrusher.com
http://www.cnimpactcrusher.com
http://www.Vibrating-screen.cn
http://www.stoneproductionline.com
http://www.hydraulicconecrusher.net

Re:Reply (-1)

Anonymous Coward | more than 2 years ago | (#41176691)

Crushing innovation [youtube.com] .

Re:Reply (-1, Offtopic)

nozzo (851371) | more than 2 years ago | (#41176715)

Beverley Crusher

I don't know if the question should be... (2)

Tastecicles (1153671) | more than 2 years ago | (#41175865)

...is it a server problem, with the way it interprets record data, or the browser (any browser) (maybe as instructions rather than markup)? I'm guessing server in this case, since if the stream is intercepted and there's a referrer URL that directly references an image or other blob on the same or another server on a subdomain, that could be used to pwn the account/whatever... I'm not up on that sort of hack (you can probably tell). I don't quite get how hosting blobs on an entirely different domain would mitigate against that hack, since you would require some sort of URI that the other domain would recognise to be able to serve up the correct file - which would be in the URL request! Someone want to try and make sense of what I'm trying to say here?

Re:I don't know if the question should be... (1)

Anonymous Coward | more than 2 years ago | (#41175923)

It is a security thing. Scripts from one domain may not modify pages on another.
So if you mix content from foo.google.com and bar.google.com on the same page then a js from foo can't do anything to the content from bar.

Re:I don't know if the question should be... (5, Informative)

Sarusa (104047) | more than 2 years ago | (#41176017)

It's fundamentally a problem with the browsers. Without getting too technical...

Problem 1: Browsers try real hard to be clever and interpret maltagged/malformed content so people with defective markup or bad mime content headers won't say 'My page doesn't work in Browser X, Browser X is defective!'. Or if the site is just serving up user text in html, stick some javascript tags in the text. Whichever way, you end up so someone malicious can upload some 'text' to a clipboard or document site which the browser then executes when the malicious person shares the URL.

Problem 2: There are a lot of checks in most browsers against 'cross site scripting', which is a page on site foobar.com (for instance) making data load requests to derp.com, or looking at derp.com's cookies, or even leaving a foobar.com cookie when derp.com is the main page. But if your script is running 'from' derp.com (as above) then permissions for derp.com are almost wide open, because it would just be too annoying for most users to manage permissions on the same site. Now they can grab all your docs, submit requests to email info, whatever is allowed. This is why just changing to another domain name helps.

There's more nitpicky stuff in the second half of TFA, but I think that's the gist of it.

Re:I don't know if the question should be... (5, Insightful)

TubeSteak (669689) | more than 2 years ago | (#41176049)

It's fundamentally a problem with not validating inputs. Without getting too technical...

Problem 1: Browsers try real hard to be clever and interpret maltagged/malformed content instead of validating inputs.

Problem 2: There are a lot of checks in most browsers against 'cross site scripting', which is fundamentally a problem of not validating inputs.

/don't forget to validate your outputs either.

Re:I don't know if the question should be... (3, Insightful)

Sarusa (104047) | more than 2 years ago | (#41176231)

This is true! You could even say it's a sooper-dooper-fundamental problem of HTTP/HTML not sufficiently separating the control channel from the data channel and/or not sufficiently encapsulating things (active code anywhere? noooo.)

But since browsers have actively chosen to validate invalid inputs and nobody's going to bother securing HTTP/HTML against this kind of thing any time soon, or fix the problems with cookies, or, etc etc etc, I figured that was a good enough high level summary of where we're at realistically. Nobody's willing to fix the foundations or 'break' when looking at malformed pages.

Re:I don't know if the question should be... (1)

ultrasawblade (2105922) | more than 2 years ago | (#41176801)

There is, though. "Control" stuff is supposed to go in the HTTP header and "data" stuff is supposed to go in the HTTP body.

Re:I don't know if the question should be... (1)

Khyber (864651) | more than 2 years ago | (#41177341)

And all of that is on the same data channel. Again, lack of proper separation.

Re:I don't know if the question should be... (5, Interesting)

19thNervousBreakdown (768619) | more than 2 years ago | (#41176253)

I'm actually not a big fan of validating inputs. I find proper escaping is a much more effective tool, and validation typically leads to both arbitrary restrictions of what your fields can hold and a false sense of security. It's why you can't put a + sign in e-mail fields, or have an apostrophe in your description field.

In short, if a data type can hold something, it should be able to read every possible value of that data type, and output every possible value of that data type. That means that if you have a Unicode string field, you should accept all valid Unicode characters, and be able to output the same. If you want to restrict it, don't use a string. Create a new data type. This makes escaping easy as well. You don't have a method that can output strings, at all. You have a method that can output HTMLString, and it escapes everything it outputs. If you want to output raw HTML, you have RawHTMLString. Makes it much harder to make a mistake when you're doing Response.Write(new RawHTMLString(userField)).

A multi-pronged approach is best, and input validation certainly has its place (ensuring that the user-supplied data conforms to the data type's domain, not trying to protect your output), but the first and primary line of defense should be making it harder to do it wrong than it is to do it right.

Re:I don't know if the question should be... (3, Interesting)

dzfoo (772245) | more than 2 years ago | (#41176361)

I'm actually not a big fan of validating inputs. I find proper escaping is a much more effective tool, and validation typically leads to both arbitrary restrictions of what your fields can hold and a false sense of security.

OK, fair point. How about if we expand the concept of "validating input" to include canonicalization and sanitation as well? Oh, it already does. Go figure.

Reducing it to a mere reg-exp is missing the point. Proper canonicalization (and proper understanding of the underlying standards and protocols, but that's another argument) would allow you to use a plus-sign in an e-mail address field.

But this won't happen as long as every kid fresh out of college wants to roll their own because they known The One True Way to fix it, this time For Real. As long as they keep ignoring everything learned before because, you know, it's old stuff and this is the new technology of The Web, where everything old does not count at all; nothing will change.

A multi-pronged approach is best, and input validation certainly has its place (ensuring that the user-supplied data conforms to the data type's domain, not trying to protect your output), but the first and primary line of defense should be making it harder to do it wrong than it is to do it right.

"MOAR TECH!!!1" and over-wrought protocols are no silver-bullet against ignorance, naivety, and hubris.

            -dZ.

Re:I don't know if the question should be... (2)

19thNervousBreakdown (768619) | more than 2 years ago | (#41176441)

Your solution appears to be, "Do exactly what we've been doing, just more." My rebuttal to that is the entire history of computer security. While it's true that proper understanding of underlying standards and protocols would go a long way toward mitigating the problems, a more complete solution is to make such detail-oriented understanding unnecessary. Compartmentalization of knowledge is, in my opinion anyway, the primary benefit of computers, and the rejection of providing that benefit to other programmers or utilizing it yourself while writing software smacks of programmers who don't want others invading their turf.

I'll grant you, new does not necessarily mean better. Some new approaches work better, some work worse, but we already know exactly what the old approach accomplishes.

Re:I don't know if the question should be... (2)

dzfoo (772245) | more than 2 years ago | (#41176683)

You misunderstood my point, and then went on to suggest that the "old way" won't work; inadvertently falling into the trap I was pointing out.

My "solution" (which really, it wasn't a solution per se) is not "more of the same." It is the realization that previous knowledge or practices may not be obsolete, and that we shouldn't try to find new ways to do things for the mere sake of being new.

A lot, though not all, of the security problems encountered in modern applications have been known and addressed in the past, to various degrees of success. We should embrace this experience and apply it, not shunt it as antiquated.

Whether you want to admit it or not, lack of input validation and understanding of data encoding at the various transport layers, is the source of most security issues. We should acknowledge this and address it directly.

You are right, a lot can be done to build solutions into our tools to ease their implementation. However, technology itself won't solve the problem of developers not understanding the risks or why they happen.

What does not help at all is to hand-wave or diminish this particular problem and blame the tools for not doing our due diligence. Or worse, ignore experience and history and mark it as a new problem, only solvable by more technology.

        dZ.

Re:I don't know if the question should be... (1)

postbigbang (761081) | more than 2 years ago | (#41176893)

It's easier for a lot of coders to just bypass the step of input parsing and validation. That ease, which IMHO amounts to sloppy coding, is a major crux of things like injection problems, downstream coding errors (and far beyond things like simple type mismatch), and eventual corruption.

For every programmer shrugging it off, there's another wondering if someone did the work, and probing everything from packets to simple scriptycrap to break it open for giggles, grins, and profit. They write long tomes of garbage to assault in various automated ways, big rocks to crash through the straw built from sleazy code. To those that believe in quality, carry on.

Re:I don't know if the question should be... (1)

dzfoo (772245) | more than 2 years ago | (#41177297)

Agreed.

Re:I don't know if the question should be... (1)

Zero__Kelvin (151819) | more than 2 years ago | (#41176695)

"Your solution appears to be, "Do exactly what we've been doing, just more."

No. His solution is that people need to start doing it. You're solution is to ignore solid secure programming practices [ibm.com] . In other words, your solution is to keep failing to practice secure programming.

"Some new approaches work better, some work worse, but we already know exactly what the old approach accomplishes."

Right. And we have also seen what doesn't work. Another way to say it is: "What we've got here is failure to communicate. Some men you just can't reach. So you get what we have here now, which is the way Microsoft wants it... well, Bill Gates gets it. I don't like it any more than you men."

Re:I don't know if the question should be... (1)

cdrguru (88047) | more than 2 years ago | (#41178365)

I'm not sure you have a firm grasp of the problem.

The problem, from my reading, can be explained as analogous to having a numeric data item that sometimes gets letters put in it. Rather than rejecting this as invalid browsers are making stuff up as they go along so A=1, B=2, and so on and so forth. This has the obvious benefit to users of not exposing them to the improper construction of web pages, but it does create sort of a sub-standard whereby other authors recognize this work-around and decide to make use of it in a widespread manner. Suddenly, we have a data item that is supposed to accept only numeric values but now it also accepts other things as well and interprets them.

Maladjusted Teenager then discovers that not only do we have A=1 but on some browser !@#!=exception and makes use of this.

The problem is the original non-validation and non-rejection of illegal input. Sure it makes this more "user friendly" but it opens huge gaping holes in any sort of standard. It then also encourages folks to intentionally code ABC when they want 123 because the browser accepts it. With widespread enough usage this interpretation is forced on all browsers because otherwise they are left flagging huge swaths of the web as "invalid". Keep doing this sort of stuff and you have the mess that we have today.

I assure you the solution isn't to not validate and accept anything unless you are prepared to throw out the idea of any sort of restricted content data item and everything becomes a string containing any possible character. And even that doesn't really work because of context - there are contexts where a numeric value is needed and having a non-numeric string is really incorrect. Where we have gone is pushing browsers to "interpret" this as something legal even when they should not. You can see where that has gotten us.

I'd say the correct behavior in all cases is to not interpret and not accept improper input but to throw it back. Perhaps go to the drastic step of saying because one part of this document is malformed, the document cannot be properly formatted. This would make it a lot more obvious to web designers, developers and authors they have done something wrong the first time they look at what they have done. Instead, we have interpretation trying to cover up mistakes and the result is they are hidden from both the author and the end user.

Re:I don't know if the question should be... (0)

Anonymous Coward | more than 2 years ago | (#41178563)

I'm not sure you have a firm grasp of the problem.

It seems neither do a whole host of people commenting on this issue.

The problem, from my reading, can be explained as analogous to having a numeric data item that sometimes gets letters put in it. Rather than rejecting this as invalid browsers are making stuff up as they go along so A=1, B=2, and so on and so forth.

No, the problem is users uploading content containing javascript. Then referencing this javascript from an external site. When a user who happens to also be logged into the popular site hosting my content views my page my javascript executes in the context of the other sites domain with access to the users credentials.

I'd say the correct behavior in all cases is to not interpret and not accept improper input but to throw it back. Perhaps go to the drastic step of saying because one part of this document is malformed, the document cannot be properly formatted. This would make it a lot more obvious to web designers, developers and authors they have done something wrong the first time they look at what they have done. Instead, we have interpretation trying to cover up mistakes and the result is they are hidden from both the author and the end user.

Irrelevant to the topic at hand.

Constructor overhead (1)

tepples (727027) | more than 2 years ago | (#41177045)

You don't have a method that can output strings, at all. You have a method that can output HTMLString, and it escapes everything it outputs. If you want to output raw HTML, you have RawHTMLString. Makes it much harder to make a mistake when you're doing Response.Write(new RawHTMLString(userField)).

Interesting technique. But how much runtime overhead do all those constructors impose for Java, C#/VB.NET, PHP, and Python?

Re:Constructor overhead (0)

Anonymous Coward | more than 2 years ago | (#41177513)

None, really, since you should be escaping all user input anyway.

Most MVCish frameworks have something like this, so you can test it yourself. (e.g. <%: in asp.net)

Re:Constructor overhead (1)

Richy_T (111409) | more than 2 years ago | (#41178175)

You escape user input for SQL (if you're not using parameterized queries) or whatever database you're using. You escape the output for HTML or whatever you are outputting.

If you've ever run across an application where someone has HTML escaped user input before insertion into the database and you now want to output it in a format that isn't HTML, you'll know what I'm talking about. User data should usually be *stored* as accurately to the original as possible.

Re:Constructor overhead (1)

Cajun Hell (725246) | more than 2 years ago | (#41178519)

But how much runtime overhead do all those constructors impose for Java, C#/VB.NET, PHP, and Python?

Either nothing, or nothing significant, or something-but-it-fixed-a-bug-which-was-definitely-there.

Re:I don't know if the question should be... (0)

Anonymous Coward | more than 2 years ago | (#41177253)

You're assuming input comes from a browser using a page you made yourself. That's not how things work. That input can come from code with deliberately malformed data to exploit your system. Which is what's been happening for well over a decade. If you aren't validating input in your server code, what is?

Escaping is hard (1)

TheLink (130905) | more than 2 years ago | (#41178161)

The problem is you currently can't escape everything reliably.

Why? Because the mainstream browser security concept is making sure that all the thousands of "Go" buttons are not pressed aka "escaped". But people are always introducing new "Go" buttons. If your library is not aware of the latest stuff it will not escape the latest crazy "Go" button the www/html/browser bunch have come up with.

So in theory a perfectly safe site could suddenly become unsafe, just because someone made a new "Go" button for the latest browser. Your library could also parse things differently from the victim browser.

Many years ago I proposed a tag to disable any active stuff. A "Stop" button if you like in a world full of "Go" buttons. But most of the browser and W3C people weren't interested. If they had done it, a lot of those worms (MySpace etc) wouldn't have worked at all.

Only recently they have finally come up with something called Content Security Policy: https://developer.mozilla.org/en-US/docs/Security/CSP/Introducing_Content_Security_Policy [mozilla.org]

"Stop" buttons aren't 100% but it's way easier to specify a "Stop" than it is to make sure that all the hundreds of current AND future "Go" buttons are properly escaped.

Car Analogy: before CSP, browsers were like cars with hundreds of accelerator pedals. To stop you had to make sure ALL the pedals were not pressed!

Anyone who thinks escaping is easy to do 100% should go look at the various security researcher/hackers guides on exploiting stuff. Especially if you are trying to still allow HTML content (say from advertisers or HTML email for your users). It's easy if you are only going to allow ASCII text. But once you throw in HTML and unicode, it all starts to get complicated.

Re:I don't know if the question should be... (1)

Anonymous Coward | more than 2 years ago | (#41176431)

Problem 3: Where originally scripts could only be defined in the HTML header, some not-to-be-named company in Redmond decided it was a good idea to permit defining them in the document body as well.

Scripts in the body (1)

tepples (727027) | more than 2 years ago | (#41177061)

Where originally scripts could only be defined in the HTML header, some not-to-be-named company in Redmond

It wasn't Nintendo of America, was it? :-p

decided it was a good idea to permit defining them in the document body as well.

Anywhere you have HTML element attributes beginning with on, you have scripts in the body. It's been so long ago, I can't remember: did Netscape's original version of JavaScript have onclick or onmouseover?

Re:Scripts in the body (1)

Richy_T (111409) | more than 2 years ago | (#41178203)

I think the <a href="javascript: predates even that.

Re:I don't know if the question should be... (5, Informative)

ais523 (1172701) | more than 2 years ago | (#41176457)

After seeing a demonstration of a successful XSS attack on a plaintext file (IE7 was the offending browser, incidentally), I find it hard to see what sort of validation could possibly help. After all, the offending code was a perfectly valid ASCII plain text file that didn't even look particularly like HTML, but happened to contain a few HTML tags. (Incidentally, for this reason, Wikipedia refuses to serve user-entered content as text/plain; it uses text/css instead, because it happens to render the same on all major browsers and doesn't have bizarre security issues with IE.)

Re:I don't know if the question should be... (2, Informative)

Anonymous Coward | more than 2 years ago | (#41177405)

It doesn't "refuse" to serve text/plain, it just makes you ask for it specifically. (Use ?action=raw via index.php and/or format=txt via api.php)

Re:I don't know if the question should be... (0)

Tastecicles (1153671) | more than 2 years ago | (#41176059)

answer to problem 1: should browsers, whose primary purpose is to interpret markup language, be specified to interpret markup language and display server-provided content according to that markup, and NOTHING MORE? As in, malformed/maltagged content should be IGNORED (ie dropped, not processed further)?

Oh, got it. Microsoft helped specify the capabilities of mainstream browsers, didn't they - though not until after they looked at what Nutscrape were trying to do, arseraped them and implemented most of the Nutscrape shit into IE... think back to the .wmf mess, back in the days when .wmf was the preferred file format for print clipart(! Yeah, I know - I still have dozens of CDs full of wmf cliparts)...

Re:I don't know if the question should be... (-1)

Anonymous Coward | more than 2 years ago | (#41178147)

Not the browsers -- the scripting languages themselves.
TFA doesn't even mention the biggest security hole out there -- javascript. That piece of crap is more touble than it is worth if you care anything about security.

It's called reprocessing (1, Interesting)

KreAture (105311) | more than 2 years ago | (#41175867)

Convert the file to the site supported format and quality level in sandbox.
Tadaaaa,,,

Re:It's called reprocessing (5, Informative)

Anonymous Coward | more than 2 years ago | (#41176037)

As TFA points out, it is possible to create a Flash applet using nothing but alphanumeric characters. Good luck catching that in your reprocessing.

Re:It's called reprocessing (1)

19thNervousBreakdown (768619) | more than 2 years ago | (#41176255)

Without an example it's tough to say for sure, but I suspect that it only works when the output isn't properly escaped.

Re:It's called reprocessing (1)

KreAture (105311) | more than 2 years ago | (#41176261)

How is that a picture, and how would your reprocessing code not reject it for not having header matching file name?
I think it was obvious my post refered to pictures.

Re:It's called reprocessing (0)

Anonymous Coward | more than 2 years ago | (#41177013)

It really wasn't. "Convert the file to the site supported format" could easily mean "reparse everything from unicode to ASCII"

Vector pictures (1)

tepples (727027) | more than 2 years ago | (#41177091)

Before SVG, and even now with Internet Explorer on Windows XP, SWF was the most widely compatible format for displaying vector pictures on a PC.

Re:It's called reprocessing (1)

fulldecent (598482) | more than 2 years ago | (#41177191)

I'd like to see that regex

JavaScript whacky encoding also (0)

Anonymous Coward | more than 2 years ago | (#41178257)

A long the same line, someone has also described [patriciopalladino.com] and published tools to create JavaScript using only the following characters: ()[]{}!+

"user content" (1, Insightful)

Hazel Bergeron (2015538) | more than 2 years ago | (#41175937)

Google's solution is effectively to make all content belong to Google.

Gooooo cloud!

Re:"user content" (2, Interesting)

Anonymous Coward | more than 2 years ago | (#41176093)

Umm, what does your comment have to do with the subject in TFA? They used to host content on google.com, then they moved it to googleusercontent.com for security reasons. If anything they have made it clear that the user owns it, but not for that reason.

Re:"user content" (0)

Hazel Bergeron (2015538) | more than 2 years ago | (#41176117)

I say we rename "Google" to "The Democratic People's Republic of Google". This would make it more clear that the product^Wcustomer^Wnetizen is empowered.

Re:"user content" (1)

Anonymous Coward | more than 2 years ago | (#41176269)

Are you connected to reality in any meaningful way?

Re:"user content" (0, Offtopic)

Hazel Bergeron (2015538) | more than 2 years ago | (#41176411)

The universe is not connected to anyone in any meaningful way. I wouldn't want reality to feel in my debt.

So "No" is the answer. (0)

Anonymous Coward | more than 2 years ago | (#41176615)

Put down the bong and step AWAY from the computer.

Before 1939, "propaganda" just meant "PR" (-1, Offtopic)

Hazel Bergeron (2015538) | more than 2 years ago | (#41176653)

Insight always seems like madness to the dull.

Anyway, are you honestly saying that it is better to post on /. while not high?

Re:Before 1939, "propaganda" just meant "PR" (0)

Anonymous Coward | more than 2 years ago | (#41176721)

Anyway, are you honestly saying that it is better to post on /. while not high?

For you? Definitely.

Re:Before 1939, "propaganda" just meant "PR" (0)

Hazel Bergeron (2015538) | more than 2 years ago | (#41176735)

Your past two retorts have been boring. You have one more try if you can think up something more witty, otherwise I'm closing this tab.

Re:Before 1939, "propaganda" just meant "PR" (0)

Hazel Bergeron (2015538) | more than 2 years ago | (#41176741)

I do appreciate the sentiment, though.

Assuming that today I am high, my posts are even better when I'm not high.

Doesn't follow. (0)

Anonymous Coward | more than 2 years ago | (#41177427)

They could be even worse when you're not high.

However, this doesn't excuse you being high and posting dreck like you do.

Re:Before 1939, "propaganda" just meant "PR" (0)

Anonymous Coward | more than 2 years ago | (#41177289)

And madness seems like madness to those who are insightful.

A madman is more likely to seriously claim to be insightful than an insightful man is to claim he is mad.

Google security breaches (1)

romit_icarus (613431) | more than 2 years ago | (#41175985)

For all its transparency, I've yet to see a working list of security breach attempts made on Google servers. I bet there are many, and it would be useful to know just the source and method if nothing more.

Re:Google security breaches (1)

cbiltcliffe (186293) | more than 2 years ago | (#41176015)

Security breach *attempts*?
I'm guessing a simple csv of that would be several TB in size. That's probably why you can't get a working list.

Filter the multi-TB IDS log plz (1)

tepples (727027) | more than 2 years ago | (#41177115)

You cite a multi-TB IDS log. May I have it filtered to the cases that came closest to a substantial intrusion?

Referererer (0)

Anonymous Coward | more than 2 years ago | (#41175987)

Why not check HTTP_REFERER variable and not serve up content if missing or not from the sites domain?

The ususal objections about not trusting browsers seem to be misplaced... You can trust the browser when you want the browser to protect the end user.

Another objection has to do with hacks in ancient versions of flash and other machinary that would allow referer checks to be forged/circumvented. If you are asserting this you need to show how it can still be done in 2012. These vulnerabilities were closed up years ago.

The only downside I can see you couldn't make content externally linkable from other sites other than your own which is the behavior most sites seem to prefer anyway.

Re:Referererer (1)

19thNervousBreakdown (768619) | more than 2 years ago | (#41176265)

Because writing a script to forge the Referer (sic) header is trivial.

Show how it can still be done in 2012 (1)

tepples (727027) | more than 2 years ago | (#41177199)

Another objection has to do with [...] machinary that would allow referer checks to be forged/circumvented. If you are asserting this you need to show how it can still be done in 2012.

Because writing a script to forge the Referer (sic) header is trivial.

Go ahead and show us how please.

Re:Show how it can still be done in 2012 (0)

Anonymous Coward | more than 2 years ago | (#41177817)

Seriously? Google it. It's dead simple to put *whatever* you want in a purpose-built HTTP request. (In fact, it's *supposed* to be dead simple, so normal HTTP requests can be built easily.)

How do you think your browser fills the Referrer field in the header when it sends a request? A script can do it exactly the same way, and it only gets easier if you don't assume said script has to run inside the browser.

Re:Referererer (0)

Anonymous Coward | more than 2 years ago | (#41178379)

Because writing a script to forge the Referer (sic) header is trivial.

You don't understand the problem. Sending a request with a forged referer header is trivial and also irrelevent.

The problem is how can you forge a request within the *browser* context of the user to enable you to manipulate the users relationship with the domain? If that is so trivial why not just tell us how it can be done by forging the referer then?

Strip Referer (1)

tepples (727027) | more than 2 years ago | (#41177227)

Why not check HTTP_REFERER variable and not serve up content if missing

Because a lot of proxies and web browser extensions strip Referer for privacy reasons.

Yes, it really is that bad. (5, Interesting)

VortexCortex (1117377) | more than 2 years ago | (#41176043)

This is what happens when you try to be lenient with markup instead of strict (note: compliant does not preclude extensible), and then proceed to use a horribly inefficient and inconsistent (by design) scripting language and a dysfunctional family of almost sane document display engines combined with a stateless protocol to produce a stateful application development platform by way of increasingly ridiculous hacks.

When I first heard of "HTML5" I thought: Thank Fuck Almighty! They're finally going to start over and do shit right, but no, they're not. HTML5 is just taking the exact same cluster of fucks to even more dizzying degrees. HOW MANY YEARS have we been waiting for v5? I've HONESTLY lost count and any capacity to give a damn when we reached a decade -- Just looked it up, 12 years. For about one third the age of the Internet we've been stuck on v4.01... ugh. I don't, even -- no, bad. Wrong Universe! Get me out!

In 20XX when HTML6 may be available I may reconsider "web development". As it stands web development is chin-deep in its own filth which it sprays with each mention, onto passers by and they receive the horrid spittle joyously not because its good or even not-putrid, but because we've actually had worse! I can crank out a cross platform pixel perfect native application for Android, iOS, Linux, OSX, XP, Vista, Win7, and mother fucking BSD in one third the time it takes to make a web app work on the various flavours of IE, Firefox, Safari, Chrom(e|ium). The time goes from 1/3rd down to 1/6th when I cut out testing for BSD, Vista, W7 (runs on XP, likely runs on Vista & Win7. Runs on X11 + OpenGL + Linux, likely builds/runs on BSD & Mac).

Long live the Internet and actual cross platform development toolchains, but fuck the web.

Re:Yes, it really is that bad. (5, Funny)

sgrover (1167171) | more than 2 years ago | (#41176099)

+1, but tell us how you really feel

Re:Yes, it really is that bad. (0)

Anonymous Coward | more than 2 years ago | (#41176113)

Indeed. It's kind of ironic, Google's security team are grumbling about problems that are perpetuated by HTML5, a standard being driven by an employee of ... Google! Thanks Hixie!

Re:Yes, it really is that bad. (5, Insightful)

SuricouRaven (1897204) | more than 2 years ago | (#41176143)

Of course it's a mess. The combination of HTTP and HTML was designed for simple, static documents displaying predominatly text, a little formatting and a few images. By this point we're using extensions to extensions to extensions. It's a miracle it works at all.

Re:Yes, it really is that bad. (4, Funny)

adolf (21054) | more than 2 years ago | (#41176471)

It's a miracle it works at all.

It works?

Re:Yes, it really is that bad. (0)

Anonymous Coward | more than 2 years ago | (#41177145)

The GP probably never has visited this website [slashdot.org] . At least not from his mobile device, you know: simple text and some pictures.

Re:Yes, it really is that bad. (0)

Anonymous Coward | more than 2 years ago | (#41176209)

In 20XX when HTML6 may be available I may reconsider "web development".

I think you're missing an X there

Re:Yes, it really is that bad. (1)

svick (1158077) | more than 2 years ago | (#41176639)

HOW MANY YEARS have we been waiting for v5? I've HONESTLY lost count and any capacity to give a damn when we reached a decade -- Just looked it up, 12 years.

But HTML 5 is already here! It's just that it's not like the standards of old, it's a living standard. And if you don't like that, you're not agile enough.

Re:Yes, it really is that bad. (1)

arose (644256) | more than 2 years ago | (#41176705)

Remind me, is i possible to serve XHMTL 1.0 accross the board yet? I think it just about it, and we are to the point of "why the fuck bother anymore", if you can do better at getting shit implemented go right ahead, but so far HsTML5 has made more tangible progress than just about any other single initiative of W3C.

Re:Yes, it really is that bad. (0)

Anonymous Coward | more than 2 years ago | (#41176939)

That's because it was WHATWG, not W3C. W3C can't get shit done.

W3C only jumped on board and gave up on XHTML 2.0 when they realized WHATWG was winning with HTML5

April 2014 (1)

tepples (727027) | more than 2 years ago | (#41177371)

It will be in April 2014 when Windows XP, the operating system for which the latest version of the bundled browser is IE 8, leaves extended support.

Re:Yes, it really is that bad. (0)

Anonymous Coward | more than 2 years ago | (#41176689)

Pixel perfect web is a myth as it was designed specifically to be adaptive on client side. And it's a good thing. I've damn tired of "font-size: 10pt" (and similar styles) everywhere.

Re:Yes, it really is that bad. (1)

Skapare (16644) | more than 2 years ago | (#41176767)

It's posts like this that make me wish Slashdot could do moderations above level 5.

Re:Yes, it really is that bad. (2)

TheDarkMaster (1292526) | more than 2 years ago | (#41176915)

I think the same thing. I currently work doing "web systems". And do they work? Work, I managed to make a web application that can use a card printer. But at what price? I spent twice the time that I would spend if I did compiled desktop applications, and lost count of the many horrible hacks I had to do to similar desktop functionality using HTML

Re:Yes, it really is that bad. (1)

wolverine1999 (126497) | more than 2 years ago | (#41177101)

there won't be an HTML6. It's all HTML now.

Re:Yes, it really is that bad. (1)

FireFury03 (653718) | more than 2 years ago | (#41177379)

When I first heard of "HTML5" I thought: Thank Fuck Almighty! They're finally going to start over and do shit right, but no, they're not. HTML5 is just taking the exact same cluster of fucks to even more dizzying degrees.

XHTML was a pretty good step in the right direction. Enforced well-formed ness is a good thing (although IMHO browsers should've had a built in "please try to fix this page" function that the user could manually run over a broken page), genericsising tags is sensible (if you're going to embed a rectangular object then it makes sense to have a single <object> tag to do it for all content, for example - no need to produce a whole new revision of the language just because someone has invented a new type of embeddable content).

Unfortunately, the "industry" (Nokia, Microsoft, etc) were not interested in a major overhaul, and essentially wanted a quick bodge, so they came up with HTML 5 and more or less forced the W3C to adopt it. All the good stuff that HTML 5 brings, could have easilly been added to XHTML in a more generic way, but the industry weren't interested so we're left with the almighty clusterfuck known as HTML 5.

Re:Yes, it really is that bad. (0)

Anonymous Coward | more than 2 years ago | (#41177655)

Call me a Dinosaur, but I'm still using HTML1. Never had any problems with it and never had the feeling I am needing anything more or better. But this thread makes me curious. Can anyone give me a good reason to go to HTML2 ? :)

Re:Yes, it really is that bad. (0)

Anonymous Coward | more than 2 years ago | (#41178241)

Pixel perfect won't be so perfect if Apple succeeds in getting the ball rolling for 2880 x 1800 and higher displays.

HTML needs a sandbox tag (2)

Hentes (2461350) | more than 2 years ago | (#41176273)

The easiest way to secure embedded content would be a sandbox tag that allows to limit what kind of content can be inside of it.

Re:HTML needs a sandbox tag (1)

Anonymous Coward | more than 2 years ago | (#41176483)

you mean like the iframe sandbox ? http://www.w3schools.com/html5/att_iframe_sandbox.asp (actually firefox supports it now as well in nightly)

Re:HTML needs a sandbox tag (0)

Anonymous Coward | more than 2 years ago | (#41176701)

That was already discussed along with including a silver bullet tag.

Please no... (1)

betterunixthanunix (980855) | more than 2 years ago | (#41176863)

Stop extending HTML! HTML does not need more tags. HTML was not designed to be a presentation language for applications and certainly not to be an environment for running applications; it was designed to be a hypertext document language (yes, "hypertext" is a word with meaning beyond HTML). The worst thing we did was to allow HTML documents with embedded programs -- applets, Javascript, etc.

The real answer is a new standard that is designed for application presentation and deliver, that does not have so much in-band signaling. We need to get it right the first time by building security into the system, not extend an already bloated monstrosity to make up for the inevitable security problems that result from turning a language for describing documents into a platform for running distributed software with malicious users.

Good luck getting Apple to adopt it (1)

tepples (727027) | more than 2 years ago | (#41177387)

The real answer is a new standard that is designed for application presentation and deliver

That's been tried, in the form of Flex and Silverlight. Good luck getting Apple to adopt your proposed new standard.

Re:Good luck getting Apple to adopt it (1)

Richy_T (111409) | more than 2 years ago | (#41178279)

Flex? Silverlight was just another Microsoft attempt to abuse the market and that's a play everyone has gotten wise to by now.

Re:Please no... (1)

firewrought (36952) | more than 2 years ago | (#41178277)

The real answer is a new standard that is designed for application presentation and deliver, that does not have so much in-band signaling. We need to get it right the first time by building security into the system.

And to help folks bridge the gap, we could deliver this app over HTTP to a browser plugin. Great idea!! Now we just need a fancy name that will make it resonate with programmers like, um.... "Java" (cause it's a type of coffee, get it?) or "Silverlight" (cause we code while the moon's up!).

Re:HTML needs a sandbox tag (1)

TheLink (130905) | more than 2 years ago | (#41178363)

I suggested something like that 10 years ago: http://lists.w3.org/Archives/Public/www-html/2002May/0021.html [w3.org]
http://www.mail-archive.com/mozilla-security@mozilla.org/msg01448.html [mail-archive.com]
But hardly anyone was interested. If implemented it could have prevented the Hotmail, MySpace, yahoo and many other XSS worms.

There's Content Security Policy now:
https://developer.mozilla.org/en-US/docs/Security/CSP/Introducing_Content_Security_Policy [mozilla.org]

As far as I see security is not a priority for the browser and W3C bunch.

Problem can be solved, but users are the problem (2)

gweihir (88907) | more than 2 years ago | (#41176659)

Images and text can be sanitized reliably. The problem is that this strips out all of the non-essential features. Users have a hard time understanding that, because users do not understand the trade-offs involved.

But the process is easy: Map all images to meta-data and compression free formats (pnm, e.g.) then recompress with a trusted compressor. For text, accept plain ASCII, RTF and HTML 2.0. Everything else, convert either to images or to cleaned PDF/Postscript by "printing" and OCR'ing.

Re:Problem can be solved, but users are the proble (0)

Anonymous Coward | more than 2 years ago | (#41177157)

If you actually read the article, you'd know that there are stupid browsers out there that will happily interpret a perfectly valid ASCII text file served as text/plain as HTML, making your "sanitizing" of it by requiring it to be plain ASCII text ineffective. :(

Firewalls/Webfiltering (1)

Anonymous Coward | more than 2 years ago | (#41176869)

What about the fact that some companies don't want their staff pulling images from sites like google, and would block the images domain, but allow the search domain?

If it were all one domain and not separated, then companies of this mindset would have to make a choice of blocking all of google, or blocking merely the images. Many of Google's ads are text based, and they would lose money if the didn't offer an alternative that would allow companies to selectively block those.

Re:Firewalls/Webfiltering (1)

Richy_T (111409) | more than 2 years ago | (#41178297)

Any decent web filtering software allows blocking based on URL components, not just the domain. Google would have to work pretty hard to circumvent that and what would be the motivation?

Another reason why a separate domain is useful (0)

Anonymous Coward | more than 2 years ago | (#41177023)

Some regimes require families to have a content filter either on their computer or on their ISP's router that is configured to block all domains with non-premoderated user-generated content if they have children below certain age. So, if a site contains a mixture of known-safe content and user-generated content on the same domain, it will be blocked completely. That's definitely suboptimal.

Fuckle can screw themselves (-1)

Anonymous Coward | more than 2 years ago | (#41177057)

Considering their pieces of shit products such as Fuckle Assdroid (A horrible ripoff of iOS) and Fuckle Chrap (A horrible ripoff of Safari) is it any wonder anyone takes them seriously. Oh, that's right it's open source so they get free passes. Forget about the fact Fuckle is using code from the open source community much like crappy TiVo. Fuckle needs to merge with M$ so they will be on the Anti-Trust radar again then kill 2 birds with one stone. Apple FTW!!!!!!!!!!!!

What Google wants (0)

Anonymous Coward | more than 2 years ago | (#41177645)

Here's the give: "In the days of static HTML and simple web applications, giving the owner of the domain authoritative control over how the content is displayed wasn’t of any importance."

"giving the owner of the domain authoritative control over how content is displayed"

The article says no more about this, but instead proceeds to (correctly) detail a number of flaws with common web app protocols and procedures and how Google deals with them.

I agree with Google - web apps suck eggs. The world could really use something better. But be very careful what you wish for, because for all of it's warts, web apps remain one of the only viable ways to produce widely available applications using open standards. Take that away, and we're back to the 1980's, when the only way to do anything was to serve at the caprice of proprietary vendors.

Novel Solution (2, Interesting)

Sentrion (964745) | more than 2 years ago | (#41177825)

This was a real problem back in the 1980s. Everytime I would connect to a BBS my computer would execute any code it came across, which made it very easy for viruses to infect my PC. But lucky for me, in the early 90's the world wide web came into being and I didn't have to run executable code just to view content that someone else posted. The PC was insulated from outside threats by viewing the web "pages" only through a "web browser" that only let you view the content, which could be innocuous text, graphics, images, sound, and even animation that was uploaded to the net by way of a non-executable markup language known as HTML. It was at this time that the whole world began to use their home computers to view content online because it was now safe for amateurs and noobs to connect their PCs to the internet without any worries of being inundated with viruses and other malware.

Today I only surf the web with browsers like Erwise, Viola, Mosaic, and Cello. People today are accessing the internet with applications that run executable code, such as Internet Explorer and Firefox. Very dangerous for amateurs and noobs.

It is True (0)

Anonymous Coward | more than 2 years ago | (#41177975)

I googled and found that it is TRUE...

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?