Beta

Slashdot: News for Nerds

×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Optimizing Page Load Times

kdawson posted more than 7 years ago

186

John Callender writes, "Google engineer Aaron Hopkins has written an interesting analysis of optimizing page load time. Hopkins simulated connections to a web page consisting of many small objects (HTML file, images, external javascript and CSS files, etc.), and looked at how things like browser settings and request size affect perceived performance. Among his findings: For web pages consisting of many small objects, performance often bottlenecks on upload speed, rather than download speed. Also, by spreading static content across four different hostnames, site operators can achieve dramatic improvements in perceived performance."

cancel ×

186 comments

gnaa (-1, Offtopic)

Anonymous Coward | more than 7 years ago | (#16639993)

first post, fucking asshats.

gnaa rulez!

Re:gnaa (-1, Offtopic)

ccarson (562931) | more than 7 years ago | (#16640861)

lol

Erm.. huh? (1, Insightful)

Turn-X Alphonse (789240) | more than 7 years ago | (#16639995)

I'm not quite sure how this really has any "real world" effects.. Browsing speed is already insanely fast to the point where if you blink you miss the loading on most connections these days. How does speeding up this second or so really help anything?

I can see it's use on large sites but this seems aimed at smaller sites.

Then again HTML isn't my thing so it goes over my head I guess.

Re:Erm.. huh? (3, Informative)

rf0 (159958) | more than 7 years ago | (#16640019)

If you are on a fast broadband pipe you are correct but there is still a lot of other people on small connections with low upload limits (64k-256kbit) and I can see why this could be a bottle neck as it can't get the requests out fast enough. That said there are things a user can do to help themselves.

Firstly if the ISP has a proxy server then using it will reduce the trip time for some stored content meaning it only has to go over a few hops than prehaps all the way across the world. You can also look at something like Onspeed [onspeed.com] which is a paid for product but compresses images (though makes them look worse) and content and can give a decent boost on very slow (GPRS/3G) connections and also get more out of your transfer quota.

Re:Erm.. huh? (1)

Jussi K. Kojootti (646145) | more than 7 years ago | (#16640033)

Browsing speed is already insanely fast to the point where if you blink you miss the loading on most connections these day.
Unfortunately that is not true. Many "broadband" connections are definitely not insanely fast, and at least here in Finland the upload speeds of most connections are so pathetic that the problems mentioned in the article are very easily observed.

Re:Erm.. huh? (3, Interesting)

mabinogi (74033) | more than 7 years ago | (#16640071)

1.5Mbps ADSL.
5 Seconds to refresh the page on slashdot. That's just to getting the page to actually blank and refresh, there's still then the time it takes to load all the comments.
Sometimes it's near instant, but most of the time it's around about that.
Most of the time is spent "Waiting for slashdot.org", or "connecting to images.slashdot.org".
It used to be a hell of a lot worse, but I installed adblock to eliminate all the extra unecesary connections (google analytics, and the various ad servers). I didn't care about the ads or the tracking, it just bugged me that those things made my browsing experience slower.
I find it funny that this guy is suggesting spreading across multiple hosts, it's my completely unscientific and entirely anecdotal experience that the more host names the browser has to resolve to load the page, the longer it takes before you get to see anything.

I'm in Australia so there's a minimum 200 ms latency on roundtrips - five roundtrips and you've added 1 second to the rendering time. Approaches that add extra DNS lookups really aren't going to help. (Though the DNS lookups themselves aren't necesarily going to take 200ms - they could be much faster if they're in my ISPs DNS cache, or the could be longer if it's got to query them)

Re:Erm.. huh? (1)

mabinogi (74033) | more than 7 years ago | (#16640157)

A little clarification to that - I have pipelining on, which may be why multiple hosts is a net loss for me, instead of a gain.

Re:Erm.. huh? (3, Informative)

x2A (858210) | more than 7 years ago | (#16640321)

There are other factors.

1 - keepalive/pipelining connections means only 1 dns lookup is performed, often cached on your local machine means this delay is minimal.

2 - the dns lookup can be happening for the second host while connections to the first host are still downloading, rather than stopping everything while the second host is looked up. This hides the latency of the second lookup.

3 - most browsers limit the number of connections to each server to 2. If you're loading loads of images, this means you can only be loading two at once (or one while the rest of the page is still downloading). If you put images on a different host, you can get extra connections to it. Also, cookies will usually stop an object from taking advantage of proxies/caches. Putting images on a different host is an easy to way make sure they're not cookied.

How nice of you... (1)

shaneh0 (624603) | more than 7 years ago | (#16641079)

It's obvious how much you identify as part of the slashdot community. I mean, only dedicated slashdotters would go to the trouble to adblock the few banner adverts on this page just so they can get to the comments faster. That's a clear sign that you care more about the good of the community than yourself. I just wanted to say: Thank you for making such a commitment to this website. If everyone took the extraordinary steps to block completely unobtrusive advertising than this website would be a much better place.

We all know that slashdot doesn't NEED advertising revenue. It's a little-known fact that Slashdot servers don't need bandwidth or electricity: You just pour a four-pack of redbull directly into the gigabit ethernet port and it serves requests like a daemon for 2-3 days without needing to sleep().

Re:Erm.. huh? (1)

jakoz (696484) | more than 7 years ago | (#16640091)

It has very big implications still. For you it obviously has no effect, but let me give you an example.

We are in the middle of the planning of a software release that rolls out to thousands of users. So that they can access it remotely, we are toying with the idea of supporting 3G PCMCIA cards.

In the area we're benchmarking in, latency and a retarded slow-start windowing algorithm are the limiting factors. Keep in mind that this software is crucial to the company, which is a fairly large one. Adoption rates of it drive the company.

We are limited by the existing platform to use IE. A simple registry hack such as the one discussed to increase max connections can (and will) make the difference to our customers, and was my number one recommendation when they ran into performance problems, with me betting it would influence benchmark times faster than their next two solutions combined.

Re:Erm.. huh? (1)

LiquidCoooled (634315) | more than 7 years ago | (#16640153)

Clear your cache and then try loading something like digg
The front page is so chocka with javascript files it takes an age to come up.
As an example I can click digg into the first tab, click slashdot boingboing and gmail into others.
I can glance at the front pages of all of them and most of the time clear out my spam before digg even appears, it sticks on the js files.

Those tenths of seconds add up (4, Informative)

giafly (926567) | more than 7 years ago | (#16640187)

If a big part of your job involves using a Web-based application, reducing page-load times really helps. My real job is writing one of these applications and getting the caching right is much more important than sexier topics like AJAX. There's some good advice in TFA.

Re:Erm.. huh? (3, Interesting)

orasio (188021) | more than 7 years ago | (#16640599)

User perception of responsiveness on interfaces has a lower bound of 200 ms. Some times even lower.

Just because 1 seconds seems fast, it doesn't mean that it's fast enough to stop improving.
When you reach that 200ms barrier, the interface has perfect responsiveness, a bigger interval is always perfectible.

fp (-1, Offtopic)

Anonymous Coward | more than 7 years ago | (#16639997)

First post for cold beer

HTTP Pipelining (5, Informative)

onion2k (203094) | more than 7 years ago | (#16640007)

If the user were to enable pipelining in his browser (such as setting Firefox's network.http.pipelining in about:config), the number of hostnames we use wouldn't matter, and he'd make even more effective use of his available bandwidth. But we can't control that server-side.

For those that don't know what that means: http://www.mozilla.org/projects/netlib/http/pipeli ning-faq.html [mozilla.org]

I've had it switched on for ages. I sometimes wonder why it's off by default.

Re:HTTP Pipelining (1)

SanityInAnarchy (655584) | more than 7 years ago | (#16640069)

I always wonder why it's off by default. IE I can understand -- they still don't support XHTML -- but Firefox?

Re:HTTP Pipelining (4, Interesting)

baadger (764884) | more than 7 years ago | (#16640171)

This is NOT just Opera fanboyism, but Opera however *does* do pipelining by default (with a safe fallback)

Opera pipelines by default - and uses heuristics to control the level of pipelining employed depending on the server Opera is connected to
Reference [operawiki.info]

Re:HTTP Pipelining (-1, Redundant)

baadger (764884) | more than 7 years ago | (#16640181)

Just found this...

Mozilla's HTTP/1.1 Pipelining FAQ [mozilla.org]

Re:HTTP Pipelining (0)

Anonymous Coward | more than 7 years ago | (#16640283)

Here's a comment [slashdot.org] linking to it.

Re:HTTP Pipelining (1)

Urza9814 (883915) | more than 7 years ago | (#16640399)

Oh wow. My load times suck because I'm running freenet, but that pipelining thing just cut them in half.

Some reasons (2, Informative)

harmonica (29841) | more than 7 years ago | (#16640435)

I've had it switched on for ages. I sometimes wonder why it's off by default.

Some reasons against pipelining [mozillazine.org] .

Re:Some reasons (1)

Chris Pimlott (16212) | more than 7 years ago | (#16640981)

Er, not a very informative page; the only caveat listed is a vague notion that it's "unsupported" and "can prevent Web pages from displaying correctly" because it's "incompatible with some Web servers and proxy servers". There may well be good reasons, but that page doesn't really explain why.

HTTP/1.1 Design (5, Insightful)

keithmo (453716) | more than 7 years ago | (#16640015)

From TFA:

By default, IE allows only two outstanding connections per hostname when talking to HTTP/1.1 servers or eight-ish outstanding connections total. Firefox has similar limits.

And:

If your users regularly load a dozen or more uncached or uncachable objects per page load, consider evenly spreading those objects over four hostnames. Due to browser oddness, this usually means your users can have 4x as many outstanding connections to you.

From RFC 2616, section 8.1.4:

Clients that use persistent connections SHOULD limit the number of simultaneous connections that they maintain to a given server. A single-user client SHOULD NOT maintain more than 2 connections with any server or proxy.

It's not a browser quirk, it's specified behavior.

Re:HTTP/1.1 Design (1)

jakoz (696484) | more than 7 years ago | (#16640123)

It might be specified behavior, but it's stupidly outdated, and seriously needs to get with the times.

It has been that way since I had dialup many years ago. It might have been prudent at the time, but now it is sadly outdated.

Things have changed. The popularity of FasterFox, which happily breaks all specifications, is a reflection of it.

I feel that 10-20 is a much more realistic figure now. I haven't seen many webmasters complaining about FasterFox.

Re:HTTP/1.1 Design (1)

ben there... (946946) | more than 7 years ago | (#16640219)

I feel that 10-20 is a much more realistic figure now. I haven't seen many webmasters complaining about FasterFox.

I've seen webmasters complain right on FasterFox's download page on Mozilla Update.

Re:HTTP/1.1 Design (2, Informative)

jakoz (696484) | more than 7 years ago | (#16640327)

Then perhaps they need to invest in some modern systems. The following definitions are interesting:

3. SHOULD This word, or the adjective "RECOMMENDED", mean that there may exist valid reasons in particular circumstances to ignore a particular item, but the full implications must be understood and carefully weighed before choosing a different course. 4. SHOULD NOT This phrase, or the phrase "NOT RECOMMENDED" mean that there may exist valid reasons in particular circumstances when the particular behavior is acceptable or even useful, but the full implications should be understood and the case carefully weighed before implementing any behavior described with this label. They don't say DO NOT or MUST NOT. Like they say, the behavior can is useful... and they could see this would be the case IN 1997!

It is time we updated things. It's particularly funny that Microsoft found this RFC, of all things, to obey.

Re:HTTP/1.1 Design (4, Interesting)

x2A (858210) | more than 7 years ago | (#16640347)

The limit's not to do with your connection speed as such - it's to do with being polite and not putting too much drain on the server your downloading from.

Re:HTTP/1.1 Design (1)

jakoz (696484) | more than 7 years ago | (#16640389)

I realize that, but the limits need to be revised. 2 might have been courteous a decade ago, but now it isn't realistic.

Re:HTTP/1.1 Design (3, Insightful)

x2A (858210) | more than 7 years ago | (#16640405)

Depends on server load; how many of the objects are static vs dynamic etc. 5-10 connections for images might be okay, but for dynamic objects it might not be. Perhaps it should be specifiable within the html page?

Re:HTTP/1.1 Design (1)

jakoz (696484) | more than 7 years ago | (#16640445)

You know... that's a damn good idea and should be modded up. It's a very good solution that should be in the specs already. Granted that some browsers could ignore it, but they could anyway,

Re:HTTP/1.1 Design (2, Insightful)

hany (3601) | more than 7 years ago | (#16640693)

At the end you have just one pipe to push that data even if you have say 100 connections.

By still having one pipe with certain capacity (i.e. bandwidth) but increasing amount of connections, you're wasting your bandwidth for maintenance of multiple connections.

Also you're wasting the resources of the server for the same reason.

At the end, you're slowing yourself down.

Yes, there are scenarios where using for example 4 connections as opposed to just 1 yields better download performance but AFAIK almost all such scenarios are very specific for given implementation of webserver, given implementation of network, given implementation of browser, ...

So to sum myself up: I think that the 1-2 active connections per client as mentioned in RFC 2616 was generaly valid in 1997, is generaly valid now and also will be generaly valid in the future.

Contrary, "the hack" of using multiple connections to speed-up downloads may have been, is and may be in the future sometimes valid but generaly degrades performance.

Pity is, Aaron Hopkins is mentioning true solution (HTTP pipelining) only as "(Optional)" and at total end of the article. But he correctly describe his previous propositions as "tricks". :)

Re:HTTP/1.1 Design (0)

Anonymous Coward | more than 7 years ago | (#16640709)

please people mod parent up

Re:HTTP/1.1 Design (1)

Sulka (4250) | more than 7 years ago | (#16640545)

Uhh... I bet you haven't ever administrated a large website.

When you have a lot of concurrent users, the amount of TCP sockets you can have open on a given server while still maintaining good throughput is limited. If all users out there had 20 sockets open to each server, making sites scale would be seriously hard on very large sites.

I do agree the two socket limit is a bit low but 20 would be a total overkill.

Re:HTTP/1.1 Design (1)

statusbar (314703) | more than 7 years ago | (#16641289)

The wonderful thing about the RFC language "SHOULD" and "SHOULD NOT" is that it really is only a suggestion that do not need to be followed. It makes it wonderful to test all possible combinations of "should" and "should not" options in the protocol with both clients and servers, probably the biggest source of bugs and problems.

rfc2119 [faqs.org] defines the terms:

3. SHOULD This word, or the adjective "RECOMMENDED", mean that there
may exist valid reasons in particular circumstances to ignore a
particular item, but the full implications must be understood and
carefully weighed before choosing a different course.

--jeffk++

Simulation software available? (3, Informative)

leuk_he (194174) | more than 7 years ago | (#16640029)

"Regularly use your site from a realistic net connection. Convincing the web developers on my project to use a "slow proxy" that simulates bad DSL in New Zealand (768Kbit down, 128Kbit up, 250ms RTT, 1% packet loss) rather than the gig ethernet a few milliseconds from the servers in the U.S. was a huge win. We found and fixed a number of usability and functional problems very quickly."

What (free) simulation is available for this? I only know dummynet which requires a linux server and some advanced routing. But surely there is more. Is there?

Re:Simulation software available? (1)

OffTheLip (636691) | more than 7 years ago | (#16640085)

NIST Net (http://www-x.antd.nist.gov/nistnet) can be used for latency and bandwith simulation. It is no longer supported though. Not sure what other free alternatives are out there.

Re:Simulation software available? (2, Interesting)

Jussi K. Kojootti (646145) | more than 7 years ago | (#16640105)

Try trickle. It won't do fancy stuff like simulating packet loss, but a
trickle -d 100 -u 20 -L 50 firefox
should limit download, upload and latency rates.

Re:Simulation software available? (1)

rHBa (976986) | more than 7 years ago | (#16640129)

Try going back to 56k(or less)dial-up.

Re:Simulation software available? (1)

leuk_he (194174) | more than 7 years ago | (#16640245)

Actually Dail up cost me money. While i have a always on adsl already available.

And it is impossible so simulate faster lines than my current adsl.

Re:Simulation software available? (4, Interesting)

ggvaidya (747058) | more than 7 years ago | (#16640279)

You could try using Sloppy [dallaway.com] . I've only ever heard about it because its programmer has a very nice page on getting a free Thwarte FreeMail certificate to work with Java WebStart [dallaway.com] , so this isn't a recommendation or anything. Looks pretty decent, though.

Re:Simulation software available? (1)

Slashdiddly (917720) | more than 7 years ago | (#16640495)

From the home page:

Sloppy deliberately slows the transfer of data between client and server.

So basically a straight proxy then - only in Java!

sloppy. (1)

leuk_he (194174) | more than 7 years ago | (#16640567)

Read the faq. It is a vey simple webserver in java. Not a proxy.

"Sloppy doesn't work as a real HTTP proxy so don't configure your browser to use it."

Re:Simulation software available? (1)

hacker (14635) | more than 7 years ago | (#16641137)

Too bad it doesn't work on Firefox 1.5.x or 2.x on Linux with the latest Java plugin from Sun.

Re:Simulation software available? (1)

badfish99 (826052) | more than 7 years ago | (#16640343)

If you're running linux you can do all of this with iptables (even simulating random packet loss). The command syntax is a bit complicated but once you've got the hang of it, it is extremely powerful.

Re:Simulation software available? (0)

Anonymous Coward | more than 7 years ago | (#16640505)

you're looking for "honeyd". it's what you need

Public Proxies (0)

Anonymous Coward | more than 7 years ago | (#16640639)

Use public proxies -- they are usually very slow.

Re:Simulation software available? (1)

wralias (862131) | more than 7 years ago | (#16640937)

Fiddler for Windows - http://www.fiddlertool.com/ [fiddlertool.com] - and Charles for Mac OSX - http://www.xk72.com/charles/ [xk72.com] - are debugging proxies. Very easy to use, and both allow you to throttle bandwidth to something lower than what you are using. Charles has a native way to do it, but I think you either have to program or download a new "rule" for Fiddler in order to throttle. Anyway, both are extremely valuable for solving a multitude of problems, not just throttling bandwidth.

concurrent web server connections (1)

dysfunct (940221) | more than 7 years ago | (#16640081)

Also, by spreading static content across four different hostnames, site operators can achieve dramatic improvements in perceived performance.

I've worked with heavily loaded servers that serve many pictures per page and can confirm that this does decrease perceived load time, but it does have its drawbacks. Pushing the concurrent browser requests to num_hostnames * browser_default on the same physical host, you'll have to increase the maximum concurrent requests of your web server, which can badly increase system load and lead to easy slashdotting situations. Only do this if you can modify those settings, know what your server is capable of and are not limited by bandwidth as this can also quickly fill your tubes. And as the article states: only do this with small objects or you might be under heavy load in no time.

Css and Scripts (5, Informative)

Gopal.V (532678) | more than 7 years ago | (#16640111)

I've done some benchmarks and measurements in the past which will never be made public (I work for Yahoo!). And the most important bits in those have been CSS and Scripts. A lot of performance has been squeezed out of the HTTP layers (akamai, Expires headers), but not enough attention has been paid to the render section of the experience. You could possibly reproduce the benchmarks with a php script which does a sleep() for a few seconds to introduce delays at various points and with a weekend to waste [dotgnu.info] .

The page does not start rendering till the last CSS stream is completed, which means if your css has @import url() entries, the delay before render increases (until that file is pulled & parsed too). It really pays to have the quickest load for the css data over anything else - because without it, all you'll get it a blank page for a while.

Scripts marked defer do not always defer and a lot of inline code in <script> tags depend on such scripts that a lot of browsers just pull the scripts as and when they find it. There seems to be just two threads downloading data in parallel (from one hostname), which means a couple of large (but rarely used) scripts in the code will block the rest of the css/image fetches. See flickr's organizr [flickr.com] for an example of that in action.

You should understand that these resources have different priorities in the render land and you should really only venture here after you've optimized the other bits (server [yahoo.com] and application [php.net] ).

All said and done, good tutorial by Aaron Hopkins - a lot of us have had to rediscover all that (& more) by ourselves.

Re:Css and Scripts (2, Informative)

Evets (629327) | more than 7 years ago | (#16640203)

I've found that once a page has layout it will begin rendering and not before. CSS imported in the body do not prevent rendering. CSS imported in the HEAD will. In fact, the css and javascript in the head section seem to need downloading prior to rendering.

I have also found that cached CSS and Javascript can play with you a little bit. When developing a site you tend to see an expected set of behaviors based on your own experience with a site - but you can find later that having the external files either cached or not cached can have an effect on things. (i.e. a cached javascript file with a load event may be triggered before the DOM is ready if you aren't checking for the readiness of the DOM itself)

ETAG headers are very important as well. Running "tail -f access.log" while you browse your own site will show a lot of redundant calls to javascript, css, and image files that should be cached but aren't. IE has a setting of "Check for new content" or something like that that really fouls up css background images without proper expiration headers (lots of flickering).

There is still a significant portion of the web community that utilizes dialup connections. These users are seemingly ignored by many popular sites. I try to get pages to load in under 8 seconds for dialup users, but with any significant javascript or CSS it is sometimes a difficult task. It's much easier on consecutive page loads by forcing cacheing, but that doesn't matter one bit if the user goes elsewhere because the initial page load was too slow.

There are certainly a plethora of optimization techniques not even touched on in this article. I know that Google and Yahoo are very keen on these subjects and it's worth taking a look at the source of some of their pages for ideas. Last I checked, they could care less about validation, though. But with the bandwidth they must utilize saving a few bytes here and there can mean significant dollar differences at the end of the month and what truly matters is whether or not the browser renders the page correctly.

Re:Css and Scripts (1)

RAMMS+EIN (578166) | more than 7 years ago | (#16640491)

``The page does not start rendering till the last CSS stream is completed''

On the other hand, stylesheets are often static and used by many pages, so they could be cached (Opera does this). The same is true of scripts.

Re:Css and Scripts (1, Interesting)

Anonymous Coward | more than 7 years ago | (#16640585)

The remarkable thing here is that Google is one of the major causes of slow loading web pages due to the way their adsense system works. The webmaster is not allowed to modify the code which loads the script that creates the ads. Thus the script always loads inline, and since ads are usually placed at the top of a page, delays in delivering the adsense script, which have become more frequent and severe lately, cause the rest of the page to stall.

Possibly a little contradictory (0)

Anonymous Coward | more than 7 years ago | (#16640115)

Glad to see somebody has taken the time to write about these things, even if they are mostly self-evident. One thing that I'm not sure about...

"Load fewer external objects. Figure out how to globally inline the same one or two javascript files"


You see this in forum html where they have inlined hundreds of lines of js and CSS, this increases response size per request. The page will display faster first time but the correct thing to do is set expires headers and have the browser cache external CSS and javascript files. It's a tradeoff, larger CSS segments and javascript that are used site-wide should always be external, 200 bytes in a head element thats sent over the wire using gzip (because deflate confuses some browser vendors) is probably not worth bothering with.

Spreading content across hostnames... (0)

bigmouth_strikes (224629) | more than 7 years ago | (#16640143)

Just what we need in the time of phising attacks being a constant worry: having to figure out whether not just one, but four different hostnames belong to the site operator in question.

Re:Spreading content across hostnames... (1, Informative)

Anonymous Coward | more than 7 years ago | (#16640183)

hostnames != domainname

Why would a sub-domain confuse anyone?

rss.slashdot.org
apple.slashdot.org
ask.slashdot.org
backslash.slashdot.org

Caching of dynamic content (4, Insightful)

baadger (764884) | more than 7 years ago | (#16640147)

This [web-caching.com] is a good place to start testing the 'cacheability' of your dynamic web pages. Quite frankly it's appauling that even the big common web apps used today like most forum or blog scripts don't generate sensible Last-Modified, Vary, Expires, Cache-Control headers. With most of the metadata you need to generate this stuff stored in the existing database scheme theres just really no excuse for it.

Abolishment of nasty long query strings into nicer, more memorable URI's is also something we should be seeing more of in "Web 2.0." Use mod_rewrite [google.com] , you'll feel better for it.

Re:Caching of dynamic content (1)

RAMMS+EIN (578166) | more than 7 years ago | (#16640507)

``Quite frankly it's appauling that even the big common web apps used today like most forum or blog scripts don't generate sensible Last-Modified, Vary, Expires, Cache-Control headers.''

The problem is that things don't usually break if you don't use these headers effectively. In other words, you don't notice that something could be improved.

Re:Caching of dynamic content (1)

hacker (14635) | more than 7 years ago | (#16641013)

Woops!

Not Found
The requested URL /cgi-web-caching/cacheability.py was not found on this server.

Apache/1.3.31 Server at www.web-caching.com Port 80

Re:Caching of dynamic content (0)

Anonymous Coward | more than 7 years ago | (#16641061)

Why must you mall the English language so?

Pipelining (1)

RAMMS+EIN (578166) | more than 7 years ago | (#16640223)

FTFA:

``Neither IE nor Firefox ship with HTTP pipelining enabled by default.''

Huh? So all these web servers implement keep-alive connections and browsers don't use it?

Not the same thing (0)

Anonymous Coward | more than 7 years ago | (#16640265)

Pipelining depends on keep-alive to concatenate multiple HTTP responses into a single TCP packet.

Re:Pipelining (2, Informative)

smurfsurf (892933) | more than 7 years ago | (#16640317)

Pipelining is not the same as keep-alive. Although pipelining needs a keep-alive connection.
Pipeling means "multiple requests can be sent before any responses are received. "

Re:Pipelining (4, Informative)

TheThiefMaster (992038) | more than 7 years ago | (#16640333)

Pipelining is not keep-alive. Keep alive means sending multiple requests down one connection, waiting for the response to the request before sending the next. Pipelining sends all the requests at once without waiting.

Keep-alive no:
Open connection
-Request
-Response
Close Connection
Open connection
-Request
-Response
Close Connection
-Repeat-

Keep-alive yes:
Open connection
-Request
-Response
-Request
-Response
-Repeat-
Close Connection

Pipe-lining yes:
Open connection
-Request
-Request
-Repeat-
-Response
-Response
-Repeat-
Close Connection

Re:Pipelining (2, Informative)

x2A (858210) | more than 7 years ago | (#16640397)

Keep-alive sends the next request after the first has completed, but on the same connection (this requires the server to send Content-length: header, so it knows after how many bytes the page has finished loading. Without this, the server must close the connection so the browser knows it's done).

Pipelining sends requests out without having to wait for the previous to complete (this does also require a Content-length: header. This is fine for static files, such as images, but many scripts where output is sent straight to the browser as it's being generated will break this, as it won't know the content length until generated has completed).

Re:Pipelining (1)

guy-in-corner (614138) | more than 7 years ago | (#16640809)

...it won't know the content length until generated has completed.

Which is what Chunked Transfer Coding is for. See Section 3.6 of RFC 2616.

DNS (0)

Anonymous Coward | more than 7 years ago | (#16640229)

Also, by spreading static content across four different hostnames, site operators can achieve dramatic improvements in perceived performance.
Isn't it common knowledge that DNS is a significant bottleneck?

Re:DNS (0)

Anonymous Coward | more than 7 years ago | (#16640299)

How so? There are trade offs between fast site response and initial load time, usually the expense of a few DNS lookups is easily outweighed by having servers dedicated to serving static content. Unless you routinely set insanely low TTLs?

Re:DNS (0)

Anonymous Coward | more than 7 years ago | (#16640611)

The reason for delivering static content from different domains is that HTTP defines a maximum of 4 concurrent connections per server (=domain). On a page with many small files, you have to wait the roundtrip time per file and the browser can't do anything else in the meantime since it isn't allowed to open more connections. HTTP pipelining would fix that, but it is not enabled by default because there are still some servers out there which get confused by it. Another way to improve load time is to ignore the HTTP spec and increase the number of concurrent connections, but just like pipelining, this is beyond the control of the server admin. Consequently: distribute static objects over subdomains to circumvent the connection limit. DNS lookups of subdomains are faster than complete lookups because the resolver only needs to ask the same server again, not the whole chain of servers.

@Slashdot: decrease the stupid delay for anonymous posting. This is a worthwhile comment and I'm not going to wait half an hour to post it.

Re:DNS (0)

Anonymous Coward | more than 7 years ago | (#16640701)


@Slashdot: decrease the stupid delay for anonymous posting. This is a worthwhile comment and I'm not going to wait half an hour to post it.

Slashdot: encouraging the anonymous to stay anonymous.

It's talking about 'percieved performance' (-1, Troll)

magnumquest (894849) | more than 7 years ago | (#16640231)

We don't really need to make 'web browsing' faster than it already is. Studies like this are just a mockery of human patience. I understand the need for speed when I'm downloading pirated videos or 'cracked' software, but I do not understand spending resrouces trying to make web pages load even faster. I do have that much patience. If this goes on any further, I can imagine a future where people would be working hard to get pages directly to there brain so they dont even have to spend time 'looking' at it.

Re:It's talking about 'percieved performance' (1)

mdarksbane (587589) | more than 7 years ago | (#16640387)

Since every fricking developer seems to think that the web is the end all be all future solution for everything, then yes, it does matter.

When I click on an element in a web page to manage my email or use a word processor, the response time is going to be around my ping (30-90 ms depending on where in the country it is) plus the time to load. That is long enough that I am clicking, and waiting. If I were working on a local native app, the response time would be under 30 ms and I would probably not even notice it.

For a quick email check or reading webpages, it doesn't really matter too much. But if you're trying to use that for constant daily productivity sorts of things (or even have a lot of email to go through) it is wasting a ton of your time. There are some real advantages to moving applications online and into a web browser (I've even heard people suggesting we should move to a web-browser for the full interface of our windowing system) but speed is currently NOT one of them. Since it seems like it's going to be more or less forced on me, anything that can make it faster and more tolerable is quite appreciated.

Connection Limits (2, Interesting)

RAMMS+EIN (578166) | more than 7 years ago | (#16640253)

``By default, IE allows only two outstanding connections per hostname when talking to HTTP/1.1 servers or eight-ish outstanding connections total. Firefox has similar limits.''

Anybody know why? This seems pretty dumb to me. Request a page with several linked objects (images, stylesheets, scripts, ...) in it (i.e., most web pages), and lots of these objects are going to be requested sequentially, costing you lots of round trip times.

Re:Connection Limits (0)

Anonymous Coward | more than 7 years ago | (#16640315)

On a loaded server where each client has n+ connections, you can run up against concurrency limits quickly. Some servers handle concurrency better than others. [hyber.org]

Re:Connection Limits (2, Informative)

MathFox (686808) | more than 7 years ago | (#16640355)

The "max two connections per webserver" limit is to keep resource usage in the webserver down; a single apache thread can use 16 or 32 Mbyte of RAM for dynamicly generated webpages. If you get 5 page requests a second and it takes (on average) 10 seconds to handle the request and send back the results you need 1 Gb RAM in the webserver, if you can ignore Slashdot. (2-4 Gb to handle peaks)

If you have a second webserver for all static data, that can be a simpeler http deamon with 1 Mb/connection or less. You can handle more parallel connactions (and Akamai the setup if needed!)

Yes, it's best to avoid inline images, Google text ad objects, etc. But allowing parallel loading of the objects (and that's the trick with using several separate hosts for images) you can take 8 or 16 roundtrips at the same time; here is your perceived speedup.

Re:Connection Limits (1)

RAMMS+EIN (578166) | more than 7 years ago | (#16640541)

``The "max two connections per webserver" limit is to keep resource usage in the webserver down''

I understand that, but why write it into the standard? Couldn't servers be made to handle this? If you don't have the resources right now, just hold off on retrieving/handling the request for a while. If you can handle the load, you will be able to service clients quicker. Now, even if the server can handle the load, the clients will slow themselves down.

``If you get 5 page requests a second and it takes (on average) 10 seconds to handle the request and send back the results''

10 seconds to process a request is a very long time. If it takes that long, a few extra round trip times don't matter much.

Re:Connection Limits (1)

MathFox (686808) | more than 7 years ago | (#16640745)

If you don't have the resources right now, just hold off on retrieving/handling the request for a while.
And make your self extra vulnarable to DoS attacks... I know that it is hard to find the right balance of priorities when your site is slashdotted, been there :-(.
10 seconds to process a request is a very long time. If it takes that long, a few extra round trip times don't matter much.
Generating the megabyte of html is easily done within a second, there are few users that have a fast enough connection to receive it within a second. Luckily most browsers start rendering before they received the whole page and all graphics, which means that the users see sub-second response. But you still have those server side resources that you use... And a swapping webserver is no fun!

Re:Connection Limits (1)

Raphael (18701) | more than 7 years ago | (#16641093)

This is not only for keeping the resource usage in the server down, but also for improving the overall performance of the whole network by avoiding congestions and packet losses. Note that the "whole network" includes not only the last mile (cable or DSL link between your home or office and your ISP), but also all routers at your ISP, in the backbone, etc.

Here is the general idea: if all clients use only one or two TCP connections and they use HTTP pipelining, then the traffic will be less bursty on these connections and TCP will be able to find the optimal transfer rate easily. As a result, the risk of congestion in the routers will be decreased and the overall performance of the network will be improved. The assumption of the HTTP/1.1 working group was that most of the traffic on the Internet would be TCP-friendly (using similar congestion avoidance algorithms). Although this is not really true for many peer-to-peer applications, many ISPs use traffic shaping and lower the priority of agressive P2P protocols compared to more "network-friendly" TCP-based applications.

Re:Connection Limits (1)

Malc (1751) | more than 7 years ago | (#16641029)

It's per the HTTP spec.

Requests Too Large (2, Interesting)

RAMMS+EIN (578166) | more than 7 years ago | (#16640273)

FTFA:

``Most DSL or cable Internet connections have asymmetric bandwidth, at rates like 1.5Mbit down/128Kbit up, 6Mbit down/512Kbit up, etc. Ratios of download to upload bandwidth are commonly in the 5:1 to 20:1 range. This means that for your users, a request takes the same amount of time to send as it takes to receive an object of 5 to 20 times the request size. Requests are commonly around 500 bytes, so this should significantly impact objects that are smaller than maybe 2.5k to 10k. This means that serving small objects might mean the page load is bottlenecked on the users' upload bandwidth, as strange as that may sound.''

I've said for years that HTTP requests are larger than they should be. It's good to hear it confirmed by someone who's taken seriously. This is even more of an issue when doing things like AJAX, where you send HTTP requests and receive HTTP responses + XML verbosity for what should be small and quick user interface actions.

Re:Requests Too Large (0)

Anonymous Coward | more than 7 years ago | (#16640509)

Browser vendors are aware of the problem, they don't add HTTP headers without a good reason. Common sense may be in low supply, seems there's just enough to stop web devs from setting 20 4KB cookies. The issue with AJAX (which rarely uses XML BTW [json.org] ) is that it's a dirty hack on top of HTTP. When all you have is a hammer...

ooh sub domain spam (1)

hauntingthunder (985246) | more than 7 years ago | (#16640277)

Isn't using 4 subdomains going to confuse the google algoritem (abuse of subdomians has beeen adressed in a few updates) let alone duplicate content - or is it just for iamges and objects.

Latency (1)

RAMMS+EIN (578166) | more than 7 years ago | (#16640325)

Latency matters a lot. My connection does up to 512 KB/s downstream, meaning a 10 KB object would take about 0.02 seconds to receive. However, before I start receiving the bytes, my request has to travel to the server, and the response has to travel back to me. When the site is in the US (I'm in Europe) the round trip time to the server can easily be 100 to 200 ms. If the TCP connection is already open, this time gets added once. However, if the connection still has to be established, this will result in another round trip time being added. So, because of the latency, that 0.02 seconds becomes 0.03 through 0.06 seconds, instead.

If you read the article, you will see that the default behavior for Firefox and MSIE is to use only up to two connections per hostname (resulting in many objects being received sequentially - add one round trip time for each), and that they don't use HTTP pipelining, meaning a new connection is set up for each object (add one round trip time for each).

In other words: it's the latency, stupid.

Re:Latency (0)

Anonymous Coward | more than 7 years ago | (#16640363)

That may be true, however the limit is in the HTTP/1.1 spec for a reason. Think about what happens when you connect to a typical forking httpd. Let's suppose a server is being slashdotted and instead of 2 connections, each client makes 8 concurrent connections. How long is that server going to last?

Re:Latency (1)

RAMMS+EIN (578166) | more than 7 years ago | (#16640475)

``That may be true, however the limit is in the HTTP/1.1 spec for a reason.''

Yeah, namely that there are so many sucky web servers and operating systems out there. So what we do? We protect these poor little things by writing their limitations into the standard. I mean, a web server worth its salt is able to cope with many concurrent connections, be they from one client or from many clients.

Re:Latency (0)

Anonymous Coward | more than 7 years ago | (#16640671)

You're forgetting that the problems we face today aren't with static content. So for performance you would want to disable keep-alive on Apache for the dynamic content and handle static content using lighttpd, mathopd etc...

You need multiple dedicated web servers for this and most businesses can't justify the expense. Don't forget that a single shared host can handle a couple of hundred low traffic dynamic sites. Do the math.

Re:Latency (1)

RAMMS+EIN (578166) | more than 7 years ago | (#16640447)

``If you read the article, you will see that the default behavior for Firefox and MSIE is to use only up to two connections per hostname (resulting in many objects being received sequentially - add one round trip time for each), and that they don't use HTTP pipelining, meaning a new connection is set up for each object (add one round trip time for each).''

Whoops. I somehow got confused into thinking that pipelining == keep-alive (despite clicking on the provided link). HTTP pipelining means that multiple requests can be sent over the same (keep-alive) connection without waiting for a response first. This eliminates both sources of latency: no round trip time between requests on the same connection, and no need to establish new connections for new requests.

Re:Latency (1)

fruey (563914) | more than 7 years ago | (#16640993)

_every_ packet you receive has to be ACKed, and so latency can affect your download speed no matter how long your connection stays open.

Re:Latency (1)

RAMMS+EIN (578166) | more than 7 years ago | (#16641121)

``_every_ packet you receive has to be ACKed, and so latency can affect your download speed no matter how long your connection stays open.''

Larger sliding windows for TCP can significantly reduce that problem.

Re:Latency (1)

fruey (563914) | more than 7 years ago | (#16641199)

OK, but the article is mostly about what you can do as a content provider, rather than tweaks based on changing clients...

Gmail (2, Insightful)

protomala (551662) | more than 7 years ago | (#16640377)

I hope they apply this study on Gmail. Using it on a non-broadband connection (plain 56k modem) is a pain unless you use the pure HTML view that is crap compared to other HTML webmails.
The fun is that newer AJAX products from google (like goffice) don't suffer from this behavior, they have a much more cleaner code (just pick view code on your favorite browser and see). Probally Gmail HTML/Javascript is already showing it's age, and paying the price for being a first at google AJAX apps.

Re:Gmail (1)

Peeteriz (821290) | more than 7 years ago | (#16640465)

Yes, I have the problem also, it's very painful - my upload speed at home is severely limited, and GMail gets timeout messages occasionally - and if something like Bittorrent is running, then it's impossible to send anything with attachments - for example, a small 50 kb document is impossible to send.

Re:Gmail (1)

badfish99 (826052) | more than 7 years ago | (#16640821)

YMMV, but I find that throttling Bittorrent to 90% of its maximum upload speed makes the difference between "internet connection almost unusable" and "internet connection working almost normally".

Is this really news? (1)

iritant (156271) | more than 7 years ago | (#16640473)

There is a paper about this in SIGCOMM 1997 (!) by Nielsen, Gettys, et al that goes into far more detail of the "whys" and "wherefores". I'm not sure this shows ANYTHING new. In fact, what this gentleman demonstrates is the way that TCP windows work. By spreading requests over four hosts you are in effect getting four times the window size, arguably more than your fair share. Without looking at the aggregate impact, one cannot really judge what's going on.

Also, the reason pipelining is turned off by default in many browsers is that there are a lot of middleboxes that can't handle it.

Page load time is still important (1)

Dekortage (697532) | more than 7 years ago | (#16640479)

There are a lot of posts here asking "why is this important" and saying that pages already load fast enough on their broadband Internet connection. That may be true for you, but I'm frequently in a position where I am designing a site that needs to load over a slow satellite connection in rural Africa, say, or into a remote village in Nepal. They have a fairly recent computer, OS and browser on the recieving end, but their Internet connection is dog slow; anything I can do to speed it up will be greatly appreciated. It's back to 1980s dial-up speeds.

This isn't everyone's problem, I admit, but it's an issue for a lot of people in the world.

"Also, by spreading static content..." (1)

mumblestheclown (569987) | more than 7 years ago | (#16640513)

"Also, by spreading static content across four different hostnames, site operators can achieve dramatic improvements in perceived performance."

How ironic that a google engineer would say this, since doing this will also pretty well kill your google pagerank rankings. Google is great, yes, but among is many, many problems are the ridiculous ways that it forces people to do web design if they want a decent pagerank. another is how it "helpfully" directs you to "geographically relevant" searches - meaning that, for example, if you want a hotel room in egypt and browse from the UK, you get all the links from (much more expensive) UK based hotel and travel shops rather than, say, ones in egypt or elsewhere that, while also in english, are much cheaper.

FRISTr PSOt!! (-1, Troll)

Anonymous Coward | more than 7 years ago | (#16640533)

contact to seoe if

4 hostnames and security (1)

suv4x4 (956391) | more than 7 years ago | (#16640613)

Nice trick with 4 hostnames, but this means 4 security contexts for your content, which may make a lot of development hard (especially client based with JavaScript).
Not to mention the management issues of having to link to content on 4 different domains in an efficient enough manner.

This leaves us with pipelining on the client, which could results in much worse load peaks on the servers though.

In the end: let the page load a little slower, the workarounds are not worth it.

render time (1)

Spliffster (755587) | more than 7 years ago | (#16640695)

this is an interesting summary about network affected display time. i would like to add that nowadays, with more complex pages, the rendering time has also to be taken into account. for example; with the use of bloated/redundant/unneeded css declarations, java and flash applets the pageload (until it is actualy display to the user) can be slowed down pretty badly. if you don't belive me, go an check out myspace.com :P

Cheers,
-S

Optimization of slashdot? (1)

karolgajewski (515082) | more than 7 years ago | (#16641005)

Ok, now how will this help the loading of Slashdot? In that Slashdot has this terrible habit of occasionally slowing the browser to a halt when opening discussions with a lot of posts to parse (I'm using the new discussion system).

The problem, as I see it, is that issues like page load times are partly caused by browser issues (HTTP pipelining, cache, etc) and partly caused by server issues. (yes, yes, I know it's obvious) However, consider the idea of specialized configurations. Essentially a per-site set of conditions. For Slashdot.org, allow multiple HTTP connections (have to load that style file) and just load the images from the old cache (after all, the Microsoft Borg icon hasn't changed, has it?)

To a certain extent, this could be handled almost in a cookie-like fashion, except it's read before the initial HTTP request is made. You'd know that you're only requesting parts of the page, and could do a background query for elements which have been updated (i.e. a new category image, etc).

Then again, I also hate it when the loading of a PDF causes a loss of focus and slowing of the browser. Not the same, but in the same category of annoyance.
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Create a Slashdot Account

Loading...