×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

When RSS Traffic Looks Like a DDoS

CmdrTaco posted more than 9 years ago | from the which-is-most-of-the-time dept.

The Internet 443

An anonymous reader writes "Infoworld's CTO Chad Dickerson says he has a love/hate relationship with RSS. He loves the changes to his information production and consumption, but he hates the behavior of some RSS feed readers. Every hour, Infoworld "sees a massive surge of RSS newsreader activity" that "has all the characteristics of a distributed DoS attack." So many requests in such a short period of time are creating scaling issues. " We've seen similiar problems over the years. RSS (or as it should be called, "Speedfeed") is such a useful thing, it's unfortunate that it's ultimately just very stupid.

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

443 comments

RSS maybe (3, Funny)

Anonymous Coward | more than 9 years ago | (#9752005)

RSS may be ultimatly stupid but you didn't get first post did you! rookie!

Re:RSS maybe (1, Insightful)

Anonymous Coward | more than 9 years ago | (#9752120)

First post for once finally used in the correct context of a story and its modded offtopic, damn. Thought I had a winner.

fp? (-1, Offtopic)

Anonymous Coward | more than 9 years ago | (#9752006)

fp?

Yesterday (3, Interesting)

ravan_a (222804) | more than 9 years ago | (#9752009)

Does this have anything to do with /. problems yesterday

Re:Yesterday (0)

Anonymous Coward | more than 9 years ago | (#9752125)

Yesterday? What about today too :)

Re:Yesterday (0)

Anonymous Coward | more than 9 years ago | (#9752330)

I have been having all kinds of problems for the last maybe 4 days.

Re:Yesterday (2, Interesting)

afidel (530433) | more than 9 years ago | (#9752300)

Oh how prophetic, I went to check the first reply to your post and slashdot again did the white page thing (top and left borders with a white page and no right border). Earlier today (around noon EST) I was getting nothing but 503's. This new code has not been good to Slashdot.

Are they sure (0, Troll)

foidulus (743482) | more than 9 years ago | (#9752011)

they aren't just being /.ed?

Re:Are they sure (1)

Saeed al-Sahaf (665390) | more than 9 years ago | (#9752316)

Are they sure they aren't just being /.ed?

Yes, of course! That's it! They should have known better than to run Infoworld off a 286 and a DSL conx in some guy's basement.

w00 (-1, Offtopic)

Anonymous Coward | more than 9 years ago | (#9752013)

yay

hey dudez (-1, Offtopic)

Anonymous Coward | more than 9 years ago | (#9752015)

whatz up?

Can't this be throttled? (2, Interesting)

xplosiv (129880) | more than 9 years ago | (#9752019)

Can't one just write a small php script or something which returns an error (i.e. 500), less data to send back, and hopefully the reader would just try again later.

Re:Can't this be throttled? (3, Insightful)

jcain (765708) | more than 9 years ago | (#9752089)

That kind of eliminates the point of having the RSS at all, as the user no longer gets up-to-the-minute information.

Also, I doubt that the major problem here is bandwidth, more the number of requests the server has to deal with. RSS feeds are quite small (just text most of the time). The server would still have to run that PHP script you suggest.

Re:Can't this be throttled? (4, Insightful)

mgoodman (250332) | more than 9 years ago | (#9752103)

Then their RSS client would barf on the input and the user wouldn't see any of the previously downloaded news feeds, in some cases.

Or rather, anyone that programs an RSS reader so horribly as to make it so that every client downloads information every hour on the hour would probably also barf on the input of a 500 or 404 error.

Most RSS feeders *should* just download every hour from the time they start, making the download intervals between users more or less random and well-dispersed. And if you want it more than every hour, well then edit the source and compile it yourself :P

Re:Can't this be throttled? (4, Insightful)

ameoba (173803) | more than 9 years ago | (#9752369)

It seems kinda stupid to have the clients basing their updates on clock time. Doing an update on client startup and then every 60min after that would be just as easy as doing it on the clock time & would basically eliminate the whole DDOSesque thing.

Simple HTTP Solution (3, Informative)

inertia@yahoo.com (156602) | more than 9 years ago | (#9752020)

The readers should HEAD to see if the last modified changed... And the feed rendering engines should make sure their last modified is accurate.

Re:Simple HTTP Solution (5, Insightful)

skraps (650379) | more than 9 years ago | (#9752181)

This "optimization" will not have any long-lasting benefits. There are at least three variables in this equation:

  1. Number of users
  2. Number of RSS feeds
  3. Size of each request

This optimization only addresses #3, which is the least likely to grow as time goes on.

Re:Simple HTTP Solution (3, Informative)

ry4an (1568) | more than 9 years ago | (#9752231)

Better than that they should use the HTTP 2616 If-Modified-Since: header in their GETs as specified in section 14.25. That way if it has changed they don't have to do a subsequent GET.

Someone did a nice write-up about doing so [pastiche.org] back in 2002.

Re:Simple HTTP Solution (3, Informative)

johnbeat (685167) | more than 9 years ago | (#9752320)

So, he's writing from infoworld and complaining that RSS feed readers grab feeds whether the data has changed or not. So, I went to look for infoworld's RSS feeds. Found them at:

http://www.infoworld.com/rss/rss_info.html

Trying the top news feed, got back:

date -u ; curl --head http://www.infoworld.com/rss/news.xml
Tue Jul 20 19:51:44 GMT 2004
HTTP/1.1 200 OK
Date: Tue, 20 Jul 2004 19:48:30 GMT
Server: Apache
Accept-Ranges: bytes
Content-Length: 7520
Content-Type: text/html; charset=UTF-8

How do I write an RSS reader that only downloads this feed if the data has changed?

Jerry

Still haven't tried these newfangled RSS readers.. (2, Interesting)

Rezonant (775417) | more than 9 years ago | (#9752026)

...so could someone recommend a couple of really good ones for Windows and *nix?

Re:Still haven't tried these newfangled RSS reader (0)

Anonymous Coward | more than 9 years ago | (#9752144)

Off topic indeed, this clearly should have been an Ask Slashdot.

Re:Still haven't tried these newfangled RSS reader (4, Informative)

Dr. Sp0ng (24354) | more than 9 years ago | (#9752242)

On Windows I use RSS Bandit [rssbandit.org]. Haven't found a non-sucky one for *nix, although I haven't looked all that hard. On OS X I use NetNewsWire [ranchero.com], which while not great, does the job.

Re:Still haven't tried these newfangled RSS reader (2, Informative)

Eslyjah (245320) | more than 9 years ago | (#9752307)

If you're using NetNewsWire on OS X, try the Atom Beta [ranchero.com], which, I'm sure it will come as no shock to you, adds support for Atom feeds.

Re:Still haven't tried these newfangled RSS reader (0)

Neil Blender (555885) | more than 9 years ago | (#9752248)

Still haven't tried these newfangled RSS readers.. (Score:3, Informative)
by Rezonant (775417) on 2004-07-20 12:35 (#9752026) ...so could someone recommend a couple of really good ones for Windows and *nix?


Ok. How is a question informative? Or is the fact that Rezonant has never used an RSS reader informative? Here's a +5 Informative for you: I haven't used an RSS reader either.

Re:Still haven't tried these newfangled RSS reader (0, Funny)

Anonymous Coward | more than 9 years ago | (#9752335)

It's "informative by association." His post will attract answers. (Well, answers and people who bitch about moderation.)

Re:Still haven't tried these newfangled RSS reader (1)

coolguy81 (322371) | more than 9 years ago | (#9752292)

RSS Bandit [rssbandit.org] (Windows)
Syndigator [sourceforge.net] (X)

There is also a rss thunderbird extension Formzilla [mozillazine.org] but you have to be using a version of thunderbird build with the xmlextras extension... it is all described in the post.

Re:Still haven't tried these newfangled RSS reader (1)

harley_frog (650488) | more than 9 years ago | (#9752363)

I use wTicker [wticker.org] for my Windows computer and KNewsTicker [uni-oldenburg.de]for my Linux boxes. The latest version of wTicker won't run on my XP computer, but an older version does. It's still in beta and a little clunky, but the crawler takes up far less screen space than any other RSS reader I've tried.

way to propose a solution (-1, Troll)

Anonymous Coward | more than 9 years ago | (#9752029)

or anything really, aside from whining about things

kinda like the slashdot effect? (-1, Redundant)

Anonymous Coward | more than 9 years ago | (#9752030)

hrm...

Oh really? (0)

Anonymous Coward | more than 9 years ago | (#9752046)

Every hour, random sites "see a massive surge of /.'s news reader activity" that "has all the characteristics of a distributed DoS attack."

Slashdot (or as it should be called, "Sitefsck") is such a useful thing, it's unfortunate that it's ultimately just very stupid.

Over the years? How about over the weekend? (5, Informative)

Marxist Hacker 42 (638312) | more than 9 years ago | (#9752048)

We've seen similiar problems over the years. RSS (or as it should be called, "Speedfeed") is such a useful thing, it's unfortunate that it's ultimately just very stupid.

And it seems to have gotten worse since the new code was installed- I get 503 errors at the top of every hour now on slashdot.

What about a scheduler? (4, Interesting)

el-spectre (668104) | more than 9 years ago | (#9752051)

Since many clients request the new data every 30 minutes or so... how about a simple system that spreads out the load? A page that, based on some criteria (domain name, IP, random seed, round robin) gives each client a time it should check for updates (i.e. 17 past the hour).

Of course, this depends on the client to respect the request, but we already have systems that do (robots.txt), and they seem to work fairly well, most of the time.

Re:What about a scheduler? (1)

el-spectre (668104) | more than 9 years ago | (#9752112)

Just thinking about it... all you really need is a script that has a cycling counter from 0-59, and responds to a GET. Take about 2 minutes to write in the language of your choice.

Re:What about a scheduler? (5, Funny)

cmdr_beeftaco (562067) | more than 9 years ago | (#9752260)

Bad idea. Everyone knows that most headlines are made at the top of the hour. Thus, A.M. radio always give news headlines "at-the-top-of-hour." RSS reader should be given the same timely updates.
Related to this is the fact that most traffic accidents happen "on the twenties." Human nature is a curious and seemingly very predictable thing.

Re:What about a scheduler? (1)

Retric (704075) | more than 9 years ago | (#9752353)

Just check every hour after you log on / start it up. Shure there would be a minor bias to people loging on on the hour as many people get to work at 7:50 +/- 5 min or so but it's not all that bad and if you want to cover this one just add +/- 2 min to the interval aka make it 58 - 62 min and people will spread out quickly.

This realy has a lot more to do with 100,000 people checking the sight with in 30 seconds of each other than anything else.

They get hit every hour? (1, Interesting)

Anonymous Coward | more than 9 years ago | (#9752054)

Why not make it standard that the starting time is chosen randomly or assigned by the remote site? "Forty-three minutes after the hour is pretty empty, from now on you can check the news at that time" or something similar.

RSS as DDoS (-1, Offtopic)

CheeseburgerBlue (553720) | more than 9 years ago | (#9752057)

That's exactly what I told the FBI, but they ate my baby anyhow.

FBI: "You're a Terrorist."

Me: "No, I just have a smutty blog."

FBI: "Is that Arabic?"

Me: "I'm sorry?"

FBI: "Take him away. We're blowing up Alderan, baby."

RSS needs better TCP stacks (3, Interesting)

Russ Nelson (33911) | more than 9 years ago | (#9752058)

RSS just needs better TCP stacks. Here's how it would work: when your RSS client connects to an RSS server, it would simply leave the connection open until the next time the RSS data got updated. Then you would receive a copy of the RSS content. You simply *couldn't* fetch data that hadn't been updated.

The reason this needs better TCP stacks is because every open connection is stored in kernel memory. That's not necessary. Once you have the connecting ip, port, and sequence number, those should go into a database, to be pulled out later when the content has been updated.
-russ

Re:RSS needs better TCP stacks (1)

Russ Nelson (33911) | more than 9 years ago | (#9752127)

Several people have pointed out that you want a schedule. No, you don't. That just foments the stampeding herd of clients. You really want to allow people to connect whenever they want, and then receive data only when you're ready and able to send it back.

Basically, you use the TCP connection as a subscription. Call it "repeated confirmation of opt-in" if you want. Every time the user re-connects to get the next update (which they will probably do immediately; may as well) that's an indication that they want another copy. Everybody gets updates as soon as possible, just like email, only it's not possible to force data on everybody, as we've seen happen with email.

Re:RSS needs better TCP stacks (3, Insightful)

EnderWiggnz (39214) | more than 9 years ago | (#9752328)

not needing user intervention is the effing POINT of rss.

its like saying - "java is great, except lets make it compiled, and platform specific"

Re:RSS needs better TCP stacks (1)

ganhawk (703420) | more than 9 years ago | (#9752148)

I assume It will be DDOS'ed (intentionally or even otherwise due to internet hotspots).

Especially since scalling is a problem. Imagine millions of active connections with database storing the states of each connection. You need far more resources for that than the current system.

Re:RSS needs better TCP stacks (3, Funny)

genixia (220387) | more than 9 years ago | (#9752182)

Yeah, because there's nothing like using a sledgehammer to crack a hazlenut.

For starters, how about the readers play nice and spread their updates around a bit instead of all clamoring at the same time.

Re:RSS needs better TCP stacks (1)

mgoodman (250332) | more than 9 years ago | (#9752225)

I'm not sure the server could handle having that many open connections...hence its current process of providing an extremely small text file, creating a connection, transferring the file, and destroying the connection.

Correct me if I'm wrong.

Re:RSS needs better TCP stacks (5, Insightful)

Salamander (33735) | more than 9 years ago | (#9752372)

Leaving thousands upon thousands of connections open on the server is a terrible idea no matter how well-implemented the TCP stack is. The real solution is to use some sort of distributed mirroring facility so everyone could connect to a nearby copy of the feed and spread the load. The even better solution would be to distribute asynchronous update notifications as well as data, because polling always sucks. Each client would then get a message saying "xxx has updated, please fetch a copy from your nearest mirror" only when the content changes, providing darn near optimal network efficiency.

Wrong target, but good solution (1)

oGMo (379) | more than 9 years ago | (#9752381)

Reimplementing TCP using a database is excessive. Making a light connectionless protocol that does similar to what you described would be a lot simpler and not require reimplementing everyone's TCP stack.

Also, as much as I hate the fad of labelling everything P2P, having a P2P-ish network for this would help, too. The original server can just hand out MD5's, and clients propagate the actual text throughout the network.

Of course (and this relates to the P2P stuff), every newfangled toy these days is just a pathetic reimplementation of some original Internet protocol. Like, say, NNTP. Which does all of this already, and has for years. Ah well.

Scheduling (1)

mbbac (568880) | more than 9 years ago | (#9752059)

RSS readers and aggregators shouldn't gather new feeds every hour on the hour. They should gather them when the application is first run and then every hour after that (probably not on the hour). I'd hope most GUI applications already run this way. I guess most of this traffic just comes from daemon processes -- and that should be changed.

Re:Scheduling (1)

stratjakt (596332) | more than 9 years ago | (#9752195)

No doubt its cron jobs and the like.

Does any flavor of cron have a "randomizing" function? Like, for instance, tell it "every hour on the hour, give or take 30 minutes"?

So it might look at 1:11, 2:25, 2:51, etc...

Revision of the Standard (1)

novalogic (697144) | more than 9 years ago | (#9752064)

RSS is infact living up to what it was made for, however, its getting used like a Chevy S-10 pulling a Semi trailer.

PHP just had a major overhaul, no reason why RSS2 shouldn't be on the drawing board. This time, though, more thought the scale of its use should be thought of.

Re:Revision of the Standard (2, Interesting)

cmdr_beeftaco (562067) | more than 9 years ago | (#9752190)

And there is a one word solution, peer to peer. The whole torrent concept is what is needed.

deluge of traffic... (0)

Anonymous Coward | more than 9 years ago | (#9752065)

if the deluge of traffic that RSS causes makes RSS "stupid," posting an article about the deluge of traffic RSS is causing on Slashdot is, at the very least, "ironic."

Idea (4, Interesting)

iamdrscience (541136) | more than 9 years ago | (#9752073)

Well maybe somebody should set something up to syndicate RSS feeds via a peer to peer service. BitTorrent would work, but it could be improved upon (people would still be grabbing a torrent every hour, so it wouldn't completely solve the problem).

Re:Idea (5, Interesting)

ganhawk (703420) | more than 9 years ago | (#9752296)

You could have a system based on JXTA. Instead of the bittorrent model, it would be something like the P2P Radio. When the user asks for feed, a neigbour who just recived it can give it to the user (overlay network, JXTA based) or the server can point to one of the users who just received it.(similar to bittorrent but user gets whole file from peer intead of parts. The user also does not come back to server at all, if transfer is successfull. But the problem is this user need not serve others and can just leech)

I feel overlay netwrok scheme would work better than Bittorrent/tracker based system. In overlay network scheme each group of network will have its own ultra peer (JXTA rendezvous) which acts as tracker for all files in that network. I wanted to do this for slashdot effect (p2pbridge.sf.net) but somehow the project has been delayed for long.

Google News (1)

Dominatus (796241) | more than 9 years ago | (#9752081)

I used to have an RSS feed for google news that I loved and used all the time, but it was taken down due to this effect. It's a shame that these things can't be handled better. (the RSS feed may be back up, I haven't checked in months)

One hour interval? (2, Insightful)

anynameleft (787817) | more than 9 years ago | (#9752082)

Why have developers made their RSS readers so that they query the master site at each hour sharp? Why haven't they done it like Opera or Konqueror, e.g. query the server every sixty minutes after the application has been started?

Or did the RSS reader authors hope that their applications wouldn't be used by anybody except for a few geeks?

Won't help (1, Interesting)

Animats (122034) | more than 9 years ago | (#9752202)

Doesn't matter. If lots of people poll every hour, eventually the polls will synch up. There's a neat phase-locking effect. Van Jacobson analyzed this over twenty years ago.

We have way too much traffic from dumb P2P schemes today, considering the relatively small volume of new content being distributed.

Re:One hour interval? (1)

r00zky (622648) | more than 9 years ago | (#9752232)

At first i didn't understood the article...
You mean RSS readers are programmed to fetch the feed at hour xx:00??

That's fantastic

Some programmers should be shot...

Re:One hour interval? (0)

Anonymous Coward | more than 9 years ago | (#9752272)

Because, unfortunately, on the Macintosh, on the hour is the only option available (from Safari!)

Re:One hour interval? (1)

AndroidCat (229562) | more than 9 years ago | (#9752366)

I believe that's what SharpReader does. One thing I personally do is adjust the refresh rate for each feed from the one hour default. There's no point in banging on a feed every hour when it changes a few times a week.

One good idea would be for the protocols to allow each feed to suggest a default refresh rate. That way slow changing or overloaded sites could ask readers to slow down a little. A minimum refresh warning rate would be good too. (i.e. Refreshing faster than that rate might get you nuked.) I know that some things are already in the protocols, but a better set of Netiquette for Blogreaders would be a good idea.

"it's the connection overhead, stupid" (4, Informative)

SuperBanana (662181) | more than 9 years ago | (#9752090)

...is what one would say to the designers of RSS.

Mainly, IF your client is smart enough to communicate that it only needs part of the page, guess what? The pages, especially after gzip compression(which, including with mod_gzip, can be done ahead of time)...the real overhead is all the nonsense, both on a protocol level and for the server in terms of CPU time, of opening+closing a TCP connection.

It's also the fault of the designers for not including strict rules as part of the standard for how frequently the client is allowed to check back, and, duh, the client shouldn't be user-configured to check at common times, like on the hour.

Bram figured this out with BitTorrent- the server can instruct the client on when it should next check back.

Re:"it's the connection overhead, stupid" (1)

Russ Nelson (33911) | more than 9 years ago | (#9752157)

No, it's not necessary to add scheduling. All that's needed is better TCP stacks which can handle millions of concurrent open connections. Presumably this would happen in a database in userland, and not in the kernel.
-russ

Why are we writing polling software in 2004? (1, Flamebait)

skraps (650379) | more than 9 years ago | (#9752097)

The guy who came up with the idea for RSS should be sent back to comp. sci. 101. It should have been readily apparent from day 1 that this would be a problem.

Some sort of peer-to-peer event-driven model would be a better match for this problem.

Re:Why are we writing polling software in 2004? (1)

Gooner (28391) | more than 9 years ago | (#9752194)

Y'know the parent comment is one time I'm more than willing to let Dave Winner grab all the "glory" within the creation of RSS. Oh and if this turns into a flamewar over who invented what then let me get my CDF shout out, erm, out there.

it's the PULL,stupid (3, Interesting)

kisrael (134664) | more than 9 years ago | (#9752099)

"Despite 'only' being XML, RSS is the driving force fulfilling the Web's original promise: making the Web useful in an exciting, real-time way."

Err, did I miss the meeting where that was declared as the Web's original promise?

Anyway, the trouble is pretty obvious: RSS is just a polling mechanism to do fakey Push. (Wired had an interesting retrospective on their infamous "PUSH IS THE FUTURE" hand cover about PointCast.) And that's expensive, the cyber equivalent of a hoarde of screaming children asking "Are we there yet? Are we there yet? How about now? Are we there yet now? Are we there yet?" It would be good if we had an equally widely used "true Push" standard, where remote clients would register as listeners, and then the server can actually publish new content to the remote sites. However, in today's heavily firewall'd internet, I dunno if that would work so well, especially for home users.

I dunno. I kind of admit to not really grokking RSS, for me, the presentation is too much of the total package. (Or maybe I'm bitter because the weird intraday format that emerged for my own site [kisrael.com] doesn't really lend itself to RSS-ification...)

Proposed Solution (2, Interesting)

Dominatus (796241) | more than 9 years ago | (#9752119)

Here's a solution: Have the RSS readers grab data every hour or half hour starting from when they are started up, not on the hour. This would of course distribute the "attacks" on the server.

As a self-appointed representative of RSS, ... (1)

burgburgburg (574866) | more than 9 years ago | (#9752121)

I'd like to dispute the characterization of my client as stupid.

I'd really, really like to.

Obviously, I can't, but boy would I like to.

Stupid RSS.

Poisson distribution (1, Insightful)

Anonymous Coward | more than 9 years ago | (#9752133)

We use poisson distribution to even out the load our scripts generate.

Server side and client side fixes. (1)

Inoshiro (71693) | more than 9 years ago | (#9752135)

In any commons, co-operation is key. I doubt most people will update their clients to work with HEAD or some sort of checksumming without reason, so the first obvious step is to block clients for a period. If a client retrieves information from a host, place a bam on all requests from said client until either the information changes, or there is a timeout value.

On the client side, the software needs to be written to check for updates to the data before pulling the data. This will lessen the burder.

The other side of the problem is the fact that the clients default to asking for data at the top of the hour. As this scales up, even with checks to see if data has changed, you'll be seeing a synchronized rise in traffic which leads to a DDoS effect on systems. To fix this is the same way we fixed message ids: the interval that clients check the data on should be seeded semi-random intervals such that no more than subset n of the total i clients are checking for new data or transfering new data at any given time. This is something else that can be mitigated by having smarter server-side data blocks until users update to smarter clients. Otherwise the servers risk being DDoSed by these legions of stupid clients ;)

Random request time (1)

sanpitch (9206) | more than 9 years ago | (#9752136)

Newsreaders and users could both use random request times, rather than defaulting to the top of the hour.

random check intervals? (2, Insightful)

Hunterdvs (461524) | more than 9 years ago | (#9752172)

Why not have rss readers that check on startup, then check again at user specified intervals.. After a random amount of time has past.
user starts program at 3.15 and it checks rss feed.
user sets check interval to 1 hour.
rand()%60 minutes later (let's say 37) it checks feed
every hour after that it checks the feed.

simplistic sure, but isn't rss in general?

on an aside, any of you (few) non-programmers interested in creating rss feeds, i put out some software that facilitates it.
hunterdavis.com/ssrss.html

I thought it should be called (0)

Anonymous Coward | more than 9 years ago | (#9752173)

Arse-feed

Push, not pull! (4, Interesting)

mcrbids (148650) | more than 9 years ago | (#9752192)

The basic problem with RSS is that it's a "pull" method - RSS clients have to make periodic requests "just to see". Also, there's no effective way to mirror content.

That's just plain retarded.

What they *should* do...

1) Content should be pushed from the source, so only *necessary* traffic is generated. It should be encrypted with a certificate so that clients can be sure they're getting content from the "right" server.

2) Any RSS client should also be able to act as a server, NTP style. Because of the certificate used in #1, this could be done easily while still ensuring that the content came from the "real" source.

3) Subscription to the RSS feed could be done on a "hand-off" basis. In other words, a client makes a request to be added to the update pool on the root RSS server. It either accepts the request, or redirects the client to one its already set up clients. Whereupon the process starts all over again. The client requests subscription to the service, and the request is either accepted or deferred. Wash, rinse, repeat until the subscription is accepted.

The result of this would be a system that could scale to just about any size, easily.

Anybody want to write it? (Unfortunately, my time is TAPPED!)

Re:Push, not pull! (3, Interesting)

stratjakt (596332) | more than 9 years ago | (#9752365)

Too many firewalls in todays world for "push" anything to work.

Too many upstream bandwidth restrictions, especially on home connections. Last thing people want is getting AUPped because they're mirroring slashdot headlines.

My solution? Multicast IPs. Multicast IPs solve every problem that's ever been encountered by mankind. Join Multicast, listen till you've heard all the headlines (which repeat ad nauseum), move on with life. Heck, keep listening if ya want. All we have to do is make it work.

Frankly, who said you have to let everyone in the world on your RSS feed. If your server cant handle X concurrent RSS requests, it's hardly the protocols "fault", IMO.

Re:Push, not pull! (1)

ikegami (793066) | more than 9 years ago | (#9752382)

Pushed contents sounds like the obvious solution, but it's hard to push content to a client behind NAT (home "routers") or a firewall.

I seem to remember... (4, Interesting)

Misch (158807) | more than 9 years ago | (#9752193)

I seem to remember Windows scheduler being able to randomize scheduled event times within a 1 hour period. I think our RSS feeders need similar functions.

Simple Solution (0, Redundant)

prichardson (603676) | more than 9 years ago | (#9752205)

Ask RSS reader writers to program into their programs a suggestion that the refresh not be on the hour. It would distribute the load more evenly. Getting people to actually do this is another problem. Sounds to me like a little bit of lazy coding (not checking modified times in header), and a little bit of ignorance (RSS isn't big enough to cause a problem.... so doing this on the hour is OK right?) have just snowballed.

XML Bloat... (0)

Anonymous Coward | more than 9 years ago | (#9752262)

Of course, XML bloat has nothing to do with this.

Cache it (1)

Sloppy (14984) | more than 9 years ago | (#9752265)

IMHO, most ISPs (and their ISPs) should run caching proxies for http. (And of course, servers need to be less stupid about advising against caching content.) I just don't understand why they don't. Most of them already run a nameserver, mail server, maybe a usenet spool, etc. What's one more service?

It might not seem like it's worth much effort if a bunch of your customers are all downloading a few hundred bytes of headlines every hour, but it probably matters when they're all downloading movie trailers or OS updates. The caching of small stuff to keep from contibuting to someone else's slashdotting, is just a bonus.

Oh, and if an RSS is ten minutes old instead of "real time": The 1% of the population that actually cares, can just elect to not use the proxy.

It can even be a really cheap box, too, since it doesn't need to be reliable. Use cheap consumer-grade crap. If once per year a drive fails and you lose all your cache, so what?

Its totally stupid. (1)

torpor (458) | more than 9 years ago | (#9752297)


RSS is like a hi-jack of majordomo, by marketing dweebs.

E-mail - yes folks, good old fashioned SMTP, can be used for these things that RSS is supposedly 'good for'.

We do not need yet another protocol for transfering messages to each other. A properly defined X-Protocol addition, which allows for embedded XML in the Body text, would solve this distribution problem entirely.

Mail scales well. Like it or not, but it does. Its a perfect model for RSS ...

Oh, come on (5, Interesting)

aiken_d (127097) | more than 9 years ago | (#9752302)

My guess is that InfoWorld is dynamically generating the RSS for each request. A simple host-side cache of the generated XML, so hits just talk to the HTTP server and not the app server, would probably make this a non-issue.

Or are they *really* getting more RSS hits than image requests? If -- somehow -- that's the case, spend $500/mo on Akamai or Speedera and point RSS stuff there, and give the CDN a reasonable timeout (30 minutes or something). That guarantees you no more than about 500 hits per timeout period, or maybe one every 10 seconds. Surely the app server can handle that.

Then again, what do I know? I only worked there for five years, including two on infoworld.com. It's been a few years, but unless things have changed dramatically, that is one messed up IT organization.

Cheers
-b

Let's have RSS torrents -- (0)

Anonymous Coward | more than 9 years ago | (#9752304)

That way I can snag the story from someone who just downloaded it. :)

It just ain't broadcast.. (4, Interesting)

wfberg (24378) | more than 9 years ago | (#9752310)

Complaining about people connecting to your RSS feeds "impolitely" is missing the mark a bit, I think. Even RSS readers that *do* check when the file was last changed, still download the entire feed when so much as a single character has changed.

There used to be a system where you could pull a list of recently posted articles off of a server that your ISP had installed locally, and only get the newest headers, and then decide which article bodies to retrieve.. The articles could even contain rich content, like HTML and binary files. And to top it off, articles posted by some-one across the globe were transmitted from ISP to ISP, spreading over the world like an expanding mesh.

They called this.. USENET..

I realize that RSS is "teh hotness" and Usenet is "old and busted", and that "push is dead" etc. But for Pete's sake, don't send a unicast protocol to do a multicast (even if it is at the application layer) protocol's job!

It would of course be great if there was a "cache" hierarchy on usenet. Newsgroups could be styled after content providers URLs (e.g. cache.com.cnn, cache.com.livejournal.somegoth) and you could just subscribe to crap that way. There's nothing magical about what RSS readers do that the underlying stuff has to be all RRS-y and HTTP-y..

For real push you could even send the RSS via SMTP, and you could use your ISPs outgoing mail server to multiply your bandwidth (i.e. BCC).

rss+bittorent? (1, Redundant)

DougMelvin (551314) | more than 9 years ago | (#9752317)

How about combining RSS with Bittorent? The RSS feeder would act more as a BT tracker.. Simply point the client to the nearest dood with a copy of the feed.....

3 Good Ideas (1)

cameronc (712082) | more than 9 years ago | (#9752318)

  1. Readers should obey HTTP Cache and Expires Directives
  2. Readers should use Head requests to determine change
  3. Future RSS formats should specify update frequency

Site proxy server (0)

Anonymous Coward | more than 9 years ago | (#9752348)

Have RSS readers use a local proxy that is willing to cache RSS data.
Treat this like an HTML problem.

bah. Multicast! (0)

Anonymous Coward | more than 9 years ago | (#9752357)

Demand your ISP support native multicast. As most ISPs are now owned by cable companies, and native multicast would enable efficient, scalable video distribution, don't bet on them being too receptive, but here again is another application for which native multicast would excel.

Too many problems. (1)

blanks (108019) | more than 9 years ago | (#9752364)

The main problem with most RSS feeds is that they update all information. Most of these run off a simple JavaScript that will run on a timer to get all the data again and again. A better solution would be to implement an XML RSS (or any language really) that uses a simple ID system for news items. When its time to update the news feed, find any new ID's existing; don't retrieve existing data, only new data. This would cut down a large chunk of bandwidth. A better idea would be to implement some type of component that access their database (or web server through isapi etc etc) that will update the content on the external server that requires the RSS. This would again cut out a large level of data transfer, and requests that would normally slow the server down. Yes this would need to be installed on the external web server, but if people need the news feeds, then you can force them to do it your way. This is similar to what I do with our system, we have 12,000 machines every few minutes accessing a database to send and receive new information, and we have very few problems with it.
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...