×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Simple HA/HP clustering Using Only DNS

Hemos posted more than 9 years ago | from the interesting-concepts dept.

Technology 26

holviala writes "I cooked up a way to achieve high-availability and high-performance clustering using nothing but a few strangely configured dns zones. In case someone else is interested in an extremely easy clustering solution, I wrote a document about it. It's a bit technical, but the included examples should make it clear for anyone who's used to configuring dns. And yes, the linked site is clustered too, so... ummm... no need to be gentle :-)."

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

26 comments

well, not really HA, is it? (4, Interesting)

passthecrackpipe (598773) | more than 9 years ago | (#11466661)

This only guarantees DNS HA, since it will not test for apache being alive, or any other service being alive. More of a round robin type of setup, with automatic drop of dead addresses. Although it is a nice DNS experiment, I would never use this for HA, as there are better, and - critically - more reliable ways of doing HA, and some of those are pretty affordable.

Face it, you do HA if your business depends on it, and would you really want to rely on a DNS hack in that case?

Having said that - Cool Hack Dude!

Re:well, not really HA, is it? (3, Interesting)

holviala (124278) | more than 9 years ago | (#11467104)

This only guarantees DNS HA, since it will not test for apache being alive, or any other service being alive.

True, which is why I called it "simple". But with this setup you only need to monitor local processes and services, and if those die only shut down the nameserver. No need for complicated setups where you need to decide if it was the application of the network that died.

Face it, you do HA if your business depends on it, and would you really want to rely on a DNS hack in that case?

My business, yes, I'd rely on this. I do "offical" HA for living for customers who don't like hacks like this. But that's something I'd personally never use, not even if I'd own a million billion zillion dollar company.

Then again, I suffer from the Not Invented Here -syndrome. Guess I'd make a bad leader: "You'll use my DNS hack or you're fired!" :-)

Re:well, not really HA, is it? (1)

mdielmann (514750) | more than 9 years ago | (#11471910)

Then again, I suffer from the Not Invented Here -syndrome. Guess I'd make a bad leader: "You'll use my DNS hack or you're fired!" :-)

There's a nice executive position opening up in less than 4 years. You seem to share a couple ideas with the current executive, maybe you should apply.

P.S. If you're anti-Bush, please take no offense, I'm just joking. If you're pro-Bush, well, let's not go there.

You don't change the serial numbers.... (1)

Nathaniel (2984) | more than 9 years ago | (#11477293)

I must be missing something. Your page says:

"The serials should always be the same on all nodes." ... "But the most serious limitation are the buggy DNS servers around the world. This setup assumes that a DNS server or resolver obeys the expire time of a zone record (the 60 seconds used above). Unfortunatly, there are a lot of servers out there which don't do that."

Aren't other DNS servers allowed to look at your SOA serial number, notice it hasn't changed, and not bother doing any other work? Isn't that the point of having serial numbers?

It sounds like you are blaming all those other DNS servers for following the RFC.

But I'm sure you've given this more thought that I have, so tell me what I'm missing.

Re:You don't change the serial numbers.... (1)

holviala (124278) | more than 9 years ago | (#11477850)

Aren't other DNS servers allowed to look at your SOA serial number, notice it hasn't changed, and not bother doing any other work? Isn't that the point of having serial numbers?

I'm glad you told me that - now I can go and take down the setup that has proven to work well....

Yeah, they could check the SOA but they don't. The reason I want all SOAs to be the same is that no matter what, the SOA won't decrease. Basically this setup is the same as the traditional rr dns, but with dead node detection.

It sounds like you are blaming all those other DNS servers for following the RFC.

I'm blaming DNS servers which are borken. The month-and-a-half incident that I was complaining about was about a regular A record which had an refresh time of 8 hours, and an expire time of one week - we changed the IP and increased the serial. After around 6 weeks I finally called my ISP and asked them what the f*ck was wrong with their DNS servers (after which they cleared their cache right away and I finally got the new IP). That was maybe five years ago, so I have no idea whether they have fixed it or not.

Re:You don't change the serial numbers.... (1)

mikefe (98074) | more than 9 years ago | (#11498560)

First of all, let me say I love the idea and will be using it myself. It is perfect for a company that uses multiple cheap connections (read DSL) and needs to deal with the possibility of one going down. I only wonder why I didn't think of it myself, it makes every service work just like SMTP with MX records...

Second, if you are still a customer of that ISP you could do a test easily enough to see if they still cache the information beyond the expiration. (Maybe even if you are not a customer, depending on if their DNS server responds to requests outside of their network).

Re:well, not really HA, is it? (1)

paRcat (50146) | more than 9 years ago | (#11467325)

would you really want to rely on a DNS hack in that case?

A hack doesn't have to be unreliable. The debian stable tree has programs with hacks in their configs, but they've been deemed stable and are trusted. Really, the only thing separating a hack from an accepted practice is how widespread it's use is.

Sounds like you just want someone to blame... or flame?

Re:well, not really HA, is it? (1)

duffahtolla (535056) | more than 9 years ago | (#11467489)

For web server availability, this works fine.

I work at a broker dealer where we have a set of machines that sit on two different ISPs, and this is the technique we use in case one line goes down.

He taunts us!!! (0)

Anonymous Coward | more than 9 years ago | (#11466694)

And yes, the linked site is clustered too, so... ummm... no need to be gentle :-)

We'll see about that. Our cluster of desktops will wipe the floor with your cluster.

Sorry, you lose (0)

Anonymous Coward | more than 9 years ago | (#11466838)

HA has is more than making more than one host available with the same name. If you are hosting some lame blog, it may well work, but if you are anything that might be interesting to the rest of the world, ie. something that involves data, you'll have more problems to solve. Tell me, how does this ensure that all the hosts are serving identical data?

Re:Sorry, you lose (1)

aderusha (32235) | more than 9 years ago | (#11467056)

easy - use shared storage. hell, use an nfs mount if you like. use a common db back-end, and have the server software failover to another db if the first fails. this is clustering for front-end software, not back-end - the back-end is easy. getting a web browser to play along with a clustered front-end is a little more trickey.

Re:Sorry, you lose (0)

Anonymous Coward | more than 9 years ago | (#11477168)

With shared storage such as NFS you still have a single point of failure. You need something more advanced such as AFS or GFS.

Re:Sorry, you lose (1)

mikefe (98074) | more than 9 years ago | (#11499460)

You don't want to use a database on AFS since it caches all writes on the local machine until the file is closed...

One comment (3, Informative)

tdemark (512406) | more than 9 years ago | (#11466923)

Don't use "Domain.dom". There are well-known domains that are reserved explicitly for this purpose [rfc-editor.org] .

Re:One comment (1)

holviala (124278) | more than 9 years ago | (#11467225)

Don't use "Domain.dom". There are well-known domains that are reserved explicitly for this purpose.

Good point. It's all fixed now...

Re:One comment (0)

Anonymous Coward | more than 9 years ago | (#11467502)

Wow. Quick turnaround.

Though, I was thinking more along the lines of "example.com" (Section 3 of the RFC), since it is a little easier to grok than "domain.example"...

Good work, btw.

DNS caching? (2, Interesting)

Anonymous Coward | more than 9 years ago | (#11467123)

What about client programs that cache DNS lookups (I think some web browsers do this)? I'd hardly call something HA if I have to do something clientside to flush any cached lookups.

Re:DNS caching? (1)

davegaramond (632107) | more than 9 years ago | (#11467208)

That's why the TTL is set to a low value (60 seconds) so caching period is kept short. See yahoo.com, google.com, gmail.com, etc. They both set low TTL value too for A records.

Re:DNS caching? (1)

jmcleod (233418) | more than 9 years ago | (#11470808)

Browsers ignore the TTL on records. If you have a DNS-based balancing solution, like this or GSLB, it's going to bite you in the ass every time. You have to restart the browser (possibly even reboot the computer) in order to clear the cache.

Re:DNS caching? (1)

Coffee (95940) | more than 9 years ago | (#11473379)

What about client programs that cache DNS lookups (I think some web browsers do this)?

Many web browsers do, nscd does, DNS caches do...

Speaking of DNS caches, think about the case when an ISP is providing DNS for their customers - even cycling once per minute isn't good for load-balancing the hits routed via a large DNS cache. Further, when I used to run DNS for a large ISP, I set a minimum timeout for data, because I explicitly did NOT want my caches pulling zone data once per minute. (I set it to five minutes, I think, but I'm not sure any more, and that was long ago, in a land far away, and besides, the wench is dead.)

This is a clever idea, and neatly works around the problem of multiple A records not being tried in the real world, but it does have problems caused by non-compliant DNS cacheing.

Quite clever (2, Insightful)

realnowhereman (263389) | more than 9 years ago | (#11467305)

Regardless of what the nit-pickers say, I think this is quite a clever idea. The author isn't suggesting this is the best HA solution in the world; but it's certainly simple and effective.

What's complex about Heartbeat/Ldirectord (1)

Bothari (34939) | more than 9 years ago | (#11467383)

Seriously, what is so complex about using something like Heartbeat/LDirectord and setting up a HA/LB cluster?
Took me about 3 hours to read through the docs, google for examples and setup a 2-Load Balancer/3-Node Cluster, using downloaded packages from ultramonkey.org .

With a 30 sec deadtime, full takeover takes about 1-2 minutes.

Re:What's complex about Heartbeat/Ldirectord (2, Informative)

phaze3000 (204500) | more than 9 years ago | (#11467532)

As someone who maintains two clusters that run LVS, I'd agree that there's nothing that magical about setting it up. However, for a simple two node cluster LVS is a massive overkill - you've got to have as many director boxes as you have nodes!

I'm not sure I'd use this guy's method, but it's interesting nonetheless.

This isn't clustering.... (1)

photon317 (208409) | more than 9 years ago | (#11469547)


This is just HA load-balancing of your inbound web traffic. Clustering is what happens on the back end between the servers, which the articles doesn't cover at all, presumably because in the example case the servers are just serving static content over http, and all that's needed to "cluster" it is to copy your changes to both machines when you change the static data.

The hard part of clustering is getting real HA and/or Loadbalancing for non-trivial content. Imagine if the websrever behind Kimmy's DNS setup was hosting an SSL-based service, where users logged into sessions, and posted on a message board to each other.

Now you have to get SSL working right for multiple machines, and you have to get sessioning to work right between multiple machines (the user's session would jump between servers with Kimmy's DNS), and the database the message board is storing in would have to be consistent across both servers.

The easy solution to most of this is of course to put the session-state in the same database as the message board, and put the message board database on a seperate back-end machine that both webservers hit, thereby removing all state from the web cluster. But now you've introduced a new single point of failure at the database level - DB goes poof, your webservers dont matter much.

So there's no easy way out of this, you can't play shell games and try to make the single points of failure magically disappear. May as well leave the database on the web server nodes (assuming load/space aren't problems), and use some kind of clustered database solution...

One way or another, you'll have to do real clustering at some layer - which will inevitably involve heartbeats and quorums and all that jazz, the complicated stuff referred to in the first paragraph of Kimmy's page.

This technique is not new (1)

linuxwrangler (582055) | more than 9 years ago | (#11469722)

Per the article: "If the above was common knowledge, I'd be grateful get links to other docs about it."

OK, how about this article [rpanetwork.co.uk] from December 2002 (see diagram and description on page 4).

Check for New Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...