DNS load balancing

I noticed that Google talk does an interesting type of DNS load balancing. It goes like this:

This means that the talk.google.com domain takes a very long time to expire from caches, whereas the talk.l.google.com does not. 5 minutes in fact.

Now, let's say we have five erlang nodes, named A, B, C, D, and E. They all have IP addresses. I want to make all users connect to talk.foo.com and get some kind of decent distribution among the five nodes. If I use the Google talk method users that connect will get passed to a different server in five minute intervals. So: userA -> talk.foo.com -> a.foo.com (ip.address.first). Five minutes later userB -> talk.foo.com -> a.foo.com(ip.address.second). Brilliant! This will require a script that's controlling a DNS server which I own, since I'll be updateing a.foo.com's IP address every 5 minutes, though a more intelligent algorithm can be used to trigger an IP update. There's no need to load balance if the load isn't there to balance. In the case of Jabber, this can be easy to detect since each new user on the system has a persistant connection. They quite literally are "logged in".

So what if node B goes down? Well, all the users connected to that node will get disconnected, no doubt about it. But depending on how far in the 300 second (5 minute) count down we are they can reconnect and pick right back up where they left off. Then I can remove a.foo.com(ip.address.second) from my distribution script until I fix that node. This pattern is not high availability, but it would be possible to put a HA system on all of these IPs to make sure no node outage would ever do anything more than interrupt a persistent connection, at worst.


Written on 2009-01-23 19:55:41 UTC

Back

comments powered by Disqus

I am a hacker and systems architect specializing in data analytics and human computer interfaces.



Photos

Music

lazzarello's Profile Page

  • Login