Webfarms (Part 2) : Balancing The Load
Okay, so you understand webfarms now. What's the magic that actually
distributes the load, and how does it determine how the distribution is
handled?
At ORCS Web we use the Foundry Server Iron products to perform our webfarm
load-balancing. If one of them fails, the other instantly takes over (In our
testing, it had sub-second fail-over!)
So what is this "Server Iron" thing? In simplest terms, it's a layer 4-7
switch. It has multiple network ports on it and can be used literally like
other types of switches. But, it can also load-balancing and traffic
distribution. A VIP (virtual IP) can be assigned to the SI (Server Iron) and
it then handles all traffic sent to that address/VIP. Further configuration
is done to tell the SI what to actually do with the traffic sent to the VIP
address.
The traffic that hits the VIP on the Server Iron is of course redistributed
to a number of server nodes so the client request can be satisfied - that's
the whole point of a webfarm. If one or more server nodes are not
responding, the switches are able to detect this and send all new requests
to servers that are still online - making the failure of a server node
almost transparent to the client.
The traffic can be distributed based on a couple different logic algorithms.
The most common are:
-
Round Robin: The switches send requests to each server in rotation,
regardless of how many connections each server has or how fast it may reply.
-
Fastest response: The switches select the server node with the fastest
response time and sends new connection requests to that server.
-
Least connections: The switches send traffic to whichever server node
shows as having the fewest active connections.
-
Active-passive: This is called Local/Remote on a Foundry switch, but is
still basically active/passive. This allows one or more servers to be
designated as "local" which marks them as primary for all traffic. This is
combined with another method above to determine what order the "local"
server nodes have requests sent to them. If a situation were to arise that
all the "local" (active) server nodes were down, then traffic would be sent
to the "remote" server nodes. Note that "remote" in this case doesn't really
have to mean remote - the "remote" server could be sitting right next to the
"local" servers but it is marked as remote in the configuration to let it
operate as a hot-standby server. This setting can also be used in a true
remote situation where there are servers in a different physical data
center - perhaps for extreme disaster recovery situations.
What method is best? It really depends on your application and some other
surrounding factors. Each method is good though and would probably satisfy
requirements regardless of the configuration. Especially if you are closely
monitoring each server node with an external tool (other than directly from
the load-balancing switch). So, with the external monitoring you can confirm
that all server nodes are operating without error and within reasonable
speed thresholds that have been set.
Also, remember that, regardless which traffic algorithm is chosen, if a node
goes down, traffic is sent to the other nodes. And when a node comes back
online, it can automatically be placed back into the webfarm and start
getting client requests again.
Clustered hosting does require some consideration of how state is managed
within applications, which will be covered in a future article.
Happy hosting!