question about having a back up server using Sockets

**newbc** · 11-22-2013

Ok I already have a simple tcp between one client and a server. Now what I need to do is have the same tcp client program, but now on the server program I need to implement code where if server serve one is down connect the client to server two. And let sever two do what server one was supposed to do. I have look for some examples on google but with no luck. Any ideas on what functions to use or sudo code?

**anduril462** · 11-22-2013

Note, this should be in the Networking or Tech Boards since it's not necessarily specific to C programming.

I think what you're looking for is called a failover. Do a bit of reading, then maybe tell us a little more about your setup, needs, etc:

OS, network/system architecture/layout, what the server and client are actually doing (file server, computation engine).
Nature of failures you expect: slow or sudden, partial (daemon starts misbehaving from time to time) or complete (power outage, network failure), etc?
What constitutes a failure: low resources, single HDD in RAID array failing, daemon crashing, etc?
How you plan on communicating between servers 1 and 2 for monitoring/heartbeat?
Consider whether a failover would be useful in some cases. E.g. if the two servers will sit on the same power circuit or in the same building, what good is a failover in a power/network outage?
Is there any state/context that needs to be transferred between server 1 and 2 (if possible) so server 2 can pick up in the middle of a task (e.g. large computation/file transfer)?
Do you need to program this yourself, or can you use a 3rd party app or maybe a hardware solution?

That should give you something to think about for a while.

**newbc** · 11-22-2013

Well I am using sockets in c\unix.

• Name Server (NS) to keep track of availability of all the Math Servers
• One TCP connection between each client and the Math Server (MS). Assume there are several Math Servers, e.g., MS1, MS2, MS3
• Heartbeat between MS and Client
• If MS1 fails, MS2 (backup server) should handle all the requests
• Math Server failures are transparent to the end user (who is trying to get the answers to the Math problems) and should get the continuous service
• When MS1 comes back, then MS2 relinquishes the service, and MS1 will be the active server.

**anduril462** · 11-22-2013

No offense, but I get the impression that you don't really have a strong need for such high availability. Just how much traffic are you expecting? How many servers will you actually have? What are the capabilities of the math servers? What rate of failures are you expecting Is the software or hardware really buggy to where it fails a lot? It may be more cost effective to buy quality hardware and rewrite the math server to be more robust. How many levels of backup do you really need, and should you only have one server actually serving at a time? Sounds incredibly wasteful to have two servers sitting there not serving anything.

You haven't said whether you must code this yourself. If you simply must make these math servers highly available, look at existing tools like Linux-HA - Wikipedia, the free encyclopedia. There's other free tools out there too, I'm sure. Don't suffer from NIH syndrome. There's is almost never a good business case to write your own version of something that already exists, especially if there are free alternatives. If there are no free alternatives, still consider how much it costs to pay you to develop the failover system, and to support/maintain/upgrade it for the rest of it's life; couple that with all the other work you wont get done while you're writing or fixing the failover system. Often this number is an order of magnitude more than buying something "off the shelf".

You also haven't addressed issues like: do all the servers live in the same building and use the same connection to the outside world? If so, no failover can protect against common issues like power or network failures.

If you do code this yourself

It's more common to make the servers responsible for heartbeat/monitoring. The client should know as little as possible about that.
Use UDP for the heartbeats. You don't want to have to deal with re-establishing a TCP connection every time.
Beware, the client may cache the IP from the name server, so when MS1 goes down, an attempt to reconnect to mathserver.example.com might try to connect to MS1 instead of MS2, MS3, etc. If you avoid this by hard-coding IPs in the client, the whole idea of a name server becomes pointless.
This may actually be a case for a lightweight front-end server that does minimal load balancing and failover stuff. That server forwards traffic to the first available math server.

So, assuming you just have a chain of backups, no front end to handle the failover, and don't want any load sharing/balancing, have your heartbeat signals go like this:

MS1 --> MS2 --> MS3 --> MS4

Each machine should send a heartbeat to it's backup every n seconds. You should only take over if you missed several heartbeats in a row, so e.g. 3n or 5n seconds without a heartbeat. So long as you receive regular heartbeats, don't accept connections from clients. The client will somehow need to know all possible IP addresses, and when a connection drops, simply try connecting to each one in turn.

You'll probably want multiple threads to handle different parts (sending heartbeats, receiving heartbeats, serving answers to client), or non-blocking IO, or select/poll loops. Use some state variables to communicate between parts, and make sure to use mutexes if you're using threads. Actually, if you really expect a lot of simultaneous connections, your math server should be threaded anyway.

EDIT
Mods: please move to appropriate forum

**newbc** · 11-22-2013

question about having a back up server using Sockets-untitled-drawing-jpg

this is how I see it. So I will need 3 servers?

**anduril462** · 11-23-2013

Not to be rude, but you aren't very forthcoming with information, and you don't seem to have a good grasp of what you really want, or why you think you need it. I understand that is part of the reason you are coming here for help, but I think you are putting the cart before the horse. Make sure you really need all this before you bother implementing it. You should have good reasons to justify buying extra hardware and you spending all this time making the system highly available. You refuse to provide any sense of scale though, which makes this really difficult. However, it seems pretty clear you're not designing this system for something the size of Google or Facebook, but much, much smaller. Have you considered cloud computing, that provides resources on demand and probably has built-in failover?

The picture suggests the name server is not just a name server, but also a router/gateway type thing. I'll call it the front-end (FE) server, and servers 1 and 2 the back-end (BE) servers. If you really want all client traffic to go through the FE server, then the client can be totally ignorant of how the heartbeat/monitoring system works or whether there's a heartbeat system at all. The FE server could have a light-weight daemon that simply forwards requests from the client to the appropriate BE server. IMO, that makes a nice, clean design since the BE servers just do the hard number crunching, and provide a heartbeat signal. The FE server is responsible for delegating the work to server 1 or 2, etc. You can change the routing daemon on the FE server if you need to change the behavior of any load balancing or failover. Note, you could also have the FE server start up BE 2 if BE 1 gets really busy, even if it hasn't failed.

Originally Posted by newbc

this is how I see it. So I will need 3 servers?

I can't answer that. That's as much a business decision as it is technical. You're vague about some of the more critical pieces of information, thus I'm still not convinced you even need failover or high availability. What is the real cost? How much money/time/whatever will you lose if the server fails (think dollars per hour of down time)? What are the real projections for demand of your service (number concurrent users, hardware resources)? Are your BE servers CPU intensive, disk intensive, database intensive, network intensive, all four of those, none, some mix (give a rough breakdown), or something else all together? What are your best estimates or (better yet) empirical numbers of how often a server will fail and how long it takes to recover? You also need to consider all the other questions I asked in my first two posts. You then need to weigh the options and do a little cost-benefit analysis.

Something to think about: There is this idea of the "n + 1" model for backups. An "n + 1" model means you have one server in reserve in case another server fails. You could also run "n + 2" or "n + m". m is basically how many concurrent failures you need to be able to tolerate. In your case, if you want to run one server at a time, and have one in reserve, that's an n + 1, where n is 1.

**newbc** · 11-23-2013

Well this is just a HW assignment, nothing big. It was on the c board because I will write it in c and execute it in a unix shell. From my understanding I just need two servers 1 and 2. I just need to show that if server 1 is down it will switch to server two. That is where I am having trouble finding code for. The client just needs to to connect to the server and does not need to know anything about heartbeat or back up server.

**anduril462** · 11-23-2013

You should have mentioned it was homework up front. Lots of wasted time and effort on stuff that probably really doesn't apply since this is not an actual system that needs high availability. This is why I asked in every post in this thread for more complete information.

If it's homework, you shouldn't "find code" for it. That would be plagiarism, and counts as cheating at almost any school. You should write code for it. That's how you learn, which is the point of school after all.

**newbc** · 11-23-2013

of course not, I just have not found any examples that might help me out, so I can understand what I am doing better. I just don't know what functions to use in the name server that will direct the client to to either server one or server two.

**tabstop** · 11-23-2013

Originally Posted by newbc

of course not, I just have not found any examples that might help me out, so I can understand what I am doing better. I just don't know what functions to use in the name server that will direct the client to to either server one or server two.

"If" isn't a function, but....

Thread: question about having a back up server using Sockets

Thread Tools

Search Thread

Display

question about having a back up server using Sockets

Similar Threads

Server/Client Go-Back-N

not-local server with sockets

TCP/IP lib for C++ [Sockets, Client/Server]

TCP Sockets: multiple clients - one server

Client and Server Sockets