Thread: Reliable UDP communication for merging events in a distributed system

  1. #1
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057

    Reliable UDP communication for merging events in a distributed system

    I've just started working on a distributed multiplayer game. Having had very little experience in this area, I'm just going with designs that seem reasonable to me, which doesn't necessarily mean that they are.

    At the moment I have a history-merging system: every event that occurs which affects the state of the system is logged and added to the end of a list. When a event comes in from the network which has happened, say at some point in the past (since the network has some delay), the log is used to roll back the system, apply this new event, and then the events that were "unapplied" are reapplied. I'm not sure how efficient this will end up being, but the idea seems to work.

    I do realize that some events won't be mergeable: for example, if a later event arrives through the network, it may seem nonsensical until an earlier event arrives. My plan is to allow unmergeable events for a while, but if they stick around for too long, to abort with an error -- or eventually request a full system update from some other peer.

    Anyway, having implemented at least the framework for this, it seemed like a bit of a waste to use TCP: as you can see, the design above by its nature handles out-of-order packets just fine. So I thought I would give UDP a try.

    The trouble is that while the system can handle out-of-order packets, it can't deal with dropped ones. I need to have reliable UDP communication, so to speak. I've come up with an idea that I think will work, which seems to share some characteristics with the vector clock algorithm (Vector clock - Wikipedia, the free encyclopedia). Thoughts about its viability would be appreciated.

    Remember that I have a distributed network here, so there might be a lot of peers connected to other peers: -- but in general, each peer is only connected to a small number of peers relative to the total number. Otherwise the number of connections would quickly grow out of hand. Anyway, the idea is this:
    • Each peer (interconnected game client) will generate events, and send out these events with an increasing ID associated with each, to the peers it has a direct connection to.
    • When a peer receives a packet from a neighbour, it records that this neighbour knows this packet for sure (for this must surely be the case, if it sent it). Then any neighbours which may not know about the packet for sure are forwarded this packet as well.
    • Every once in a while (the delay is somewhat randomized to prevent network congestion), each peer sends to its neighbours the "solid history count" that it knows about for every peer in the network. This is the highest ID, for each peer, for which every packet numbered this ID or lower has been received. [I'm under the understanding that TCP uses something similar.]
    • When these solid history updates are received, if some time has passed and a peer doesn't appear to have received a packet, the packet will be resent.


    I think this will deal with the reliability issue. New packets are immediately sent on to all neighbours (this could lead to a fair amount of redundancy, but there's not much to be done about that); if an update from a neighbour reveals it hasn't received the packet after a while, it will be resent.

    Hope that made sense. Just wondering if there are any thoughts about this design. (I've started implementing it, but made the mistake the first time around of unconsciously assuming a global packet numbering. Silly me.)
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

  2. #2
    Frequently Quite Prolix dwks's Avatar
    Join Date
    Apr 2005
    Location
    Canada
    Posts
    8,057
    Also: I do realize that I need to have a synchronized clock, or something similar, for timestamps on events from different computers to make sense. I thought I'd send a packet with a timestamp to each system, which would receive it and note how much time had passed, then send it back. By looking at the time elapsed one way and the time elapsed the other way, I should be able to figure out how much the clocks are off by (assuming approximately equal delay times both ways). Then I'd just establish a "game clock" and record how much each system's local clock is off from the game clock.

    I think this is similar to an algorithm on Wikipedia as well, but I can't remember what it's called. This one, I suppose: Berkeley algorithm - Wikipedia, the free encyclopedia
    The only difference is that I'd only need to perform this calculation once, when a peer first connects to the network (assuming that the clocks are reasonably accurate from then on), so the newly connected peer would be the "master" in this case.

    Thoughts on this idea?
    dwk

    Seek and ye shall find. quaere et invenies.

    "Simplicity does not precede complexity, but follows it." -- Alan Perlis
    "Testing can only prove the presence of bugs, not their absence." -- Edsger Dijkstra
    "The only real mistake is the one from which we learn nothing." -- John Powell


    Other boards: DaniWeb, TPS
    Unofficial Wiki FAQ: cpwiki.sf.net

    My website: http://dwks.theprogrammingsite.com/
    Projects: codeform, xuni, atlantis, nort, etc.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 2
    Last Post: 03-05-2009, 10:25 AM
  2. Replies: 4
    Last Post: 06-13-2005, 09:03 AM
  3. Replies: 3
    Last Post: 06-13-2005, 07:28 AM
  4. A Distributed System
    By Troll_King in forum A Brief History of Cprogramming.com
    Replies: 2
    Last Post: 09-13-2002, 01:01 PM