Thread: Message Queue Implementation Question

  1. #1
    Registered User
    Join Date
    Oct 2011
    Posts
    2

    Message Queue Implementation Question

    Hello,

    I have to build a Message Queue System for the University, like Amazon SQS. Whe have n queues, p producers and q consumers. The idea: for example, we have 5 producers and 4 consumers for the queue 1. Messages are delivered from the queue to any consumer when they ask for them. On that moment, the message is setted as "unvisible" in order not to be transferred to other(s) consumer(s). Each consumer has x seconds to process the message. If that time is exceeded, the message is marked "visible" and re-enters to the head of the queue. If the message is processed before x seconds, it is deleted (the queue receives an ACK signal). Doing so, messages are never lost (system requirement).

    Messages are in plain text and others like pdf, xml, etc. But they are small (no longer than 100KB). MIME is suggested, and the use of meta-information. The programming language should be C, and should use UNIX domain sockets (Linux operating system). Also, there will be an administrator who will manage the queues (creation, destruction, etc), using other IPC protocols like pipes and/or signals

    My question is, what kind of implementation would you suggest for and why (linked list, circular buffer, etc)?

    Thanks,
    Gabriel


  2. #2
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    Quote Originally Posted by bolso103 View Post
    My question is, what kind of implementation would you suggest for and why (linked list, circular buffer, etc)?
    I'm betting the professor didn't specify what data structures he wanted you to use. I'm also betting he wants you to think about it and come up with a design on your own. So why don't you tell us what data structure/implementation you think would be good, and why, and we'll let you know what we think of your design. It might help if you make a list of use cases and/or what operations you need to do to the data structure, then see which data structure (or some hybrid) fits your needs. Is there a priority aspect to the messages where new ones may be inserted in the middle of your task list, or will they only be added at the end? Do you remove tasks when you set them to invisible so they actually have to "re-enter" the queue?

  3. #3
    Registered User
    Join Date
    Oct 2011
    Posts
    2
    anduril462, thank you for your prompt reply.

    I am an electronic engineering student. I have acknowledgement on PLC, PIC and assembly basically, so, our career focuses on embedded real time systems, mostly on hardware level programming.

    Now, we have a new subject, and it is about Unix systems, using IPC. The idea of the professor is simple: to beat our heads against the wall. He gives a project to the students (the message queue system in this case), and they have to first build a project design (using methods like scrum, much investigation, decisions, mechanisms, but with NOTHING of programming, just chart flows at high level). This first part should be done in between two weeks. Then, we will have a month and a half to develop the system and program in C.

    We do not know how to program in C, we do not know what is it a pipe, a socket, a circular buffer or a linked list. We have to study that. That is why I made the question, after studying, I realized that a linked list is easier to develop, but the circular buffer has a main advantage: it makes a good implementation strategy for a queue that has fixed maximum size.

    Our project will work as various processes running which send messages between them. For example:

    -producer queue1 60 hello.txt
    -consumer queue1

    The first line is a producer which sends "hello" to queue1 each 60 seconds.
    The second line is a consumer that reads the content on the tail of queue1. Each time consumer ejecutes, he tries to read the message.

    The system should allow:

    -Stop the central server
    -Add/remove queues (without stopping the service)
    -See the amount of messages on each queue
    -List of queues
    -Send messages to the queue
    -Receive messages from the queue
    -Delete messages of the queue

    A circular buffer I think will be suitable, because an acceptable approach is to fix a maximum size. But simplicity is also important, another decision would be a doubly linked list, for example.

    Another important thing the professor said, is that it would be great to keep each queue at a fixed lenght, to prevent overflow or empty queues.

    There is not any priority aspect to the messages. It is FIFO, the first message that enters the queue is the first one to be consumed (but it depends on the process time, maybe a later message could be confirmed before).

    While messages are being processed by the producer, they are "flagged" as invisibles, I think they should be copied everywhere, and once confirmed, this copy is deleted.

    What do you think?

    Regards,
    Gabriel

  4. #4
    Registered User
    Join Date
    Nov 2010
    Location
    Long Beach, CA
    Posts
    5,909
    Quote Originally Posted by bolso103 View Post
    A circular buffer I think will be suitable, because an acceptable approach is to fix a maximum size. But simplicity is also important, another decision would be a doubly linked list, for example.
    Why is it acceptable to fix a maximum size? How do you determine that size? Certainly, my smart phone and Google's servers should have different limits as to how many queues they can support, and how many messages in each queue. I don't mean that rhetorically. If you have good reasons to limit, then do, but if not, don't, or make the limits configurable.

    Circular buffers (basically arrays) are okay, but IMO, less ideal than linked lists since it requires shifting all the array contents around when you have to delete something from the middle of the list. You never process the queue backwards, you always start at the beginning and look for the first visible message. That means the only reason for a doubly linked list instead of a singly linked list is to make deleting an element from the middle easier. You can also restrict the size of a linked list by simply storing a count of messages along with the head of the list, and refusing to insert if that count exceeds the limit -- it's a fairly small amount of code. Those limits ideally would be configurable. If you do go with the circular buffer, look in to variable length arrays (VLAs), or allocate them dynamically with malloc.

    Another important thing the professor said, is that it would be great to keep each queue at a fixed length, to prevent overflow or empty queues.
    Overflow and empty queues may happen regardless of underlying data structure. If you only have one producer, producing a message every 60 seconds, and a consumer that consumes that message in 1 second (or quicker for processing a simple file), then you will constantly be emptying your queue. Conversely, if you produce messages every 1 second, but it takes a consumer 3 seconds to process the message (reasonable for large files), then you will overflow. This is definitely true for circular buffers. Theoretically linked lists can grow indefinitely, but realistically a process is limited to how many resources it can use.

    There is not any priority aspect to the messages. It is FIFO, the first message that enters the queue is the first one to be consumed (but it depends on the process time, maybe a later message could be confirmed before).
    If you go with a linked list, always keep a pointer to the last node, so you can insert at the end quickly and easily. Also, if a later message is confirmed before an earlier message, you will be deleting nodes from the middle of the queue, so this again is support for a doubly linked list. If you already have a pointer to the node (which you do, to process the message), then deleting the node involves a few simple pointer reassignments and a call or two to free().

    While messages are being processed by the producer, they are "flagged" as invisibles, I think they should be copied everywhere, and once confirmed, this copy is deleted.
    Do you mean processed by the consumer? What do you mean by copied everywhere? You only need one copy in the appropriate queue. When you are looking for the next message to produce, you just start at the beginning, and skip all the invisible ones, grabbing the first visible one. Once you get a successful reply from the consumer, you delete.

    What do you think?
    I think you're off to a decent start. It's important to develop a good, loosely-coupled API that doesn't tie you down to a certain implementation (circular buffer, singly/doubly linked list, etc). That way, you can develop your first iteration using the easy to implement doubly linked list and later change to the circular buffer if you need to, without touching the rest of the code.

  5. #5
    Registered User Codeplug's Avatar
    Join Date
    Mar 2003
    Posts
    4,981
    Does your professor expect you to implement all the low-level IPC code yourself? If not, then you may find the ØMQ library useful.
    http://zguide.zeromq.org/

    gg

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Linux Message Queue Question!!!
    By tyrelltle in forum Linux Programming
    Replies: 1
    Last Post: 07-28-2010, 10:23 PM
  2. Queue implementation in C
    By cerr in forum C Programming
    Replies: 13
    Last Post: 01-11-2010, 12:30 PM
  3. Queue implementation
    By kolliash in forum C Programming
    Replies: 1
    Last Post: 06-01-2008, 08:25 AM
  4. Queue implementation
    By Hunter2 in forum C++ Programming
    Replies: 3
    Last Post: 02-19-2005, 01:50 PM
  5. queue implementation not working
    By sballew in forum C Programming
    Replies: 2
    Last Post: 12-05-2001, 11:52 AM

Tags for this Thread