Thread: BSD mmap for shared memory - am I right?

  1. #1
    Registered User
    Join Date
    Sep 2001
    Posts
    4,912

    BSD mmap for shared memory - am I right?

    I've had to piece this together from documentation, so if you could just confirm my conclusions and correct me (if necessary), that'd be awesome. The only good tutorial I could find on the topic is at fscked.org, and it's full of "TO DO" pages.

    If you'll recall from a previous thread, I have a program in which the "server" process spawns a number of "client" threads (this is for a shell - so I'm using the term server in the same sense as X-Server). The server maintains a doubly-linked list to represent all the clients, with each client having a pointer to the node that represents it. Clients can be removed, inserted, and moved around by the server (hence the double-linked list structure).

    This wasn't working, and that's how my attention got drawn to the fact that malloc'ed memory isn't shared between processes, even if you fork after malloc'ing it (I was very surprized - is this correct?). So I looked at shared memory, and it appears I have two options: System V's IPC, and BSD's mmap. It's my understanding that mmap is the more popular of the two choices, so I've chosen it (thoughts on this?)

    Now the actual calls to mmap and munmap are what's confusing me.

    Code:
    myPointer = (myType *)mmap(0, sizeof(myType), PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, NULL, 0);
    Where myPointer is just like the pointer return by malloc. I'd like this to act just like malloc'ed memory, except for the fact that it is shared (meaning, I don't want to actually have files written to the disk - I just want it in RAM). Other than that, I really don't care where it is, hence the NULL and 0's. Does this call look right? Have I provided the right flags, etc...?

    Then when I call munmap, I would do this:

    Code:
    myPointer = (myPointer, sizeof(myType));
    Is THAT correct? All the man pages I found listed the first argument for both methods as being "void *start", but all the examples I found specified 0 as the first parameter for mmap. If that's not right, my question is - how does munmap know what memory to free, since I've created an arbitrary number of clients with mmap, and I just close them one by one as the user dictates.

    Again - just a confirmation / correction is all I'm looking for, but if you have any thoughts on how I could go about this a better way - I'm all ears.

  2. #2
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Is this of any help:
    http://fscked.org/writings/SHM/shm-2.html

    I'm guessing you missed the unmap part of the second code-snippet...

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  3. #3
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by sean View Post
    If you'll recall from a previous thread, I have a program in which...

    This wasn't working, and that's how my attention got drawn to the fact that malloc'ed memory isn't shared between processes, even if you fork after malloc'ing it
    HA HA! I remember that thread and actually you wrote it as if it already worked:
    Quote Originally Posted by sean, elsewhere
    They also share a linked list of structs (where each node represents a child, and contains the file pointer for the FIFO to the client, the pid, a unique name, etc....). The first node is malloc'd before the first process fork's, and it's expanded as new children are created by other children processes.
    To which I remember thinking, wow, I'll have to go back and try something like that again, because I had thought it wouldn't work last time...

    (I was very surprized - is this correct?)
    Yes, that's right, it won't work.

    You don't appear to have gotten very far with mmap, so you might as well look at this, which is as far as I went but it is a working example. I used an on disk file and it was accessible to another process, but I haven't made the next step (IPC using mmap at both ends), but I seem to have to believed then that it would work, so I would be interested to find out if you get it to do so. Really really.

    (I was very surprized - is this correct?)
    ps. why not use pthreads? That will probably be easier and more useful than mmap...
    Last edited by MK27; 03-06-2009 at 07:56 AM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  4. #4
    Registered User
    Join Date
    Sep 2001
    Posts
    4,912
    I've considered POSIX, but it seemed so "not my style" - but it probably would be better - I'll look into it.

    I've been using ncurses (including panels) for all my i/o - is there a good POSIX equivalent that I'm completely missing, or is ncurses still the best choice?

  5. #5
    Registered User
    Join Date
    Sep 2001
    Posts
    4,912
    Come to think of it I had a better reason for using fork - I wanted every client to be in it's own process so that if it locked up, the server could kill it without affecting any other client (the Google Chrome strategy...) Of course, being brand new to posix, maybe there's an equivalent, but that's why I went the fork() route.

  6. #6
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by sean View Post
    Come to think of it I had a better reason for using fork - I wanted every client to be in it's own process so that if it locked up, the server could kill it without affecting any other client (the Google Chrome strategy...) Of course, being brand new to posix, maybe there's an equivalent, but that's why I went the fork() route.
    fork() is definitely more straightforward, just now you have that IPC issue. That's why I'm so curious if you can get this mmap deal to work, it would be super.

    For some reason I worry that it may not unless you actually use a disk file, because each process will probably have to have a seperate map of the file (you won't be able to pass the address, same as a pointer). Which might then mean the kernel or something would restrict the file such that you can't mmap simultaneously from two seperate processes. In my experiment "the other process" was just viewing the file -- so it has to work, because you could change the file as easily as you could view it from outside.

    But on a big link list using a disk file is out of the question.
    Last edited by MK27; 03-06-2009 at 02:17 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  7. #7
    Registered User
    Join Date
    Apr 2008
    Posts
    396
    I've considered POSIX, but it seemed so "not my style"
    but you're already using it, fork(), mmap(), and most of the syscalls you're probably using are part of this standard as long as you take care of the options you use (which are sometimes specific to the o/s). Concerning the usage of mmap(), remember that you must have an existing object to map (referenced by the fd parameter), the old way involves the shmget/at/etc. routines, the portable posix ones are shm_create/etc. Once the segment is created, you can map it and use it as if it was the result of a malloc() call.

  8. #8
    Registered User
    Join Date
    Sep 2001
    Posts
    4,912
    fork(), mmap(), and most of the syscalls you're probably using are part of this standard
    Fair enough - but were they not originally part of UNIX-like systems before POSIX adopted them as their own standard?

    As for threads - I would definitely rather have each client run in it's own process, so shared memory is a must.

    you must have an existing object to map (referenced by the fd parameter)
    My understanding was that you could use the MAP_ANONYMOUS option and map part of the RAM - which would be ideal, and then the fd parameter is apparently ignored. My question then is how do you reference the stored memory? And how do you free it?

    At this point, I'm not even sure if I can use mmap because I would call mmap to allocate space for the struct, fork, and then both process would have a pointer to the memory. When I need a new client, again, I allocate new space, assign the "next" pointer in the existing struct to the new memory, and fork again, now the second client has a pointer to the second struct. The way munmap gets used in the examples, I don't know if you can call munmap on individual allocations (i.e. structs) or if munmap unmaps everything your program has mapped - because I have no pointer to a specific part of allocated memory.

    If there was a "shared" version of malloc - it'd be perfect. I'm looking at the POSIX shm methods now...


    edit: I'm looking at the "System V" methods according to the HOW-TO linked below (which I have found to be an excellent source for learning about pipes, etc...) The functions include shmget, shmat, shmdt, etc... They appear to be exactly what I'm looking for. My only concern is portability - it would've been nice to have this work on BSD. This looks strikingly similar to what I've seen present elsewhere as "THE posix way", so.... I don't know. thoughts?

    http://tldp.org/LDP/lpg/node7.html

  9. #9
    Registered User
    Join Date
    Apr 2008
    Posts
    396
    Fair enough - but were they not originally part of UNIX-like systems before POSIX adopted them as their own standard?
    that's the point: posix is the unix standard, it defines the api and features (some required, some optional, like threads, but there is a lot more) you can expect to find on a standard unix system (which means a lot of them now), this ranges from basic system functions to advanced realtime and distributed computing. This standardization work started at the time a bunch of unices were created from the original version, but with incompatible interfaces (google unix wars).

    My understanding was that you could use the MAP_ANONYMOUS option and map part of the RAM - which would be ideal
    I never used this linux-specific hack. But you can achieve the same thing, with a few supplementary calls (shm*) and have a portable code, but if it is not of your concern, go ahead with MAP_ANONYMOUS.

    My question then is how do you reference the stored memory? And how do you free it?
    mmap() returns the address of the 1st mapped page and you undo that with unmap() (deallocate all pages), you can let the kernel decides which location is best or force one (deprecated obviously).

    At this point, I'm not even sure if I can use mmap because I would call mmap to allocate space for the struct, fork, [...] allocated memory.
    no, mmap() is not an equivalent of malloc(), it allocates pages of memory, not blocks of arbitrary sizes like malloc; those functions are not designed to do the same job. Once you have called mmap() all the pages are allocated and ready, you don't need to malloc() anything anymore, just access the whole range with pointers and store what you want in it.

    so to create shared memory, you have at least 3 options mmap(MAP_ANONYMOUS)[linux only], shm_create()+mmap()/munmap()[posix], or shmget/at/dt/ctl()[old ipc]
    Last edited by root4; 03-07-2009 at 01:51 AM.

  10. #10
    Registered User
    Join Date
    Sep 2001
    Posts
    4,912
    Ah - PAGED memory - that's the concept I was missing completely. Thanks for explaining that more clearly. I was looking around and it appears that some people have problems with the ipc method because the system places a limit on how much shm you can get. So either way - I'll have problem if my linked list gets too large. If POSIX offers betters portability - I'm jumping on that train.

    What I think I'll do is track the number of structs, and allocate another page is it gets necessary (unless this is a huge impossibility, but if I start from scratch and redesign the program, I think I can do it)

  11. #11
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Shared mapping are still shared after a call to fork(). Here is the proof:

    Code:
    #include <stdlib.h>
    #include <stdio.h>
    #include <string.h>
    
    #include <unistd.h>
    #include <sys/mman.h>
    
    int main()
    {
        void *shmem = mmap(NULL, 4096, PROT_READ | PROT_WRITE,
                           MAP_SHARED | MAP_ANONYMOUS, 0, 0);
        if(fork() == 0)
        {
            /* For demonstration purposes only -- this is not real synchronization */
            sleep(1);
            printf("%s\n", shmem);
            exit(0);
        }
        else
        {
            strcpy(shmem, "Hello, world!");
            wait(NULL);
        }
        return 0;
    }
    Yay! Shared memory between parent and child without silly shm regions or using threads.

    EDIT: root4 calls this a "hack," and I suppose it is if you consider all non-portable coding to be a hack. But you can achieve it while still remaining POSIX by using an actual file to back the mapping. Small price to pay, IMHO.
    Last edited by brewbuck; 03-07-2009 at 02:40 PM.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  12. #12
    Registered User
    Join Date
    Sep 2001
    Posts
    4,912
    The problem I see myself having, however, is storing an expandable linked list (containing mainly file descriptors, pointers, and a pid) in that kind of structure...

    edit: appears I have to use offsets instead of the "next" and "prev" pointers... niice...

  13. #13
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Yes, pointers in shared memory is not a good idea - the memory is not always at the same virtual address in all users of the shared memory, so you need some other way to determine the address of some object - either an offset, or index, or some other "relative" measure.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  14. #14
    Registered User
    Join Date
    Sep 2001
    Posts
    4,912
    Well that should be a lot of fun! Thanks for all the responses - I guess I have my work cut out for me.

  15. #15
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by sean View Post
    Well that should be a lot of fun! Thanks for all the responses - I guess I have my work cut out for me.
    You can usually control where a memory segment is mapped in your virtual address space. You can do this with mmap() or shmat(). The system should honor your request unless there is a specific reason it can't be mapped there.

    The example I gave of mapping an anonymous region then forking will guarantee that child and parent both have the same mapping at the same address.

    But matsp's point is valid -- pointers in shared memory regions are tricky at best.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 4
    Last Post: 01-13-2008, 02:14 AM
  2. Question regarding Memory Leak
    By clegs in forum C++ Programming
    Replies: 29
    Last Post: 12-07-2007, 01:57 AM
  3. Memory problem with Borland C 3.1
    By AZ1699 in forum C Programming
    Replies: 16
    Last Post: 11-16-2007, 11:22 AM
  4. Shared Memory - shmget questions
    By hendler in forum C Programming
    Replies: 1
    Last Post: 11-29-2005, 02:15 AM
  5. How much memory may I use?
    By Sloede in forum C Programming
    Replies: 2
    Last Post: 01-08-2004, 08:41 AM