Thread: Creating a memory mapped file

  1. #1
    Registered User
    Join Date
    Nov 2008
    Posts
    1

    Creating a memory mapped file

    Hi everyone!!

    First of all am quite new to cpp especially under linux.

    I am currently working on a small project under Linux and am having to deal with large number of data(larger than 1 GB). numerous number of operation is to be done with these data. for example: sorting, searching, comparing ... etc. so in order to perform all these operations it is important that these data are in the memory...

    in order to deal with such a large amount of data am thinking of using a memory mapped file..
    can anyone help me or guide me in the steps of creating a memory mapped file?

    Please help ..
    Thanking in advance

  2. #2
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Having the file memory mapped may make it easier to access the data efficiently, but memory mapping doesn't necessarily mean that the whole file will be in memory -- it will just ACT like it is. Depending on how much physical RAM is available you may have paging.

    Anyway, the general method is to open() the file and then call mmap() on the resulting file descriptor. The manual pages for these two functions should be enough to get started.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  3. #3
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Or you can use Boost.Interprocess, which makes this very easy.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  4. #4
    Kung Fu Kitty Angus's Avatar
    Join Date
    Oct 2008
    Location
    Montreal, Canada
    Posts
    115
    I don't think this is the purpose of memory-mapping. I believe that for user purposes, memory-mapping usually only becomes efficient if you have many different processes accessing the same regions of a file, and updates to that file need to be transmitted to all other processes immediately. You might be able to accomplish what you want merely by allocating memory on the heap, working with that, and writing it to disk when you are done.

    In any case, memory-mapping is pretty unC++. In OOP we deal less with pointers to memory, and more with objects. That doesn't mean you are onto the wrong solution, but you would have to break with the OO paradigm.

  5. #5
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    I'd like to also point out that mapping a 1GB file into memory will use up 1GB of virtual address space. This is not a problem, but if your file(s) grow to more than 3GB [1] on a 32-bit system, you will not be able to map the entire file into RAM. If you have a 64-bit version of Linux, this is not a problem - the limit is around 47 bits of memory address - which is 32768 times 4GB - so you need a few hundred 500GB hard disks to hold such a file.... (32768 / 125, whatever that is).

    Using such a memory mapped file is faster if there are many reads and/or writes to the same area of the file, and particularly so if there are multiple processes that are using the same file (but this also requires careful locking of the data accesses).

    In previous posts, it's been shown that a traditional copy of a mmap'd file is slower than reading and writing the file using fread/fwrite. This is of course not the same as a sorting operation, where a particular element of the file may be read multiple times. Of course, searching of unsorted data should not read more than once (and searching sorted data of fixed size can be done without reading much of the data at all), nor should "writing it to disk when you are done" be any effect in this. So it seems like sorting is the only case where any of the file-content is being read more than once.

    It may be better for general use to load the file content (on demand - that is, when needed) into ram using the regular file read and write operations in C or C++.

    [1] Actually, 3GB minus a couple of hundred megabytes or so, as Linux reserves the first 128MB to catch null pointers, and the very top of the memory is used as stack-space, and some other areas of the user memory may also be taken by various other components. So the actual size that you could possibly map is somewhat less than 3GB, but higher than 2GB. Exactly where in between those two numbers you'd end up depends on many factors, and it would be difficult to enumerate ALL of those, and under which conditions they are relevant.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  6. #6
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    I'd like to also point out that mapping a 1GB file into memory will use up 1GB of virtual address space.
    Note that you can partially map files if your use cases let you get away with it.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  7. #7
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by CornedBee View Post
    Note that you can partially map files if your use cases let you get away with it.
    Yes, of course. But in this case, it appears that the user actually WANTS to use ALL of the file at once, which means that it will use the size of the file's worth of virtual space.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Newbie homework help
    By fossage in forum C Programming
    Replies: 3
    Last Post: 04-30-2009, 04:27 PM
  2. Replies: 7
    Last Post: 02-06-2009, 12:27 PM
  3. Profiler Valgrind
    By afflictedd2 in forum C++ Programming
    Replies: 4
    Last Post: 07-18-2008, 09:38 AM
  4. Formatting a text file...
    By dagorsul in forum C Programming
    Replies: 12
    Last Post: 05-02-2008, 03:53 AM
  5. Need Help Fixing My C Program. Deals with File I/O
    By Matus in forum C Programming
    Replies: 7
    Last Post: 04-29-2008, 07:51 PM