Thread: Allocating 4 MB of Memory causes Segmentation Fault

  1. #1
    Registered User TangoOversway's Avatar
    Join Date
    Mar 2013
    Posts
    20

    Allocating 4 MB of Memory causes Segmentation Fault

    I'm working with a simple program to test Crypto++ and so I can get used to using the library. It's not Crypto++ that is causing the errors, though.

    I'll include the code at the end, but the simple run-down is this:

    I load in data from a file, 1,669,823 bytes of data. I compress it using the Gzip class in Crypto++. I check the compressed length, which comes out to be 1,663,424 bytes. (I didn't expect a lot of compression in this file. It's just one of several files I'm testing.)

    After compression, I query the object I've made from the Gzip class for the length of the compressed data. Then I use this line:

    Code:
    char* zipData[zipLen];
    This works with smaller files, but not with the bigger file I'm using.

    I figure, at this point, the total memory I'm using could be approximately 3 MB. (Rounding off, 1 MB for the original data, 1 MB for the compressed data in my Gzip object, and 1 MB that I'm allocating when I get "Segmentation Fault 11" as an error.

    I'm new to C++ and haven't had to worry about memory issues since the 1980s, when I was using 6502 Assembler on an Apple //e (with a 64k address space). I've been using Perl and Java in the past 10 years, but now that I have to think about memory, even on today's systems, 3 MB sounds like a lot to me - but I don't see how this can be so much that it should crash - unless there's some rules I don't know.

    This works with a file 521k in length. I may not be freeing up memory (feel free to point that out), but this happens if the 1MB file is the first file I test, even before I'd be destroying or deleting anything.

    (There is some extra code in the main() routine, but that's just adding arguments to a linked list for some other testing I will be doing.)

    Here's the code:

    Code:
    #include <iostream>
    #include <list>
    #include <sstream>
    #include <utility>
    #include <string>
    #include "poppler/cpp/poppler-document.h"
    #include "../Crypto++/blowfish.h"
    #include "../Crypto++/cryptlib.h"
    #include "../Crypto++/files.h"
    #include "../Crypto++/hex.h"
    #include "../Crypto++/modes.h"
    #include "../Crypto++/gzip.h"
    using namespace std;
    
    
    void testZip(string fileName) {
        using namespace CryptoPP;
        cout << "Experimenting file: " << fileName << endl;
        unsigned long sourceLen;
    
    
    //Load the file into a char*
        ifstream inStream((char*) fileName.c_str(), ios::in | ios::binary);
        inStream.seekg(0, inStream.end);
        sourceLen = inStream.tellg();
        inStream.seekg(0, inStream.beg);
        char* inData = new char[sourceLen];
        inStream.read(inData, sourceLen);
        inStream.close();
    
    
        cout << "    Original length: " << sourceLen << endl;
    
    
    //Zip file
        Gzip inZip;
        inZip.SetDeflateLevel(Gzip::MAX_DEFLATE_LEVEL);
        inZip.Put((const byte*) inData, sourceLen);
        inZip.MessageEnd();
    
    
        unsigned long zipLen = inZip.MaxRetrievable();
        cout << "  Compressed length: " << zipLen << endl;
    //This next line is the one that crashes the program with a 1 MB file
        char* zipData[zipLen];
        inZip.Get((byte*) zipData, zipLen);
    
    
    //Unzip file
        Gunzip outZip;
        outZip.Put((byte*) zipData, zipLen);
        outZip.MessageEnd();
    
    
        unsigned long outLen = outZip.MaxRetrievable();
        cout << "      Output length: " << outLen << endl;
        char* outData[outLen];
        outZip.Get((byte*) outData, outLen);
        if (sourceLen == outLen) {
            cout << "   Input and output DO match in length." << endl;
            int iTest = memcmp((const byte*) inData, (const byte*) outData, sourceLen);
            cout << "   Memory comparison result: " << iTest << endl;
        } else {
            cout << "   Input and output do NOT match in length." << endl;
        }
    
    
    
    
        return;
    }
    
    
    int main(int argc, char* argv[]) {
        list<string> inFiles;
        int i;
        cout << "Command given: " << argv[0] << endl << endl;
        for (i = 1; i < argc; i++) {
            cout << "Adding argument " << i << ": " << argv[i] << endl;
            inFiles.push_back(argv[i]);
        }
        cout << "-----------------------------" << endl << endl;
        testZip(argv[1]);
    
    
        cout << "-----------------------------" << endl << endl;
    
    
        cout << endl;
    
    
        return 0;
    }

  2. #2
    Internet Superhero
    Join Date
    Sep 2006
    Location
    Denmark
    Posts
    964
    You are allocating a very large array on the stack, so it is understandable that you are running out of memory. You should allocate it on the heap using the new operator, probably along with something like a shared_ptr or scoped_ptr.

    The stack size varies from different OSes and compilers, but it's usually no more than a megabyte or two. The heap on the other hand has plenty of space (well, it might not, if you are on a 1980s machine :-) )

    Edit:

    Hmm, you are actually using new for one of the other arrays (yet you fail to delete it after use, always delete the memory that you new, or be smart and use smart pointers)

    Why use new for one array but not the other?
    Last edited by Neo1; 03-13-2013 at 07:40 PM.
    How I need a drink, alcoholic in nature, after the heavy lectures involving quantum mechanics.

  3. #3
    Master Apprentice phantomotap's Avatar
    Join Date
    Jan 2008
    Posts
    5,108
    *
    O_o

    Well, there's your problem.

    Soma

  4. #4
    Registered User TangoOversway's Avatar
    Join Date
    Mar 2013
    Posts
    20
    Neo1: Thanks - I think I remember reading that it makes a difference if I use new, but I had forgotten it (if I did read it). I guess that's just part of trying to learn a new language and absorb a lot of information.

    phantomotap: I'm not clear what you're saying. Does the asterisk mean you're quoting the last post, or is there a specific asterisk in my code that's the issue?

  5. #5
    Registered User
    Join Date
    Oct 2006
    Posts
    3,445
    Quote Originally Posted by TangoOversway View Post
    Code:
    char* zipData[zipLen];
    this is not valid C++. you cannot create dynamically sized arrays this way. you must use new.

  6. #6
    Registered User
    Join Date
    May 2010
    Posts
    4,633
    You should also think about using a size_t for your array sizes. This type is guaranteed to be able to index the largest sized object. On many systems an unsigned long may be larger than the size_t and if your "size" grows larger than what a size_t can hold you will invoke undefined behavior. You should also not try to initialize an array with a streampos type. This type can also be larger than a size_t.

    Jim
    Last edited by jimblumberg; 03-13-2013 at 11:26 PM.

  7. #7
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    > phantomotap: I'm not clear what you're saying. Does the asterisk mean you're quoting the last post, or is there a specific asterisk in my code that's the issue?
    Well why do you need as many pointers as you have chars?

    If you just wanted to make a copy of the data, you would need something like
    char zipData[zipLen];

    or more usefully
    char *zipData = new char[zipLen];
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  8. #8
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    >>ifstream inStream((char*) fileName.c_str(), ios::in | ios::binary);
    Don't cast the return. Why are you doing it anyway?

    >>char* inData = new char[sourceLen];
    Don't use new. Use std::vector. Ie: std::vector<char> InData(SourceLen);
    I fail to see why you are allocating an array of pointers, too.

    >>inStream.close();
    No need to explicitly close streams. They close themselves are they go out of scope.

    >>inZip.Put((const byte*) inData, sourceLen);
    Change to inZip.Put(InData.begin(), InData.size()). Pretty sure you don't have to cast, either. This takes into account that you made InData a vector before.

    >>char* zipData[zipLen];
    >>char* outData[outLen];
    Again, use vector of char, not char*.

    >>int iTest = memcmp((const byte*) inData, (const byte*) outData, sourceLen);
    bool Equal;
    if (InData.size() != OutData.size())
    Equal = false;
    else
    Equal = std::equal(InData.begin(), InData.end(), OutData.begin());

    >>list<string> inFiles;
    I don't think you need a linked list here, so use vector.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  9. #9
    Registered User TangoOversway's Avatar
    Join Date
    Mar 2013
    Posts
    20
    Quote Originally Posted by Elysia View Post
    >>ifstream inStream((char*) fileName.c_str(), ios::in | ios::binary);
    Don't cast the return. Why are you doing it anyway?
    I think I was experimenting - I had looked at several examples and I remember seeing one doing it. Took it out on your advice.

    >>char* inData = new char[sourceLen];
    Don't use new. Use std::vector. Ie: std::vector<char> InData(SourceLen);
    I fail to see why you are allocating an array of pointers, too.
    Can I cast a vector to const byte* ? The routine I'm calling requires a const byte* .

    I was definitely doing that wrong - which is what led to the initial question about why the program was crashing.

    >>inStream.close();
    No need to explicitly close streams. They close themselves are they go out of scope.
    Thanks - just trying to get used to closing or deleting everything.

    >>inZip.Put((const byte*) inData, sourceLen);
    Change to inZip.Put(InData.begin(), InData.size()). Pretty sure you don't have to cast, either. This takes into account that you made InData a vector before.
    >>list<string> inFiles;
    I don't think you need a linked list here, so use vector.
    Actually, I was using a linked list intentionally - so I could be sure I could use one. That was experimental and practice.


    Thanks for all the points on there. The one thing that confuses me is the use of a vector, since the routine I'm calling requires a const byte* as a parameter. Wouldn't I still have to do a cast to get that from a vector<char>? And can I cast that?

    I know I can try that, but that just gives me a quick, "It works," or, "It doesn't," and I'd like to know the background of why or why not.

  10. #10
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by TangoOversway View Post
    Can I cast a vector to const byte* ? The routine I'm calling requires a const byte* .

    I was definitely doing that wrong - which is what led to the initial question about why the program was crashing.

    ...

    Thanks for all the points on there. The one thing that confuses me is the use of a vector, since the routine I'm calling requires a const byte* as a parameter. Wouldn't I still have to do a cast to get that from a vector<char>? And can I cast that?

    I know I can try that, but that just gives me a quick, "It works," or, "It doesn't," and I'd like to know the background of why or why not.
    You can cast it (I didn't do this right earlier, btw). Taking your inZip line:

    nZip.Put(reinterpret_cast<byte*>(&*InData.begin()) , InData.size()).

    You could also use a vector of byte directly. Then all you'd have to do would be:

    nZip.Put(&*InData.begin(), InData.size()).

    You may be wondering WTH the &* is for. The begin() functions return an iterator. An iterator is not a pointer, but behaves as one. To get the actual element it points to, you can use *, just like a pointer. But we want a pointer to the element, so we take the address of the dereferences element, hence &*.

    Using vector means you don't have to worry about the memory. No new. No delete. Works with exceptions too, with no additional modification.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  11. #11
    Registered User
    Join Date
    Oct 2006
    Posts
    3,445
    Quote Originally Posted by Elysia View Post
    [code]inZip.Put(reinterpret_cast<byte*>(&*InData.begin() ), InData.size()).[code]
    I would do it more like this:
    Code:
    inZip.Put(reinterpret_cast<byte*>(InData.data()), InData.size())
    this avoids the dereference/address of, and makes it a bit more clear, I think.

    but it's more of a style thing. your way would work just as well.

  12. #12
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    When casting a pointer to the beginning of a vector, I usually write it like this:

    Code:
    std::vector<byte> myData;
    byte* ptr = &myData[0];
    So, there's yet another option. I prefer it, because it continues to be correct even if the vector is changed to something else, like std::array<>, or even a raw array.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  13. #13
    Registered User TangoOversway's Avatar
    Join Date
    Mar 2013
    Posts
    20
    Thank you, everyone, for the help. This is a great thread for me because I not only got the problem solved quickly, but got a lot of other good advice and learned some useful things.

  14. #14
    Registered User antred's Avatar
    Join Date
    Apr 2012
    Location
    Germany
    Posts
    257
    Quote Originally Posted by brewbuck View Post
    When casting a pointer to the beginning of a vector, I usually write it like this:

    Code:
    std::vector<byte> myData;
    byte* ptr = &myData[0];
    So, there's yet another option. I prefer it, because it continues to be correct even if the vector is changed to something else, like std::array<>, or even a raw array.
    I don't see any casts in that snippet.

  15. #15
    Tweaking master Aslaville's Avatar
    Join Date
    Sep 2012
    Location
    Rogueport
    Posts
    528
    Quote Originally Posted by antred View Post
    I don't see any casts in that snippet.
    Well,...I also don't

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Regarding allocating memory
    By monkey_c_monkey in forum C++ Programming
    Replies: 2
    Last Post: 08-29-2012, 10:06 PM
  2. Replies: 7
    Last Post: 11-05-2011, 05:11 PM
  3. Replies: 3
    Last Post: 03-12-2011, 08:28 PM
  4. simple shared memory throwing segmentation fault
    By kapil1089thekin in forum C Programming
    Replies: 3
    Last Post: 10-03-2010, 06:53 AM
  5. segmentation fault and memory fault
    By Unregistered in forum C Programming
    Replies: 12
    Last Post: 04-02-2002, 11:09 PM