C++ executable sizes

This is a discussion on C++ executable sizes within the C++ Programming forums, part of the General Programming Boards category; Is it normal for an executable compiled from C++ sources, including the STL, to be substantially larger (250-1000%) larger than ...

  1. #1
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300

    C++ executable sizes

    Is it normal for an executable compiled from C++ sources, including the STL, to be substantially larger (250-1000%) larger than a C equivalent? I'm on linux, of course, using gcc.

    Since I started using C++ this has been consistent, along with prolonged compile times. I dunno if it matters that much, or how it scales -- like I'm not trying to argue C++ is infeasible or anything ridiculous, just looking for an explanation.
    Last edited by MK27; 04-01-2010 at 07:12 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  2. #2
    Registered User jeffcobb's Avatar
    Join Date
    Dec 2009
    Location
    Henderson, NV
    Posts
    875
    Quote Originally Posted by MK27 View Post
    Is it normal for an executable compiled from C++ sources, including the STL, to be substantially larger (250-1000%) larger than a C equivalent? I'm on linux, of course, using gcc.

    Since I started using C++ this has been consistent, along with prolonged compile times. I dunno if it matters that much, or how it scales -- like I'm not trying to argue C++ is infeasible or anything ridiculous, just looking for an explanation.
    MK;

    Any template use will bloat your code. I imagine even without it that C++ will result in a longer compile/larger bin but templates explode all of that by a great deal if not used with caution...not so much in how they are used in particular but how they are used across a project. Here is a Dr. Dobbs article that gives some nice ins and outs:
    http://www.drdobbs.com/cpp/184403053...PCKH4ATMY32JVN
    C/C++ Environment: GNU CC/Emacs
    Make system: CMake
    Debuggers: Valgrind/GDB

  3. #3
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by jeffcobb View Post
    Here is a Dr. Dobbs article that gives some nice ins and outs:
    http://www.drdobbs.com/cpp/184403053...PCKH4ATMY32JVN
    Well that's something to chew on. Dunno if I use templates that much, but I probably use some libraries that do.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  4. #4
    Captain Crash brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,246
    There's really no rule of thumb to gauge whether a given code size is, or is not, normal. It really comes down to examining which symbols are being produced by the compiler. If you use templates with only a small number of different instantiations, you certainly should not see code growth of 1000% or more.

    One common culprit is the exception dispatch tables which are generated for exception handling. If you declare lots of scattered objects with destructors, these exception dispatch tables can become VERY large. I've witnessed over 10x reduction in code size simply by turning exception handling off, in certain circumstances.

    Before blaming this thing or that thing, let's hear more about the code you're using.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  5. #5
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by brewbuck View Post
    Before blaming this thing or that thing, let's hear more about the code you're using.
    Well, here's a starting place...

    Code:
    #include <stdio.h>
    
    int main() {
    	char x[] = "hello world";
            printf("%s\n",x);
            return 0;
    }
    Code:
    #include <iostream>
    #include <string>
    
    using namespace std;
    
    int main() {
    	string x = "hello world";
    	cout << x << endl;
    	return 0;
    }
    Those are 6688 and 10221 bytes, meaning the C++ one is 153% the size of the C one. I guess I am wondering how much that can compound.

    What I have been doing the past week is gtkmm, the C++ gtk libraries, and while I'm not duplicating anything, it just seems I am getting like 1/2 mb exe's that are not any more complex that apps I've written in C that are <100kb.
    Last edited by MK27; 04-01-2010 at 08:51 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  6. #6
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    21,794
    Stroustrup has an answer to the FAQ Why is the code generated for the "Hello world" program ten times larger for C++ than for C? That said, as stated in this MinGW FAQ on Large executables, such "overhead is generally constant so it will not be significant in more realistic applications".
    C + C++ Compiler: MinGW port of GCC
    Version Control System: Bazaar

    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  7. #7
    Registered User
    Join Date
    Jun 2005
    Posts
    6,344
    Quote Originally Posted by jeffcobb View Post
    Any template use will bloat your code.
    Sorry, but that statement is at least ten years out of date. The effect of template usage on executable size is actually a concern related to quality of implementation of the compiler.

    For sake of discussion, let's say that a C++ source file has a .cpp extension, and the corresponding compiled object has a .obj extension.

    When templates were first specified (and this was in the late 1980s) most compilers included a copy of instantiated templates in each compiled object file. For example, if a.cpp and b.cpp each made use of a template function foo<int>(some_arguments) a copy of that instantiated function would be placed in both a.obj and b.obj. The linker would therefore leave two copies of the function in the executable. This phenomenon was the underpinning of what was called by names such as "template bloat".

    The thing is, quality of C++ compilers has increased substantially since the 1980s. Most compilers dating since the mid 1990s support some method of template instantiation that differs from that described above - most with a specific goal of eliminating or reducing "template bloat". It is actually hard to find a C++ compiler dating after 2005 that does not support such an option.

    The problem is that the methods do do this depend on the compiler. If the compiler does not support such things by default it is necessary to dig into the compiler documentation to work out how to do it.

    The export keyword, in the C++ standard, was intended to support this type of thing. (More accurately, it was intended to allow template definitions to be separated from their declarations but, if you think about it, that would also help reduce the problems of "template bloat"). Unfortunately, compiler vendors have found it hideously difficult to support the export keyword. My understanding is that, to date, only compilers based on the EDG front end (eg Comeau C++) support it properly and all other C++ compilers (gnu, Visual C++, etc) do not.
    Right 98% of the time, and don't care about the other 3%.

  8. #8
    Registered User
    Join Date
    Jun 2008
    Posts
    62
    the difference is in the iostream library and every library that it links to. With the stdio library, you are linked to a limited set of functions, with the iostream library, you are linked to several other libraries (the string library, for example). All these extra libraries add to the code, and believe me, with the structure of the current std library, it adds up fast.

    That being said, past the initial bloat from iostream, files really don't get too much bigger if you were to implement every STL library. I don't think I've seen a file much bigger then about 500kb on windows as a result of using C templates. You can fit a lot of code in 1kb.

  9. #9
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by laserlight View Post
    That said, as stated in this MinGW FAQ on Large executables,
    That might be the ticket:
    Templates and the C++ Standard Library

    When you use template classes such as the Standard Template Library, the compiler generates code separately for each instantiation (e.g. vector<int> and vector<string>) so the total code size can increase significantly.

    [...]

    When you compile C++ code gcc's default is to include RTTI. If you don't use RTTI then you can vastly reduce the size of the created code by compiling with "-fno-rtti".
    I use vector a lot, almost every function has a few vectors, all my classes have vector members.* And maps. Lotta maps. WRT RTTI, gtkmm uses dynamic casting and so will not compile with -fno-rtti.

    Which raises the question, maybe I should use the base C library instead of the C++ wrappers, it would seem they will just inherently bloat...however, thinking harder about it, the exe size is not that bad if I compare it to other apps not written by me. Eg, a 200 or 300k GUI is nothing. It's just the "pure C" ones I was doing before were always crazily small, which I took as a sign of my streamlined approach (who needs error handling? etc.) that I have somewhat given this up in favour of greater meticulousness. And to be honest, just creating as small an executable as possible is not a specific goal.

    So I guess those explanations are good enough.

    *see, I have gotten spoiled now. You know what they say: "If it seems too good to be true..."
    Last edited by MK27; 04-02-2010 at 07:04 AM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  10. #10
    C++まいる!Cをこわせ! Elysia's Avatar
    Join Date
    Oct 2007
    Posts
    22,668
    You should be avoiding dynamic_cast in the first place, too, since it comes with overhead.
    But seriously, is a few 100K such a big issue? In the end, it probably won't matter.
    And if it does, you can, as previously stated, use a compressor.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  11. #11
    and the hat of sweating
    Join Date
    Aug 2007
    Location
    Toronto, ON
    Posts
    3,545
    Unless you're running on a machine with very limited amounts of memory, I don't think you should worry too much about the executable size. I'd care more about whether it works correctly and how fast it runs.
    "I am probably the laziest programmer on the planet, a fact with which anyone who has ever seen my code will agree." - esbo, 11/15/2008

    "the internet is a scary place to be thats why i dont use it much." - billet, 03/17/2010

  12. #12
    Registered User UltraKing227's Avatar
    Join Date
    Jan 2010
    Location
    USA, New york
    Posts
    123
    the size of a file is important. but really, a computer from 1990 can have
    hundreds of 100k! now, they can contain 100,000,000,000,000,000,000
    100ks! anyway, if the size of the executeable is above three hundred
    MBs. consider reducing the size.

  13. #13
    Captain Crash brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,246
    Quote Originally Posted by MK27 View Post
    Those are 6688 and 10221 bytes, meaning the C++ one is 153% the size of the C one. I guess I am wondering how much that can compound.
    You are seeing a fixed overhead due to linking the iostreams library. There are multiple alternatives, such as STLPort, which might be smaller. At any rate, this particular overhead would not grow with program size -- you pay for it once when you call anything inside iostreams. And it's certainly not directly related to C++ as a language -- the implementors of the library will determine how big (or small) it is.

    I assume you're on Linux. You can start digging in deeper by using objdump.

    Code:
    # Dump all section headers -- see which sections are contributing most to size
    objdump -h MyProg
    
    # Dump all symbols -- see which functions are contributing most to size
    objdump -t MyProg
    Also, for comparison, try linking the C version statically. Last time I tried this with a simple hello world program, it came out at over half a megabyte. The C runtime can suck just as much as the C++ runtime.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  14. #14
    Registered User
    Join Date
    Jun 2005
    Posts
    6,344
    Quote Originally Posted by MK27 View Post
    I use vector a lot, almost every function has a few vectors, all my classes have vector members.* And maps. Lotta maps. WRT RTTI, gtkmm uses dynamic casting and so will not compile with -fno-rtti.

    Which raises the question, maybe I should use the base C library instead of the C++ wrappers, it would seem they will just inherently bloat...however, thinking harder about it, the exe size is not that bad if I compare it to other apps not written by me. Eg, a 200 or 300k GUI is nothing. It's just the "pure C" ones I was doing before were always crazily small, which I took as a sign of my streamlined approach (who needs error handling? etc.) that I have somewhat given this up in favour of greater meticulousness. And to be honest, just creating as small an executable as possible is not a specific goal.
    In the "pure C" approach you would be copying code, and having a vector_int, a vector_string, a map_string_int, a map_string_double and all sorts of related functions to work with them

    You would still have code bloat. The only difference? In C++, the code may be written once by the programmer and the compiler implements the multiple copies. In C, the programmer replicates the code and makes type-specific changes, and the compiler does not implement the bloat. The net effect is the same, except that a compiler (usually) does things more predictably than a programmer.

    Back on the topic, though: there is a fixed overhead of using a given set of features in the C++ standard library, and a fixed overhead of using a given set of features in the C standard library. As brewbuck noted, the reliable size comparison is of statically linked executables (in which all functions are placed in the executable, rather than in shared libraries). The C++ standard library (eg iostream) is usually, in effect, statically linked in.

    If you care about executable size, most compilers support a "optimize for size" option. It is also a good idea to remove debugging and symbolic information (eg strip <executable> under most flavours of unix). In practice, with non-trivial programs (i.e. more substantial than "Hello World") debugging and symbolic information is one of the biggest consumers of disk space, and may be safely removed if not needed. Except, maybe, in high criticality code.
    Last edited by grumpy; 04-02-2010 at 05:05 PM.
    Right 98% of the time, and don't care about the other 3%.

  15. #15
    Registered User
    Join Date
    Apr 2010
    Posts
    4
    I have a skeleton GUI app written in c++ thats weighs in at a mere 5K!

Page 1 of 2 12 LastLast
Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 3
    Last Post: 04-10-2009, 12:57 AM
  2. Loading a DLL that is not in the same directory as the executable
    By starcatcher in forum Windows Programming
    Replies: 10
    Last Post: 12-13-2008, 06:05 AM
  3. CreateProcess with Resource of executable, not the Filename
    By Ktulu in forum Windows Programming
    Replies: 4
    Last Post: 11-04-2006, 12:07 AM
  4. calling an executable from an executable
    By dee in forum C Programming
    Replies: 4
    Last Post: 01-10-2004, 12:32 PM
  5. Altering The Program's Executable?
    By Aidman in forum C++ Programming
    Replies: 7
    Last Post: 12-31-2002, 04:11 AM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21