When to inline your *tors

**Angus** · 10-27-2008

I'm interested in discussion on when *tors should be inline and when they should be defined in their cpp. I've been trying to conceive of when it would be beneficial to relieve the instance owner of an unnecessary call to a *tor and I'm about ready to hear some feedback from some other programmers.
The bottom-line is basically the same for any consideration of inlining or not: will the overhead of a call to a non-inline method be optimal or not. But *tors have a lot of hidden code, which makes a special discussion worthwhile.
Before I make my arguments I should point out that I my choice of words will be overly general, and there are some situations in which the cases I present don't apply:

Sometimes *tors have much more explicit roles, such as allocating/deallocating memory, opening/closing files, and whatever. In those situations the question of inlining becomes more obvious, but for the sake of argument, I'm considering only those situations where the executable code is more hidden within the source code, so assume there's nothing between the braces in the *tors.
If the class members, or the parent classes of the classes in question do not follow these rules I have set, then that will ruin the benfit of inlining. So in the following cases, let's assume that all of the classes whose definitions are hidden, also followed these principles of inlining.
Virtual methods make for a very confusing cocktail of hidden code. First of all, ctors initialize virtual tables, and virtual pointers, and I'm not ready to tackle those. What's more, if a parent's method is virtual (such as a dtor) then all child methods are automatically virtual, regardless of whether or not these child methods use the virtual keyword or not. Finally, virtual methods can't be inline. You can try to declare them inline, but they won't be inline, and you'll just waste compiler cycles. So let's assume for simplicity that none of these classes have any virtual methods
If the members need to be initialized to default values, it might be worthwhile to define the ctor in a cpp, relieving the burden on the instance owners to do so.

Case 1: a class with only POD members

Code:

class pod {
    int i;
    float f;
    short s;
    char c[11];
};

It might be considered good coding practise to define your *tors in the cpp here, but otherwise a complete waste of cycles and space. The space for all these members will be (de)allocated by the owner on the stack, if the instance is allocated on the stack, or contiguously on the heap. In either case, the instance owner will have to be (de)allocating the memory anyway, and a call to a non-inline *tor would be a waste of a call. Even if the members are to be initialized by with values from the instance owners like so:

Code:

pod _pod(5, 1.5, 14, "fred");

the instance owner would still have to marshal the data onto the stack (or registers, depending on the calling convention) and then the, ctor would have to go about extracting the data, and assigning the members itself. It would still be faster (and require no more code) to have the instance owners assign the members directly.

Case 2: The class has exactly 1 non-POD member

Code:

class more_complex {
    int i;
    char c[11];
    a_class_wout_inline_star_tors m;
};

In this case it would still be worthwhile to leave your *tors inline. True, not only will the instance owners have to (de)allocate for i, c, and whatever m needs, but they will have to call the *tors for m. So what good would it do if instead of calling m's *tors directly, the instance owner calls *tors for more_complex, which then makes the calls to m's *tors?

Case 3: the class has more than 1 non-POD member

Code:

 class still_more_complex {
	int i;
	char c[11];
	a_class_wout_inline_star_tors m;
	another_class_wout_inline_star_tors n;
};

Here it might not be worthwhile to have inline *tors. If you try to inline them, the instance owners will just have to call m's *tors and then n's *tors.

Case 4: the class is a child class
The way I see it, you can treat each parent like a class member. So if you have only 1 parent, it's as if you have only 1 non-POD class member, and if you have multiple inheritance, it's as if you have multiple non-POD members.

So am I right or am I wrong?

**anon** · 10-27-2008

I don't get a lot of your argumentation. Are you suggesting it is OK to leave some members uninitialized and what does it have to do with inlining?

If I'm not mistaken, some compilers might be able to inline code in cpp files, whereas they are free not to inline code in the header files even if you use the inline keyword if they so please.

**Angus** · 10-27-2008

Originally Posted by anon

I don't get a lot of your argumentation. Are you suggesting it is OK to leave some members uninitialized and what does it have to do with inlining?

If I'm not mistaken, some compilers might be able to inline code in cpp files, whereas they are free not to inline code in the header files even if you use the inline keyword if they so please.

This is not what I'm saying at all. You must always initialize members when necessary, However, when that initialization is necessary, it is usually worthwhile to define your ctor in your cpp, if your are initializing with default values. If you are initializing w/values passed from the instance owner, you might be best off inlining.

Elysia: I don't think the optimizer can make a decision about inlining, since the optimality of such a move is not apparent until link time. The decision to inline must always be done at programming time.

**Elysia** · 10-27-2008

Originally Posted by Angus

Elysia: I don't think the optimizer can make a decision about inlining, since the optimality of such a move is not apparent until link time. The decision to inline must always be done at programming time.

No. Certain compilers have link time optimizations. Visual C++ comes to mind.
But besides that, inlining is only a suggestion to a compiler. It does not guarantee it will inline the function.
So what does that mean?
Even if you mark it as inline, the compiler might still not do it because it is not until link time, as you say, that it can be determined whether or not it might make sense to inline it.

So let the compiler do the work.

**Angus** · 10-27-2008

Originally Posted by Elysia

No. Certain compilers have link time optimizations. Visual C++ comes to mind.
But besides that, inlining is only a suggestion to a compiler. It does not guarantee it will inline the function.
So what does that mean?
Even if you mark it as inline, the compiler might still not do it because it is not until link time, as you say, that it can be determined whether or not it might make sense to inline it.

So let the compiler do the work.

Quite. Visual C++ does do some oddball optimizations. I'd rather that more compilers would do them, but until they do I only take into consideration general optimization principles. I meant to put in my OP that one can presume too much about an optimizer. When I consider inlining, I assume that everything I specify as inline will be inline (unless virtual) and everything not inline won't be inline.

**Elysia** · 10-27-2008

Originally Posted by Angus

When I consider inlining, I assume that everything I specify as inline will be inline (unless virtual) and everything not inline won't be inline.

But that is just the thing - you cannot control what to inline and not, because a compiler is not required to listen to your inline. It is more of a suggestion, and the compiler probably does it better anyway.

But regardless of that - I do not find it a good thing to just inline something because you think it might be a good idea. Instead, you should find out bottlenecks with a profiler and inline those functions instead.
And a good compiler will go a long way towards optimizing that.
There is, after all, such a thing called premature optimization.

**matsp** · 10-27-2008

Originally Posted by Angus

Elysia: I don't think the optimizer can make a decision about inlining, since the optimality of such a move is not apparent until link time. The decision to inline must always be done at programming time.

I'm not Elysia, but I think I'm qualified to answer the question. In most cases, the compiler proper (that is, the COMPILER, not the LINKER) makes the decision about inlining. The decision to inline a function is based on:
1. Size of the inlined code vs. the size of a call to the relevant function. If it's smaller to inline the function than the call, there's an obvious case of benefit.
2. In case where the function inlined gives more code than the call, the number of times it needs to be inlined is taken into account (if a function is only ever called ONCE, there's no drawback to inline it).
3. Visibility of the function. If the compiler doesn't know the source of the function, it can not inline it.

There are compilers that at least under some circumstances will make the decision whether to inline or not is done at a "linker" stage. It is usually not the final linking stage which produces the executable file, but a "half-compiled" stage, where the source code has been translated into some sort of pseudo-machine-code, and the final machine-code phase is done AFTER this step.

As to whether you should or shouldn't inline constructors or destructors is one of those things that is very hard to give on "always right" answer. If we let the compiler do the job [and want more than a few compilers to be able to inline it appropriately], there is only one drawback, and that is that the constructor must be in a header file (or in the same .cpp file where it is being used).

The other solution would be to benchmark the code, and then use the results of for example a profiler to determine if the calls to particular constructors are actually affecting the overall performance of the software. Beware that inlined code is hard to identify when profiling, since it tends to be spread over many different locations, so if a piece of code is actually inlined and causes a slow-down because of code-bloat, you will not detect it, because the many places where the code is bloated will not show up as ONE nice peak.

--
Mats

**Elysia** · 10-27-2008

There would be a simple answer to this: let the compiler handle the inlining. The compiler generally knows best what to inline and not.
Compile with a good compiler with aggressive inlining settings and it will automatically inline stuff it thinks should be inlined.
Otherwise, the point of inlining is to save overhead for calling it. So that makes it a good idea to inline everything that is small and called enough times or big but called only once or twice. What the function does does not matter.

**anon** · 10-27-2008

However, when that initialization is necessary, it is usually worthwhile to define your ctor in your cpp, if your are initializing with default values. If you are initializing w/values passed from the instance owner, you might be best off inlining.

I don't see why you think so: the code is the same either way.

Code:

class X
{
    int a, b, c;
public:
    X(): a(0), b(0), c(0) {}
    X(int x, int y, int z): a(x), b(y), c(z) {}
};

**Angus** · 10-27-2008

Originally Posted by anon

I don't see why you think so: the code is the same either way.

Code:

class X
{
    int a, b, c;
public:
    X(): a(0), b(0), c(0) {}
    X(int x, int y, int z): a(x), b(y), c(z) {}
};

The way I see it, in the case of X(), each instance owner would have to dedicate 3 instructions to assigning a, b and c. If this ctor was not inlined, the instance owner would just have to make 1 call to the ctor, while only having to pass it the this pointer.
In the case of X(int x, int y, int x), if it was not inlined, it would have to push all 3 values onto the stack (or into registers) and then the ctor would pop them from the stack (or take them from registers) and assign then. If X(int, int, int) was inline, then rather than have the instance owners pass all the variables, they could just assign the members a, b, and c themselves.

**whiteflags** · 10-27-2008

In your original post you don't seem to mention a case where inlining ctors and dtors is detrimental. Inlining does have drawbacks, but you can get most of the benefits by doing nothing.

I rarely if ever bother to inline constructors or destructors simply because if I wrote them, they have work to do, and the overhead from a function call is probably the last thing I worry about when I go to optimize.

**Angus** · 10-27-2008

Originally Posted by citizen

In your original post you don't seem to mention a case where inlining ctors and dtors is detrimental. Inlining does have drawbacks, but you can get most of the benefits by doing nothing.

I rarely if ever bother to inline constructors or destructors simply because if I wrote them, they have work to do, and the overhead from a function call is probably the last thing I worry about when I go to optimize.

That's because the reasons why you shouldn't use inlining for *tors are classic problems, and beyond the scope of this thread. I wanted to deal with situation where inlining is definitely worthwhile (with certain specified exceptions).

**CornedBee** · 10-27-2008

Eh? I thought LTO was still an experimental branch of the current GCC development.

**@nthony** · 10-27-2008

Isn't function inlining only enabled during optimizations (i.e. by default optimization is off). In either case you may also be able to control the behaviour given some compile-time options:

Originally Posted by gcc

-finline-limit=n
By default, GCC limits the size of functions that can be inlined. This flag allows coarse control of this limit. n is the size of functions that can be inlined in number of pseudo instructions.

Inlining is actually controlled by a number of parameters, which may be specified individually by using --param name=value. The -finline-limit=n option sets some of these parameters as follows:

max-inline-insns-single
is set to n/2.
max-inline-insns-auto
is set to n/2.

See below for a documentation of the individual parameters controlling inlining and for the defaults of these parameters.

Note: there may be no value to -finline-limit that results in default behavior.

Note: pseudo instruction represents, in this particular context, an abstract measurement of function's size. In no way does it represent a count of assembly instructions and as such its exact meaning might change from one release to an another.

**Elysia** · 10-28-2008

Of course it is enabled with optimizations only. If you do not enable optimizations, the compiler will leave the code alone - not make any changes.
It does make sense to have it on when optimizations are not enabled.

Thread: When to inline your *tors

Thread Tools

Search Thread

Display

Hybrid View

When to inline your *tors

Similar Threads

Code review

Inline functions and inheritance

Certain functions

bit shifting

When does the compiler listen to the inline specifier

Tags for this Thread