# Monte Carlo Simulation Optimization

• 08-01-2010
Phys10
Monte Carlo Simulation Optimization
Greetings,

I am a Physics graduate student working on a Monte Carlo simulation I wrote in C++. I've reached the point where my major concern is how fast the program runs since I need to do lots of data averaging to get clean results. I'm trying to optimize for speed and have a couple of specific questions. When I began writing the simulation I had more experience with Fortran than C++, and my code is likely somewhat oddly structured for an object oriented language.

My major question is if it makes a significant performance difference to make a function a member of a class rather than pass the class object to the function by pointer. My program is currently structured such that I have 3 classes that divide up different types of data, and I pass one or more of these class objects to global functions by pointer. The functions do work on the data. These functions perform the major work of the program and are called millions of times.

I've been considering reworking the whole thing, perhaps lumping all of the data structures into one class, and making all the functions member functions. This would be worth the effort if I gained a 10% performance increase, probably not worth it for a 1% increase.

Other than that I have heard that references can be better optimized by the compiler than pointers, and that arrays with dimensions that are a power of 2 can be manipulated more quickly. If anyone has any advice regarding these it would be appreciated.

I know which functions are slow due to their calculation burden, and it is possible no structural changes can significantly speed things up, but if anyone has any advice it would be appreciated.

Thanks.
• 08-01-2010
jacobh
Be nice to see some code.

As for arrays with a power of 2, it depends on what you mean by "manipulated easily". From an algorithmic perspective, they can be more easily dealt with in merge sort, and binary search algorithms due to their log(n) and n*log(n) asymptotic runtimes.

It also depends on the data structures you are using to store the data and how you are using the data. If you are searching more often than inserting data, then tree structures, such as binary, 2-3, red-black trees are better. If you are inserting data more, then using linked lists and arrays would be best.

As for compiler optimization, I'm not quite sure.
• 08-01-2010
kmdv
I'm not an assembly guru, but will try to help you (this question is more related to assembly then c++). If you want to make it fast, maybe consider using physics libraries which have hardware support.
You want to optimize things that will speed up your program only a bit, but don't care about things that slow down it much.
First, consider using a different language, which has a virtual machine and compiles it just-in-time like Java. Such a code can be faster.

Quote:

My major question is if it makes a significant performance difference to make a function a member of a class rather than pass the class object to the function by pointer.
In practice, there is absolutely no difference. First, change calling convention if possible (compiler options).

Quote:

I've been considering reworking the whole thing, perhaps lumping all of the data structures into one class, and making all the functions member functions. This would be worth the effort if I gained a 10% performance increase, probably not worth it for a 1% increase.
Do not rework then.

Quote:

Other than that I have heard that references can be better optimized by the compiler than pointers
Theoretically yes, but in most cases they will work as pointers - temporary register variables - depends.
• 08-01-2010
Phys10
Thanks for your responses. kmdv thanks for the advice about reworking it, I won't go to the trouble. I'm not aware of any libraries that would be applicable to my situation though perhaps there is one. We're simulating chain molecules using a coarse grained method that is somewhat new, and the effectiveness of the method is part of the research interest. I do have linear algebra libraries for those sort of operations. I think I'm probably too far along to change languages.

jshowa thanks for the powers of 2 advice. I mostly insert and remove data from an array as the chain molecule grows and shrinks so powers of 2 might help. Also a fast function for advancing all data in an array by one index place could be useful, I'll try to look one up. I'm somewhat reluctant to be too specific or post code since the research isn't finished and also belongs to my research group.

One other thing I've been considering: in some places doubles are recast as ints when I take space locations from the continuum and round to nearest locations on a discrete grid. I have heard that this recasting can be slow. If anyone has any advice about that it would be appreciated. In this particular method I must go back and forth between the continuum and the grid.

Thanks much.
• 08-01-2010
Salem
Assuming you're using GCC, the first thing you need to do is do some profiling.

GNU gprof

This will tell you where the program is spending most of the time, and thus tell you where you should start looking for improvements.

It's an iterative process:
- do profile, analyse the results
- make a code change, retest the functionality of the program

If the code no longer works, then back out the change (or fix it)
If the performance is worse, then back out the change (or fix it)
• 08-02-2010
iMalc
Given your description of your skill level, you're at best going to make a few percent difference with a lot of effort, whilst many of use here could probably spot a few one-line changes that double the speed. With profiling we could probably quadruple it.

I've never seen a thread on here about optimisation where it was solved without the OP either becoming rather experienced with a profiler and/or posting the code here. Every thread that just talks about theory of optimising tiny code snippets is frankly a waste of time.