Hey all,
I am working with a numerical solver for differential equations. Basically I have a lump of matter, which I chop up into billions of tiny cubes and act the algorithm on each one. The values in each cell are update by the ones that surround it at each timestep. It is parallelized so that I can use any number of processors I like.
The problem is that in order to get the resolution I need, my program needs well over 1TB of RAM. Which is ridiculous. I would like to cut that in half, if possible.
Now, because most of the interesting stuff happens only in select areas of the material I am working with, the idea if to use a grid that is very fine where I need resolution and course where I don't. This is pretty well established practice.
The problem arises with the parallelization. Right now, each processor holds the same volume of space, which amounts to about 20 million grid cells per processor. But if my resolution changes from one place to the next, those 20 million grid cells will hold different volumes of space depending on where they are in the simulation. Which means that the processors will be limited to the same amount of space as the one with the finest resolution inside it, which in turn means that processors that are in the coarse resolution areas will be holding only maybe a million grid cells,and will not be working to capacity.
So I will reduce my memory requirements, but the number of processors I need to do it will skyrocket. Which is just as bad.
Now I had an idea for getting around this, but I have never worked with parallel programming before (this one the parallelization was done when I got it) so I want to run it by you guys before I try it so that I know I am not wasting my time.
I want to make the processors layout adaptive in the same way that the grid cells are - more processors where there is higher resolution grid cells, and vice versa. This means, however, that processors will have a variable number of neighbors to communicate with depending on where they are in the space (processors only need to pass info to their immediate neighbors once per timestep).
So: is this a common practice? Is it feasible? Anyone have simpler ideas?
I am having a hard time getting anything useful on google because my "vocabulary" in this field is pretty limited as of yet.