I wanna share my updated project!!!
Alright, so I did this last time but basically, I got owned in the face by multithreading and writing code that isn't cache friendly.
So I attempted a new strategy in which I'll just generate an independent tree per processor and then I guess I'll have to figure out how to link the independent meshes.
But this time, my code is like 3 - 4 times as fast as it used to be. For one, I finally figured out how to untangle all my print statements out of my code by using a "debug" option in the .hpp file. And I also switched to using templates so this way the option of float, double, long double and __float128 can be selected. Anything that can't handle non-integer division will break the code.
There are also optional tree and mesh printing routines this time as well towards the end of main.cpp.
And for those that have read this far and still don't know what I'm talking about, my code is a tetrahedral mesh code meaning that it draws tetrahedra from a set of 3D points. In my case, a Cartesian distribution of points, set by the user.
The code operates by inserting points into a global all-encompassing tetrahedron which fracture it and then any other subsequent fractures that contain the new point to be inserted. Point location is done through the use of a graph, similar in shape to a quadtree, which stores the entire insertion history of the triangulation.
I haven't added Delaunay refinement yet but legit though, as it stands now I can triangulate like 26,000 points a second compared to my old code which is something like 6200 points a second. So that's technically something like a speed-up of 4.2x.
To make it run, type something like this : ./regulus -np 1 -bl 4
-np 1 is the number of processors, set to be 1. Using any other number will fail the assertion I put in the main loop but I do plan on actually launching some threads pretty soon now.
-bl 4 is the length of the Cartesian box so 4 means a 4x4x4 cube of points with integer positions ranging from (0, 0, 0) to (3, 3, 3) for a total of 64 points.
I'll be posting all the code below. Please give it some feedback if you care to take a look and compile/run it.