# Thread: I wanna share my updated project!!!

1. ## I wanna share my updated project!!!

Alright, so I did this last time but basically, I got owned in the face by multithreading and writing code that isn't cache friendly.

So I attempted a new strategy in which I'll just generate an independent tree per processor and then I guess I'll have to figure out how to link the independent meshes.

But this time, my code is like 3 - 4 times as fast as it used to be. For one, I finally figured out how to untangle all my print statements out of my code by using a "debug" option in the .hpp file. And I also switched to using templates so this way the option of float, double, long double and __float128 can be selected. Anything that can't handle non-integer division will break the code.

There are also optional tree and mesh printing routines this time as well towards the end of main.cpp.

And for those that have read this far and still don't know what I'm talking about, my code is a tetrahedral mesh code meaning that it draws tetrahedra from a set of 3D points. In my case, a Cartesian distribution of points, set by the user.

The code operates by inserting points into a global all-encompassing tetrahedron which fracture it and then any other subsequent fractures that contain the new point to be inserted. Point location is done through the use of a graph, similar in shape to a quadtree, which stores the entire insertion history of the triangulation.

I haven't added Delaunay refinement yet but legit though, as it stands now I can triangulate like 26,000 points a second compared to my old code which is something like 6200 points a second. So that's technically something like a speed-up of 4.2x.

To make it run, type something like this : ./regulus -np 1 -bl 4

-np 1 is the number of processors, set to be 1. Using any other number will fail the assertion I put in the main loop but I do plan on actually launching some threads pretty soon now.

-bl 4 is the length of the Cartesian box so 4 means a 4x4x4 cube of points with integer positions ranging from (0, 0, 0) to (3, 3, 3) for a total of 64 points.

I'll be posting all the code below. Please give it some feedback if you care to take a look and compile/run it.

2. main.cpp :

Code:
```#include "tree.cpp"

/* Few global variable definitions */

unsigned int box_length = 0;

int main(int argc, char** argv) {

/* Simple user interface approach */

if (io(argc, argv)) {

std::cout << "Incorrect format.\n";
std::cout << "Use the form :: ./regulus -np X -bl Y\n";
std::cout << "X is the number of processors (larger than 0)\n";
std::cout << "Y is the length of the box (larger than 0)\n";

return 1;
}

/* Allocate a set number of points for a processor, using a Cartesian distribution */

unsigned int particle_count = pow(box_length, 3);
unsigned int m = particle_count % num_threads;
unsigned int ppp = (particle_count - m) / num_threads;

std::vector<buffer<U>> data_buffer;
for (unsigned int i = 0; i < num_threads; ++i)
data_buffer.emplace_back((ppp));

{

unsigned int tet_len = 3*(box_length - 1);

U tl = static_cast<U>(tet_len);
U zero = static_cast<U>(0);

alloc_point(data_buffer[0], zero, zero, zero);
alloc_point(data_buffer[0], tl, zero, zero);
alloc_point(data_buffer[0], zero, tl, zero);
alloc_point(data_buffer[0], zero, zero, tl);
}

for (unsigned int i=0; i < particle_count; ++i) {

U z = i % box_length;
U y = (i/box_length) % box_length;
U x = i/(box_length*box_length);

alloc_point(data_buffer[0], x , y, z);
}
/*
point<U>* p = reinterpret_cast<point<U>*>(data_buffer[0].point_buffer.data());
print_point(p);
for (unsigned int i = 0; i < (3 + particle_count); ++i) {
p += 1;
print_point(p);
}
*/
/* Allocate a root tetrahedron and begin insertion procedure */

{

point<U>* r0 = reinterpret_cast<point<U>*>(data_buffer[0].point_buffer.data());
point<U>* r1 = r0 + 1;
point<U>* r2 = r1 + 1;
point<U>* r3 = r2 + 1;

alloc_tetra(data_buffer[0], r0, r1, r2, r3);

}

tetra<U>* root_tetra = reinterpret_cast<tetra<U>*>(data_buffer[0].tetra_buffer.data());
print_tetra(root_tetra, "This is the root :");

{

unsigned int inside = 0, surface = 0, outside = 0;
point<U>* test = reinterpret_cast<point<U>*>(data_buffer[0].point_buffer.data()) + 4;

for (unsigned int i = 0; i < ppp; ++i, ++test) {

//std::cout << i << " : "; print_point(test);

int result = inside_tetra(root_tetra, test);

if (result == 0)
++inside;
else if (result > 0)
++surface;
else
++outside;
}

std::cout << "\nThere are " << inside << " points inside the root tetrahedron" << std::endl;
std::cout << "There are " << surface << " points on the surface" << std::endl;
std::cout << "Total points contained : " << inside + surface << "\n" << std::endl;

assert(inside + surface == ppp);

std::cout << "There are " << outside << " points outside the tetrahedron\n" << std::endl;

}

alloc_tree(data_buffer[0], root_tetra);

tree<U, tetra<U>*>* root = reinterpret_cast<tree<U, tetra<U>*>*>(data_buffer[0].tree_buffer.data());

root->parents.fill(nullptr);

std::vector<tree<U, tetra<U>*>*> leaves;

std::cout << "Triangulating points...\n";

for (unsigned int i = 5; i < (ppp + 4); ++i) {

if (debug) std::cout << "\nIterative index : " << i-4 << std::endl;

point<U>* p = reinterpret_cast<point<U>*>(data_buffer[0].point_buffer.data());

search_tree(root, p+i, leaves);

fracture(data_buffer[0], leaves, p+i);

leaves.clear();
}

/* Routines to print tree/mesh data */

//print_tree(data_buffer[0]);
//print_mesh(data_buffer[0]);

/* Print various data structure information */

std::cout << "\nSize of struct point : " << sizeof(point<U>) << " bytes" << std::endl;
std::cout << "Size of struct tetra : " << sizeof(tetra<U>) << " bytes" << std::endl;
std::cout << "Size of struct tree : " << sizeof(tree<U, tetra<U>*>) << " bytes" << std::endl;
std::cout << "Size of struct buffer : " << sizeof(buffer<U>) << " bytes" << std::endl;

return 0;
}```
io.cpp :
Code:
```#include "structures.hpp"

int io(int argc, char** argv) {

if (argc == 5) {

if (strcmp(argv[1], "-np") == 0) {

int x = sscanf(argv[2], "%u", &num_threads);

if (!x)
return 1;

if (strcmp(argv[3], "-bl") == 0) {

int y = sscanf(argv[4], "%u", &box_length);

if (!y)
return 1;

} else return 1;

} else return 1;

} else return 1;

return 0;
}```

3. point.cpp :
Code:
```#include "structures.hpp"

template<typename T>
void print_point(const point<T>* p) {

std::cout << "( " << p->x << ", " << p->y << ", " << p->z << ")" << std::endl;

return;
}

template<typename T>
void alloc_point(buffer<T> & buff, T x, T y, T z) {

point<T> p(x, y, z);

memcpy(buff.point_position, &p, sizeof(point<T>));
buff.point_position += sizeof(point<T>);

return;
}```
tetra.cpp :
Code:
```#include "point.cpp"

template<typename T>
void print_tetra(const tetra<T> *t, std::string msg) {

std::cout << "\n" + msg << std::endl;

for (int i=0; i<4; i++)
std::cout << "(" << t->p[i]->x << ", " << t->p[i]->y << ", " << t->p[i]->z << ")" << std::endl;

return;
}

/* Routine used to determine a point's location relative to
a tetrahedron */

template<typename T>
int inside_tetra(tetra<T>* t, point<T>* p) {

T d[4] = {

t->a[0]*p->x + t->b[0]*p->y + t->c[0]*p->z + t->d[0],
t->a[1]*p->x + t->b[1]*p->y + t->c[1]*p->z + t->d[1],
t->a[2]*p->x + t->b[2]*p->y + t->c[2]*p->z + t->d[2],
t->a[3]*p->x + t->b[3]*p->y + t->c[3]*p->z + t->d[3],
};

if (d[0] > 0 && d[1] > 0 && d[2] > 0 && d[3] > 0)
return 0; /* Inside */

if (d[0] < 0 || d[1] < 0 || d[2] < 0 || d[3] < 0)
return -1; /* Outside */

else {

int sum = 0;

for (int i=0; i<4; i++)
if (d[i]==0) ++sum;

return sum; /* Vertex, surface or edge (3, 1, 2) */
}
}

/* Allocate a tetrahedron in the buffer */

template<typename T>
int alloc_tetra(buffer<T> & buff, point<T>* p0, point<T>* p1, point<T>* p2, point<T>* p3) {

try {

tetra<T> t(p0, p1, p2, p3);

if (debug) {

std::cout << "\n";
print_tetra(&t, "This is the proposed allocated tetrahedron");
}

memcpy(buff.tetra_position, &t, sizeof(tetra<T>));

if (debug) {

std::cout << "\n";
print_tetra(reinterpret_cast<tetra<T>*>(buff.tetra_position), "This is what was copied to the buffer :");
}

buff.tetra_position += sizeof(tetra<T>);

return 0;

} catch (std::exception & z) {

//std::cout << z.what() << std::endl;
return 1;
}

return 0;
}

template<typename T>
tetra<T>::tetra(point<T>* p0, point<T>* p1, point<T>* p2, point<T>* p3) {

{ /* Plane 012 */

T x10 = p1->x - p0->x;
T y10 = p1->y - p0->y;
T z10 = p1->z - p0->z;

T x20 = p2->x - p0->x;
T y20 = p2->y - p0->y;
T z20 = p2->z - p0->z;

T a = y10*z20 - y20*z10;
T b = -(x10*z20 - x20*z10);
T c = x10*y20 - x20*y10;

if (a==0 && b==0 && c==0)
throw co_linear();

T d = a*p0->x + b*p0->y + c*p0->z;

//printf("%.00Lf*x + %.00Lf*y + %.00Lf*z = %.00Lf\n", a, b, c, d);

d = -d;

T p = a*p3->x + b*p3->y + c*p3->z + d;

if (p > 0) {

tetra::a[0] = a;
tetra::b[0] = b;
tetra::c[0] = c;
tetra::d[0] = d;

tetra::p[0] = p0;
tetra::p[1] = p1;
tetra::p[2] = p2;
tetra::p[3] = p3;

} else if (p < 0) {

tetra::a[0] = -a;
tetra::b[0] = -b;
tetra::c[0] = -c;
tetra::d[0] = -d;

tetra::p[0] = p0;
tetra::p[1] = p2;
tetra::p[2] = p1;
tetra::p[3] = p3;

} else if (p == 0)
throw co_planar();
}

{ /* Plane 023 */

T x20 = p2->x - p0->x;
T y20 = p2->y - p0->y;
T z20 = p2->z - p0->z;

T x30 = p3->x - p0->x;
T y30 = p3->y - p0->y;
T z30 = p3->z - p0->z;

T a = y20*z30 - y30*z20;
T b = -(x20*z30 - x30*z20);
T c = x20*y30 - x30*y20;

T d = a*p0->x + b*p0->y + c*p0->z;

//printf("%.00Lf*x + %.00Lf*y + %.00Lf*z = %.00Lf\n", a, b, c, d);

d = -d;

tetra::a[1] = a;
tetra::b[1] = b;
tetra::c[1] = c;
tetra::d[1] = d;

}

{ /* Plane 031 */

T x30 = p3->x - p0->x;
T y30 = p3->y - p0->y;
T z30 = p3->z - p0->z;

T x10 = p1->x - p0->x;
T y10 = p1->y - p0->y;
T z10 = p1->z - p0->z;

T a = y30*z10 - y10*z30;
T b = -(x30*z10 - x10*z30);
T c = x30*y10 - x10*y30;

T d = a*p0->x + b*p0->y + c*p0->z;

//printf("%.00Lf*x + %.00Lf*y + %.00Lf*z = %.00Lf\n", a, b, c, d);

d = -d;

tetra::a[2] = a;
tetra::b[2] = b;
tetra::c[2] = c;
tetra::d[2] = d;

}

{ /* Plane 321 */

T x23 = p2->x - p3->x;
T y23 = p2->y - p3->y;
T z23 = p2->z - p3->z;

T x13 = p1->x - p3->x;
T y13 = p1->y - p3->y;
T z13 = p1->z - p3->z;

T a = y23*z13 - y13*z23;
T b = -(x23*z13 - x13*z23);
T c = x23*y13 - x13*y23;

T d = a*p3->x + b*p3->y + c*p3->z;

//printf("%.00Lf*x + %.00Lf*y + %.00Lf*z = %.00Lf\n", a, b, c, d);

d = -d;

tetra::a[3] = a;
tetra::b[3] = b;
tetra::c[3] = c;
tetra::d[3] = d;

}

/* Don't forget to initialize the face values */

tetra::ngb[0] = nullptr;
tetra::ngb[1] = nullptr;
tetra::ngb[2] = nullptr;
tetra::ngb[3] = nullptr;

tetra::h = nullptr;
}```

4. tree.cpp :
Code:
```#include "tetra.cpp"

template<typename T>
void print_mesh(buffer<T> & buff) {

tree<T, tetra<T>*>* printer = reinterpret_cast<tree<T, tetra<T>*>*>(buff.tree_buffer.data());

size_t distance = (buff.tree_position - buff.tree_buffer.data()) / sizeof(tree<T, tetra<T>*>);

unsigned int leaves = 0;

for (decltype(distance) i = 0; i < distance; ++i) {

if ((printer+i)->children[0] == nullptr && (printer+i)->children[1] == nullptr && (printer+i)->children[2] == nullptr && (printer+i)->children[3] == nullptr) {

++leaves;
print_tetra((printer + i)->t, "Mesh tetrahedron :");
}
}

std::cout << "\nTotal number of tetrahedra in the mesh : " << leaves << std::endl;

return;
}

template<typename T>
void print_tree(buffer<T> & buff) {

tree<T, tetra<T>*>* printer = reinterpret_cast<tree<T, tetra<T>*>*>(buff.tree_buffer.data());

size_t distance = (buff.tree_position - buff.tree_buffer.data()) / sizeof(tree<T, tetra<T>*>);

unsigned int leaves = 0;

for (decltype(distance) i = 0; i < distance; ++i) {

print_tetra((printer + i)->t, "Tree tetrahedron");

if ((printer+i)->children[0] == nullptr && (printer+i)->children[1] == nullptr && (printer+i)->children[2] == nullptr && (printer+i)->children[3] == nullptr) {

++leaves;
}
}

std::cout << "\nTotal number of leaves in the tree : " << distance << std::endl;
std::cout << "Total number of tetrahedra in the mesh : " << leaves << std::endl;

return;
}

template<typename T>
void fracture(buffer<T> & buff, std::vector<tree<T, tetra<T>*>*> & leaves, point<T>* p) {

if (debug) std::cout << "\nSize of leaf pile : " << leaves.size() << std::endl;

for (auto it = leaves.begin(); it < leaves.end(); ++it) {

tree<T, tetra<T>*>* parent = *it;

if (debug) {

print_tetra(parent->t, "Fracturing...");
}

std::array<point<T>*, 4> sample = (*it)->t->p;

point<T>* a = sample[0];
point<T>* b = sample[1];
point<T>* c = sample[2];
point<T>* d = sample[3];

std::array<std::array<point<T>*, 4> , 4> vertices = { { { a, b, c, p }, { a, c, d, p }, { a, d, b, p }, { d, c, b, p } } };

if (debug) {

std::cout << "\nThis is all the point data for a facture :\n";

for (unsigned int i = 0; i < 4; ++i) {

std::cout << "Point set index : " << i << std::endl;
print_point(vertices[i][0]);
print_point(vertices[i][1]);
print_point(vertices[i][2]);
print_point(vertices[i][3]);
std::cout << "\n";
}
}

for (unsigned int i = 0; i < 4; ++i) {

if (debug) std::cout << "Fracture index : " << i << std::endl;

if (alloc_tetra(buff, vertices[i][0], vertices[i][1], vertices[i][2], vertices[i][3]))  {
continue;
}

alloc_tree(buff, reinterpret_cast<tetra<T>*>(buff.tetra_position - sizeof(tetra<T>)));

tree<T, tetra<T>*>* frac = reinterpret_cast<tree<T, tetra<T>*>*>(buff.tree_position - sizeof(tree<T, tetra<T>*>));

frac->parents[0] = parent;
if (parent)
parent->children[i] = frac;

if (debug) {

print_tetra(frac->t, "Fractured child");
std::cout << "\n";
}
}
}

return;
}

template<typename T>
void search_tree(tree<T, tetra<T>*>* root, point<T>* p, std::vector<tree<T, tetra<T>*>*> & leaves) {

std::set<tree<T, tetra<T>*>*> visited;

std::stack<tree<T, tetra<T>*>*> traversal;
traversal.push(root);

while (!traversal.empty()) {

tree<T, tetra<T>*>* curr = traversal.top();

if (!visited.insert(curr).second) {

traversal.pop();
continue;
}

if (debug) {

print_tetra( curr->t, "Visiting...");
}

if (curr->del == 0) {

if (inside_tetra(curr->t, p) == -1) {

traversal.pop();
continue;
}

unsigned int sum = 0;

for (auto i = 0; i < curr->children.size(); ++i)
if (curr->children[i])
++sum;

if (sum == 0) {

if (debug) {

print_tetra(curr->t, "Adding this to the leaf pile...");
}

leaves.push_back(curr);
traversal.pop();
continue;
} else {

//for (auto it = curr->children.end(); it < curr->children.end(); ++it)
//traversal.push(*it);

for (auto it = curr->children.crbegin(); it != curr->children.crend(); ++it)
if (*it)
traversal.push(*it);

continue;
}
} else if (curr->del == 1) {

//for (auto it = curr->children.begin(); it < curr->children.end(); ++it)
//traversal.push(*it);

for (auto it = curr->children.crbegin(); it != curr->children.crend(); ++it)
if (*it)
traversal.push(*it);

continue;
}
}

return;
}

template<typename T>
void alloc_tree(buffer<T> & buff, tetra<T>* t) {

tree<T, tetra<T>*> leaf;

for (unsigned int i = 0; i < leaf.children.size(); ++i)
leaf.children[i] = nullptr;

leaf.t = t;

memcpy(buff.tree_position, &leaf, sizeof(tree<T, tetra<T>*>));

buff.tree_position += sizeof(tree<T, tetra<T>*>);

return;
}```
structures.hpp :
Code:
```#include <iostream>
#include <array>
#include <vector>
#include <cstring>
#include <cmath>
#include <cassert>
#include <algorithm>
#include <exception>
#include <string>
#include <set>
#include <stack>

/* Base structure for the graph */

template<typename T, class X>
struct tree {

std::array<struct tree<T, X>*, 4> parents;
std::array<struct tree<T, X>*, 4> children;

X t; // tetrahedron contained by node

unsigned int del;

tree(void) { del = 0; }
};

/* Point structure */

template<typename T>
struct point {

T x, y, z;

point(T a, T b, T c) : x(a), y(b), z(c) { };
};

/* Tetrahedral structure */

template<typename T>
struct tetra {

std::array<struct tetra<T>*, 4> ngb; // 4 neigbhours
std::array<struct point<T>*, 4> p; // 4 vertices

struct tree<T, struct tetra<T>*>* h; // host leaf

std::array<T, 4> a, b, c, d; // half-plane data

tetra(point<T>* a, point<T>* b, point<T>* c, point<T>* d);
};

/* Create a structure to write data into */

template<typename T>
struct buffer {

std::vector<char> point_buffer;
std::vector<char> tetra_buffer;
std::vector<char> tree_buffer;

char* point_position;
char* tetra_position;
char* tree_position;

buffer(unsigned int ppp) {

point_buffer.resize((5 + ppp) * sizeof(point<T>));
tetra_buffer.resize((4+3*10*(ppp-1))*sizeof(tetra<T>));
tree_buffer.resize((5+4*10*(ppp-1))*sizeof(tree<T, struct tetra<T>*>));

point_position = this->point_buffer.data();
tetra_position = this->tetra_buffer.data();
tree_position = this->tree_buffer.data();
}
};

/* Exception handling classes */

class co_linear : public std::exception {

public :

co_linear(std::string m = "Three points are co-linear") : msg(m) { }
~co_linear(void) throw() { }

const char* what() const throw() { return msg.c_str();}

private :

std::string msg;
};

class co_planar : public std::exception {

public :

co_planar(std::string m = "Fourth point is co-planar") : msg(m) { }
~co_planar(void) throw() { }

const char* what() const throw() { return msg.c_str();}

private :

std::string msg;
};

/* Global variables and functions */

#define U double
#define debug false

extern unsigned int box_length;

int io(int argc, char** argv);```

5. Makefile :
Code:
```MKFILE    = Makefile

GCC       = g++ -O3 -Wall -Wextra -std=c++11 -pedantic

CSOURCE   = main.cpp io.cpp point.cpp tetra.cpp tree.cpp
OBJECTS   = \${CSOURCE:.cpp=.o}
EXECBIN   = regulus

all : \${EXECBIN}

\${EXECBIN} : \${OBJECTS}
\${GCC} -o \$@ \${OBJECTS} -lm

%.o : %.cpp

\${GCC} -c \$<

clean :
- rm \${OBJECTS} \${EXECBIN}

again :
\${MAKE} clean all```

6. Forums make for poor pasties, get a GitHub account.

7. Lol I dom't think my code is github public ready yet