@john.c
Thanks for the explanation. I agree it's idiotic - I clearly won't butcher my program for something so trivial, but at least now I can explain it.
Yes I'm trying to make example code - I don't want you to have to trudge through my messy code. I'm having trouble, though. More below
The .c was not pushed, my mistake. That missing function, I have no idea, but you can comment it out for our purposes. Or I'll just try and make it simple for you with example code.
@Salem
My school has this special thing where all students are technically tutors since we grade each other. We often have to argue about differences and this is one of them.
data->map is malloced but not zero filled. It's used over and over by threads and freed at the end of the program.
------------------------
fsanitizer message:
Code:
==================WARNING: ThreadSanitizer: data race (pid=809255)
Write of size 4 at 0x7ffdf4b1423c by thread T2 (mutexes: write M7):
#0 calculate_map /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw.c:92:18 (fractol+0x4b8ef6)
Previous read of size 8 at 0x7ffdf4b14238 by thread T1:
#0 memcpy <null> (fractol+0x42ff4e)
#1 calculate_map /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw.c:102:48 (fractol+0x4b8fe8)
Location is stack of main thread.
Location is global '??' at 0x7ffdf4af6000 ([stack]+0x00000001e238)
Mutex M7 (0x7ffdf4b14248) created at:
#0 pthread_mutex_init <null> (fractol+0x4276bd)
#1 thread_task /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw3.c:142:6 (fractol+0x4bae80)
#2 render_fractal /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw4.c:42:3 (fractol+0x4bb275)
#3 main /mnt/nfs/homes/rpohlen/Documents/......../fractol.c:22:2 (fractol+0x4b7236)
Thread T2 (tid=809258, running) created by main thread at:
#0 pthread_create <null> (fractol+0x425e8b)
#1 thread_task /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw3.c:149:10 (fractol+0x4baf02)
#2 render_fractal /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw4.c:42:3 (fractol+0x4bb275)
#3 main /mnt/nfs/homes/rpohlen/Documents/......../fractol.c:22:2 (fractol+0x4b7236)
Thread T1 (tid=809257, finished) created by main thread at:
#0 pthread_create <null> (fractol+0x425e8b)
#1 thread_task /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw3.c:149:10 (fractol+0x4baf02)
#2 render_fractal /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw4.c:42:3 (fractol+0x4bb275)
#3 main /mnt/nfs/homes/rpohlen/Documents/......../fractol.c:22:2 (fractol+0x4b7236)
SUMMARY: ThreadSanitizer: data race /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw.c:92:18 in calculate_map
==================
helgrind message:
Code:
==811402== ---Thread-Announcement------------------------------------------
==811402==
==811402== Thread #3 was created
==811402== at 0x4C6C122: clone (clone.S:71)
==811402== by 0x4B2F2EB: create_thread (createthread.c:101)
==811402== by 0x4B30E0F: pthread_create@@GLIBC_2.2.5 (pthread_create.c:817)
==811402== by 0x4842917: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==811402== by 0x4048C2: thread_task (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
==811402== by 0x404BF8: render_fractal (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
==811402== by 0x4025F3: main (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
==811402==
==811402== ---Thread-Announcement------------------------------------------
==811402==
==811402== Thread #2 was created
==811402== at 0x4C6C122: clone (clone.S:71)
==811402== by 0x4B2F2EB: create_thread (createthread.c:101)
==811402== by 0x4B30E0F: pthread_create@@GLIBC_2.2.5 (pthread_create.c:817)
==811402== by 0x4842917: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==811402== by 0x4048C2: thread_task (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
==811402== by 0x404BF8: render_fractal (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
==811402== by 0x4025F3: main (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
==811402==
==811402== ----------------------------------------------------------------
==811402==
==811402== Lock at 0x1FFEFFFC78 was first observed
==811402== at 0x4843D9D: pthread_mutex_init (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==811402== by 0x404853: thread_task (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
==811402== by 0x404BF8: render_fractal (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
==811402== by 0x4025F3: main (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
==811402== Address 0x1ffefffc78 is on thread #1's stack
==811402==
==811402== Possible data race during write of size 4 at 0x1FFEFFFC6C by thread #3
==811402== Locks held: 1, at address 0x1FFEFFFC78
==811402== at 0x403A4D: calculate_map (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
==811402== by 0x4842B1A: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==811402== by 0x4B30608: start_thread (pthread_create.c:477)
==811402== by 0x4C6C132: clone (clone.S:95)
==811402==
==811402== This conflicts with a previous read of size 4 by thread #2
==811402== Locks held: none
==811402== at 0x4845AD7: memcpy@GLIBC_2.2.5 (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==811402== by 0x403B02: calculate_map (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
==811402== by 0x4842B1A: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==811402== by 0x4B30608: start_thread (pthread_create.c:477)
==811402== by 0x4C6C132: clone (clone.S:95)
==811402== Address 0x1ffefffc6c is on thread #1's stack
==811402==
==811402== ---Thread-Announcement------------------------------------------
==811402==
==811402== Thread #4 was created
==811402== at 0x4C6C122: clone (clone.S:71)
==811402== by 0x4B2F2EB: create_thread (createthread.c:101)
==811402== by 0x4B30E0F: pthread_create@@GLIBC_2.2.5 (pthread_create.c:817)
==811402== by 0x4842917: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==811402== by 0x4048C2: thread_task (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
==811402== by 0x404BF8: render_fractal (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
==811402== by 0x4025F3: main (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
==811402==
==811402== ----------------------------------------------------------------
==811402==
==811402== Possible data race during read of size 4 at 0x1FFEFFFC80 by thread #4
==811402== Locks held: none
==811402== at 0x4845ACF: memcpy@GLIBC_2.2.5 (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==811402== by 0x403B02: calculate_map (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
==811402== by 0x4842B1A: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==811402== by 0x4B30608: start_thread (pthread_create.c:477)
==811402== by 0x4C6C132: clone (clone.S:95)
==811402==
==811402== This conflicts with a previous write of size 4 by thread #3
==811402== Locks held: none
==811402== at 0x4B347D1: __pthread_mutex_unlock_usercnt (pthread_mutex_unlock.c:52)
==811402== by 0x4B347D1: pthread_mutex_unlock (pthread_mutex_unlock.c:357)
==811402== by 0x4840458: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==811402== by 0x403A5B: calculate_map (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
==811402== by 0x4842B1A: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
==811402== by 0x4B30608: start_thread (pthread_create.c:477)
==811402== by 0x4C6C132: clone (clone.S:95)
==811402== Address 0x1ffefffc80 is on thread #1's stack
==811402== in frame #4, created by main (???:)
==811402==
I tried reproducing the behavior through this example code but no go - fsanitizer at least doesn't catch anything:
Code:
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#define NUMTHREADS 48 // > 1 to cause data race
#define WINX 400 // even 0 causes data race
#define WINY 200
typedef struct s_data
{
int **map;
int thread_id;
int something;
pthread_mutex_t mutex;
} t_data;
int return_something(t_data data, int i, int j)
{
(void)data;
(void)i;
(void)j;
return (0);
}
void *thread(void *arg)
{
int i;
int j;
t_data *data;
data = (t_data *)arg;
pthread_mutex_lock(&(data->mutex));
i = data->thread_id++;
pthread_mutex_unlock(&(data->mutex));
while (i < WINX)
{
j = -1;
while (++j < WINY)
data->map[i][j] = return_something(*data, data->something + i, data->something + j);
i += NUMTHREADS;
}
return (NULL);
}
int main(void)
{
int i;
t_data data;
pthread_t threads[NUMTHREADS];
// malloc map
data.map = malloc(WINX * sizeof(*data.map));
i = -1;
while (++i < WINX)
data.map[i] = malloc(WINY * sizeof(**data.map));
// create threads
pthread_mutex_init(&(data.mutex), NULL);
data.thread_id = 0;
data.something = 0;
i = -1;
while (++i < NUMTHREADS)
pthread_create(threads + i, NULL, &thread, &data);
// join threads
i = -1;
while (++i < NUMTHREADS)
pthread_join(threads[i], NULL);
// free map
i = -1;
while (++i < WINX)
free(data.map[i]);
free(data.map);
return (0);
}
My reasoning was that since the data race happens around the
Code:
data->map[y][x] = calculate_map_pixel(*data,
data->pos.x + data->step * x, data->pos.y - data->step * y);
line in my code, somehow reproducing this would create the same issue.
What the program is doing is essentially calculating some complex math function over and over and filling a 2D array with the results.
Threads here are simply used to split the calculations : each thread does a different line in the array. For 4 threads, thread 0 does lines 0 4 8... thread 2 does lines 1 5 9... etc...
They should never collide, the code is simple enough. Assign a thread ID with mutex, use that thread ID to decide what lines to work on - what I did in the example code too.
I am at a loss.