Thread: Problems with pthread - still reachables and data race

  1. #1
    Registered User
    Join Date
    Oct 2021
    Posts
    16

    Question Problems with pthread - still reachables and data race

    Hi everyone,

    I'm working on a graphical program, and using pthread to speed up the calculations.

    The program works beautifully, but valgrind and fsanitize warn me of 2 potential problems:

    1. There are still reachables when my program exits. They seem to be coming from pthread. I'd like to figure out whether I can do anything about them - maybe they're inherent to pthread, maybe I made a mistake. https://i.imgur.com/nExA5PN.png

    2. There are data race warnings when using helgrind or fsanitize=thread. What I can't figure out is that only the school computers have this problem. At home, neither of these tools warn of any problems. I tried testing extensively and I couldn't have any data race warning. I don't have any log of the errors sadly but I will tomorrow.

    Some more details about the 2 problems:

    1. I know still reachables, especially such as these (4 blocks, 1654 bytes, always, no matter window size, amount of renders, number of threads), should not really be a problem. This is a school project and my school is very strict on these, so I'm doing my best to comply. If I can't fix this, I'd like to at least understand exactly where the still reachables come from - what does pthread do that creates these?

    2. When I implemented multi-threading, I read about mutex of course, and I implemented a mutex lock on the only variable that the threads could simultaneously write to. All the rest is read-only.

    When troubleshooting with a friend, he pointed out that several threads accessing the same memory space concurrently could create problems, and that I should try to make copies of all common memory for each thread. Is that true?

    I found a way to completely prevent any data race warnings by adding a mutex lock around a single line (fractol_draw.c, line 104), which completely defeats the purpose of multithreading. And the thing is, these instructions never write anything to common memory, only read. The only non mutex'd write is compartmentalized so that each thread cannot touch the others.

    Not to mention I have no problems at home. My CPU is way better, but no matter how much I stress it by blowing up window size and SSAA strength, there are no data race warnings.

    The code is there GitHub - m0d1nst4ll3r/42_fract-ol: 42 - common core - ring #2 (it's really just a personal repo, and it's my first big project, please forgive the messiness - I'll do better next time)

    Thank you all for your help! I'm going crazy.

  2. #2
    Registered User
    Join Date
    Oct 2021
    Posts
    16
    Here is, specifically, the pthread creates, joins, and one of the threaded functions. I suppose I must be doing something terribly wrong in one of them, but I can't figure out what.


    Edit: also, if NUMTHREADS is 1, there are no still reachables anymore.
    Edit2: if NUMTHREADS is low (like, 2), the still reachables sometimes do, and sometimes do not show, the more threads, the more likely that they will

    Code:
    void    thread_task(t_fract *data, char task){
        int            err;
        int            i;
        pthread_t    threads[NUMTHREADS];
    
    
        if (pthread_mutex_init(&(data->mutex), NULL))
            return ;
        data->thread = -1;
        i = -1;
        while (++i < NUMTHREADS)
        {
            if (task == 'c')
                err = pthread_create(threads + i, NULL, &calculate_map, data);
            else if (task == 'd')
                err = pthread_create(threads + i, NULL, &draw_fractal, data);
            else if (task == 's')
                err = pthread_create(threads + i, NULL, &render_ssaa, data);
            if (err)
                return ;
        }
        i = -1;
        while (++i < NUMTHREADS)
            pthread_join(threads[i], NULL);
        pthread_mutex_destroy(&data->mutex);
    }
    Code:
    void    *calculate_map(void *arg){
        int            x;
        int            y;
        int            thread;
        t_fract        *data;
    
    
        data = (t_fract *)arg;
        pthread_mutex_lock(&(data->mutex));
        data->thread++;
        thread = data->thread;
        pthread_mutex_unlock(&(data->mutex));
        y = thread;
        while (y < data->winy)
        {
            x = -1;
            while (++x < data->winx)
            {
                if (data->highest_iter && (int)data->map[y][x] < data->highest_iter)
                    continue ;
                data->map[y][x] = calculate_map_pixel(*data,
                        data->pos.x + data->step * x, data->pos.y - data->step * y);
            }
            y += NUMTHREADS;
        }
        pthread_exit(EXIT_SUCCESS);
    }
    Last edited by Modin; 06-13-2022 at 04:21 PM.

  3. #3
    Registered User
    Join Date
    Oct 2021
    Posts
    16
    Code to reproduce still reachables, at least on my end: (sorry about the chain posts)

    Code:
    #include <stdlib.h>
    #include <pthread.h>
    #include <unistd.h>
    
    
    #define NUMTHREADS  2 //can be anything higher than 1 to cause still reachables
    
    
    void    *thread(void *arg)
    {
        (void)arg;
        pthread_exit(EXIT_SUCCESS);
    }
    
    
    int main(void)
    {
        int         i;
        pthread_t   threads[NUMTHREADS];
    
    
        i = -1;
        while (++i < NUMTHREADS)
            pthread_create(threads + i, NULL, &thread, NULL);
        i = -1;
        while (++i < NUMTHREADS)
            pthread_join(threads[i], NULL);
        return (0);
    }
    Any NUMTHREADS higher than 1 causes 1654 bytes in 4 blocks of non-reachables, for me. Sometimes. I'm having trouble correlating frequency to NUMTHREADS, it seems very unstable.

    Am I doing anything bad there?

  4. #4
    Registered User
    Join Date
    Dec 2017
    Posts
    1,628
    It is a long-standing valgrind bug (if it can even be considered a bug) when working with pthreads related to the exit call at the end of the thread function. You can avoid it by actually returning from the function by saying return NULL instead of pthread_exit(0).

    BTW, it is a rather ancient practice to define variables at the top of a function, and your loop structure is bizarre. Normally we simply say the more common and understandable:
    Code:
    for (int i = 0; i < NUMTHREADS; ++i)
    A little inaccuracy saves tons of explanation. - H.H. Munro

  5. #5
    Registered User
    Join Date
    Oct 2021
    Posts
    16
    Ah, yes.

    This is a school assignment and for some reason at my school, for loops are forbidden. We have to do everything with while loops. I suppose maybe... for loops are harder to read? Or while loops are closer to assembler? I'm not sure myself.

    We're also forced to define all of our variables (all of them) at the top of our functions, with an empty line separating declarations from other instructions. No assignments in declarations, no multiple declarations, 25 lines per function, etc etc etc. Tons of rules that make our code easily readable, at least between us.

    Similarly, I would have paid no heed to this still reachable thing as it's clearly not a problem, BUT... when I turned my program in, the first thing that was tested was valgrind, and there was a debate. I wasn't able to explain where this came from exactly, hence I had to figure it out.

    Anyway, thanks a ton, your solution works beautifully. Is there any problem in exiting threads without pthread_exit?

    I'll come back tomorrow with the data race warnings. Do you have any idea why I could be getting the warnings at school but not at home? Or why I'm getting them at all since I'm only reading from shared memory and writing to different parts of an array in each thread?

  6. #6
    Registered User
    Join Date
    Dec 2017
    Posts
    1,628
    Is there any problem in exiting threads without pthread_exit?
    It should be okay. It's strange that pthread_exit doesn't clean up properly, though. Maybe it's the way that valgrind attaches to the process that screws it up and it normally cleans up properly. valgrind apparently doesn't have a problem with a regular return, though. However, it's idiotic that you would need to bend to valgrind's deficiencies.

    I'll come back tomorrow with the data race warnings.
    Example runnable code that demonstrates the problem would be good.

    I just tried compiling your program but fractol_printflol.c is missing from your source code (is it needed?), and mlx_mouse_get_pos seems to be missing from the libmlx_Linux.a library (at least the version I stumbled upon). Still, we shouldn't have to deal with a large piece of code like that unless absolutely necessary.

    Do you have any idea why I could be getting the warnings at school but not at home?
    It smells like undefined behavior.

    Or why I'm getting them at all since I'm only reading from shared memory and writing to different parts of an array in each thread?
    Maybe you think you're writing to different parts but there's actually a little overlap (off-by-one error?).
    A little inaccuracy saves tons of explanation. - H.H. Munro

  7. #7
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > This is a school assignment and for some reason at my school, for loops are forbidden.
    A lot of tutors seem to have a perverse delight in making students 'do x' without using the most rational thing you would use to 'do x'.

    A while loop might have saved you a couple of instructions - in the 1970's!.
    Modern compilers are far smarter now.

    Or they have some stupid code checking tool that can't cope with for loops.
    So rather than fix the tool, just give all the students brain damage instead.

    > Do you have any idea why I could be getting the warnings at school but not at home?
    - OS version differences
    - compiler version differences
    - valgrind version differences
    - valgrind command line differences

    How is data->map allocated?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  8. #8
    Registered User
    Join Date
    Oct 2021
    Posts
    16
    @john.c

    Thanks for the explanation. I agree it's idiotic - I clearly won't butcher my program for something so trivial, but at least now I can explain it.

    Yes I'm trying to make example code - I don't want you to have to trudge through my messy code. I'm having trouble, though. More below

    The .c was not pushed, my mistake. That missing function, I have no idea, but you can comment it out for our purposes. Or I'll just try and make it simple for you with example code.

    @Salem

    My school has this special thing where all students are technically tutors since we grade each other. We often have to argue about differences and this is one of them.

    data->map is malloced but not zero filled. It's used over and over by threads and freed at the end of the program.

    ------------------------

    fsanitizer message:

    Code:
    ==================WARNING: ThreadSanitizer: data race (pid=809255)
      Write of size 4 at 0x7ffdf4b1423c by thread T2 (mutexes: write M7):
        #0 calculate_map /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw.c:92:18 (fractol+0x4b8ef6)
    
    
      Previous read of size 8 at 0x7ffdf4b14238 by thread T1:
        #0 memcpy <null> (fractol+0x42ff4e)
        #1 calculate_map /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw.c:102:48 (fractol+0x4b8fe8)
    
    
      Location is stack of main thread.
    
    
      Location is global '??' at 0x7ffdf4af6000 ([stack]+0x00000001e238)
    
    
      Mutex M7 (0x7ffdf4b14248) created at:
        #0 pthread_mutex_init <null> (fractol+0x4276bd)
        #1 thread_task /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw3.c:142:6 (fractol+0x4bae80)
        #2 render_fractal /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw4.c:42:3 (fractol+0x4bb275)
        #3 main /mnt/nfs/homes/rpohlen/Documents/......../fractol.c:22:2 (fractol+0x4b7236)
    
    
      Thread T2 (tid=809258, running) created by main thread at:
        #0 pthread_create <null> (fractol+0x425e8b)
        #1 thread_task /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw3.c:149:10 (fractol+0x4baf02)
        #2 render_fractal /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw4.c:42:3 (fractol+0x4bb275)
        #3 main /mnt/nfs/homes/rpohlen/Documents/......../fractol.c:22:2 (fractol+0x4b7236)
    
    
      Thread T1 (tid=809257, finished) created by main thread at:
        #0 pthread_create <null> (fractol+0x425e8b)
        #1 thread_task /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw3.c:149:10 (fractol+0x4baf02)
        #2 render_fractal /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw4.c:42:3 (fractol+0x4bb275)
        #3 main /mnt/nfs/homes/rpohlen/Documents/......../fractol.c:22:2 (fractol+0x4b7236)
    
    
    SUMMARY: ThreadSanitizer: data race /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw.c:92:18 in calculate_map
    ==================
    helgrind message:

    Code:
    ==811402== ---Thread-Announcement------------------------------------------
    ==811402== 
    ==811402== Thread #3 was created
    ==811402==    at 0x4C6C122: clone (clone.S:71)
    ==811402==    by 0x4B2F2EB: create_thread (createthread.c:101)
    ==811402==    by 0x4B30E0F: pthread_create@@GLIBC_2.2.5 (pthread_create.c:817)
    ==811402==    by 0x4842917: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==811402==    by 0x4048C2: thread_task (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
    ==811402==    by 0x404BF8: render_fractal (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
    ==811402==    by 0x4025F3: main (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
    ==811402== 
    ==811402== ---Thread-Announcement------------------------------------------
    ==811402== 
    ==811402== Thread #2 was created
    ==811402==    at 0x4C6C122: clone (clone.S:71)
    ==811402==    by 0x4B2F2EB: create_thread (createthread.c:101)
    ==811402==    by 0x4B30E0F: pthread_create@@GLIBC_2.2.5 (pthread_create.c:817)
    ==811402==    by 0x4842917: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==811402==    by 0x4048C2: thread_task (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
    ==811402==    by 0x404BF8: render_fractal (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
    ==811402==    by 0x4025F3: main (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
    ==811402== 
    ==811402== ----------------------------------------------------------------
    ==811402== 
    ==811402==  Lock at 0x1FFEFFFC78 was first observed
    ==811402==    at 0x4843D9D: pthread_mutex_init (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==811402==    by 0x404853: thread_task (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
    ==811402==    by 0x404BF8: render_fractal (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
    ==811402==    by 0x4025F3: main (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
    ==811402==  Address 0x1ffefffc78 is on thread #1's stack
    ==811402== 
    ==811402== Possible data race during write of size 4 at 0x1FFEFFFC6C by thread #3
    ==811402== Locks held: 1, at address 0x1FFEFFFC78
    ==811402==    at 0x403A4D: calculate_map (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
    ==811402==    by 0x4842B1A: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==811402==    by 0x4B30608: start_thread (pthread_create.c:477)
    ==811402==    by 0x4C6C132: clone (clone.S:95)
    ==811402== 
    ==811402== This conflicts with a previous read of size 4 by thread #2
    ==811402== Locks held: none
    ==811402==    at 0x4845AD7: memcpy@GLIBC_2.2.5 (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==811402==    by 0x403B02: calculate_map (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
    ==811402==    by 0x4842B1A: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==811402==    by 0x4B30608: start_thread (pthread_create.c:477)
    ==811402==    by 0x4C6C132: clone (clone.S:95)
    ==811402==  Address 0x1ffefffc6c is on thread #1's stack
    ==811402== 
    ==811402== ---Thread-Announcement------------------------------------------
    ==811402== 
    ==811402== Thread #4 was created
    ==811402==    at 0x4C6C122: clone (clone.S:71)
    ==811402==    by 0x4B2F2EB: create_thread (createthread.c:101)
    ==811402==    by 0x4B30E0F: pthread_create@@GLIBC_2.2.5 (pthread_create.c:817)
    ==811402==    by 0x4842917: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==811402==    by 0x4048C2: thread_task (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
    ==811402==    by 0x404BF8: render_fractal (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
    ==811402==    by 0x4025F3: main (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
    ==811402== 
    ==811402== ----------------------------------------------------------------
    ==811402== 
    ==811402== Possible data race during read of size 4 at 0x1FFEFFFC80 by thread #4
    ==811402== Locks held: none
    ==811402==    at 0x4845ACF: memcpy@GLIBC_2.2.5 (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==811402==    by 0x403B02: calculate_map (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
    ==811402==    by 0x4842B1A: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==811402==    by 0x4B30608: start_thread (pthread_create.c:477)
    ==811402==    by 0x4C6C132: clone (clone.S:95)
    ==811402== 
    ==811402== This conflicts with a previous write of size 4 by thread #3
    ==811402== Locks held: none
    ==811402==    at 0x4B347D1: __pthread_mutex_unlock_usercnt (pthread_mutex_unlock.c:52)
    ==811402==    by 0x4B347D1: pthread_mutex_unlock (pthread_mutex_unlock.c:357)
    ==811402==    by 0x4840458: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==811402==    by 0x403A5B: calculate_map (in /mnt/nfs/homes/rpohlen/Documents/......../fractol)
    ==811402==    by 0x4842B1A: ??? (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==811402==    by 0x4B30608: start_thread (pthread_create.c:477)
    ==811402==    by 0x4C6C132: clone (clone.S:95)
    ==811402==  Address 0x1ffefffc80 is on thread #1's stack
    ==811402==  in frame #4, created by main (???:)
    ==811402==
    I tried reproducing the behavior through this example code but no go - fsanitizer at least doesn't catch anything:

    Code:
    #include <stdlib.h>
    #include <pthread.h>
    #include <unistd.h>
    
    
    #define NUMTHREADS	48 // > 1 to cause data race
    #define WINX		400 // even 0 causes data race
    #define WINY		200
    
    
    typedef struct s_data
    {
    	int				**map;
    	int				thread_id;
    	int				something;
    	pthread_mutex_t	mutex;
    }					t_data;
    
    
    int	return_something(t_data data, int i, int j)
    {
    	(void)data;
    	(void)i;
    	(void)j;
    	return (0);
    }
    
    
    void	*thread(void *arg)
    {
    	int		i;
    	int		j;
    	t_data	*data;
    
    
    	data = (t_data *)arg;
    	pthread_mutex_lock(&(data->mutex));
    	i = data->thread_id++;
    	pthread_mutex_unlock(&(data->mutex));
    	while (i < WINX)
    	{
    		j = -1;
    		while (++j < WINY)
    			data->map[i][j] = return_something(*data, data->something + i, data->something + j);
    		i += NUMTHREADS;
    	}
    	return (NULL);
    }
    
    
    int	main(void)
    {
    	int			i;
    	t_data		data;
    	pthread_t	threads[NUMTHREADS];
    
    
    // malloc map
    
    
    	data.map = malloc(WINX * sizeof(*data.map));
    	i = -1;
    	while (++i < WINX)
    		data.map[i] = malloc(WINY * sizeof(**data.map));
    
    
    // create threads
    
    
    	pthread_mutex_init(&(data.mutex), NULL);
    	data.thread_id = 0;
    	data.something = 0;
    	i = -1;
    	while (++i < NUMTHREADS)
    		pthread_create(threads + i, NULL, &thread, &data);
    
    
    // join threads
    
    
    	i = -1;
    	while (++i < NUMTHREADS)
    		pthread_join(threads[i], NULL);
    
    
    // free map
    
    
    	i = -1;
    	while (++i < WINX)
    		free(data.map[i]);
    	free(data.map);
    
    
    	return (0);
    }
    My reasoning was that since the data race happens around the

    Code:
                data->map[y][x] = calculate_map_pixel(*data,
                        data->pos.x + data->step * x, data->pos.y - data->step * y);
    line in my code, somehow reproducing this would create the same issue.

    What the program is doing is essentially calculating some complex math function over and over and filling a 2D array with the results.

    Threads here are simply used to split the calculations : each thread does a different line in the array. For 4 threads, thread 0 does lines 0 4 8... thread 2 does lines 1 5 9... etc...

    They should never collide, the code is simple enough. Assign a thread ID with mutex, use that thread ID to decide what lines to work on - what I did in the example code too.

    I am at a loss.

  9. #9
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > data->map is malloced but not zero filled. It's used over and over by threads and freed at the end of the program.
    I noticed your test code allocated
    Code:
        data.map = malloc(WINX * sizeof(*data.map));
        i = -1;
        while (++i < WINX)
            data.map[i] = malloc(WINY * sizeof(**data.map));
    But your real code would require WINY to be the outer loop.

    > data->map is malloced but not zero filled. It's used over and over by threads and freed at the end of the program.
    Yeah, that might be the problem.

    It might be fine for a single iteration, but the globally malloc'ed memory is going to end up being used by different threads at very different times.

    Like T1, T2, T3 are all fine when you run calculate_map for the first time.
    But if you try the same again with T4, T5, T6 calling some other function (or even calculate_map again), it looks to valgrind like two separate threads accessing the same memory without a lock.

    Do you see problems with a single pass through thread_task ?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  10. #10
    Registered User
    Join Date
    Oct 2021
    Posts
    16
    Yes the WINX WINY were inverted. I replaced them to match my code and this also didn't get me a warning.

    What my program does, basically, is run 2 threaded tasks one after the other.

    1. Fill the 2D array in a threaded way like I described - no collisions

    2. Use that 2D array to fill another 2D array (handled by the mlx functions) - still no collisions

    This is run once on startup, then every time the user presses a key.

    fsanitize finds a problem specifically (at least) at startup during the 1st task - the 2nd seems to be free of problems.

    Task 2 begins only when task 1 is over, task 1 begins only once task 2 is over. These 2 tasks are sequential - not parallel.

    I added some mutex locked printfs to analyze what goes on, here's an example case (it changes a bit but the warning always happens in the first few threads) - this is for 48 threads

    Code:
    thread 00 starts
    thread 00 doing line 0
    thread 00 doing line 48
    thread 00 doing line 96
    thread 00 doing line 144
    thread 00 doing line 192
    thread 00 ends
    thread 01 starts
    thread 01 doing line 1
    thread 01 doing line 49
    thread 01 doing line 97
    thread 01 doing line 145
    LLVMSymbolizer: error reading file: No such file or directory
    ==================
    WARNING: ThreadSanitizer: data race (pid=817585)
      Write of size 4 at 0x7ffc9cf005cc by thread T3 (mutexes: write M7):
        #0 calculate_map /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw.c:93:18 (fractol+0x4b8ef6)
    
    
      Previous read of size 8 at 0x7ffc9cf005c8 by thread T2:
        #0 memcpy <null> (fractol+0x42ff4e)
        #1 calculate_map /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw.c:110:48 (fractol+0x4b9063)
    
    
      Location is stack of main thread.
    
    
      Location is global '??' at 0x7ffc9cee2000 ([stack]+0x00000001e5c8)
    
    
      Mutex M7 (0x7ffc9cf005d8) created at:
        #0 pthread_mutex_init <null> (fractol+0x4276bd)
        #1 thread_task /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw3.c:142:6 (fractol+0x4baf30)
        #2 render_fractal /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw4.c:42:3 (fractol+0x4bb325)
        #3 main /mnt/nfs/homes/rpohlen/Documents/......../fractol.c:22:2 (fractol+0x4b7236)
    
    
      Thread T3 (tid=817589, running) created by main thread at:
        #0 pthread_create <null> (fractol+0x425e8b)
        #1 thread_task /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw3.c:149:10 (fractol+0x4bafb2)
        #2 render_fractal /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw4.c:42:3 (fractol+0x4bb325)
        #3 main /mnt/nfs/homes/rpohlen/Documents/......../fractol.c:22:2 (fractol+0x4b7236)
    
    
      Thread T2 (tid=817588, running) created by main thread at:
        #0 pthread_create <null> (fractol+0x425e8b)
        #1 thread_task /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw3.c:149:10 (fractol+0x4bafb2)
        #2 render_fractal /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw4.c:42:3 (fractol+0x4bb325)
        #3 main /mnt/nfs/homes/rpohlen/Documents/......../fractol.c:22:2 (fractol+0x4b7236)
    
    
    SUMMARY: ThreadSanitizer: data race /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw.c:93:18 in calculate_map
    ==================
    thread 02 starts
    thread 02 doing line 2
    thread 01 doing line 193
    thread 02 doing line 50
    thread 01 ends
    thread 02 doing line 98
    thread 02 doing line 146
    thread 03 starts
    thread 03 doing line 3
    thread 02 doing line 194
    thread 03 doing line 51
    thread 02 ends
    thread 03 doing line 99
    thread 03 doing line 147
    thread 04 starts
    thread 04 doing line 4
    thread 03 doing line 195
    thread 04 doing line 52
    thread 03 ends
    thread 04 doing line 100
    thread 04 doing line 148
    thread 05 starts
    thread 05 doing line 5
    thread 04 doing line 196
    thread 05 doing line 53
    thread 04 ends
    thread 05 doing line 101
    thread 05 doing line 149
    thread 05 doing line 197
    thread 05 ends
    thread 06 starts
    thread 06 doing line 6
    thread 06 doing line 54
    thread 06 doing line 102
    thread 06 doing line 150
    thread 06 doing line 198
    thread 06 ends
    thread 07 starts
    thread 07 doing line 7
    thread 07 doing line 55
    thread 07 doing line 103
    thread 08 starts
    thread 08 doing line 8
    thread 07 doing line 151
    thread 08 doing line 56
    thread 07 doing line 199
    thread 08 doing line 104
    thread 07 ends
    thread 08 doing line 152
    thread 08 ends
    thread 09 starts
    thread 09 doing line 9
    thread 09 doing line 57
    thread 09 doing line 105
    thread 09 doing line 153
    thread 09 ends
    thread 10 starts
    thread 10 doing line 10
    thread 10 doing line 58
    thread 10 doing line 106
    thread 11 starts
    thread 11 doing line 11
    thread 10 doing line 154
    thread 10 ends
    thread 12 starts
    thread 12 doing line 12
    thread 11 doing line 59
    thread 12 doing line 60
    thread 11 doing line 107
    thread 12 doing line 108
    thread 11 doing line 155
    thread 12 doing line 156
    thread 11 ends
    thread 12 ends
    thread 14 starts
    thread 14 doing line 14
    thread 14 doing line 62
    thread 14 doing line 110
    thread 14 doing line 158
    thread 14 ends
    thread 13 starts
    thread 13 doing line 13
    thread 13 doing line 61
    thread 13 doing line 109
    thread 13 doing line 157
    thread 13 ends
    thread 15 starts
    thread 15 doing line 15
    thread 15 doing line 63
    thread 15 doing line 111
    thread 16 starts
    thread 16 doing line 16
    thread 15 doing line 159
    thread 16 doing line 64
    thread 15 ends
    thread 16 doing line 112
    thread 17 starts
    thread 17 doing line 17
    thread 18 starts
    thread 18 doing line 18
    thread 18 doing line 66
    thread 17 doing line 65
    thread 18 doing line 114
    thread 17 doing line 113
    thread 18 doing line 162
    thread 17 doing line 161
    thread 17 ends
    thread 18 ends
    thread 16 doing line 160
    thread 16 ends
    thread 20 starts
    thread 20 doing line 20
    thread 20 doing line 68
    thread 20 doing line 116
    thread 20 doing line 164
    thread 20 ends
    thread 19 starts
    thread 19 doing line 19
    thread 19 doing line 67
    thread 19 doing line 115
    thread 19 doing line 163
    thread 19 ends
    thread 21 starts
    thread 21 doing line 21
    thread 21 doing line 69
    thread 21 doing line 117
    thread 21 doing line 165
    thread 21 ends
    thread 22 starts
    thread 22 doing line 22
    thread 23 starts
    thread 23 doing line 23
    thread 23 doing line 71
    thread 24 starts
    thread 24 doing line 24
    thread 23 doing line 119
    thread 24 doing line 72
    thread 23 doing line 167
    thread 24 doing line 120
    thread 23 ends
    thread 24 doing line 168
    thread 25 starts
    thread 25 doing line 25
    thread 24 ends
    thread 25 doing line 73
    thread 25 doing line 121
    thread 25 doing line 169
    thread 25 ends
    thread 26 starts
    thread 26 doing line 26
    thread 26 doing line 74
    thread 26 doing line 122
    thread 27 starts
    thread 27 doing line 27
    thread 26 doing line 170
    thread 27 doing line 75
    thread 26 ends
    thread 27 doing line 123
    thread 28 starts
    thread 28 doing line 28
    thread 27 doing line 171
    thread 28 doing line 76
    thread 28 doing line 124
    thread 28 doing line 172
    thread 28 ends
    thread 27 ends
    thread 22 doing line 70
    thread 22 doing line 118
    thread 22 doing line 166
    thread 22 ends
    thread 29 starts
    thread 29 doing line 29
    thread 29 doing line 77
    thread 29 doing line 125
    thread 30 starts
    thread 30 doing line 30
    thread 29 doing line 173
    thread 30 doing line 78
    thread 29 ends
    thread 31 starts
    thread 31 doing line 31
    thread 30 doing line 126
    thread 31 doing line 79
    thread 30 doing line 174
    thread 31 doing line 127
    thread 30 ends
    thread 31 doing line 175
    thread 31 ends
    thread 32 starts
    thread 32 doing line 32
    thread 32 doing line 80
    thread 32 doing line 128
    thread 32 doing line 176
    thread 33 starts
    thread 33 doing line 33
    thread 32 ends
    thread 33 doing line 81
    thread 33 doing line 129
    thread 34 starts
    thread 34 doing line 34
    thread 33 doing line 177
    thread 34 doing line 82
    thread 33 ends
    thread 34 doing line 130
    thread 34 doing line 178
    thread 34 ends
    thread 35 starts
    thread 35 doing line 35
    thread 35 doing line 83
    thread 35 doing line 131
    thread 36 starts
    thread 36 doing line 36
    thread 35 doing line 179
    thread 36 doing line 84
    thread 35 ends
    thread 36 doing line 132
    thread 36 doing line 180
    thread 37 starts
    thread 37 doing line 37
    thread 38 starts
    thread 38 doing line 38
    thread 37 doing line 85
    thread 36 ends
    thread 37 doing line 133
    thread 38 doing line 86
    thread 37 doing line 181
    thread 38 doing line 134
    thread 37 ends
    thread 38 doing line 182
    thread 38 ends
    thread 39 starts
    thread 39 doing line 39
    thread 40 starts
    thread 39 doing line 87
    thread 39 doing line 135
    thread 39 doing line 183
    thread 39 ends
    thread 41 starts
    thread 41 doing line 41
    thread 41 doing line 89
    thread 40 doing line 40
    thread 41 doing line 137
    thread 40 doing line 88
    thread 41 doing line 185
    thread 40 doing line 136
    thread 41 ends
    thread 40 doing line 184
    thread 40 ends
    thread 42 starts
    thread 42 doing line 42
    thread 42 doing line 90
    thread 42 doing line 138
    thread 42 doing line 186
    thread 42 ends
    thread 43 starts
    thread 43 doing line 43
    thread 43 doing line 91
    thread 43 doing line 139
    thread 43 doing line 187
    thread 43 ends
    thread 44 starts
    thread 44 doing line 44
    thread 44 doing line 92
    thread 44 doing line 140
    thread 45 starts
    thread 45 doing line 45
    thread 44 doing line 188
    thread 45 doing line 93
    thread 44 ends
    thread 45 doing line 141
    thread 45 doing line 189
    thread 46 starts
    thread 46 doing line 46
    thread 46 doing line 94
    thread 45 ends
    thread 46 doing line 142
    thread 46 doing line 190
    thread 46 ends
    thread 47 starts
    thread 47 doing line 47
    thread 47 doing line 95
    thread 47 doing line 143
    thread 47 doing line 191
    thread 47 ends
    As you can see, the threads behave. And yet, I receive a warning. For what, officer?

    The offending line (which, mutex'd, fixes the problem) - the one that writes in the 2D array, calls another function, but that other function never has any access to the shared memory. It receives a copy (data is passed without address).
    Last edited by Modin; 06-14-2022 at 06:09 AM.

  11. #11
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    Code:
      Write of size 4 at 0x7ffc9cf005cc by thread T3 (mutexes: write M7):
        #0 calculate_map /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw.c:93:18 (fractol+0x4b8ef6)
     
     
      Previous read of size 8 at 0x7ffc9cf005c8 by thread T2:
        #0 memcpy <null> (fractol+0x42ff4e)
        #1 calculate_map /mnt/nfs/homes/rpohlen/Documents/......../fractol_draw.c:110:48 (fractol+0x4b9063)
    How do these line numbers relate to the code posted in your repo in the first post?
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  12. #12
    Registered User
    Join Date
    Dec 2017
    Posts
    1,628
    I did just comment out that one file of yours, but as I said, the library is also apparently missing a function (unless it's one of your missing functions that you named like one of the library functions). I don't feel like bothering with it anymore. Did you bother looking for an off-by-one error? Clearly you are making a mistake somewhere. The "reachable" problem wasn't really a problem, but the "data race" problem probably is.

    You could try changing to the "smart" way to malloc a non-ragged 2D array. It requires only two malloc calls (and two frees) no matter how many rows the array has. It's basically insane to do it the way you've done it. This shouldn't actually fix your problem though.
    Code:
    void *xmalloc(size_t size) {
        void *p = malloc(size);
        if (!p) { // simplistic error handling (but shouldn't occur for reasonable sizes)
            perror("xmalloc");
            exit(EXIT_FAILURE);
        }
        return p;
    }
     
    int **malloc2D(int rows, int cols) {
        int **a = xmalloc(rows * sizeof *a);       // malloc row pointers
        a[0] = xmalloc(rows * cols * sizeof **a);  // malloc data block
        for (int r = 1; r < rows; ++r)             // set row pointers
            a[r] = a[r - 1] + cols;
        return a;
    }
     
    void free2D(int ***a) {
        free((*a)[0]);  // free data block
        free(*a);       // free row pointers
        *a = NULL;
    }
     
    void example_usage() {
        int **a = malloc2D(10, 20);
        //...
        free2D(&a); // pass pointer to 'a' so it can be set to NULL
    }
    A little inaccuracy saves tons of explanation. - H.H. Munro

  13. #13
    Registered User
    Join Date
    Feb 2019
    Posts
    1,078
    Lots of error assuming wrong things:

    1 - Each time pthread_create is called it can fail (return a value different of 0);
    2 - Thread ID isn't a sequential value. It can be anything;
    3 - A pointer to a pointer isn't the same as a bidimensional array.

    A test:
    Code:
    #include <unistd.h>
    #include <stdio.h>
    #include <stdint.h>
    #include <inttypes.h>
    #include <pthread.h>
    
    struct threads_s {
      pthread_t tid;
      int ok;
    };
    
    // thread waits 3 seconds and exit.
    void *thread_proc( void *arg )
    { sleep(3); return NULL; }
    
    int main( void )
    {
      struct threads_s tids[10] = { 0 };
    
      // Create 10 threads...
      // Notice pthread_create can fail!
      for ( int i = 0; i < sizeof tids / sizeof tids[0]; i++ )
        if ( ! pthread_create( &tids[i].tid, NULL, thread_proc, NULL ) )
        {
          tids[i].ok = 1;
    
          printf( "Thread %d (thread_id = 0x%" PRIx64 ") created.\n",
            i, (uint64_t)tids[i].tid );
        }
    
      // Join only valid threads.
      for ( int i = 0; i < sizeof tids / sizeof tids[0]; i++ )
        if ( tids[i].ok )
          pthread_join( tids[i].tid, NULL );
    }
    On Linux (Debian bullseye x86-64):
    Code:
    $ cc -pthread -O2 -o test test.c -lpthread
    $ ./test
    Thread 0 (thread_id = 0x7f559ae82700) created.
    Thread 1 (thread_id = 0x7f559a681700) created.
    Thread 2 (thread_id = 0x7f5599e80700) created.
    Thread 3 (thread_id = 0x7f559967f700) created.
    Thread 4 (thread_id = 0x7f5598e7e700) created.
    Thread 5 (thread_id = 0x7f559867d700) created.
    Thread 6 (thread_id = 0x7f5597e7c700) created.
    Thread 7 (thread_id = 0x7f559767b700) created.
    Thread 8 (thread_id = 0x7f5596e7a700) created.
    Thread 9 (thread_id = 0x7f5596679700) created.
    On Windows 10 x86 (MSYS2 [MinGW64]):
    Code:
    Thread 0 (thread_id = 0x800000440) created.
    Thread 1 (thread_id = 0x8000586d0) created.
    Thread 2 (thread_id = 0x8000587d0) created.
    Thread 3 (thread_id = 0x8000588d0) created.
    Thread 4 (thread_id = 0x8000589d0) created.
    Thread 5 (thread_id = 0x800058ad0) created.
    Thread 6 (thread_id = 0x800058bd0) created.
    Thread 7 (thread_id = 0x800058cd0) created.
    Thread 8 (thread_id = 0x800058dd0) created.
    Thread 9 (thread_id = 0x800058ed0) created.
    Last edited by flp1969; 06-14-2022 at 11:49 AM.

  14. #14
    Registered User
    Join Date
    Oct 2021
    Posts
    16
    I managed to find a solution to the data race warnings!

    I was using O3, removing it fixed all the warnings.

    I'm not familiar with what O3 does exactly but I know it can cause weird errors so I should've known better. I'll still keep it for the speed, since everything otherwise works perfectly, but at least I can also justify this.

    Do you guys think I shouldn't use O3 even though my program runs flawlessly (other than warnings in debuggers)?

    @Salem

    Sorry, I ........ed up. Those warnings were generated with a modified source where I included printfs so the lines were wrong.

    The first line is the one protected by my mutex where I increment my thread id to separate my threads.

    The second line is the offending one, where, if I put mutex around it, it would fix the data race, the one where I write into my map[y][x].

    @john.c

    Please don't bother! I know I have to debug my own code and make it easy for everyone else when I ask for help.

    A question for you: in this case, could I not create a "2D" array with only 1 malloc, and using map[y * WINY + x] to access its contents? Is there a reason to have another array of addresses?

    I agree it's simpler when mallocing and freeing to simply call malloc and free once or twice regardless of the x y size. I think I'll do that from now on.

    @flp1969

    1. Oh yes thanks for pointing that out. I should've protected my pthread calls for sure. For my next pthread project I'll make sure to do so. (for now it would make things messier since I'm limited to 25 lines per function and 5 functions per .c)

    2. By thread id you must mean their address in memory? When I say "thread id" I just mean the identifier I have them assign themselves with that mutex locked line. For 48 threads the "ids" I'm talking about go from 0 to 47 and they are sequential. Should I pay attention to their address in memory?

    3. Hmmm I guess I just meant an array of arrays which can be visualised as 2D space. What would be a bidimensional array in C?

  15. #15
    Registered User
    Join Date
    Feb 2019
    Posts
    1,078
    Quote Originally Posted by Modin View Post
    @flp1969

    2. By thread id you must mean their address in memory? When I say "thread id" I just mean the identifier I have them assign themselves with that mutex locked line. For 48 threads the "ids" I'm talking about go from 0 to 47 and they are sequential. Should I pay attention to their address in memory?
    No... the thread id isn't a pointer per se. Look at the example. The thread id is a value returned by pthread_create (pointed in the first argument) and isn't sequential as in 0,1,2,3...!
    Quote Originally Posted by Modin View Post
    3. Hmmm I guess I just meant an array of arrays which can be visualised as 2D space. What would be a bidimensional array in C?
    If you know, in advance, how many elements you need to allocate, why use that loop to do partial allocations? It is simplier to allocate all the array once. And, this line:
    Code:
      data->map[i][j] = return_something(*data, data->something + i, data->something + j);
    Can be confusing... Let's say you have:
    Code:
    struct S {
      int map1[10][10];
      int **map2;
    };
    ...
    struct S *data = malloc( sizeof *data );
    data->map2 = malloc( 10 * sizeof( int * ) );
    for ( int i = 0; i < 10; i++ )
      data->map2[i] = malloc( 10 * sizeof(int) );
    ...
    x = data->map1[3][2];
    y = data->map2[3][2];
    ...
    In the last 2 lines, without looking at the structure definition, which one is a pointer and which one is an array?
    Last edited by flp1969; 06-15-2022 at 02:56 PM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. data race within ptr_ring code implementation
    By promach in forum C Programming
    Replies: 0
    Last Post: 01-08-2019, 01:26 AM
  2. Thread-specific data not work fine for me in pthread
    By ppdouble in forum C Programming
    Replies: 5
    Last Post: 01-03-2013, 10:45 AM
  3. Problems reading entered race times C
    By loopymoo26 in forum C Programming
    Replies: 12
    Last Post: 05-23-2009, 07:38 AM
  4. pthread question how would I init this data structure?
    By mr_coffee in forum C Programming
    Replies: 2
    Last Post: 02-23-2009, 12:42 PM
  5. pthread create and join problems
    By rotis23 in forum C Programming
    Replies: 1
    Last Post: 10-11-2002, 08:41 AM

Tags for this Thread