Thread: Advice on refactoring legacy code to be threadsafe

  1. #1
    Registered User
    Join Date
    Jun 2020
    Posts
    6

    Advice on refactoring legacy code to be threadsafe

    Hi all. New to the forums, let me know if this is not the right place for this kind of question. I'm looking for guidance on some extensive refactoring of a multithreaded (pthreads) C application I've inherited at work.

    The code is riddled with global state variables which are modified in many places, often with no synchronization. Some of these are necessary due to the nature of the application. It runs in an embedded Linux environment, but memory/CPU constraints aren't extremely tight.

    I'm now familiar with its many ins and outs, and have started cleaning up sections, adding synchronization and writing tests. Tests are challenging because of the large number of globals.

    What I don't feel completely comfortable with is spending time refactoring the overall structure, like how data is shared between threads. The problem is I don't know what is considered good practice/standard for this, and am having trouble finding resources to consult.

    For example, pthread_create's last argument arg is currently NULL in all calls in the code, and variables are instead shared via extern in header files. I would prefer the variables be declared in main, then shared via a struct through arg. This would tighten up their scope, and force us to think about what really needs to be shared vs. just making a new global variable.

    Another technique I've started to use is switching from a variable with global scope to file scope, then writing get/set functions with a file scope mutex. I can reason why this is an improvement, but again, I don't know if there's a standard, superior way of doing this.

    Sorry for the rambling question, I felt background was important. I guess it could be summarized as 'Do you have any recommendations on best practices for multithreaded C programming, when some global state is necessary?'

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    Mmm, where to begin...

    Are your threads arranged in producer/consumer pairs, where one thread does some work and then passes it on.

    Or as a worker pool, where a controller divides up work into a number of segments, passes each work unit to a thread and then waits around to collect the results.

    > The code is riddled with global state variables which are modified in many places, often with no synchronization.
    The problem with your incremental approach is that you don't know how finely balanced the existing system is. Any change you make can have far reaching consequences that could take a long time before they become apparent.

    > Tests are challenging because of the large number of globals.
    The tests you need at this stage should be treating your code as a black box. Data goes in, results come out.
    A large corpus of input (both real world and simulated) is about the best you can hope for.
    So for each change you make, you have a known data set to feed in, to compare with a known set of corresponding results.

    > Another technique I've started to use is switching from a variable with global scope to file scope, then writing get/set functions with a file scope mutex.
    This risks turning your multi-threaded code into a poorly performing lock-happy single-threaded application because every single thread winds up in someone else's queue for no good reason.

    One thing to establish is which variables are used by which threads, and when.
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <pthread.h>
    #include <unistd.h>
    #include <sys/types.h>
    #include <time.h>
    
    #define MAX_EVENTS 1000000
    typedef struct {
      pthread_t       tid;
      struct timespec time;
    } event_t;
    event_t events[MAX_EVENTS];
    int current_event = 0;
    void record_time(void) {
      int n = __atomic_fetch_add(&current_event,1,__ATOMIC_SEQ_CST);
      if( n < MAX_EVENTS ) {
        events[n].tid = pthread_self();
        clock_gettime(CLOCK_REALTIME, &events[n].time);
      }
    }
    void print_times(FILE *fp) {
      for(int i = 0 ; i < current_event; i++ ) {
        fprintf(fp,"%lx %ld:%ld\n", events[i].tid,
                events[i].time.tv_sec, events[i].time.tv_nsec);
      }
    }
    
    // Some global variable which may (or may not) be shared.
    int global_var;
    
    #define MAX_LOOP 10000
    
    void *worker(void *p) {
      for(int i = 0 ; i < MAX_LOOP ; i++ ) {
        if( rand() % 100 < 50 ) {
          global_var++;
          record_time();
        }
        usleep(1);
      }
      return NULL;
    }
    
    int main()
    {
      pthread_t tids[5];
    
      for(int i = 0 ; i < 5 ; i++ ) {
        pthread_create(&tids[i],NULL,worker,NULL);
      }
      for(int i = 0 ; i < 5 ; i++ ) {
        pthread_join(tids[i],NULL);
      }
    
      printf("Final result in global=%d\n", global_var);
      printf("Number of debug records=%d\n", current_event);
      print_times(stdout);
    }
    
    $ ./a.out | head
    Final result in global=24862
    Number of debug records=24867
    7f37e89eb700 1592026468:827352674
    7f37ea1ee700 1592026468:827395393
    7f37e89eb700 1592026468:827410460
    7f37e81ea700 1592026468:827431857
    7f37ea1ee700 1592026468:827460899
    7f37e81ea700 1592026468:827499052
    7f37e91ec700 1592026468:827500700
    7f37ea1ee700 1592026468:827519506
    Maybe you'll find that particular variables are used by only one thread.
    Maybe you'll find that particular variables are used by two threads, but at very different times.

    Perhaps tweak the tracer to
    - record whether it was a read or write access
    - record only the first and last access times on a per thread basis
    - record the number of times on a per thread basis


    Another option (maybe) is to use helgrind, part of valgrind.
    Code:
    $ gcc -g bar.c -pthread
    $ valgrind --tool=helgrind ./a.out 
    ==6496== Helgrind, a thread error detector
    ==6496== Copyright (C) 2007-2015, and GNU GPL'd, by OpenWorks LLP et al.
    ==6496== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
    ==6496== Command: ./a.out
    ==6496== 
    ==6496== ---Thread-Announcement------------------------------------------
    ==6496== 
    ==6496== Thread #3 was created
    ==6496==    at 0x51643DE: clone (clone.S:74)
    ==6496==    by 0x4E46149: create_thread (createthread.c:102)
    ==6496==    by 0x4E47E83: pthread_create@@GLIBC_2.2.5 (pthread_create.c:679)
    ==6496==    by 0x4C34BB7: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==6496==    by 0x400A7F: main (bar.c:52)
    ==6496== 
    ==6496== ---Thread-Announcement------------------------------------------
    ==6496== 
    ==6496== Thread #2 was created
    ==6496==    at 0x51643DE: clone (clone.S:74)
    ==6496==    by 0x4E46149: create_thread (createthread.c:102)
    ==6496==    by 0x4E47E83: pthread_create@@GLIBC_2.2.5 (pthread_create.c:679)
    ==6496==    by 0x4C34BB7: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==6496==    by 0x400A7F: main (bar.c:52)
    ==6496== 
    ==6496== ----------------------------------------------------------------
    ==6496== 
    ==6496== Possible data race during read of size 4 at 0x6010A0 by thread #3
    ==6496== Locks held: none
    ==6496==    at 0x400A09: worker (bar.c:37)
    ==6496==    by 0x4C34DB6: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==6496==    by 0x4E476B9: start_thread (pthread_create.c:333)
    ==6496== 
    ==6496== This conflicts with a previous write of size 4 by thread #2
    ==6496== Locks held: none
    ==6496==    at 0x400A12: worker (bar.c:37)
    ==6496==    by 0x4C34DB6: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==6496==    by 0x4E476B9: start_thread (pthread_create.c:333)
    ==6496==  Address 0x6010a0 is 0 bytes inside data symbol "global_var"
    ==6496== 
    ==6496== ----------------------------------------------------------------
    ==6496== 
    ==6496== Possible data race during write of size 4 at 0x6010A0 by thread #3
    ==6496== Locks held: none
    ==6496==    at 0x400A12: worker (bar.c:37)
    ==6496==    by 0x4C34DB6: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==6496==    by 0x4E476B9: start_thread (pthread_create.c:333)
    ==6496== 
    ==6496== This conflicts with a previous write of size 4 by thread #2
    ==6496== Locks held: none
    ==6496==    at 0x400A12: worker (bar.c:37)
    ==6496==    by 0x4C34DB6: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
    ==6496==    by 0x4E476B9: start_thread (pthread_create.c:333)
    ==6496==  Address 0x6010a0 is 0 bytes inside data symbol "global_var"
    ==6496== 
    Final result in global=9
    Number of debug records=9
    5c27700 1592027935:822210625
    5c27700 1592027935:822605329
    6828700 1592027935:823178972
    5c27700 1592027935:823200774
    5c27700 1592027935:823276338
    5c27700 1592027935:823405036
    6828700 1592027935:823456170
    5c27700 1592027935:823470497
    5c27700 1592027935:823535862
    ==6496== 
    ==6496== For counts of detected and suppressed errors, rerun with: -v
    ==6496== Use --history-level=approx or =none to gain increased speed, at
    ==6496== the cost of reduced accuracy of conflicting-access information
    ==6496== ERROR SUMMARY: 14 errors from 2 contexts (suppressed: 101 from 7)
    For your program, you're going to get a LOT of output.

    I think before you do too much actual code changing, you need to gather as much information as possible.
    - probing the thread scope of each global variable.
    - creating a design for what you have (warts and all).
    - creating a design for what you want (as if you were starting from afresh).
    - work out a planned transformation from where you are to where you want to be.

    Bit of a long ramble I'm afraid.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Jun 2020
    Posts
    6
    Thanks, that's a lot of helpful advice. Determining thread scope sounds like a good starting place. I've found many variables already which have unnecessarily broad scope.

    Some of the threads are producers in that they read in sensor data/user input, which is then shared via globals with other threads. The bulk of the threads are continuously running loops which take various actions and modify global/file scope state variables. Actually, your question is making me consider why the threads were even structured this way in the first place. Separate threads for processing/sharing inputs makes sense, but I bet many of the other threads could be consolidated. It is possible they were originally split off to separate the state/logic of different modules, but they bleed into the others enough that this probably isn't even a meaningful distinction any more.

    One of the challenges with black box testing is the application is not meant to terminate under normal circumstances, nor does it really have outputs. It's an infinite loop that runs a device. Inputs can occur at any time (not just the start of the program). I could come up with a list of inputs and corresponding required actions, and use that for testing.

    Thanks again. This has given me a good place to start.

  4. #4
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    > nor does it really have outputs.
    There must be an observable consequence of doing all this work surely.
    Otherwise, you could just write "while(1);" and call it a day.

    > Some of the threads are producers in that they read in sensor data/user input
    Is this all because someone couldn't figure out how to do a non-blocking read?

    Assuming that all your sensors and user input are accessed via file descriptors, then the select() system call is your friend.
    Not only can it monitor a whole set of descriptors to see if anything interesting has happened, it can also timeout if nothing happens within some interval.

    You end up with a nice single-threaded function with a loop dispatcher watching all the descriptors, calling whatever handler function is appropriate for the active descriptors.

    > One of the challenges with black box testing is the application is not meant to terminate under normal circumstances
    But you can package 99% of the code to make it testable.
    Code:
    int main ( ) {
    #ifdef TEST_MODE
      for(int i = 0 ; i < NUM_TESTS ; i++ ) {
        do_work();
      }
    #else
      while(1) {
        do_work();
      }
    #endif
    }
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  5. #5
    Registered User
    Join Date
    Jun 2020
    Posts
    6
    It is entirely possible that this is because no one thought of the single threaded, non-blocking read approach. I used select to add a feature and was asked what it did, so I'm sure no one else was aware of it before. Code was originally written a long time ago and has been added to over many years.

    I'll have to do a bit more thinking and sit down with pen and paper to sketch out what that would look like exactly, but I can't think of any reason off the top of my head that it wouldn't work.

    Thanks again. This is exactly what I was looking for.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. legacy code problem
    By baxy in forum C++ Programming
    Replies: 3
    Last Post: 04-09-2015, 10:48 AM
  2. Attempting to make a threadsafe construct
    By golfinguy4 in forum C++ Programming
    Replies: 14
    Last Post: 04-06-2010, 01:07 PM
  3. C problem with legacy code
    By andy_baptiste in forum C Programming
    Replies: 4
    Last Post: 05-19-2008, 06:14 AM
  4. Refactoring
    By kawshik in forum C Programming
    Replies: 13
    Last Post: 06-28-2007, 03:52 AM
  5. C++ Refactoring Tools
    By SpaceCadet in forum C++ Programming
    Replies: 3
    Last Post: 10-11-2006, 06:03 PM

Tags for this Thread