Thread: Rewriting function into a more generic form.

  1. #1
    Registered User
    Join Date
    Sep 2020
    Posts
    15

    Rewriting function into a more generic form.

    Usually I don't pay much particular attention to how generic or flexible my code is. If it doesn't jump straight into sight as being something obviously reusable, I don't make it reusable. Mainly that's because I don't share the pervasive philosophy of the OOP culture of devising highly abstracted, generic constructs as much as you can. I don't care about building "flexible" code that may help solve a variety of problems I may or may not face in future; I want to solve my own current problem, now, and do it in a simple, clear and efficient manner.

    Yet, because my "sight" is rather short -I have the literal proficiency of an imbecile monkey when it comes to programming- some "obvious" things often escape me.
    For instance today I found myself writing this:
    Code:
    static const char *find_duplicate(const char *list, const char *items)
    {
        const unsigned  listlen = (unsigned)strlen(list);
        unsigned        itemlen;
        const char      *start = items;
        char            *delimiter;
    
        if (list == items)
        {    while (*items)
            {    delimiter = strchrnul(items, ',');
                itemlen = (unsigned)(delimiter - items);
                if ((items = text_contains(delimiter, listlen - (unsigned)(delimiter - start), items, itemlen) ) )
                    return (items);
                items = delimiter + !!*delimiter;
            }
        }
        else
        {    while (*items)
            {    delimiter = strchrnul(items, ',');
                itemlen = (unsigned)(delimiter - items);
                if ((items = text_contains(list, listlen, items, itemlen) ) ) //memmem wrapper with additional code to ensure exact rather than partial matches.
                    return (items);
                items = delimiter + !!*delimiter;
            }
        }
        return ((void*)0);
    }
    Later today, I found I needed to do the opposite: find the first differing, i.e non-duplicate, item of two strings. Which led me to revisit this code, and wonder if there is a good way to 1. merge both loops into one,in order to eliminate the awfully redundant code and 2. alter the function to instead, return the first non-duplicate item.

    I came up with this untested solution but I'm not thrilled about it. I wonder what cool alternatives the great minds that roam these forums may come up with.
    Code:
    static const char *find_item
    (const char *list, const char *items, const _Bool find_duplicate)
    {
        const unsigned  listlen = (unsigned)strlen(list);
        unsigned        itemlen;
        const char      *start = items;
        char            *delimiter;
    
        while (*items)
        {    delimiter = strchrnul(items, ',');
            itemlen = (unsigned)(delimiter - items);
            if (list == start)
                items = text_contains(delimiter, listlen - (unsigned)(delimiter - start), items, itemlen);
            else
                items = text_contains(list, listlen, items, itemlen); //memmem wrapper with additional code to ensure exact rather than partial matches.
            if ((find_duplicate && items) || (!find_duplicate && !items) )
                return (items);
            items = delimiter + !!*delimiter;
        }
        return ((void*)0);
    }

  2. #2
    Registered User
    Join Date
    Sep 2020
    Posts
    15
    Bonus question: I wonder if there is a way to simplify the ((find_duplicate && items) || (!find_duplicate && !items) ) expression.
    I've played around with it and can transform it to one form or another, but found no way to reduce it further, e.g:
    Code:
    (a && b) || (!a && !b) = (a || !b) && (!a || b)

  3. #3
    Registered User
    Join Date
    Sep 2020
    Posts
    425
    You can use binary XOR to reduce your expression, but only if the values are 1s and 0s.

  4. #4
    Registered User
    Join Date
    Sep 2020
    Posts
    425
    Here's how I would do it, given that I do a lot of embedded code, and the standard library is aimed at null terminated strings, and assuming I understand your requirements.

    Code:
    static int matches
    (const char *a, const char *b)
    {
      while(*a == *b) {
        a++;
        b++;
        if((*a == ',' || *a == '\0') && (*b == ',' || *b == '\0'))
          return 1;
      }
      return 0;
    }
    
    
    static const char *next_element
    (const char *a) {
       // Scan for next element seperator
       while(*a != '\0' && *a != ',')
          a++;
    
    
       // End of string?
       if(*a == '\0')
          return (void *)0;
    
    
       // Skip over a comma
       a++;
    
    
       return a;
    }
    
    
    static const char *find_item
    (const char *list, const char *items, const _Bool find_duplicate)
    {
       const char *haystack = list;
       while(haystack) {  // For each item in the list being searched
          const char *needle = items;
          while (needle) {  // Check if it or is not in the list of things we are comparing with
             if(matches(haystack,needle))
                break;
             needle = next_element(needle);
          }
          if(find_duplicate && needle != NULL)
              return haystack;  // Element in 'list' has been found in 'items'
    
    
          if(!find_duplicate && needle == NULL)
              return haystack;  // Element in 'list' was not foud in 'items'
          haystack = next_element(haystack);
       }
       return (void*)0;
    }

  5. #5
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by hamster_nz
    assuming I understand your requirements
    I think Exosomes's find_duplicate function actually does two things:
    • Find the first duplicate token in a list of tokens as a string in a comma-separated value format
    • Find the first matching token between two lists of tokens each as a string in a comma-separated value format

    The first behaviour is invoked when the caller passes the same string as both the first and second parameters, otherwise the second behaviour is invoked. Exosomes's find_item function extends find_duplicate with the option of negation of the original behaviours.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  6. #6
    Registered User
    Join Date
    Sep 2020
    Posts
    15
    Laser light is correct, but I still like your solution.
    I find interesting the different ways people go about solving the same problems; it's always cool to see new approaches. Perhaps I'd update needle along with haystack after a successful match, saving an unnecessary call to match (they are different, needle is one element behind).
    And, props for not using library functions; although searching characters one by one is probably much slower.

  7. #7
    Registered User
    Join Date
    Sep 2020
    Posts
    425
    Quote Originally Posted by Exosomes View Post
    Laser light is correct, but I still like your solution.
    I find interesting the different ways people go about solving the same problems; it's always cool to see new approaches. Perhaps I'd update needle along with haystack after a successful match, saving an unnecessary call to match (they are different, needle is one element behind).
    And, props for not using library functions; although searching characters one by one is probably much slower.
    Opps - I got the requirements slightly off.

    On the speed front it really depends on your environment, and testing would be needed to say either way. The standard library functions (e.g. strlen(), strcmp()) aren't magical.

    I also have a slight worry that text_contain() is a problem. If you use it to look for "cat" or "dog" in "book,catalogue,dogma,index", does it find it?
    Last edited by hamster_nz; 10-16-2020 at 10:14 PM.

  8. #8
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    I'd say another consideration might be: just how large are these lists of tokens in production, and how long does each token tend to be? If these lists are long enough and comparing tokens expensive enough, then it might make sense to explicitly parse the source strings first into lists of token objects, then sort these lists of token objects in order to do single pass matching, at the cost of additional memory proportional to the size of the lists.

    EDIT:
    Oh, but that will change the behaviour though, since you wouldn't be finding the same first match/duplicate.
    Last edited by laserlight; 10-16-2020 at 10:54 PM.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 0
    Last Post: 07-23-2009, 03:20 AM
  2. Generic function
    By vin_pll in forum C++ Programming
    Replies: 26
    Last Post: 02-04-2009, 07:36 AM
  3. Problem with generic function
    By jlbfunes in forum C++ Programming
    Replies: 23
    Last Post: 08-17-2008, 05:13 PM
  4. help with generic function
    By joeyzt in forum C++ Programming
    Replies: 16
    Last Post: 01-05-2006, 07:49 PM
  5. Generic function for initialisation
    By foniks munkee in forum C Programming
    Replies: 2
    Last Post: 03-01-2002, 07:08 PM

Tags for this Thread