Thread: Two int Arrays, remove duplicates from second

  1. #1
    Registered User
    Join Date
    Nov 2008
    Location
    Phoenix
    Posts
    70

    Two int Arrays, remove duplicates from second

    Code:
    /*
        generals is the first array. Max 10 elements.
        numGenerals is the element count of generals. 
    
        genBuff is the second array; it is to be checked/pruned.
        genCount is the element count of genBuff.
        genBuff will be a max of 171, but be pruned to no more than 10, and no more than the complement of the element count of generals.
    */
    
        for(int i = 0; i < numGenerals; i++) 
            for(int j = 0; j < genCount; j++)
                if(generals[i] == genBuff[j])
                {
                    for(int k = j; k < genCount - 1; k++)
                        genBuff[k] = genBuff[k + 1];
                    genCount--;
                }
    Is there a better way to do this that is still easy for someone to read and understand? (I do have comments in the actual source, different from above).

    I have two int arrays. They hold values from 0 to 170. The first one will never be more than 10. The second will be at most 171, but will be whittled down to at most 10, usually less. 171 is worst case, most users of this particular program will probably be reasonable and not try to add all 171 (max is 10 anyway). The first array is the original array. The second array is a temporary array. Any value in the second array that is also found in the first array, is removed from the second array, since all values in the first one must be unique. After this pruning process, both arrays will collectively contain no more than 10 unique elements; the elements from the second will be added to the first.

    So right now I have three nested loops. I figured with the miniscule array sizes it wouldn't be a big deal. I can think of a way to remove one or two of them, but I want to be sure that I'm still writing clean, legible, good-practice code. The first loop walks through the first array. For each element in the first array, there is a second loop to walk through the second array to check for duplicates. If a duplicate is found, the third loop walks through the second array to overwrite the duplicate while preserving the second loop's position (j).

    Is this dumb? I know that the big O gets worse and worse the deeper you go with nested loops. Even though the arrays are really tiny, is this still a thing to avoid?

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    You could sort both arrays, then a single parallel pass through both arrays would be enough to find the elements from the second array that are in the first array. Or since the first array will always be tiny, sorting the second array will suffice, upon which you can use binary search to find the elements from the second array that match those in the first array. However, "removing" a duplicate from the second array by shifting elements will take linear time, i.e., the same time complexity as finding if a given element from the first array is in the second array without sorting the second array. Therefore, this sorting will not be all that useful.

    Quote Originally Posted by Sorin
    I figured with the miniscule array sizes it wouldn't be a big deal.
    Probably.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Registered User
    Join Date
    Aug 2005
    Location
    Austria
    Posts
    1,990
    Your function doesn't work. After shifting the elements back you need to decrement j as well. Otherwise you will not detect if the next element is a duplicate as well.
    Kurt

  4. #4
    Stoned Witch Barney McGrew's Avatar
    Join Date
    Oct 2012
    Location
    astaylea
    Posts
    420
    You don't need to continuously shift the elements back when you find one in the array. All you really need to do is one pass through the array like so:


    Code:
    size_t rm(int *a, size_t n, int v)
    {
        size_t to, from;
    
        for (to = from = 0; from < n; to += a[from++] != v)
            a[to] = a[from];
        return to;
    }
    I suggest you break your code up into fewer smaller functions that each do a single thing. You'll find it easier to manage that way.

  5. #5
    Registered User
    Join Date
    Nov 2008
    Location
    Phoenix
    Posts
    70
    I guess there's a few things I should have mentioned that I took for granted.

    Quote Originally Posted by laserlight View Post
    You could sort both arrays, then a single parallel pass through both arrays would be enough to find the elements from the second array that are in the first array. Or since the first array will always be tiny, sorting the second array will suffice, upon which you can use binary search to find the elements from the second array that match those in the first array. However, "removing" a duplicate from the second array by shifting elements will take linear time, i.e., the same time complexity as finding if a given element from the first array is in the second array without sorting the second array. Therefore, this sorting will not be all that useful.
    First is that the arrays are both already sorted. The first array is built from a binary file, and the elements are always present in the file in sorted order (lowest to highest).

    Quote Originally Posted by ZuK View Post
    Your function doesn't work. After shifting the elements back you need to decrement j as well. Otherwise you will not detect if the next element is a duplicate as well.
    Kurt
    Second, is that the second array will never contain duplicates in itself. The array is generated by Windows through a listbox with the multi-selection style, so the array will always contain unique elements within itself since it is just a list of selected indexes. Plus, they're sorted.

    BUT, I do see your point in that if I was working with an array where duplicates were possible within itself, I'd need to decrement j after moving everything over in order to not skip over elements, so thanks for pointing that out.

    Thanks to both of you for the pointers, I appreciate the input.

    Quote Originally Posted by Barney McGrew View Post
    You don't need to continuously shift the elements back when you find one in the array. All you really need to do is one pass through the array like so:


    Code:
    size_t rm(int *a, size_t n, int v)
    {
        size_t to, from;
    
        for (to = from = 0; from < n; to += a[from++] != v)
            a[to] = a[from];
        return to;
    }
    I suggest you break your code up into fewer smaller functions that each do a single thing. You'll find it easier to manage that way.
    I'll have to decipher what's going on here in the morning. I assume this is still considered good code, and that I just need more practice reading other peoples code.
    Last edited by Sorin; 04-03-2013 at 02:23 AM.

  6. #6
    Lurking whiteflags's Avatar
    Join Date
    Apr 2006
    Location
    United States
    Posts
    9,612
    Barney's code can be rewritten so it is easier to understand.

    Code:
    for ( from = to = 0; from < n; from++ ) {
      a[to] = a[from];
      if ( a[from] != v ) to++;
    }
    If it makes any difference to you this code will result in as many assignments as there are duplicates. So in the array (1, 2, 2, 2, 3) the two in the second position is actually assigned three times, once for each 2, and then the to counter gets incremented once to properly place the 3, e.g. (1, 2, 3).

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. C++ compare two arrays and display duplicates
    By codechick in forum C++ Programming
    Replies: 21
    Last Post: 02-09-2012, 04:39 PM
  2. Strings and Structs (remove duplicates)
    By christianB in forum C Programming
    Replies: 44
    Last Post: 07-25-2011, 09:01 PM
  3. recursive remove duplicates from a string
    By transgalactic2 in forum C Programming
    Replies: 47
    Last Post: 12-20-2009, 02:02 AM
  4. Replies: 6
    Last Post: 11-28-2004, 11:01 AM
  5. Really basic remove duplicates from array
    By Chris Fowler in forum C++ Programming
    Replies: 7
    Last Post: 11-25-2002, 10:35 PM