# Thread: Intersection of two arrays in O(n) complexity

1. ## Intersection of two arrays in O(n) complexity

The question is to find the common elements in two arrays .The code which i wrote have the complexity of order n square.
Code:
```void main ()
{ int i,j;
int a[]={1,2,3,4};
int b[]={4,5,7,1};
for (i=0;i<4;i++)
{
for (j=0;j<4;j++)
{ if (a[i]==b[j])
{
printf("%d",a[i]);
}
}
}

}```
How it can be done in linear time

2. Originally Posted by malwaregeek
The question is to find the common elements in two arrays .The code which i wrote have the complexity of order n square.

How it can be done in linear time
The obvious answer to me is it can be done if the arrays are sorted.

Tim S.

3. Yes in that case it could be definitely done in linear time. But even in case of unsorted arrays, it could be done ,one of my friend told me to use hashmap, but I have no idea how to implement though I know it theoretically.

4. Originally Posted by malwaregeek
Yes in that case it could be definitely done in linear time. But even in case of unsorted arrays, it could be done ,one of my friend told me to use hashmap, but I have no idea how to implement though I know it theoretically.
With a hash map you would get O(n log n) time with O(n) additional memory. Pretty good, but not as O(n) time.

You can achieve O(n) time if the range of values is reasonably small: You set up an array of bits ---one for each possible value--- set to zero. Then mark with ones the bits of the values present in one array. Then traverse the other array, checking if its values were marked...

5. Oops. Sorry, I was thinking in a traditional [tree] map... With a hash map you can also get O(n) time, but only if you make the hash table reasonably big (proportional to the number of different values) and you avoid storing duplicates separately.

6. Originally Posted by comocomocomo
Oops. Sorry, I was thinking in a traditional [tree] map... With a hash map you can also get O(n) time, but only if you make the hash table reasonably big (proportional to the number of different values) and you avoid storing duplicates separately.
I was going to say. In practice the memory and computational overhead of a hash table is too high, and the way to do it is to sort both arrays. But a hash is an O(N) way of retrieving a database entry by content.

7. I once began writing a modified IntroSelect sort of algorithm that operates on both arrays at once to see if their contents are the same but would allow for early outs that could mean it typically beats the hashing option in practice. Of course it requires that the array ordering need not necessarily be maintained. I never quite got it working properly in all cases and I cant remember where the code went.

I'd probably just sort both and compare. Radix Sort can be used to turn it into linear time, but if I was that intent on having it linear time then I'd probably just use a hash table.