Hello,
I'm working with a large data set and was wondering if there were some statistical shortcuts. I'm trying to compute the mode of a data set and can only think of an O(n^2) solution. Ordinarily, this wouldn't be too bad, but I feel like there might be something else that I could do. This is what I was thinking of trying:
Code:
int count=0,temp,mode=nums[0];
for (i=0;i<setMax;i++)
{
temp =0;
for(j=0;j<setMax;j++)
if(nums[j]==nums[i]) temp++;
if(temp>count)
{
temp=count;
mode = nums[i];
};
};
When I used the mode function in excel, there was a noticeable amount of time in between when I pressed enter and when the function returned something around 1.7 (approx .25sec). Since I have to compute the mode many times, I hope that I can optimize it without having to do something too complicated. Obviously, I could used a linked list with frequencies, but then other parts of my code would slow down unreasonably.
Thanks for you help.