Originally Posted by
jmass16
Hey guys,
I have a little problem with a program that I wrote and I am a little stuck. I have a vector that consists of 20,000 sample points. These sample points create a sine wave that only consists of the negative half of the wave. Sometimes there is a "glitch" or "spike" in this wave that I want to filter out of my measurements but am stuck. I take a frequency measurement and a Min and Max measurement. Also, this has to work for a general case, i.e. a larger wave/smaller wave or even a wave with a smaller amplitude or higher frequency. Below is the excel file that I used to graph the points, then my program reads in these values one by one that populates a vector. I have also added the spike to this file to get a visual of the graph. This graph repeats itself all the way to 20000, but the spike does NOT get repeated.
Thanks!
I was going to suggest a kind of filtering, but I think there's a better way, that meets your requirement of working on different frequencies.
There are some FINITE number of glitches, right? And ALL the glitches have absolute values LARGER than any legitimate part of the signal. Suppose that there will never be more than about 10 glitches. As you process the samples, keep an array of the 10 largest samples (and their indices) that you have seen. This is not as slow as it sounds, since you only need to do processing if the sample value is at least as large as the smallest of the 10 largest values (that's a mouthful) -- and that's a very cheap thing to check on every sample.
Once you've looked at the entire signal, look at the amplitudes of those largest 10 values. They might look something like this:
45.7
45.9
41.5
46.8
51.7
49.9
13.5
13.5
13.5
13.5
It's obvious that there is a "shelf" between the glitches (all in the range of about 41-51) and the "real" signal (13.5). You just need to find a way to detect this shelf. The samples all the way down to the shelf precisely pinpoint the glitches, which you can then go back and remove by averaging the surrounding samples in the matter matsp said.
The boon here is that you need to do very little processing per sample -- basically:
Code:
for each sample:
// maxima is some fixed-size sorted array
if sample.value >= maxima.smallest():
maxima.insert(sample.value, sample.index) // This line executes only RARELY
last_glitch = detect_shelf(maxima)
for each max in maxima[0..last_glitch]:
// Correct the sample:
sample[max.index] = (sample[max.index - 1] + sample[max.index + 1]) / 2.0