# bandpass filtering

• 08-01-2007
jmass16
bandpass filtering
Hey guys,
I have a little problem with a program that I wrote and I am a little stuck. I have a vector that consists of 20,000 sample points. These sample points create a sine wave that only consists of the negative half of the wave. Sometimes there is a "glitch" or "spike" in this wave that I want to filter out of my measurements but am stuck. I take a frequency measurement and a Min and Max measurement. Also, this has to work for a general case, i.e. a larger wave/smaller wave or even a wave with a smaller amplitude or higher frequency. Below is the excel file that I used to graph the points, then my program reads in these values one by one that populates a vector. I have also added the spike to this file to get a visual of the graph. This graph repeats itself all the way to 20000, but the spike does NOT get repeated.
Thanks!
• 08-01-2007
jmass16
sorry guys, there are 60 waves.
• 08-01-2007
matsp
So you are looking for data that for very few samples are "out of line" - that is, big changes from one to another sample, where the rest of the data is changing slowly - calculate the difference between each sample, and if the difference is more than <some randomly choosen magnitude>, say more than 8 times, compared to the previous difference and the next one or two samples also show a BIG differernce, then discard them by using the average of the sample before and after the "glitch" or "spike".

--
Mats
• 08-01-2007
jmass16
Thanks for getting back to me Matsp. I also thought of doing it that way as well. The problem arises when you have a sample of say 2,500,000. That's a lot of overhead comparing each sample point or even comparing every 5th sample point. Also, another problem that arises is say for instance when your reading the wave, you can't take the average of the sample before and after if you start with a spike.

--Joe
• 08-01-2007
matsp
Passing through 2.5M samples should be doable - depends on how often you get new samples really - but a modern 2GHz processor should be able to do it in milliseconds - perhaps not single-digit milliseconds, but definitely less than a second (unless you do somethign stupid in your calculation).

As to the implementation, I'd try out a method that looks at a "small number of samples" for detecting spikes, and if there is a "sudden change" replace it - if you are at the very edge, just use the values after(when at the beginning)/before(when at the end) the glitch/spike to replace the values with - this may be "not perfect", but it's definitely better than the spike.

--
Mats
• 08-01-2007
brewbuck
Quote:

Originally Posted by jmass16
Hey guys,
I have a little problem with a program that I wrote and I am a little stuck. I have a vector that consists of 20,000 sample points. These sample points create a sine wave that only consists of the negative half of the wave. Sometimes there is a "glitch" or "spike" in this wave that I want to filter out of my measurements but am stuck. I take a frequency measurement and a Min and Max measurement. Also, this has to work for a general case, i.e. a larger wave/smaller wave or even a wave with a smaller amplitude or higher frequency. Below is the excel file that I used to graph the points, then my program reads in these values one by one that populates a vector. I have also added the spike to this file to get a visual of the graph. This graph repeats itself all the way to 20000, but the spike does NOT get repeated.
Thanks!

I was going to suggest a kind of filtering, but I think there's a better way, that meets your requirement of working on different frequencies.

There are some FINITE number of glitches, right? And ALL the glitches have absolute values LARGER than any legitimate part of the signal. Suppose that there will never be more than about 10 glitches. As you process the samples, keep an array of the 10 largest samples (and their indices) that you have seen. This is not as slow as it sounds, since you only need to do processing if the sample value is at least as large as the smallest of the 10 largest values (that's a mouthful) -- and that's a very cheap thing to check on every sample.

Once you've looked at the entire signal, look at the amplitudes of those largest 10 values. They might look something like this:

45.7
45.9
41.5
46.8
51.7
49.9
13.5
13.5
13.5
13.5

It's obvious that there is a "shelf" between the glitches (all in the range of about 41-51) and the "real" signal (13.5). You just need to find a way to detect this shelf. The samples all the way down to the shelf precisely pinpoint the glitches, which you can then go back and remove by averaging the surrounding samples in the matter matsp said.

The boon here is that you need to do very little processing per sample -- basically:

Code:

```for each sample:     // maxima is some fixed-size sorted array     if sample.value >= maxima.smallest():         maxima.insert(sample.value, sample.index) // This line executes only RARELY last_glitch = detect_shelf(maxima) for each max in maxima[0..last_glitch]:     // Correct the sample:     sample[max.index] = (sample[max.index - 1] + sample[max.index + 1]) / 2.0```
• 08-02-2007
VirtualAce
If the distance between the samples is small and therefore the change also between the samples equally as small you can get away with some type of interpolation between the samples to minimize the glitch.

LI1=v1+xi*(v2-v1)
LI2=v3+xi*(v4-v3)
Final=LI1+yi*(LI2-LI1)

Where xi and yi are the linear interpolation values in range 0 to 1 for x and y respectively.

The following are true about these formulas.
• As xi approaches 0, LI1 approaches v1 and LI2 approaches v3
• As xi approaches 1, LI1 approaches v2 and LI2 approaches v4
• As yi approaches 0, Final approaches LI1
• As yi approaches 1, Final approaches LI2
• The midpoint is at xi=0.5, yi=0.5

This is a much better filter than:
Final=(s(0)+s(1)+....+s(n-1)+s(n))/n

For a pattern of repeating identical waveforms the following is true:
f(x)=f(x+wavelength)
f(x+wavelength+spikelength)=f(x+wavelength+spikele ngth)

Or more simply, the value at a point x will be the same as a value at a point x+wavelength since the wave cycles once from x to x+wavelength.

Therefore a simple copy operation is all that is needed to correctly reconstruct the affected wave.

Code:

``` for (int i=iStartSample;i<iStartSample+iSpikeLength;i++) {   Wave[i+iWaveLength]=Wave[i]; }```
A sine function will create identical waveforms provided the sampling interval is a multiple of the wavelength.
• 08-02-2007
matsp
Brewbuck - I like you idea (a little bit along the method used for benchmarking, take 5 samples, take away the highest and lowest then average).

I would make sure that the "number of high samples" is definitely "too big" - not massive, but make sure you don't run out of space here, just in case there is a burst of spikes/glitches.

Bubba also have a good suggestion - look at the next / previous waveform - but that assumes that they are really supposed to be similar - if you have a decaying or increasing wave-form (on purpose) then it doesn't quite work - likewise if the frequency is changing all of a sudden.

The other solution is a hardware filter, of course... A suitable size capacitor over the input line would do the trick! But it may be difficult to do that without affecting the actual readout.

--
Mats
• 08-02-2007
VirtualAce
According to the plot of the waveform the trough of A matches the exact trough of B. So since A=B then the value at A+wavelength equals the value of B.

In actuality the plot of the waveform is all points in the first full cycle+wavelength*(wavenumber).

That is all points in the first wave exactly match all points in the following waves. In other words the signal is not changing throughout the plot, the volume is not changing since the amplitudes of the waves are identical, and the frequency is not changing since the wavelengths of all the wave forms are the same length - in this case trough to trough.

Quote:

These sample points create a sine wave that only consists of the negative half of the wave.
A sine wave will produce the exact same waveforms for each cycle of the wave as long as the frequency of the sine wave is constant throughout the plot and the volume is constant. Even if the volume was not constant you could still reproduce the original wave using simple algebra.

The first capture is an absolute 100 hz stereo sine wave (only the positive upper half is produced) at -0.9 db.
You could reproduce the entire length of the sample as long as you knew the values of the first full cycle. In fact if you split the first wave (first cycle) in half and duplicate it for the second half you really only need to produce 1/2 the cycle and the rest of the sample can be computed rather than generated.

The second capture is the same wave but has been pitch bent downward near the end of the sound sample. Notice the wavelengths are getting larger and larger and thus the frequency is getting lower and lower. However if the amount of pitch bend is known and is linear throughout the sample you could still compute all of the wave forms as long as the first half cycle was known.

The last capture illustrates the problem of smoothing the sample too much. This is Sony Sound Forge so I doubt they are doing a simple average but it illustrates the problem. In short, if you average the samples too much you will begin to lose volume. Average sample smoothing will affect the final volume of the samples in the interval of samples being averaged. As well it could induce clicks or pops at the transition from the normal sample values to the averaged sample values and vice versa. So at the start of the 'averaged patch' you may get a click and at the end you may also get a click since the transition is not completely accurate.

But in this instance the solution is simple since it is a sine wave. The only time it is difficult to reproduce the exact waveform is when the sound sample does not follow any simple mathematical formula. For instance if you record your voice saying "Hello" and look at the resulting waveforms....it would be very difficult, albeit not impossible with the correct software, to fix any portion of the sound sample.
• 08-02-2007
matsp
Bubba,

I didn't mean to say your idea wasn't workable, I was just pointing out that it works best/easiest when the waveform is at least relatively conistent.

Your further explanation shows how it would probably work really well for what the original poster wanted... Assuming the waveform is reasonably predictable that is.

--
Mats
• 08-02-2007
jmass16
You guys are AWESOME!!!! I will try these and let you guys know how it works out. Thanks again!!!
• 08-06-2007
jmass16
Bubba,
could you explain a little more on your first idea? I am having trouble understanding the formulas.