# Thread: Best use of FFT

1. ## Best use of FFT

Well, there isn't really a proper board for this, so I'll post it here... I'm sorry if this is the wrong section .

I've implemented FFT. I understand it. But my question is: how best to use it?
I'd like to input some audio samples and get the frequencies that are actually most present in this sound (or the part of the sound). I've tried to input a simple sine function and while it does report the proper frequency to be most present, the frequencies before and after it seem to be pretty 'present' as well. That is, the intensity of the proper frequency is highest, but those of a few hundred Hz more or less are quite high as well. Is there a proper way to distinguish properly between them?
Also, what is the best number of points (samples) to use?
And finally, what is the best way to input the data. Let's say I'm using 4096 points (samples) (I currently am). Should I input the first 4096 samples, then the next 4096? Or is there a better method, for instance: first input 4096, then the last 2048 of the last buffer and the next 2048 bytes, and so on? So, should the input overlap or not?

EVOEx

2. >> the frequencies before and after it seem to be pretty 'present' as well
I'd imagine that's because the signal isn't periodic. If the signal that is transformed is not periodic (i.e. an integer number of periods), power leaks into non fundamental frequencies, it's called spectral leakage. It can be reduced by windowing. A window forces the signal to be periodic by convolving the function kernel in time domain so the non-periodicy of the signal contributes less to the side bands.

As for N, I can't remember to be honest. 4096 seems high, but it depends on what it's being used for. The overlap you describe is called Welch's Method, I think. It gives good resolution by virtue of the overlap, but doesn't require as many points. Of course it increases computation but gives better results. If your application demands use it. Otherwise don't.

Also, if you don't have 4096 (for example) data points you can always zero-pad the buffer with 0's at the end. Can make the spectrum look better.

3. Thank you for your response. I still have to read the link you send more thoroughly to understand it properly, but I think I found another bug.
I generated the input for the FFT like this:
Code:
`sin(i * 2.0 * M_PI / 98)`
So the wave has a period of 98 samples, at 44100 Hz, so it should generate an output of 450 Hz.
Code:
```0.944744 on frequency 0
0.953641 on frequency 43.0664
0.981347 on frequency 86.1328
1.03121 on frequency 129.199
1.11001 on frequency 172.266
1.23057 on frequency 215.332
1.41819 on frequency 258.398
1.72835 on frequency 301.465
2.30799 on frequency 344.531
3.71084 on frequency 387.598
11.4264 on frequency 430.664     <-- Obvious peek
8.90731 on frequency 473.73
3.03495 on frequency 516.797
1.77379 on frequency 559.863
1.22751 on frequency 602.93
0.924448 on frequency 645.996
** Lots of lines **
0.00590028 on frequency 21963.9
0.00590019 on frequency 22006.9```
This looks OK to me (although I'm really unexperienced with audio processing, this is the first thing I'm doing). Even though the other frequencies show up a bit, it's easy to see around 450 Hz is most present.

When I change the function to 947.461 Hz, using:
Code:
`sin(i * 2.0 * M_PI / 46.545451475);`
And the output is quite good:
Code:
```1.42468e-07 on frequency 0
1.7196e-07 on frequency 43.0664
2.40527e-07 on frequency 86.1328
** Lots of lines ***
1.10611e-05 on frequency 861.328
2.26838e-05 on frequency 904.395
16 on frequency 947.461                 <-- Obvious peek
2.37334e-05 on frequency 990.527
1.21123e-05 on frequency 1033.59
** Lots of lines ***
1.42481e-07 on frequency 21963.9
1.42479e-07 on frequency 22006.9```

So they both look good to me. However, now I try to combine them using the following function:
Code:
`sin(i * 2.0 * M_PI / 46.545451475)*sin(i * 2.0 * M_PI / 98)`
I would think this should output peeks at the frequencies 450 and 947.461. However, what I receive is this:
Code:
```0.0940861 on frequency 0
0.100144 on frequency 43.0664
0.11747 on frequency 86.1328
** Lots of lines **
0.831364 on frequency 387.598
1.46475 on frequency 430.664
4.40354 on frequency 473.73
5.7609 on frequency 516.797       <-- Peek
1.90085 on frequency 559.863
1.19735 on frequency 602.93
0.905608 on frequency 645.996
0.748753 on frequency 689.062
** Lots of lines **
0.509344 on frequency 904.395
0.503752 on frequency 947.461
0.507132 on frequency 990.527
** Lots of lines **
0.889341 on frequency 1248.93
1.17847 on frequency 1291.99
1.87924 on frequency 1335.06
5.73641 on frequency 1378.12    <-- Peek
4.43105 on frequency 1421.19
1.49541 on frequency 1464.26
0.865353 on frequency 1507.32
0.592714 on frequency 1550.39
** Lots of lines **
0.000333131 on frequency 21963.9
0.00033312 on frequency 22006.9```

This output just doesn't seem right to me. So, am I doing something completely wrong here? Or is my algorithm wrong? I programmed it with some help of a tutorial and bit of the tutorials code at
http://www.relisoft.com/Science/Physics/sampling.html

I'm pretty sure that code produces the same output.

So, any idea what I'm doing wrong? Or is the error in the FFT algorithm?

Thanks,
EVOEx

4. So, sin alpha*sin beta = (1/2)(cos (alpha-beta) - cos(alpha+beta)) (hooray for trig). So you would get two different frequencies than what you had before. I would expect, if you want to combine what you've got, you should just add rather than multiply.

5. Again, it goes back to the periodicy thing I mentioned before. You need an integer number of periods to get the best results without windowing. I'd guess that if you did:

Code:
`sin(i * 2.0 * M_PI / 46.545451475)*sin(i * 4.0 * M_PI / (46.545451475)); /* changed second 2.0 to 4.0 */`
you'd get better results. It seems to me that 46.whatever is quite close to X sine periods, so you get great spikes from that. But 98 clearly isn't, which is why there is so much spectral leakage. If i*2.0*M_PI/46.545451475 is close to a period, then X*i*2.0*M_PI/46.545451475 will be too, with X being an integer. It's hard to think of a reason for why, but here's some top level maths about it.

Fourier transforms will convert any signal in the time domain to an infinite sum of sinusoids in the frequency domain. So if the input doesn't consist of a whole sine wave then the the input isn't periodic, which it needs to be for best results. This is why audio sample buffers you'd take would normally be small - your voice is repetitive over small times, but over larger (>one second), it isn't. This is also why the signal is often convolved in the time domain to force periodicy.

The two peaks is correct.

6. there isn't really a proper board for this,
try the kvraudio dsp forum

http://www.kvraudio.com/forum/viewforum.php?f=33

7. Thanks guys!

@twomers:
I could understand leakage may increase the intensity of the frequencies near the actual frequency, thinking about how FFT works. However, moving the peaks, I can't. True, I haven't studied it mathematically, just intuition.

Besides, Tabstop seems to be right here. Changing it to add up the sinusoids fixed it and put the peaks into exactly the right positions. It fixed my problem completely .

m37h0d: Thanks for the link. Although I wouldn't just join for one single question, unless I have to. Besides, I figured since there are a few really smart guys here, some will know the answer. And I was right . But should I have more questions, I'll certainly join there.

Thanks again

8. Originally Posted by EVOEx
I've tried to input a simple sine function and while it does report the proper frequency to be most present, the frequencies before and after it seem to be pretty 'present' as well. That is, the intensity of the proper frequency is highest, but those of a few hundred Hz more or less are quite high as well. Is there a proper way to distinguish properly between them?
You need to learn some more basics. A discrete FFT obviously only has a finite number of bins. Yet you can represent more than this number of frequencies. So if some frequency is present which does not EXACTLY fall into one of the bins, it will be smeared across several bins.

If you need finer frequency resolution, you need a larger sized FFT. But a larger sized FFT decreases your temporal resolution. Tada, you've discovered the Uncertainty Principle (it's mathematically the same thing as the Uncertainty Principle from quantum mechanics). This mathematical fact is inescapable.

EDIT: The comment about non-periodicity also applies, but the effects of a signal which is not periodic-smooth is to manifest as very high frequencies, NOT frequencies close to the one you are looking for. You're probably seeing a combination of both effects.

The non-periodic noise can be reduced by applying a windowing function to your signal... This is all very basic DSP stuff. I suggest this online book: http://www.dspguide.com/