I want to compress voice data for transfer over
network,who could tell me about algorithm?
Printable View
I want to compress voice data for transfer over
network,who could tell me about algorithm?
The simplest approach would be to use an existing compression library (one that does MP3, for example). A decent "home-grown" method is to do an FFT on the data and filter out non-essential frequencies (outside of the range of voice, that is). I did that a while back and got pretty good results.
Basically, what I did was take the FFT of a 16-bit audio stream and discarded non-voice frequencies, as well as frequencies with small amplitudes. Then I mapped the three components (frequency, amplitude, and phase) to 8-bit representations (so 3 bytes per spectral component). Finally, I compressed the result using arithmetic coding. The last step helped, but wasn't domain-specific enough to produce really spectacular compression; the correlation between each frame within the sample wasn't really exploited properly. For example, consider the ZIP archive format. It works really well with text because there is a direct correlation between each byte of data. But if you apply it to a buffer of 32-bit data, the efficiency typically goes down because the relationship between the data on a byte level is somewhat artificial. The key to achieving really good compression, then, is to model those inter-relationships as faithfully as possible. That said, the result was definitely satisfactory. Compression levels averaged out to roughly 10% - 33% of the uncompressed size, and the voice quality was decent. Had the second step been better modeled, I imagine those figures would have been half as much.
If I use 8-bit or 16-bit, what is the range about non-voice frequencies?