I want to compress voice data for transfer over
network,who could tell me about algorithm?
I want to compress voice data for transfer over
network,who could tell me about algorithm?
If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
If at first you don't succeed, try writing your phone number on the exam paper.
The simplest approach would be to use an existing compression library (one that does MP3, for example). A decent "home-grown" method is to do an FFT on the data and filter out non-essential frequencies (outside of the range of voice, that is). I did that a while back and got pretty good results.
Basically, what I did was take the FFT of a 16-bit audio stream and discarded non-voice frequencies, as well as frequencies with small amplitudes. Then I mapped the three components (frequency, amplitude, and phase) to 8-bit representations (so 3 bytes per spectral component). Finally, I compressed the result using arithmetic coding. The last step helped, but wasn't domain-specific enough to produce really spectacular compression; the correlation between each frame within the sample wasn't really exploited properly. For example, consider the ZIP archive format. It works really well with text because there is a direct correlation between each byte of data. But if you apply it to a buffer of 32-bit data, the efficiency typically goes down because the relationship between the data on a byte level is somewhat artificial. The key to achieving really good compression, then, is to model those inter-relationships as faithfully as possible. That said, the result was definitely satisfactory. Compression levels averaged out to roughly 10% - 33% of the uncompressed size, and the voice quality was decent. Had the second step been better modeled, I imagine those figures would have been half as much.
Last edited by gardhr; 12-23-2011 at 08:59 PM.
If I use 8-bit or 16-bit, what is the range about non-voice frequencies?