Abstract
We develop a parametric sinusoidal analysis/synthesis model which can be applied to both speech and audio signals. These signals are characterised by large amplitude variations and small frequency variation within a short analysis frame. The model comprises of a Gaussian mixture representation for the envelope and a sum of linear chirps for the frequency components. A closed form solution is derived for the frequency domain parameters of a chirp with Gaussian-mixture envelope, based on the spectral moments. An iterative algorithm is developed to select and estimate prominent chirps based on the psycho-acoustic masking threshold. The model can adaptively select the number of time-domain and frequency-domain parameters to suit a particular type of signal. Experimental evaluation of the technique has shown that about 2 to 4 parameters/ms is sufficient for near transparent quality reconstruction of a variety of wide-band music and speech signals.