Advances in Non-Linear Modeling for Speech Processing by Raghunath S. Holambe

By Raghunath S. Holambe

Advances in Non-Linear Modeling for Speech Processing contains complex subject matters in non-linear estimation and modeling thoughts besides their functions to speaker popularity.

Non-linear aeroacoustic modeling strategy is used to estimate the $64000 fine-structure speech occasions, which aren't published by way of the quick time Fourier remodel (STFT). This aeroacostic modeling procedure offers the impetus for the excessive solution Teager power operator (TEO). This operator is characterised via a time answer which may music fast sign strength alterations inside of a glottal cycle.

The cepstral beneficial properties like linear prediction cepstral coefficients (LPCC) and mel frequency cepstral coefficients (MFCC) are computed from the importance spectrum of the speech body and the section spectra is overlooked. to beat the matter of neglecting the part spectra, the speech creation procedure could be represented as an amplitude modulation-frequency modulation (AM-FM) version. To demodulate the speech sign, to estimation the amplitude envelope and instant frequency parts, the power separation set of rules (ESA) and the Hilbert rework demodulation (HTD) set of rules are mentioned.

Different positive aspects derived utilizing above non-linear modeling suggestions are used to advance a speaker id method. ultimately, it truly is proven that, the fusion of speech construction and speech conception mechanisms may end up in a powerful function set.

De Maria FD, Figueiras AR (1995) Radial basis functions for nonlinear prediction of speech in analysis-by-synthesis coders. In: Proceedings of IEEE workshop on nonlinear signal and image processing, Halkidiki 35. Lapedes A, Farber R (1998) How neural nets work. In: Lee YC (ed) Evolution, learning, and cognition. World Scientific, Singapore, pp 231–346 36. Tishby N (1990) A dynamical systems approach to speech processing. In: Proceedings of IEEE international conference on acoustics, speech, and, signal processing (ICASSP’90) 37.

The another form of model discussed is the dynamic system model, whose canonical form is called the statespace model. Finally this model is extended to time-invariant and time-varying models and some approximation methods are discussed. References 1. Rabiner LR, Juang BH (1993) Fundamentals of speech recognition. Prentice-Hall of India, New Delhi 2. Hansen J, Proakis J (2000) Discrete-time processing of speech signals, 2nd edn. IEEE Press, New York 3. Mammone R, Zhang X, Ramachandran R (1996) Robust speaker recognition: a feature based approach.

Depending on the angle between the surface of the obstacle and the direction of flow, the surface roughness and the obstacle geometry, the noise generated can be up to 20 dB higher than that generated by the same jet in free space. Because of the spatially concentrated source, modeling obstacle noise is easier than modeling the noise in a free jet. Experiments reveal that obstacle noise can be approximated by a dipole source located at the obstacle. The above theoretical findings qualitatively explain the observed phenomenon that the fricatives “th” and “f” (and the corresponding voiced “dh” and “v”) are weak compared to the fricatives “s” and “sh”.

