Publications
Rapid prototyping of radar algorithms
Summary
Summary
Rapid prototyping of advanced signal processing algorithms is critical to developing new radars. Signal processing engineers usually use high level languages like MATLAB, IDL, or Python to develop advanced algorithms and to determine the optimal parameters for these algorithms. Many of these algorithms have very long execution times due to...
Sinewave parameter estimation using the fast Fan-Chirp Transform
Summary
Summary
Sinewave analysis/synthesis has long been an important tool for audio analysis, modification and synthesis [1]. The recently introduced Fan-Chirp Transform (FChT) [2,3] has been shown to improve the fidelity of sinewave parameter estimates for a harmonic audio signal with rapid frequency modulation [4]. A fast version of the FChT [3]...
Towards co-channel speaker separation by 2-D demodulation of spectrograms
Summary
Summary
This paper explores a two-dimensional (2-D) processing approach for co-channel speaker separation of voiced speech. We analyze localized time-frequency regions of a narrowband spectrogram using 2-D Fourier transforms and propose a 2-D amplitude modulation model based on pitch information for single and multi-speaker content in each region. Our model maps...
A log-frequency approach to the identification of the Wiener-Hammerstein model
Summary
Summary
In this paper we present a simple closed-form solution to the Wiener-Hammerstein (W-H) identification problem. The identification process occurs in the log-frequency domain where magnitudes and phases are separable. We show that the theoretically optimal W-H identification is unique up to an amplitude, phase and delay ambiguity, and that the...
2-D processing of speech for multi-pitch analysis.
Summary
Summary
This paper introduces a two-dimensional (2-D) processing approach for the analysis of multi-pitch speech sounds. Our framework invokes the short-space 2-D Fourier transform magnitude of a narrowband spectrogram, mapping harmonically related signal components to multiple concentrated entities in a new 2-D space. First, localized time-frequency regions of the spectrogram are...
A comparison of query-by-example methods for spoken term detection
Summary
Summary
In this paper we examine an alternative interface for phonetic search, namely query-by-example, that avoids OOV issues associated with both standard word-based and phonetic search methods. We develop three methods that compare query lattices derived from example audio against a standard ngrambased phonetic index and we analyze factors affecting the...
A framework for discriminative SVM/GMM systems for language recognition
Summary
Summary
Language recognition with support vector machines and shifted-delta cepstral features has been an excellent performer in NIST-sponsored language evaluation for many years. A novel improvement of this method has been the introduction of hybrid SVM/GMM systems. These systems use GMM supervectors as an SVM expansion for classification. In prior work...
Discriminative N-gram selection for dialect recognition
Summary
Summary
Dialect recognition is a challenging and multifaceted problem. Distinguishing between dialects can rely upon many tiers of interpretation of speech data - e.g., prosodic, phonetic, spectral, and word. High-accuracy automatic methods for dialect recognition typically rely upon either phonetic or spectral characteristics of the input. A challenge with spectral system...
Large-scale analysis of formant frequency estimation variability in conversational telephone speech
Summary
Summary
We quantify how the telephone channel and regional dialect influence formant estimates extracted from Wavesurfer in spontaneous conversational speech from over 3,600 native American English speakers. To the best of our knowledge, this is the largest scale study on this topic. We found that F1 estimates are higher in cellular...
The MIT Lincoln Laboratory 2008 speaker recognition system
Summary
Summary
In recent years methods for modeling and mitigating variational nuisances have been introduced and refined. A primary emphasis in this years NIST 2008 Speaker Recognition Evaluation (SRE) was to greatly expand the use of auxiliary microphones. This offered the additional channel variations which has been a historical challenge to speaker...