Publications

Refine Results

(Filters Applied) Clear All

Linear prediction modulation filtering for speaker recognition of reverberant speech

Published in:
Odyssey 2012, The Speaker and Language Recognition Workshop, 25-28 June 2012.

Summary

This paper proposes a framework for spectral enhancement of reverberant speech based on inversion of the modulation transfer function. All-pole modeling of modulation spectra of clean and degraded speech are utilized to derive the linear prediction inverse modulation transfer function (LP-IMTF) solution as a low-order IIR filter in the modulation envelope domain. By considering spectral estimation under speech presence uncertainty, speech presence probabilities are derived for the case of reverberation. Aside from enhancement, the LP-IMTF framework allows for blind estimation of reverberation time by extracting a minimum phase approximation of the short-time spectral channel impulse response. The proposed speech enhancement method is used as a front-end processing step for speaker recognition. When applied to the microphone condition of the NISTSRE 2010 with artificially added reverberation, the proposed spectral enhancement method yields significant improvements across a variety of performance metrics.
READ LESS

Summary

This paper proposes a framework for spectral enhancement of reverberant speech based on inversion of the modulation transfer function. All-pole modeling of modulation spectra of clean and degraded speech are utilized to derive the linear prediction inverse modulation transfer function (LP-IMTF) solution as a low-order IIR filter in the modulation...

READ MORE

The MITLL NIST LRE 2011 language recognition system

Summary

This paper presents a description of the MIT Lincoln Laboratory (MITLL) language recognition system developed for the NIST 2011 Language Recognition Evaluation (LRE). The submitted system consisted of a fusion of four core classifiers, three based on spectral similarity and one based on tokenization. Additional system improvements were achieved following the submission deadline. In a major departure from previous evaluations, the 2011 LRE task focused on closed-set pairwise performance so as to emphasize a system's ability to distinguish confusable language pairs. Results are presented for the 24-language confusable pair task at test utterance durations of 30, 10, and 3 seconds. Results are also shown using the standard detection metrics (DET, minDCF) and it is demonstrated the previous metrics adequately cover difficult pair performance. On the 30 s 24-language confusable pair task, the submitted and post-evaluation systems achieved average costs of 0.079 and 0.070 and standard detection costs of 0.038 and 0.033.
READ LESS

Summary

This paper presents a description of the MIT Lincoln Laboratory (MITLL) language recognition system developed for the NIST 2011 Language Recognition Evaluation (LRE). The submitted system consisted of a fusion of four core classifiers, three based on spectral similarity and one based on tokenization. Additional system improvements were achieved following...

READ MORE

A stochastic system for large network growth

Published in:
IEEE Signal Process. Lett., Vol. 19, No. 6, June 2012, pp. 356-359.

Summary

This letter proposes a new model for preferential attachment in dynamic directed networks. This model consists of a linear time-invariant system that uses past observations to predict future attachment rates, and an innovation noise process that induces growth on vertices that previously had no attachments. Analyzing a large citation network in this context, we show that the proposed model fits the data better than existing preferential attachment models. An analysis of the noise in the dataset reveals power-law degree distributions often seen in large networks, and polynomial decay with respect to age in the probability of citing yet-uncited documents.
READ LESS

Summary

This letter proposes a new model for preferential attachment in dynamic directed networks. This model consists of a linear time-invariant system that uses past observations to predict future attachment rates, and an innovation noise process that induces growth on vertices that previously had no attachments. Analyzing a large citation network...

READ MORE

Continuous security metrics for prevalent network threats - introduction and first four metrics

Summary

The goal of this work is to introduce meaningful security metrics that motivate effective improvements in network security. We present a methodology for directly deriving security metrics from realistic mathematical models of adversarial behaviors and systems and also a maturity model to guide the adoption and use of these metrics. Four security metrics are described that assess the risk from prevalent network threats. These can be computed automatically and continuously on a network to assess the effectiveness of controls. Each new metric directly assesses the effect of controls that mitigate vulnerabilities, continuously estimates the risk from one adversary, and provides direct insight into what changes must be made to improve security. Details of an explicit maturity model are provided for each metric that guide security practitioners through three stages where they (1) Develop foundational understanding, tools and procedures, (2) Make accurate and timely measurements that cover all relevant network components and specify security conditions to test, and (3) Perform continuous risk assessments and network improvements. Metrics are designed to address specific threats, maintain practicality and simplicity, and motivate risk reduction. These initial four metrics and additional ones we are developing should be added incrementally to a network to gradually improve overall security as scores drop to acceptable levels and the risks from associated cyber threats are mitigated.
READ LESS

Summary

The goal of this work is to introduce meaningful security metrics that motivate effective improvements in network security. We present a methodology for directly deriving security metrics from realistic mathematical models of adversarial behaviors and systems and also a maturity model to guide the adoption and use of these metrics...

READ MORE

FY11 Line-Supported Bio-Next Program - Multi-modal Early Detection Interactive Classifier (MEDIC) for mild traumatic brain injury (mTBI) triage

Summary

The Multi-modal Early Detection Interactive Classifier (MEDIC) is a triage system designed to enable rapid assessment of mild traumatic brain injury (mTBI) when access to expert diagnosis is limited as in a battlefield setting. MEDIC is based on supervised classification that requires three fundamental components to function correctly; these are data, features, and truth. The MEDIC system can act as a data collection device in addition to being an assessment tool. Therefore, it enables a solution to one of the fundamental challenges in understanding mTBI: the lack of useful data. The vision of MEDIC is to fuse results from stimulus tests in each of four modalitites - auditory, occular, vocal, and intracranial pressure - and provide them to a classifier. With appropriate data for training, the MEDIC classifier is expected to provide an immediate decision of whether the subject has a strong likelihood of having sustained an mTBI and therefore requires an expert diagnosis from a neurologist. The tests within each modalitity were designed to balance the capacity of objective assessment and the maturity of the underlying technology against the ability to distinguish injured from non-injured subjects according to published results. Selection of existing modalities and underlying features represents the best available, low cost, portable technology with a reasonable chance of success.
READ LESS

Summary

The Multi-modal Early Detection Interactive Classifier (MEDIC) is a triage system designed to enable rapid assessment of mild traumatic brain injury (mTBI) when access to expert diagnosis is limited as in a battlefield setting. MEDIC is based on supervised classification that requires three fundamental components to function correctly; these are...

READ MORE

A scalable signal processing architecture for massive graph analysis

Published in:
ICASSP 2012, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 25-30 March 2012, pp. 5329-32.

Summary

In many applications, it is convenient to represent data as a graph, and often these datasets will be quite large. This paper presents an architecture for analyzing massive graphs, with a focus on signal processing applications such as modeling, filtering, and signal detection. We describe the architecture, which covers the entire processing chain, from data storage to graph construction to graph analysis and subgraph detection. The data are stored in a new format that allows easy extraction of graphs representing any relationship existing in the data. The principal analysis algorithm is the partial eigendecomposition of the modularity matrix, whose running time is discussed. A large document dataset is analyzed, and we present subgraphs that stand out in the principal eigenspace of the time varying graphs, including behavior we regard as clutter as well as small, tightly-connected clusters that emerge over time.
READ LESS

Summary

In many applications, it is convenient to represent data as a graph, and often these datasets will be quite large. This paper presents an architecture for analyzing massive graphs, with a focus on signal processing applications such as modeling, filtering, and signal detection. We describe the architecture, which covers the...

READ MORE

Autoregressive HMM speech synthesis

Author:
Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, 25-30 March 2012, pp. 4021-4.

Summary

Autoregressive HMM modeling of spectral features has been proposed as a replacement for standard HMM speech synthesis. The merits of the approach are explored, and methods for enforcing stability of the estimated predictor coefficients are presented. It appears that rather than directly estimating autoregressive HMM parameters, greater synthesis accuracy is obtained by estimating the autoregressive HMM parameters by using a more traditional HMM recognition system to compute state-level posterior probabilities that are then used to accumulate statistics to estimate predictor coefficients. The result is a simplified mathematical framework that requires no modeling of derivatives and still provides smooth synthesis without unnatural spectral discontinuities. The resulting synthesis algorithm involves no matrix solves and may be formulated causally, and appears to result in quality very similar to that of more traditional HMM synthesis approaches. This paper describes the implementation of a complete Autoregressive HMM LVCSR system and its application for synthesis, and describes the preliminary synthesis results.
READ LESS

Summary

Autoregressive HMM modeling of spectral features has been proposed as a replacement for standard HMM speech synthesis. The merits of the approach are explored, and methods for enforcing stability of the estimated predictor coefficients are presented. It appears that rather than directly estimating autoregressive HMM parameters, greater synthesis accuracy is...

READ MORE

Goodness-of-fit statistics for anomaly detection in Chung-Lu random graphs

Published in:
ICASSP 2012, Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, 25-30 March 2012, pp. 3265-8.

Summary

Anomaly detection in graphs is a relevant problem in numerous applications. When determining whether an observation is anomalous with respect to the model of typical behavior, the notion of "goodness of fit" is important. This notion, however, is not well understood in the context of graph data. In this paper, we propose three goodness-of-fit statistics for Chung-Lu random graphs, and analyze their efficacy in discriminating graphs generated by the Chung-Lu model from those with anomalous topologies. In the results of a Monte Carlo simulation, we see that the most powerful statistic for anomaly detection depends on the type of anomaly, suggesting that a hybrid statistic would be the most powerful.
READ LESS

Summary

Anomaly detection in graphs is a relevant problem in numerous applications. When determining whether an observation is anomalous with respect to the model of typical behavior, the notion of "goodness of fit" is important. This notion, however, is not well understood in the context of graph data. In this paper...

READ MORE

Topic identification based extrinsic evaluation of summarization techniques applied to conversational speech

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, ICASSP, 25-30 March 2012, pp. 5073-6.

Summary

Document summarization algorithms are most commonly evaluated according to the intrinsic quality of the summaries they produce. An alternate approach is to examine the extrinsic utility of a summary, measured by the ability of the summary to aid a human in the completion of a specific task. In this paper, we use topic identification as a proxy for relevancy determination in the context of an information retrieval task, and a summary is deemed effective if it enables a user to determine the topical content of a retrieved document. We utilize Amazon's Mechanical Turk service to perform a large-scale human study contrasting four different summarization systems applied to conversational speech from the Fisher Corpus. We show that these results appear to be correlated with the performance of an automated topic identification system, and argue that this automated system can act as a low-cost proxy for a human evaluation during the development stages of a summarization system.
READ LESS

Summary

Document summarization algorithms are most commonly evaluated according to the intrinsic quality of the summaries they produce. An alternate approach is to examine the extrinsic utility of a summary, measured by the ability of the summary to aid a human in the completion of a specific task. In this paper...

READ MORE

Moments of parameter estimates for Chung-Lu random graph models

Published in:
ICASSP 2012, Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, 25-30 March 2012, pp. 3961-4.

Summary

As abstract representations of relational data, graphs and networks find wide use in a variety of fields, particularly when working in non- Euclidean spaces. Yet for graphs to be truly useful in in the context of signal processing, one ultimately must have access to flexible and tractable statistical models. One model currently in use is the Chung- Lu random graph model, in which edge probabilities are expressed in terms of a given expected degree sequence. An advantage of this model is that its parameters can be obtained via a simple, standard estimator. Although this estimator is used frequently, its statistical properties have not been fully studied. In this paper, we develop a central limit theory for a simplified version of the Chung-Lu parameter estimator. We then derive approximations for moments of the general estimator using the delta method, and confirm the effectiveness of these approximations through empirical examples.
READ LESS

Summary

As abstract representations of relational data, graphs and networks find wide use in a variety of fields, particularly when working in non- Euclidean spaces. Yet for graphs to be truly useful in in the context of signal processing, one ultimately must have access to flexible and tractable statistical models. One...

READ MORE