Publications

Refine Results

(Filters Applied) Clear All

Technical challenges of supporting interactive HPC

Published in:
Ann. High Performance Computer Modernization Program Users Group Conf., 19-21 June 2007.

Summary

Users' demand for interactive, on-demand access to a large pool of high performance computing (HPC) resources is increasing. The majority of users at Massachusetts Institute of Technology Lincoln Laboratory (MIT LL) are involved in the interactive development of sensor processing algorithms. This development often requires a large amount of computation due to the complexity of the algorithms being explored and/or the size of the data set being analyzed. These researchers also require rapid turnaround of their jobs because each iteration directly influences code changes made for the following iteration. Historically, batch queue systems have not been a good match for this kind of user. The Lincoln Laboratory Grid (LLGrid) system at MIT-LL is the largest dedicated interactive, on-demand HPC system in the world. While the system also accommodates some batch queue jobs, the vast majority of jobs submitted are interactive, on-demand jobs. Choosing between running a system with a batch queue or in an interactive, on-demand manner involves tradeoffs. This paper discusses the tradeoffs between operating a cluster as a batch system, an interactive, ondemand system, or a hybrid system. The LLGrid system has been operational for over three years, and now serves over 200 users from across Lincoln. The system has run over 100,000 interactive jobs. It has become an integral part of many researchers' algorithm development workflows. For instance, in batch queue systems, an individual user commonly can gain access to 25% of the processors in the system after the job has waited in the queue; in our experience with on-demand, interactive operation, individual users often can also gain access to 20-25% of the cluster processors. This paper will share a variety of the new data on our experiences with running an interactive, on-demand system that also provides some batch queue access. Keywords: grid computing, on-demand, interactive high performance computing, cluster computing, parallel MATLAB.
READ LESS

Summary

Users' demand for interactive, on-demand access to a large pool of high performance computing (HPC) resources is increasing. The majority of users at Massachusetts Institute of Technology Lincoln Laboratory (MIT LL) are involved in the interactive development of sensor processing algorithms. This development often requires a large amount of computation...

READ MORE

Automatic language identification

Published in:
Wiley Encyclopedia of Electrical and Electronics Engineering, Vol. 2, pp. 104-9, 2007.

Summary

Automatic language identification is the process by which the language of digitized spoken words is recognized by a computer. It is one of several processes in which information is extracted automatically from a speech signal.
READ LESS

Summary

Automatic language identification is the process by which the language of digitized spoken words is recognized by a computer. It is one of several processes in which information is extracted automatically from a speech signal.

READ MORE

Making network intrusion detection work with IPsec

Published in:
MIT Lincoln Laboratory Report TR-1121

Summary

Network-based intrusion detection systems (NIDSs) are one component of a comprehensive network security solution. The use of IPsec, which encrypts network traffic, renders network intrusion detection virtually useless unless traffic is decrypted at network gateways. One alternative to NIDSs, host-based intrusion detection systems (HIDSs), provides some of the functionality of NIDSs but with limitations. HIDSs cannot perform a network-wide analysis and can be subverted if a host is compromised. We propose an approach to intrusion detection that combines HIDS, NIDS, and a version of IPsec that encrypts the header and the body of IP packets separately. We refer to the latter generically as Two-Key IPsec. We show that all of the network events currently detectable by the Snort NIDS on unencrypted network traffic are also detectable on encrypted network traffic using this approach. The NIDS detects network-level events that HIDSs have trouble detecting and HIDSs detect application-level events that can't be detected by the NIDS.
READ LESS

Summary

Network-based intrusion detection systems (NIDSs) are one component of a comprehensive network security solution. The use of IPsec, which encrypts network traffic, renders network intrusion detection virtually useless unless traffic is decrypted at network gateways. One alternative to NIDSs, host-based intrusion detection systems (HIDSs), provides some of the functionality of...

READ MORE

MIT Lincoln Laboratory multimodal person identification system in the CLEAR 2007 Evaluation

Author:
Published in:
2nd Annual Classification of Event Activities and Relationships/Rich Transcription Evaluations, 8-11 May 2008, pp. 240-247.

Summary

A description of the MIT Lincoln Laboratory system used in the person identification task of the recent CLEAR 2007 Evaluation is documented in this paper. This task is broken into audio, visual, and multimodal subtasks. The audio identification system utilizes both a GMM and a SVM subsystem, while the visual (face) identification system utilizes an appearance-based [Kernel] approach for identification. The audio channels, originating from a microphone array, were preprocessed with beamforming and noise preprocessing.
READ LESS

Summary

A description of the MIT Lincoln Laboratory system used in the person identification task of the recent CLEAR 2007 Evaluation is documented in this paper. This task is broken into audio, visual, and multimodal subtasks. The audio identification system utilizes both a GMM and a SVM subsystem, while the visual...

READ MORE

Low-bit-rate speech coding

Author:
Published in:
Chapter 16 in Springer Handbook of Speech Processing and Communication, 2007, pp. 331-50.

Summary

Low-bit-rate speech coding, at rates below 4 kb/s, is needed for both communication and voice storage applications. At such low rates, full encoding of the speech waveform is not possible; therefore, low-rate coders rely instead on parametric models to represent only the most perceptually relevant aspects of speech. While there are a number of different approaches for this modeling, all can be related to the basic linear model of speech production, where an excitation signal drives a vocal-tract filter. The basic properties of the speech signal and of human speech perception can explain the principles of parametric speech coding as applied in early vocoders. Current speech modeling approaches, such as mixed excitation linear prediction, sinusoidal coding, and waveform interpolation, use more-sophisticated versions of these same concepts. Modern techniques for encoding the model parameters, in particular using the theory of vector quantization, allow the encoding of the model information with very few bits per speech frame. Successful standardization of low-rate coders has enabled their widespread use for both military and satellite communications, at rates from 4 kb/s all the way down to 600 b/s. However, the goal of toll-quality low-rate coding continues to provide a research challenge.
READ LESS

Summary

Low-bit-rate speech coding, at rates below 4 kb/s, is needed for both communication and voice storage applications. At such low rates, full encoding of the speech waveform is not possible; therefore, low-rate coders rely instead on parametric models to represent only the most perceptually relevant aspects of speech. While there...

READ MORE

Nuisance attribute projection

Published in:
Chapter in Speech Communication, May 2007.

Summary

Cross-channel degradation is one of the significant challenges facing speaker recognition systems. We study this problem in the support vector machine (SVM) context and nuisance variable compensation in high-dimensional spaces more generally. We present an approach to nuisance variable compensation by removing nuisance attribute-related dimensions in the SVM expansion space via projections. Training to remove these dimensions is accomplished via an eigenvalue problem. The eigenvalue problem attempts to reduce multisession variation for the same speaker, reduce different channel effects, and increase "distance" between different speakers. Experiments show significant improvement in performance for the cross-channel case.
READ LESS

Summary

Cross-channel degradation is one of the significant challenges facing speaker recognition systems. We study this problem in the support vector machine (SVM) context and nuisance variable compensation in high-dimensional spaces more generally. We present an approach to nuisance variable compensation by removing nuisance attribute-related dimensions in the SVM expansion space...

READ MORE

Text-independent speaker recognition

Published in:
Springer Handbook of Speech Processing and Communication, 2007, pp. 763-81.

Summary

In this chapter, we focus on the area of text-independent speaker verification, with an emphasis on unconstrained telephone conversational speech. We begin by providing a general likelihood ratio detection task framework to describe the various components in modern text-independent speaker verification systems. We next describe the general hierarchy of speaker information conveyed in the speech signal and the issues involved in reliably exploiting these levels of information for practical speaker verification systems. We then describe specific implementations of state-of-the-art text-independent speaker verification systems utilizing low-level spectral information and high-level token sequence information with generative and discriminative modeling techniques. Finally, we provide a performance assessment of these systems using the National Institute of Standards and Technology (NIST) speaker recognition evaluation telephone corpora.
READ LESS

Summary

In this chapter, we focus on the area of text-independent speaker verification, with an emphasis on unconstrained telephone conversational speech. We begin by providing a general likelihood ratio detection task framework to describe the various components in modern text-independent speaker verification systems. We next describe the general hierarchy of speaker...

READ MORE

ILR-based MT comprehension test with multi-level questions

Published in:
Human Language Technology, North American Chapter of the Association for Computational Linguistics, HLT/NAACL, 22-27 April 2007.

Summary

We present results from a new Interagency Language Roundtable (ILR) based comprehension test. This new test design presents questions at multiple ILR difficulty levels within each document. We incorporated Arabic machine translation (MT) output from three independent research sites, arbitrarily merging these materials into one MT condition. We contrast the MT condition, for both text and audio data types, with high quality human reference Gold Standard (GS) translations. Overall, subjects achieved 95% comprehension for GS and 74% for MT, across all genres and difficulty levels. Interestingly, comprehension rates do not correlate highly with translation error rates, suggesting that we are measuring an additional dimension of MT quality.
READ LESS

Summary

We present results from a new Interagency Language Roundtable (ILR) based comprehension test. This new test design presents questions at multiple ILR difficulty levels within each document. We incorporated Arabic machine translation (MT) output from three independent research sites, arbitrarily merging these materials into one MT condition. We contrast the...

READ MORE

A new approach to achieving high-performance power amplifier linearization

Published in:
IEEE Radar Conf., 17-20 April 2007. doi: 10.1109/RADAR.2007.374329

Summary

Digital baseband predistortion (DBP) is not particularly well suited to linearizing wideband power amplifiers (PAs); this is due to the exorbitant price paid in computational complexity. One of the underlying reasons for the computational complexity of DBP is the inherent inefficiency of using a sufficiently deep memory and a high enough polynomial order to span the multidimensional signal space needed to mitigate PA-induced nonlinear distortion. Therefore we have developed a new mathematical method to efficiently search for and localize those regions in the multidimensional signal space that enable us to invert PA nonlinearities with a significant reduction in computational complexity. Using a wideband code division multiple access (CDMA) signal we demonstrate and compare the PA linearization performance and computational complexity of our algorithm to that of conventional DBP techniques using measured results.
READ LESS

Summary

Digital baseband predistortion (DBP) is not particularly well suited to linearizing wideband power amplifiers (PAs); this is due to the exorbitant price paid in computational complexity. One of the underlying reasons for the computational complexity of DBP is the inherent inefficiency of using a sufficiently deep memory and a high...

READ MORE

Language recognition with word lattices and support vector machines

Published in:
Proc. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, ICASSP, 15-20 April 2007, Vol. IV, pp. 989-992.

Summary

Language recognition is typically performed with methods that exploit phonotactics--a phone recognition language modeling (PRLM) system. A PRLM system converts speech to a lattice of phones and then scores a language model. A standard extension to this scheme is to use multiple parallel phone recognizers (PPRLM). In this paper, we modify this approach in two distinct ways. First, we replace the phone tokenizer by a powerful speech-to-text system. Second, we use a discriminative support vector machine for language modeling. Our goals are twofold. First, we explore the ability of a single speech-to-text system to distinguish multiple languages. Second, we fuse the new system with an SVM PRLM system to see if it complements current approaches. Experiments on the 2005 NIST language recognition corpus show the new word system accomplishes these goals and has significant potential for language recognition.
READ LESS

Summary

Language recognition is typically performed with methods that exploit phonotactics--a phone recognition language modeling (PRLM) system. A PRLM system converts speech to a lattice of phones and then scores a language model. A standard extension to this scheme is to use multiple parallel phone recognizers (PPRLM). In this paper, we...

READ MORE