Publications

Refine Results

(Filters Applied) Clear All

Experimental facility for measuring the impact of environmental noise and speaker variation on speech-to-speech translation devices

Published in:
Proc. IEEE Spoken Language Technology Workshop, 10-13 December 2006, pp. 250-253.

Summary

We describe the construction and use of a laboratory facility for testing the performance of speech-to-speech translation devices. Approximately 1500 English phrases from various military domains were recorded as spoken by each of 30 male and 12 female English speakers with variation in speaker accent, for a total of approximately 60,000 phrases available for experimentation. We describe an initial experiment using the facility which shows the impact of environmental noise and speaker variability on phrase recognition accuracy for two commercially available oneway speech-to-speech translation devices configured for English-to-Arabic.
READ LESS

Summary

We describe the construction and use of a laboratory facility for testing the performance of speech-to-speech translation devices. Approximately 1500 English phrases from various military domains were recorded as spoken by each of 30 male and 12 female English speakers with variation in speaker accent, for a total of approximately...

READ MORE

The security of OpenBSD: milk or wine?

Published in:
;login:, Vol. 31, No. 6, December 2006, pp. 26-32.

Summary

Purchase a fine wine, place it in a cellar, and wait a few years: The aging will have resulted in a delightful beverage, a product far better than the original. Purchase a gallon of milk, place it in a cellar, and wait a few years. You will be sorry. We know how the passing of time affects milk and wine, but how does aging affect the security of software? Many in the security research community have criticized software developers both for releasing software with so many vulnerabilities and for the lack of any apparent improvement in this software over time. However, critics have lacked quantitative evidence that applying effort over time will result in software with fewer vulnerabilities. In short, we don't know whether software security is destined to age like milk or has the potential to become wine. We thus investigated whether or not the rate at which vulnerabilities are reported in OpenBSD is decreasing over time.
READ LESS

Summary

Purchase a fine wine, place it in a cellar, and wait a few years: The aging will have resulted in a delightful beverage, a product far better than the original. Purchase a gallon of milk, place it in a cellar, and wait a few years. You will be sorry. We...

READ MORE

An efficient graph search decoder for phrase-based statistical machine translation

Published in:
Int. Workshop on Spoken Language Translation, 28 November 2006.

Summary

In this paper we describe an efficient implementation of a graph search algorithm for phrase-based statistical machine translation. Our goal was to create a decoder that could be used for both our research system and a real-time speech-to-speech machine translation demonstration system. The search algorithm is based on a Viterbi graph search with an A* heuristic. We were able to increase the speed of our decoder substantially through the use of on-the-fly beam pruning and other algorithmic enhancements. The decoder supports a variety of reordering constraints as well as arbitrary n-gram decoding. In addition, we have implemented disk based translation models and a messaging interface to communicate with other components for use in our real-time speech translation system.
READ LESS

Summary

In this paper we describe an efficient implementation of a graph search algorithm for phrase-based statistical machine translation. Our goal was to create a decoder that could be used for both our research system and a real-time speech-to-speech machine translation demonstration system. The search algorithm is based on a Viterbi...

READ MORE

The MIT-LL/AFRL IWSLT-2006 MT system

Published in:
Proc. Int. Workshop on Spoken Language Translation, IWSLT, 27-28 November 2006.

Summary

The MIT-LL/AFRL MT system is a statistical phrase-based translation system that implements many modern SMT training and decoding techniques. Our system was designed with the long-term goal of dealing with corrupted ASR input and limited amounts of training data for speech-to-speech MT applications. This paper will discuss the architecture of the MIT-LL/AFRL MT system, improvements over our 2005 system, and experiments with manual and ASR transcription data that were run as part of the IWSLT-2006 evaluation campaign.
READ LESS

Summary

The MIT-LL/AFRL MT system is a statistical phrase-based translation system that implements many modern SMT training and decoding techniques. Our system was designed with the long-term goal of dealing with corrupted ASR input and limited amounts of training data for speech-to-speech MT applications. This paper will discuss the architecture of...

READ MORE

The JHU Workshop 2006 IWSLT System

Published in:
Int. Workshop on Spoken Language Translation, IWSLT, 27-28 November 2006.

Summary

This paper describes the SMT we built during the 2006 JHU Summer Workshop for the IWSLT 2006 evaluation. Our effort focuses on two parts of the speech translation problem: 1) efficient decoding of word lattices and 2) novel applications of factored translation models to IWSLT-specific problems. In this paper, we present results from the open-track Chinese-to-English condition. Improvements of 5-10% relative BLEU are obtained over a high performing baseline. We introduce a new open-source decoder that implements the state-of-the-art in statistical machine translation.
READ LESS

Summary

This paper describes the SMT we built during the 2006 JHU Summer Workshop for the IWSLT 2006 evaluation. Our effort focuses on two parts of the speech translation problem: 1) efficient decoding of word lattices and 2) novel applications of factored translation models to IWSLT-specific problems. In this paper, we...

READ MORE

High productivity computing and usable petascale systems

Published in:
SC '06: Proceedings of the 2006 ACM/IEEE conference on Supercomputing

Summary

High Performance Computing has seen extraordinary growth in peak performance which has been accompanied by a significant increase in the difficulty of using these systems. High Productivity Computing Systems (HPCS) seek to address this gap by producing petascale computers that are usable by a broader range of scientists and engineers. One of the most important HPCS innovations is the concept of a flatter memory hierarchy, which means that data from remote processors can be retrieved and used very efficiently. A flatter memory hierarchy increases performance and is easier to program.
READ LESS

Summary

High Performance Computing has seen extraordinary growth in peak performance which has been accompanied by a significant increase in the difficulty of using these systems. High Productivity Computing Systems (HPCS) seek to address this gap by producing petascale computers that are usable by a broader range of scientists and engineers...

READ MORE

Validating and restoring defense in depth using attack graphs

Summary

Defense in depth is a common strategy that uses layers of firewalls to protect Supervisory Control and Data Acquisition (SCADA) subnets and other critical resources on enterprise networks. A tool named NetSPA is presented that analyzes firewall rules and vulnerabilities to construct attack graphs. These show how inside and outside attackers can progress by successively compromising exposed vulnerable hosts with the goal of reaching critical internal targets. NetSPA generates attack graphs and automatically analyzes them to produce a small set of prioritized recommendations to restore defense in depth. Field trials on networks with up to 3,400 hosts demonstrate that firewalls often do not provide defense in depth due to misconfigurations and critical unpatched vulnerabilities on hosts. In all cases, a small number of recommendations was provided to restore defense in depth. Simulations on networks with up to 50,000 hosts demonstrate that this approach scales well to enterprise-size networks.
READ LESS

Summary

Defense in depth is a common strategy that uses layers of firewalls to protect Supervisory Control and Data Acquisition (SCADA) subnets and other critical resources on enterprise networks. A tool named NetSPA is presented that analyzes firewall rules and vulnerabilities to construct attack graphs. These show how inside and outside...

READ MORE

Securing communication of dynamic groups in dynamic network-centric environments

Summary

We developed a new approach and designed a practical solution for securing communication of dynamic groups in dynamic network-centric environments, such as airborne and terrestrial on-the-move networks. The solution is called Public Key Group Encryption (PKGE). In this paper, we define the problem of group encryption, motivate the need for decentralized group encryption services, and explain our vision for designing such services. We then describe our solution, PKGE, at a high-level, and report on the prototype implementation, performance experiments, and a demonstration with GAIM/Jabber chat.
READ LESS

Summary

We developed a new approach and designed a practical solution for securing communication of dynamic groups in dynamic network-centric environments, such as airborne and terrestrial on-the-move networks. The solution is called Public Key Group Encryption (PKGE). In this paper, we define the problem of group encryption, motivate the need for...

READ MORE

Analysis of nonmodal phonation using minimum entropy deconvolution

Published in:
Proc. Int. Conf. on Spoken Language Processing, ICSLP INTERSPEECH, 17-21 September 2006, pp. 1702-1705.

Summary

Nonmodal phonation occurs when glottal pulses exhibit nonuniform pulse-to-pulse characteristics such as irregular spacings, amplitudes, and/or shapes. The analysis of regions of such nonmodality has application to automatic speech, speaker, language, and dialect recognition. In this paper, we examine the usefulness of a technique called minimum-entropy deconvolution, or MED, for the analysis of pulse events in nonmodal speech. Our study presents evidence for both natural and synthetic speech that MED decomposes nonmodal phonation into a series of sharp pulses and a set of mixedphase impulse responses. We show that the estimated impulse responses are quantitatively similar to those in our synthesis model. A hybrid method incorporating aspects of both MED and linear prediction is also introduced. We show preliminary evidence that the hybrid method has benefit over MED alone for composite impulse-response estimation by being more robust to short-time windowing effects as well as a speech aspiration noise component.
READ LESS

Summary

Nonmodal phonation occurs when glottal pulses exhibit nonuniform pulse-to-pulse characteristics such as irregular spacings, amplitudes, and/or shapes. The analysis of regions of such nonmodality has application to automatic speech, speaker, language, and dialect recognition. In this paper, we examine the usefulness of a technique called minimum-entropy deconvolution, or MED, for...

READ MORE

Reducing speech coding distortion for speaker identification

Author:
Published in:
Int. Conf. on Spoken Language Processing, ICSLP, 17-21 September 2006.

Summary

In this paper, we investigate the degradation of speaker identification performance due to speech coding algorithms used in digital telephone networks, cellular telephony, and voice over IP. By analyzing the difference between front-end feature vectors derived from coded and uncoded speech in terms of spectral distortion, we are able to quantify this coding degradation. This leads to two novel methods for distortion compensation: codebook and LPC compensation. Both are shown to significantly reduce front-end mismatch, with the second approach providing the most encouraging results. Full experiments using a GMM-UBM speaker ID system confirm the usefulness of both the front-end distortion analysis and the LPC compensation technique.
READ LESS

Summary

In this paper, we investigate the degradation of speaker identification performance due to speech coding algorithms used in digital telephone networks, cellular telephony, and voice over IP. By analyzing the difference between front-end feature vectors derived from coded and uncoded speech in terms of spectral distortion, we are able to...

READ MORE