Publications

Refine Results

(Filters Applied) Clear All

Spectral representations of nonmodal phonation

Published in:
IEEE Trans. Audio, Speech, Language Proc., Vol. 16, No. 1, January 2008, pp. 34-46.

Summary

Regions of nonmodal phonation, which exhibit deviations from uniform glottal-pulse periods and amplitudes, occur often in speech and convey information about linguistic content, speaker identity, and vocal health. Some aspects of these deviations are random, including small perturbations, known as jitter and shimmer, as well as more significant aperiodicities. Other aspects are deterministic, including repeating patterns of fluctuations such as diplophonia and triplophonia. These deviations are often the source of misinterpretation of the spectrum. In this paper, we introduce a general signal-processing framework for interpreting the effects of both stochastic and deterministic aspects of nonmodality on the short-time spectrum. As an example, we show that the spectrum is sensitive to even small perturbations in the timing and amplitudes of glottal pulses. In addition, we illustrate important characteristics that can arise in the spectrum, including apparent shifting of the harmonics and the appearance of multiple pitches. For stochastic perturbations, we arrive at a formulation of the power-spectral density as the sum of a low-pass line spectrum and a high-pass noise floor. Our findings are relevant to a number of speech-processing areas including linear-prediction analysis, sinusoidal analysis-synthesis, spectrally derived features, and the analysis of disordered voices.
READ LESS

Summary

Regions of nonmodal phonation, which exhibit deviations from uniform glottal-pulse periods and amplitudes, occur often in speech and convey information about linguistic content, speaker identity, and vocal health. Some aspects of these deviations are random, including small perturbations, known as jitter and shimmer, as well as more significant aperiodicities. Other...

READ MORE

Performance metrics and software architecture

Published in:
High Performance Embedded Computing Handbook, Chapter 15

Summary

This chapter presents that high performance embedded computing (HPEC) software architectures and evaluation metrics. A canonical HPEC application is used to illustrate basic concepts. The chapter discusses different types of parallelism are reviewed, and performance analysis techniques. It presents a typical programmable multicomputer and explores the performance trade-offs of different parallel mappings on this computer using key system performance metrics. HPEC systems are amongst the most challenging systems in the world to build. Synthetic Aperture Radar (SAR) is one of the most common modes in a radar system and one of the most computationally stressing to implement. Often the first step in the development of a system is to produce a rough estimate of how many processors will be needed. The parallel opportunities at each stage of the calculation discussed in the previous section show that there are many different ways to exploit parallelism in this application. The chapter concludes with a discussion of the impact of different software implementations approaches.
READ LESS

Summary

This chapter presents that high performance embedded computing (HPEC) software architectures and evaluation metrics. A canonical HPEC application is used to illustrate basic concepts. The chapter discusses different types of parallelism are reviewed, and performance analysis techniques. It presents a typical programmable multicomputer and explores the performance trade-offs of different...

READ MORE

Radar Signal Processing: An Example of High Performance Embedded Computing

Published in:
High Performance Embedded Computing Handbook, Chapter 6

Summary

This chapter focuses on the computational complexity of the front-end of the surface moving-target indication (SMTI) radar application. SMTI radars can require over one trillion operations per second of computation for wideband systems. The adaptive beamforming performed in SMTI radars is one of the major computational complexity drivers. The goal of the SMTI radar is to process the received signals to detect targets while rejecting clutter returns and noise. The radar must also mitigate interference from unintentional sources such as RF systems transmitting in the same band and from jammers that may be intentionally trying to mask targets. The pulse compression stage filters the data to concentrate the signal energy of a relatively long transmitted radar pulse into a short pulse response. The relative range rate between the radar and the ground along the line of sight of the sidelobe may be the same as range rate of the target detected in the mainbeam.
READ LESS

Summary

This chapter focuses on the computational complexity of the front-end of the surface moving-target indication (SMTI) radar application. SMTI radars can require over one trillion operations per second of computation for wideband systems. The adaptive beamforming performed in SMTI radars is one of the major computational complexity drivers. The goal...

READ MORE

Parallel and Distributed Processing

Author:
Published in:
High Performance Embedded Computing Handbook, Chapter 18

Summary

This chapter discusses parallel and distributed programming technologies for high performance embedded systems. Computational or memory constraints can be overcome with parallel processing. The primary goal of parallel processing is to improve performance by distributing computation across multiple processors or increasing dataset sizes by distributing data across multiple processors’ memory. The typical programmer has little to no experience writing programs that run on multiple processors. The transition from serial to parallel programming requires significant changes in the programmer’s way of thinking. For example, the programmer must worry about how to distribute data and computation across multiple processors to maximize performance and how to synchronize and communicate between processors. Although most programmers will likely admit to having no experience with parallel programming, many have indeed had exposure to a rudimentary type in the form of threads. A typical threaded program starts execution as a single thread.
READ LESS

Summary

This chapter discusses parallel and distributed programming technologies for high performance embedded systems. Computational or memory constraints can be overcome with parallel processing. The primary goal of parallel processing is to improve performance by distributing computation across multiple processors or increasing dataset sizes by distributing data across multiple processors’ memory...

READ MORE

Topic identification from audio recordings using word and phone recognition lattices

Published in:
2000 IEEE Workshop on Automatic Speech Recognition and Understanding, 9-13 December 2007, pp. 659-664.

Summary

In this paper, we investigate the problem of topic identification from audio documents using features extracted from speech recognition lattices. We are particularly interested in the difficult case where the training material is minimally annotated with only topic labels. Under this scenario, the lexical knowledge that is useful for topic identification may not be available, and automatic methods for extracting linguistic knowledge useful for distinguishing between topics must be relied upon. Towards this goal we investigate the problem of topic identification on conversational telephone speech from the Fisher corpus under a variety of increasingly difficult constraints. We contrast the performance of systems that have knowledge of the lexical units present in the audio data, against systems that rely entirely on phonetic processing.
READ LESS

Summary

In this paper, we investigate the problem of topic identification from audio documents using features extracted from speech recognition lattices. We are particularly interested in the difficult case where the training material is minimally annotated with only topic labels. Under this scenario, the lexical knowledge that is useful for topic...

READ MORE

An interactive attack graph cascade and reachability display

Published in:
VizSEC 2007, Proc. of the Workshop on Visualization for Computer Security, 29 October 2007, pp. 221-236.

Summary

Attack graphs for large enterprise networks improve security by revealing critical paths used by adversaries to capture network assets. Even with simplification, current attack graph displays are complex and difficult to relate to the underlying physical networks. We have developed a new interactive tool intended to provide a simplified and more intuitive understanding of key weaknesses discovered by attack graph analysis. Separate treemaps are used to display host groups in each subnet and hosts within each treemap are grouped based on reachability, attacker privilege level, and prerequisites. Users position subnets themselves to reflect their own intuitive grasp of network topology. Users can also single-step the attack graph to successively add edges that cascade to show how attackers progress through a network and learn what vulnerabilities or trust relationships allow critical steps. Finally, an integrated reachability display demonstrates how filtering devices affect host-to-host network reachability and influence attacker actions. This display scales to networks with thousands of hosts and many subnets. Rapid interactivity has been achieved because of an efficient C++ computation engine (a program named NetSPA) that performs attack graph and reachability computations, while a Java application manages the display and user interface.
READ LESS

Summary

Attack graphs for large enterprise networks improve security by revealing critical paths used by adversaries to capture network assets. Even with simplification, current attack graph displays are complex and difficult to relate to the underlying physical networks. We have developed a new interactive tool intended to provide a simplified and...

READ MORE

Tuning intrusion detection to work with a two encryption key version of IPsec

Published in:
IEEE MILCOM 2007, 29-31 October 2007, pp. 3977-3983.

Summary

Network-based intrusion detection systems (NIDSs) are one component of a comprehensive network security solution. The use of IPsec, which encrypts network traffic, renders network intrusion detection virtually useless unless traffic is decrypted at network gateways. Host-based intrusion detection systems (HIDSs) can provide some of the functionality of NIDSs but with limitations. HIDSs cannot perform a network-wide analysis and can be subverted if a host is compromised. We propose an approach to intrusion detection that combines HIDS, NIDS, and a version of IPsec that encrypts the header and the body of IP packets separately ("Two-Zone IPsec"). We show that all of the network events currently detectable by the Snort NIDS on unencrypted network traffic are also detectable on encrypted network traffic using this approach. The NIDS detects network-level events that HIDSs have trouble detecting and HIDSs detect application-level events that can't be detected by the NIDS.
READ LESS

Summary

Network-based intrusion detection systems (NIDSs) are one component of a comprehensive network security solution. The use of IPsec, which encrypts network traffic, renders network intrusion detection virtually useless unless traffic is decrypted at network gateways. Host-based intrusion detection systems (HIDSs) can provide some of the functionality of NIDSs but with...

READ MORE

Sinewave analysis/synthesis based on the fan-chirp transform

Published in:
Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPA, 21-24 October 2007, pp. 247-250.

Summary

There have been numerous recent strides at making sinewave analysis consistent with time-varying sinewave models. This is particularly important in high-frequency speech regions where harmonic frequency modulation (FM) can be significant. One notable approach is through the Fan Chirp transform that provides a set of FM-sinewave basis functions consistent with harmonic FM. In this paper, we develop a complete sinewave analysis/synthesis system using the Fan Chirp transform. With this system we are able to obtain more accurate sinewave frequencies and phases, thus creating more accurate frequency tracks, in contrast to a system derived from the short-time Fourier transform, particularly for high-frequency regions of large-bandwidth analysis. With synthesis, we show an improvement in segmental signal-to-noise ratio with respect to waveform matching with the largest gains during rapid pitch dynamics.
READ LESS

Summary

There have been numerous recent strides at making sinewave analysis consistent with time-varying sinewave models. This is particularly important in high-frequency speech regions where harmonic frequency modulation (FM) can be significant. One notable approach is through the Fan Chirp transform that provides a set of FM-sinewave basis functions consistent with...

READ MORE

The MIT-LL/AFRL IWSLT-2007 MT System

Published in:
Int. Workshop on Spoken Language Translation, IWSLT, 15-16 October 2007.

Summary

The MIT-LL/AFRL MT system implements a standard phrase-based, statistical translation model. It incorporates a number of extensions that improve performance for speech-based translation. During this evaluation our efforts focused on the rapid porting of our SMT system to a new language (Arabic) and novel approaches to translation from speech input. This paper discusses the architecture of the MIT-LL/AFRL MT system, improvements over our 2007 system, and experiments we ran during the IWSLT-2007 evaluation. Specifically, we focus on 1) experiments comparing the performance of confusion network decoding and direct lattice decoding techniques for speech machine translation, 2) the application of lightweight morphology for Arabic MT pre-processing and 3) improved confusion network decoding.
READ LESS

Summary

The MIT-LL/AFRL MT system implements a standard phrase-based, statistical translation model. It incorporates a number of extensions that improve performance for speech-based translation. During this evaluation our efforts focused on the rapid porting of our SMT system to a new language (Arabic) and novel approaches to translation from speech input...

READ MORE

Classification methods for speaker recognition

Published in:
Chapter in Springer Lecture Notes in Artificial Intelligence, 2007.

Summary

Automatic speaker recognition systems have a foundation built on ideas and techniques from the areas of speech science for speaker characterization, pattern recognition and engineering. In this chapter we provide an overview of the features, models, and classifiers derived from these areas that are the basis for modern automatic speaker recognition systems. We describe the components of state-of-the-art automatic speaker recognition systems, discuss application considerations and provide a brief survey of accuracy for different tasks.
READ LESS

Summary

Automatic speaker recognition systems have a foundation built on ideas and techniques from the areas of speech science for speaker characterization, pattern recognition and engineering. In this chapter we provide an overview of the features, models, and classifiers derived from these areas that are the basis for modern automatic speaker...

READ MORE