
Refine Results

(Filters Applied) Clear All


Published in:
Journal of Parallel and Distributed Computing, Vol. 64, No. 8, pp. 997-1005.


In many projects the true costs of high performance computing are currently dominated by software. Addressing these costs may require shifting to higher level languages such as Matlab. MatlabMPI is a Matlab implementation of the Message Passing Interface (MPI) standard and allows any Matlab program to exploit multiple processors. MatlabMPI currently implements the basic six functions that are the core of the MPI point-to-point communications standard. The key technical innovation of MatlabMPI is that it implements the widely used MPI “look and feel” on top of standard Matlab file I/O, resulting in an extremely compact (?350 lines of code) and “pure” implementation which runs anywhere Matlab runs, and on any heterogeneous combination of computers. The performance has been tested on both shared and distributed memory parallel computers (e.g. Sun, SGI, HP, IBM, Linux, MacOSX and Windows). MatlabMPI can match the bandwidth of C based MPI at large message sizes. A test image filtering application using MatlabMPI achieved a speedup of ?300 using 304 CPUs and ?15% of the theoretical peak (450 Gigaflops) on an IBM SP2 at the Maui High Performance Computing Center. In addition, this entire parallel benchmark application was implemented in 70 software-lines-of-code, illustrating the high productivity of this approach. MatlabMPI is available for download on the web.


In many projects the true costs of high performance computing are currently dominated by software. Addressing these costs may require shifting to higher level languages such as Matlab. MatlabMPI is a Matlab implementation of the Message Passing Interface (MPI) standard and allows any Matlab program to exploit multiple processors. MatlabMPI...


Beyond cepstra: exploiting high-level information in speaker recognition


Traditionally speaker recognition techniques have focused on using short-term, low-level acoustic information such as cepstra features extracted over 20-30 ms windows of speech. But speech is a complex behavior conveying more information about the speaker than merely the sounds that are characteristic of his vocal apparatus. This higher-level information includes speaker-specific prosodics, pronunciations, word usage and conversational style. In this paper, we review some of the techniques to extract and apply these sources of high-level information with results from the NIST 2003 Extended Data Task.


Traditionally speaker recognition techniques have focused on using short-term, low-level acoustic information such as cepstra features extracted over 20-30 ms windows of speech. But speech is a complex behavior conveying more information about the speaker than merely the sounds that are characteristic of his vocal apparatus. This higher-level information includes...


Exploiting nonacoustic sensors for speech enhancement


Nonacoustic sensors such as the general electromagnetic motion sensor (GEMS), the physiological microphone (P-mic), and the electroglottograph (EGG) offer multimodal approaches to speech processing and speaker and speech recognition. These sensors provide measurements of functions of the glottal excitation and, more generally, of the vocal tract articulator movements that are relatively immune to acoustic disturbances and can supplement the acoustic speech waveform. This paper describes an approach to speech enhancement that exploits these nonacoustic sensors according to their capability in representing specific speech characteristics in different frequency bands. Frequency-domain sensor phase, as well as magnitude, is found to contribute to signal enhancement. Preliminary testing involves the time-synchronous multi-sensor DARPA Advanced Speech Encoding Pilot Speech Corpus collected in a variety of harsh acoustic noise environments. The enhancement approach is illustrated with examples that indicate its applicability as a pre-processor to low-rate vocoding and speaker authentication, and for enhanced listening from degraded speech.


Nonacoustic sensors such as the general electromagnetic motion sensor (GEMS), the physiological microphone (P-mic), and the electroglottograph (EGG) offer multimodal approaches to speech processing and speaker and speech recognition. These sensors provide measurements of functions of the glottal excitation and, more generally, of the vocal tract articulator movements that are...


Multimodal speaker authentication using nonacuostic sensors

Published in:
Proc. Workshop on Multimodal User Authentication, 11-12 December 2003, pp. 215-222.


Many nonacoustic sensors are now available to augment user authentication. Devices such as the GEMS (glottal electromagnetic micro-power sensor), the EGG (electroglottograph), and the P-mic (physiological mic) all have distinct methods of measuring physical processes associated with speech production. A potential exciting aspect of the application of these sensors is that they are less influenced by acoustic noise than a microphone. A drawback of having many sensors available is the need to develop features and classification technologies appropriate to each sensor. We therefore learn feature extraction based on data. State of the art classification with Gaussian Mixture Models and Support Vector Machines is then applied for multimodal authentication. We apply our techniques to two databases--the Lawrence Livermore GEMS corpus and the DARPA Advanced Speech Encoding Pilot corpus. We show the potential of nonacoustic sensors to increase authentication accuracy in realistic situations.


Many nonacoustic sensors are now available to augment user authentication. Devices such as the GEMS (glottal electromagnetic micro-power sensor), the EGG (electroglottograph), and the P-mic (physiological mic) all have distinct methods of measuring physical processes associated with speech production. A potential exciting aspect of the application of these sensors is...


Passive operating system identification from TCP/IP packet headers

Published in:
ICDM Workshop on Data Mining for Computer Security, DMSEC, 19 November 2003.


Accurate operating system (OS) identification by passive network traffic analysis can continuously update less-frequent active network scans and help interpret alerts from intrusion detection systems. The most recent open-source passive OS identification tool (ettercap) rejects 70% of all packets and has a high 75-class error rate of 30% for non-rejected packets on unseen test data. New classifiers were developed using machine-learning approaches including cross-validation testing, grouping OS names into fewer classes, and evaluating alternate classifier types. Nearest neighbor and binary tree classifiers provide a low 9-class OS identification error rate of roughly 10% on unseen data without rejecting packets. This error rate drops to nearly zero when 10% of the packets are rejected.


Accurate operating system (OS) identification by passive network traffic analysis can continuously update less-frequent active network scans and help interpret alerts from intrusion detection systems. The most recent open-source passive OS identification tool (ettercap) rejects 70% of all packets and has a high 75-class error rate of 30% for non-rejected...


Biometrically enhanced software-defined radios


Software-defined radios and cognitive radios offer tremendous promise, while having great need for user authentication. Authenticating users is essential to ensuring authorized access and actions in private and secure communications networks. User authentication for software-defined radios and cognitive radios is our focus here. We present various means of authenticating users to their radios and networks, authentication architectures, and the complementary combination of authenticators and architectures. Although devices can be strongly authenticated (e.g., cryptographically), reliably authenticating users is a challenge. To meet this challenge, we capitalize on new forms of user authentication combined with new authentication architectures to support features such as continuous user authentication and varying levels of trust-based authentication. We generalize biometrics to include recognizing user behaviors and use them in concert with knowledge- and token-based authenticators. An integrated approach to user authentication and user authentication architectures is presented here to enhance trusted radio communications networks.


Software-defined radios and cognitive radios offer tremendous promise, while having great need for user authentication. Authenticating users is essential to ensuring authorized access and actions in private and secure communications networks. User authentication for software-defined radios and cognitive radios is our focus here. We present various means of authenticating users...


Auditory signal processing as a basis for speaker recognition

Published in:
IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 19-22 October, 2003, pp. 111-114.


In this paper, we exploit models of auditory signal processing at different levels along the auditory pathway for use in speaker recognition. A low-level nonlinear model, at the cochlea, provides accentuated signal dynamics, while a a high-level model, at the inferior colliculus, provides frequency analysis of modulation components that reveals additional temporal structure. A variety of features are derived from the low-level dynamic and high-level modulation signals. Fusion of likelihood scores from feature sets at different auditory levels with scores from standard mel-cepstral features provides an encouraging speaker recognition performance gain over use of the mel-cepstrum alone with corpora from land-line and cellular telephone communications.


In this paper, we exploit models of auditory signal processing at different levels along the auditory pathway for use in speaker recognition. A low-level nonlinear model, at the cochlea, provides accentuated signal dynamics, while a a high-level model, at the inferior colliculus, provides frequency analysis of modulation components that reveals...


System adaptation as a trust response in tactical ad hoc networks

Published in:
IEEE MILCOM 2003, 13-16 October 2003, pp. 209-214.


While mobile ad hoc networks offer significant improvements for tactical communications, these networks are vulnerable to node capture and other forms of cyberattack. In this paper we evaluated via simulation of the impact of a passive attacker, a denial of service (DoS) attack, and a data swallowing attack. We compared two different adaptive network responses to these attacks against a baseline of no response for 10 and 20 node networks. Each response reflects a level of trust assigned to the captured node. Our simulation used a responsive variant of the ad hoc on-demand distance vector (AODV) routing algorithm and focused on the response performance. We assumed that the attacks had been detected and reported. We compared performance tradeoffs of attack, response, and network size by focusing on metrics such as "goodput", i.e., percentage of messages that reach the intended destination untainted by the captured node. We showed, for example, that under general conditions a DoS attack response should minimize attacker impact while a response to a data swallowing attack should minimize risk to the system and trust of the compromised node with most of the response benefit. We show that the best network response depends on the mission goals, network configuration, density, network performance, attacker skill, and degree of compromise.


While mobile ad hoc networks offer significant improvements for tactical communications, these networks are vulnerable to node capture and other forms of cyberattack. In this paper we evaluated via simulation of the impact of a passive attacker, a denial of service (DoS) attack, and a data swallowing attack. We compared...


Acoustic, phonetic, and discriminative approaches to automatic language identification


Formal evaluations conducted by NIST in 1996 demonstrated that systems that used parallel banks of tokenizer-dependent language models produced the best language identification performance. Since that time, other approaches to language identification have been developed that match or surpass the performance of phone-based systems. This paper describes and evaluates three techniques that have been applied to the language identification problem: phone recognition, Gaussian mixture modeling, and support vector machine classification. A recognizer that fuses the scores of three systems that employ these techniques produces a 2.7% equal error rate (EER) on the 1996 NIST evaluation set and a 2.8% EER on the NIST 2003 primary condition evaluation set. An approach to dealing with the problem of out-of-set data is also discussed.


Formal evaluations conducted by NIST in 1996 demonstrated that systems that used parallel banks of tokenizer-dependent language models produced the best language identification performance. Since that time, other approaches to language identification have been developed that match or surpass the performance of phone-based systems. This paper describes and evaluates three...


Fusing high- and low-level features for speaker recognition


The area of automatic speaker recognition has been dominated by systems using only short-term, low-level acoustic information, such as cepstral features. While these systems have produced low error rates, they ignore higher levels of information beyond low-level acoustics that convey speaker information. Recently published works have demonstrated that such high-level information can be used successfully in automatic speaker recognition systems by improving accuracy and potentially increasing robustness. Wide ranging high-level-feature-based approaches using pronunciation models, prosodic dynamics, pitch gestures, phone streams, and conversational interactions were explored and developed under the SuperSID project at the 2002 JHU CLSP Summer Workshop (WS2002): In this paper, we show how these novel features and classifiers provide complementary information and can be fused together to drive down the equal error rate on the 2001 NIST Extended Data Task to 0.2%-a 71% relative reduction in error over the previous state of the art.


The area of automatic speaker recognition has been dominated by systems using only short-term, low-level acoustic information, such as cepstral features. While these systems have produced low error rates, they ignore higher levels of information beyond low-level acoustics that convey speaker information. Recently published works have demonstrated that such high-level...