Publications

Refine Results

(Filters Applied) Clear All

Creating a cyber moving target for critical infrastructure applications

Published in:
5th IFIP Int. Conf. on Critical Infrastructure Protection, ICCIP 2011, 19-21 March 2011.

Summary

Despite the significant amount of effort that often goes into securing critical infrastructure assets, many systems remain vulnerable to advanced, targeted cyber attacks. This paper describes the design and implementation of the Trusted Dynamic Logical Heterogeneity System (TALENT), a framework for live-migrating critical infrastructure applications across heterogeneous platforms. TALENT permits a running critical application to change its hardware platform and operating system, thus providing cyber survivability through platform diversity. TALENT uses containers (operating-system-level virtualization) and a portable checkpoint compiler to create a virtual execution environment and to migrate a running application across different platforms while preserving the state of the application (execution state, open files and network connections). TALENT is designed to support general applications written in the C programming language. By changing the platform on-the-fly, TALENT creates a cyber moving target and significantly raises the bar for a successful attack against a critical application. Experiments demonstrate that a complete migration can be completed within about one second.
READ LESS

Summary

Despite the significant amount of effort that often goes into securing critical infrastructure assets, many systems remain vulnerable to advanced, targeted cyber attacks. This paper describes the design and implementation of the Trusted Dynamic Logical Heterogeneity System (TALENT), a framework for live-migrating critical infrastructure applications across heterogeneous platforms. TALENT permits...

READ MORE

USSS-MITLL 2010 human assisted speaker recognition

Summary

The United States Secret Service (USSS) teamed with MIT Lincoln Laboratory (MIT/LL) in the US National Institute of Standards and Technology's 2010 Speaker Recognition Evaluation of Human Assisted Speaker Recognition (HASR). We describe our qualitative and automatic speaker comparison processes and our fusion of these processes, which are adapted from USSS casework. The USSS-MIT/LL 2010 HASR results are presented. We also present post-evaluation results. The results are encouraging within the resolving power of the evaluation, which was limited to enable reasonable levels of human effort. Future ideas and efforts are discussed, including new features and capitalizing on naive listeners.
READ LESS

Summary

The United States Secret Service (USSS) teamed with MIT Lincoln Laboratory (MIT/LL) in the US National Institute of Standards and Technology's 2010 Speaker Recognition Evaluation of Human Assisted Speaker Recognition (HASR). We describe our qualitative and automatic speaker comparison processes and our fusion of these processes, which are adapted from...

READ MORE

Information security for situational awareness in computer network defense

Published in:
Chapter Six, Situational Awareness in Computer Network Defense: Principles, Methods, and Applications, 2011, pp. 86-103.

Summary

Situational awareness - the perception of "what's going on" - is crucial in every field of human endeavor, especially so in the cyber world where most of the protections afforded by physical time and distance are taken away. Since ancient times, military science emphasized the importance of preserving your awareness of the battlefield and at the same time preventing your adversary from learning the true situation for as long as possible. Today cyber is officially recognized as a contested military domain like air, land, and sea. Therefore situational awareness in computer networks will be under attacks of military strength and will require military-grade protection. This chapter describes the emerging threats for computer SA, and the potential avenues of defense against them.
READ LESS

Summary

Situational awareness - the perception of "what's going on" - is crucial in every field of human endeavor, especially so in the cyber world where most of the protections afforded by physical time and distance are taken away. Since ancient times, military science emphasized the importance of preserving your awareness...

READ MORE

Using United States government language proficiency standards for MT evaluation

Published in:
Chapter 5.3.3 in Handbook of Natural Language Processing and Machine Translation, 2011, pp. 775-82.

Summary

The purpose of this section is to discuss a method of measuring the degree to which the essential meaning of the original text is communicated in the MT output. We view this test to be a measurement of the fundamental goal of MT; that is, to convey information accurately from one language to another. We conducted a series of experiments in which educated native readers of English responded to test questions about translated versions of texts originally written in Arabic and Chinese. We compared the results for those subjects using machine translations of the texts with those using professional reference translations. These comparisons serve as a baseline for determining the level of foreign language reading comprehension that can be achieved by a native English reader relying on machine translation technology. This also allows us to explore the relationship between the current, broadly accepted automatic measures of performance for machine translation and a test derived from the Defense Language Proficiency Test, which is used throughout the Defense Department for measuring foreign language proficiency. Our goal is to put MT system performance evaluation into terms that are meaningful to US government consumers of MT output.
READ LESS

Summary

The purpose of this section is to discuss a method of measuring the degree to which the essential meaning of the original text is communicated in the MT output. We view this test to be a measurement of the fundamental goal of MT; that is, to convey information accurately from...

READ MORE

Topic identification

Published in:
Chapter 12, Spoken Language Understanding: Systems for Extracting from Speech, Gokhan Tur and Renato De Mori, eds., 2011, pp. 319-356.

Summary

In this chapter we discuss the problem of identifying the underlying topics beings discussed in spoken audio recordings. We focus primarily on the issues related to supervised topic classification or detection tasks using labeled training data, but we also discuss approaches for other related tasks including novel topic detection and unsupervised topic clustering. The chapter provides an overview of the common tasks and data sets, evaluation metrics, and algorithms most commonly used in this area of study.
READ LESS

Summary

In this chapter we discuss the problem of identifying the underlying topics beings discussed in spoken audio recordings. We focus primarily on the issues related to supervised topic classification or detection tasks using labeled training data, but we also discuss approaches for other related tasks including novel topic detection and...

READ MORE

Direct and latent modeling techniques for computing spoken document similarity

Published in:
SLT 2010, IEEE Workshop on Spoken Language Technology, 12-15 December 2010.

Summary

Document similarity measures are required for a variety of data organization and retrieval tasks including document clustering, document link detection, and query-by-example document retrieval. In this paper we examine existing and novel document similarity measures for use with spoken document collections processed with automatic speech recognition (ASR) technology. We compare direct vector space approaches using the cosine similarity measure applied to feature vectors constructed with various forms of term frequency inverse document frequency (TF-IDF) normalization against latent topic modeling approaches based on latent Dirichlet allocation (LDA). In document link detection experiments on the Fisher Corpus, we find that an approach that applies bagging to models derived from LDA substantially outperforms the direct vector space approach.
READ LESS

Summary

Document similarity measures are required for a variety of data organization and retrieval tasks including document clustering, document link detection, and query-by-example document retrieval. In this paper we examine existing and novel document similarity measures for use with spoken document collections processed with automatic speech recognition (ASR) technology. We compare...

READ MORE

Subgraph detection using eigenvector L1 norms

Published in:
23rd Int. Conf. on Neural Info. Process. Syst., NIPS, 6-9 December 2010, pp. 1633-41.

Summary

When working with network datasets, the theoretical framework of detection theory for Euclidean vector spaces no longer applies. Nevertheless, it is desirable to determine the detectability of small, anomalous graphs embedded into background networks with known statistical properties. Casting the problem of subgraph detection in a signal processing context, this article provides a framework and empirical results that elucidate a "detection theory" for graph-valued data. Its focus is the detection of anomalies in unweighted, undirected graphs through L1 properties of the eigenvectors of the graph's so-called modularity matrix. This metric is observed to have relatively low variance for certain categories of randomly-generated graphs, and to reveal the presence of an anomalous subgraph with reasonable reliability when the anomaly is not well-correlated with stronger portions of the background graph. An analysis of subgraphs in real network datasets confirms the efficacy of this approach.
READ LESS

Summary

When working with network datasets, the theoretical framework of detection theory for Euclidean vector spaces no longer applies. Nevertheless, it is desirable to determine the detectability of small, anomalous graphs embedded into background networks with known statistical properties. Casting the problem of subgraph detection in a signal processing context, this...

READ MORE

The MIT-LL/AFRL IWSLT-2010 MT system

Published in:
Proc. Int. Workshop on Spoken Language Translation, IWSLT, 2 December 2010.

Summary

This paper describes the MIT-LUAFRL statistical MT system and the improvements that were developed during the IWSLT 2010 evaluation campaign. As part of these efforts, we experimented with a number of extensions to the standard phrase-based model that improve performance on the Arabic and Turkish to English translation tasks. We also participated in the new French to English BTEC and English to French TALK tasks. We discuss the architecture of the MIT-LL/AFRL MT system, improvements over our 2008 system, and experiments we ran during the IWSLT-2010 evaluation. Specifically, we focus on 1) cross-domain translation using MAP adaptation, 2) Turkish morphological processing and translation, 3) improved Arabic morphology for MT preprocessing, and 4) system combination methods for machine translation.
READ LESS

Summary

This paper describes the MIT-LUAFRL statistical MT system and the improvements that were developed during the IWSLT 2010 evaluation campaign. As part of these efforts, we experimented with a number of extensions to the standard phrase-based model that improve performance on the Arabic and Turkish to English translation tasks. We...

READ MORE

Design, implementation and evaluation of covert channel attacks

Published in:
2010 IEEE Int. Conf. on Technologies for Homeland Security, 8 November 2010, pp. 481-487.

Summary

Covert channel attacks pose a threat to the security of critical infrastructure and key resources (CIKR). To design defenses and countermeasures against this threat, we must understand all classes of covert channel attacks along with their properties. Network-based covert channels have been studied in great detail in previous work, although several other classes of covert channels (hardware based and operating system-based) are largely unexplored. One of our contributions is investigating these classes by designing, implementing, and experimentally evaluating several specific covert channel attacks. We implement and evaluate hardware-based and operating system-based attacks and show significant differences in their properties and mechanisms. We also present channel capacity differences among the various attacks, which span three orders of magnitude. Furthermore, we present the concept of hybrid covert channel attacks which use two or more communication categories to transport data. Hybrid covert channels can be qualitatively harder to detect and counter than traditional covert channels. Finally, we summarize the lessons learned through covert channel attack design and implementation, which have important implications for critical asset protection and risk analysis. The study also facilitates the development of countermeasures to protect CIKR systems against covert channel attacks.
READ LESS

Summary

Covert channel attacks pose a threat to the security of critical infrastructure and key resources (CIKR). To design defenses and countermeasures against this threat, we must understand all classes of covert channel attacks along with their properties. Network-based covert channels have been studied in great detail in previous work, although...

READ MORE

Temporally oblivious anomaly detection on large networks using functional peers

Published in:
IMC'10, Proc. of the ACM SIGCOMM Internet Measurement Conf., 1 November 2010, pp. 465-471.

Summary

Previous methods of network anomaly detection have focused on defining a temporal model of what is "normal," and flagging the "abnormal" activity that does not fit into this pre-trained construct. When monitoring traffic to and from IP addresses on a large network, this problem can become computationally complex, and potentially intractable, as a state model must be maintained for each address. In this paper, we present a method of detecting anomalous network activity without providing any historical context. By exploiting the size of the network along with the minimal overhead of NetFlow data, we are able to model groups of hosts performing similar functions to discover anomalous behavior. As a collection, these anomalies can be further described with a few high-level characterizations and we provide a means for creating and labeling these categories. We demonstrate our method on a very large-scale network consisting of 30 million unique addresses, focusing specifically on traffic related to web servers.
READ LESS

Summary

Previous methods of network anomaly detection have focused on defining a temporal model of what is "normal," and flagging the "abnormal" activity that does not fit into this pre-trained construct. When monitoring traffic to and from IP addresses on a large network, this problem can become computationally complex, and potentially...

READ MORE