Publications

Refine Results

(Filters Applied) Clear All

Finding focus in the blur of moving-target techniques

Published in:
IEEE Security and Privacy, Vol. 12, No. 2, March/April 2014, pp. 16-26.

Summary

Moving-target (MT) techniques seek to randomize system components to reduce the likelihood of a successful attack, add dynamics to a system to reduce the lifetime of an attack, and diversify otherwise homogeneous collections of systems to limit the damage of a large-scale attack. In this article, we review the five dominant domains of MT techniques, consider the advantages and weaknesses of each, and make recommendations for future research.
READ LESS

Summary

Moving-target (MT) techniques seek to randomize system components to reduce the likelihood of a successful attack, add dynamics to a system to reduce the lifetime of an attack, and diversify otherwise homogeneous collections of systems to limit the damage of a large-scale attack. In this article, we review the five...

READ MORE

Effective parallel computation of eigenpairs to detect anomalies in very large graphs

Published in:
SIAM Conference on Parallel Processing for Scientific Computing

Summary

The computational driver for an important class of graph analysis algorithms is the computation of leading eigenvectors of matrix representations of the graph. In this presentation, we discuss the challenges of calculating eigenvectors of modularity matrices derived from very large graphs (upwards of a billion vertices) and demonstrate the scaling properties of parallel eigensolvers when applied to these matrices.
READ LESS

Summary

The computational driver for an important class of graph analysis algorithms is the computation of leading eigenvectors of matrix representations of the graph. In this presentation, we discuss the challenges of calculating eigenvectors of modularity matrices derived from very large graphs (upwards of a billion vertices) and demonstrate the scaling...

READ MORE

Authenticated broadcast with a partially compromised public-key infrastructure

Published in:
Info. and Comput., Vol. 234, February 2014, pp. 17-25.

Summary

Given a public-key infrastructure (PKI) and digital signatures, it is possible to construct broadcast protocols tolerating any number of corrupted parties. Existing protocols, however, do not distinguish between corrupted parties who do not follow the protocol, and honest parties whose secret (signing) keys have been compromised but continue to behave honestly. We explore conditions under which it is possible to construct broadcast protocols that still provide the usual guarantees (i.e., validity/agreement) to the latter. Consider a network of n parties, where an adversary has compromised the secret keys of up to tc honest parties, where an adversary has compromised the secret keys of up to tc honest parties and, in addition, fully controls the behavior of up to ta other parties. We show that for any fixed tc>0 and any fixed ta, there exists an efficient protocol for broadcast if and only if 2 ta + min (ta, tc) < n. (When tc = 0, standard results imply feasibility for all ta < n.) We also show that if tc, ta are not fixed, but are only guaranteed to satisfy the above bound, then broadcast is impossible to achieve except for a few specific values of n; for these "exceptional" values of n, we demonstrate broadcast protocols. Taken together, our results give a complete characterization of this problem.
READ LESS

Summary

Given a public-key infrastructure (PKI) and digital signatures, it is possible to construct broadcast protocols tolerating any number of corrupted parties. Existing protocols, however, do not distinguish between corrupted parties who do not follow the protocol, and honest parties whose secret (signing) keys have been compromised but continue to behave...

READ MORE

Characterizing phonetic transformations and acoustic differences across English dialects

Published in:
IEEE Trans. Audio, Speech, and Lang. Process., Vol. 22, No. 1, January 2014, pp. 110-24.

Summary

In this work, we propose a framework that automatically discovers dialect-specific phonetic rules. These rules characterize when certain phonetic or acoustic transformations occur across dialects. To explicitly characterize these dialect-specific rules, we adapt the conventional hidden Markov model to handle insertion and deletion transformations. The proposed framework is able to convert pronunciation of one dialect to another using learned rules, recognize dialects using learned rules, retrieve dialect-specific regions, and refine linguistic rules. Potential applications of our proposed framework include computer-assisted language learning, sociolinguistics, and diagnosis tools for phonological disorders.
READ LESS

Summary

In this work, we propose a framework that automatically discovers dialect-specific phonetic rules. These rules characterize when certain phonetic or acoustic transformations occur across dialects. To explicitly characterize these dialect-specific rules, we adapt the conventional hidden Markov model to handle insertion and deletion transformations. The proposed framework is able to...

READ MORE

Content + context networks for user classification in Twitter

Published in:
Frontiers of Network Analysis, NIPS Workshop, 9 December 2013.

Summary

Twitter is a massive platform for open communication between diverse groups of people. While traditional media segregates the world's population on lines of language, age, physical location, social status, and many other characteristics, Twitter cuts through these divides. The result is an extremely diverse social network. In this work, we combine features of this network structure with content analytics on the tweets in order to create a content + context network, capturing the relations not only between people, but also between people and content and between content and content. This rich structure allows deep analysis into many aspects of communication over Twitter. We focus on predicting user classifications by using relational probability trees with features from content + context networks. Experiments demonstrate that these features are salient and complementary for user classification.
READ LESS

Summary

Twitter is a massive platform for open communication between diverse groups of people. While traditional media segregates the world's population on lines of language, age, physical location, social status, and many other characteristics, Twitter cuts through these divides. The result is an extremely diverse social network. In this work, we...

READ MORE

Optimizing media access strategy for competing cognitive radio networks

Published in:
GLOBECOM 2013: 2013 IEEE Global Communications Conf., 9-13 December 2013.

Summary

This paper describes an adaptation of cognitive radio technology for tactical wireless networking. We introduce Competing Cognitive Radio Network (CCRN) featuring both communicator and jamming cognitive radio nodes that strategize in taking actions on an open spectrum under the presence of adversarial threats. We present the problem in the Multi-armed Bandit (MAB) framework and develop the optimal media access strategy consisting of mixed communicator and jammer actions in a Bayesian setting for Thompson sampling based on extreme value theory. Empirical results are promising that the proposed strategy seems to outperform Lai & Robbins and UCB, some of the most important MAB algorithms known to date.
READ LESS

Summary

This paper describes an adaptation of cognitive radio technology for tactical wireless networking. We introduce Competing Cognitive Radio Network (CCRN) featuring both communicator and jamming cognitive radio nodes that strategize in taking actions on an open spectrum under the presence of adversarial threats. We present the problem in the Multi-armed...

READ MORE

The MIT-LL/AFRL IWSLT-2013 MT System

Summary

This paper describes the MIT-LL/AFRL statistical MT system and the improvements that were developed during the IWSLT 2013 evaluation campaign [1]. As part of these efforts, we experimented with a number of extensions to the standard phrase-based model that improve performance on the Russian to English, Chinese to English, Arabic to English, and English to French TED-talk translation task. We also applied our existing ASR system to the TED-talk lecture ASR task. We discuss the architecture of the MIT-LL/AFRL MT system, improvements over our 2012 system, and experiments we ran during the IWSLT-2013 evaluation. Specifically, we focus on 1) cross-entropy filtering of MT training data, and 2) improved optimization techniques, 3) language modeling, and 4) approximation of out-of-vocabulary words.
READ LESS

Summary

This paper describes the MIT-LL/AFRL statistical MT system and the improvements that were developed during the IWSLT 2013 evaluation campaign [1]. As part of these efforts, we experimented with a number of extensions to the standard phrase-based model that improve performance on the Russian to English, Chinese to English, Arabic...

READ MORE

Systematic analysis of defenses against return-oriented programming

Published in:
RAID 2013: 16th Int. Symp. on Research in Attacks, Intrusions, and Defenses, LNCS 8145, 23-25 October 2013.

Summary

Since the introduction of return-oriented programming, increasingly compiles defenses and subtle attacks that bypass them have been proposed. Unfortunately the lack of a unifying threat model among code reuse security papers makes it difficult to evaluate the effectiveness of defenses, and answer critical questions about the interoperability, composability, and efficacy of existing defensive techniques. For example, what combination of defenses protect against every known avenue of code reuse? What is the smallest set of such defenses? In this work, we study the space of code reuse attacks by building a formal model of attacks and their requirements, and defenses and their assumptions. We use a SAT solver to perform scenario analysis on our model in two ways. First, we analyze the defense configurations of a real-world system. Second, we reason about hypothetical defense bypasses. We prove by construction that attack extensions implementing the hypothesized functionality are possible even if a 'perfect' version of the defense is implemented. Our approach can be used to formalize the process of threat model definition, analyze defense configurations, reason about composability and efficacy, and hypothesize about new attacks and defenses.
READ LESS

Summary

Since the introduction of return-oriented programming, increasingly compiles defenses and subtle attacks that bypass them have been proposed. Unfortunately the lack of a unifying threat model among code reuse security papers makes it difficult to evaluate the effectiveness of defenses, and answer critical questions about the interoperability, composability, and efficacy...

READ MORE

Competing Mobile Network Game: embracing antijamming and jamming strategies with reinforcement learning

Published in:
2013 IEEE Conf. on Communications and Network Security (CNS), 14-16 October 2013, pp. 28-36.

Summary

We introduce Competing Mobile Network Game (CMNG), a stochastic game played by cognitive radio networks that compete for dominating an open spectrum access. Differentiated from existing approaches, we incorporate both communicator and jamming nodes to form a network for friendly coalition, integrate antijamming and jamming subgames into a stochastic framework, and apply Q-learning techniques to solve for an optimal channel access strategy. We empirically evaluate our Q-learning based strategies and find that Minimax-Q learning is more suitable for an aggressive environment than Nash-Q while Friend-or-for Q-learning can provide the best solution under distributed mobile ad hoc networking scenarios in which the centralized control can hardly be available.
READ LESS

Summary

We introduce Competing Mobile Network Game (CMNG), a stochastic game played by cognitive radio networks that compete for dominating an open spectrum access. Differentiated from existing approaches, we incorporate both communicator and jamming nodes to form a network for friendly coalition, integrate antijamming and jamming subgames into a stochastic framework...

READ MORE

D4M 2.0 Schema: a general purpose high performance schema for the Accumulo database

Summary

Non-traditional, relaxed consistency, triple store databases are the backbone of many web companies (e.g., Google Big Table, Amazon Dynamo, and Facebook Cassandra). The Apache Accumulo database is a high performance open source relaxed consistency database that is widely used for government applications. Obtaining the full benefits of Accumulo requires using novel schemas. The Dynamic Distributed Dimensional Data Model (D4M) [http://www.mit.edu.ezproxyberklee.flo.org/~kepner/D4M] provides a uniform mathematical framework based on associative arrays that encompasses both traditional (i.e., SQL) and non-traditional databases. For non-traditional databases D4M naturally leads to a general purpose schema that can be used to fully index and rapidly query every unique string in a dataset. The D4M 2.0 Schema has been applied with little or no customization to cyber, bioinformatics, scientific citation, free text, and social media data. The D4M 2.0 Schema is simple, requires minimal parsing, and achieves the highest published Accumulo ingest rates. The benefits of the D4M 2.0 Schema are independent of the D4M interface. Any interface to Accumulo can achieve these benefits by using the D4M 2.0 Schema.
READ LESS

Summary

Non-traditional, relaxed consistency, triple store databases are the backbone of many web companies (e.g., Google Big Table, Amazon Dynamo, and Facebook Cassandra). The Apache Accumulo database is a high performance open source relaxed consistency database that is widely used for government applications. Obtaining the full benefits of Accumulo requires using...

READ MORE