Publications
Dynamic Distributed Dimensional Data Model (D4M) database and computation system
Summary
Summary
A crucial element of large web companies is their ability to collect and analyze massive amounts of data. Tuple store databases are a key enabling technology employed by many of these companies (e.g., Google Big Table and Amazon Dynamo). Tuple stores are highly scalable and run on commodity clusters, but...
Creating a cyber moving target for critical infrastructure applications using platform diversity
Summary
Summary
Despite the significant effort that often goes into securing critical infrastructure assets, many systems remain vulnerable to advanced, targeted cyber attacks. This paper describes the design and implementation of the Trusted Dynamic Logical Heterogeneity System (TALENT), a framework for live-migrating critical infrastructure applications across heterogeneous platforms. TALENT permits a running...
Fundamental Questions in the Analysis of Large Graphs
Summary
Summary
Graphs are a general approach for representing information that spans the widest possible range of computing applications. They are particularly important to computational biology, web search, and knowledge discovery. As the sizes of graphs increase, the need to apply advanced mathematical and computational techniques to solve these problems is growing...
Visualizing Large Kronecker Graphs
Summary
Summary
Kronecker graphs have been shown to be one of the most promising models for real-world networks. Visualization of Kronecker graphs is an important challenge. This chapter describes an interactive framework to assist scientists and engineers in generating, analyzing, and visualizing Kronecker graphs with as little effort as possible.
A knowledge-based operator for a genetic algorithm which optimizes the distribution of sparse matrix data
Summary
Summary
We present the Hogs and Slackers genetic algorithm (GA) which addresses the problem of improving the parallelization efficiency of sparse matrix computations by optimally distributing blocks of matrices data. The performance of a distribution is sensitive to the non-zero patterns in the data, the algorithm, and the hardware architecture. In...
Linear algebraic notation and definitions
Summary
Summary
This chapter presents notation, definitions, and conventions for graphs, matrices, arrays, and operations upon them.
Subgraph Detection
Summary
Summary
Detecting subgraphs of interest in larger graphs is the goal of many graph analysis techniques. The basis of detection theory is computing the probability of a “foreground” with respect to a model of the “background” data. Hidden Markov Models represent one possible foreground model for patterns of interaction in a...
The Kronecker theory of power law graphs
Summary
Summary
An analytical theory of power law graphs is presented based on the Kronecker graph generation technique. Explicit, stochastic, and instance Kronecker graphs are used to highlight different properties. The analysis uses Kronecker exponentials of complete bipartite graphs to formulate the substructure of such graphs. The Kronecker theory allows various high-level...
Graphs and matrices
Summary
Summary
A linear algebraic approach to graph algorithms that exploits the sparse adjacency matrix representation of graphs can provide a variety of benefits. These benefits include syntactic simplicity, easier implementation, and higher performance. Selected examples are presented illustrating these benefits. These examples are drawn from the remainder of the book in...
Topic modeling for spoken documents using only phonetic information
Summary
Summary
This paper explores both supervised and unsupervised topic modeling for spoken audio documents using only phonetic information. In cases where word-based recognition is unavailable or infeasible, phonetic information can be used to indirectly learn and capture information provided by topically relevant lexical items. In some situations, a lack of transcribed...