Characterizing and Exploiting Tree-Like Structure in Large Social and Information Networks

Principal Investigator(s): 
Michael Mahoney

In this project, researchers are developing methods to characterize and exploit "tree-like" structure in realistic social and information networks. In particular, they are focused on two related but complementary notions of tree-like-ness, as well as related heuristic variants, for graphs. These notions will be used to develop tools to characterize the manner in which realistic complex networks are coarsely tree-like, and this characterization will be used to develop tools for improved analytics on realistic networks. Particular attention will be paid to how this can shed light on intermediate-scale, i.e., not very small-scale or local and not very large-scale or global, structure in realistic networks.

Recent technological advances have led to an explosive growth of social and information network data that have been analyzed across a wide range of disciplines. This has led to a wide range of metrics, e.g., degree distributions, clustering coefficients, homophily measures, and so on, to describe and extract insight from realistic networks. Ultimately, the hope is that these metrics will help domain scientists to extract information and actionable insight from these networks. This enthusiasm has not yet been fully realized, however, for several reasons: since existing metrics do not perfectly capture properties of interest; since the coarse structural properties of existing networks are subtle and not always fully understood; since the properties of popular algorithms applied to realistic networks are only partially understood; and so on. For these and related reasons, an ongoing challenge is to develop finer actionable metrics to understand the properties of realistic networks. This work is timely since, although there is a large body of recent theoretical and empirical work suggesting that such networks have tree-like properties, existing algorithmic and statistical tools for identifying tree-like structure were developed for more structured applications, thus limiting their usefulness when applied to realistic networks.

Funding provided by NSF grant #1423621