Today, governments, businesses, and the critical infrastructure are facing cyberattacks of increasing frequency, intensity, and complexity. With the widespread use of sensitive data in applications such as social network based systems and enterprise systems, the chances of breaching the privacy of individuals and corporations have increased. It has thus become essential to protect networks and data from security threats and privacy leakage. Towards this end, an important line of research has focused on enhancing security and privacy using the analysis of graphs, such as social graphs in reputation systems and anonymous communication systems, and causal dependency graphs in enterprise forensic systems.
In this thesis, we explore solutions to the limitations of two widely adopted graph analytics techniques in security and privacy applications, i.e., analysis of trust relationships in social graphs, and causal relationships in dependency graphs. We observe that classical paradigms for graph analytics, such as use of random walks on social graphs and use of breadth-first search on dependency graphs, have induced poor trade-offs between security/privacy and functionality due to lack of flexibility. Our insight is that leveraging graph-theoretic characteristics and machine learning techniques, we can make conventional paradigms adaptive to improve the security/privacy without harming the utility.
We present three systems for network security and privacy based on our advanced graph analytics: i) we introduce an adaptive random walk algorithm that uses a heterogeneous random walk length across nodes in a graph based on local structural characteristics, and propose SmartWalk, a security enhancing system which incorporates adaptive random walks in social network security applications. ii) We introduce a prioritized search algorithm that considers topological properties of nodes in a graph to accelerate the search, and propose PrioTracker, a causality tracker that automatically prioritizes the search for abnormal causal dependencies within a time constraint. iii) We introduce an interruptible context-based prioritized search algorithm that propagates and re-assesses node priorities by uncovering paths between nodes on a graph, and propose RAPID, an automated real-time alert triage system that helps scale up the alarm processing capability in enterprises.