Immunization and Summarization of Large Graphs
In the thesis, we present results for two problems related to graphs.
Immunizing a subset of nodes in a network - enabling them to identify and withstand the spread of harmful content - is one of the most effective ways to counter the spread of malicious content. It has applications in network security, public health policy, and social media surveillance. Finding a subset of nodes whose immunization results in the least vulnerability of the network is a computationally challenging task. In this work, we establish a relationship between a widely used network vulnerability measure and the combinatorial properties of networks. Using this relationship and graph summarization techniques, we propose an efficient approximation algorithm to find a set of nodes to immunize. We provide theoretical justifications for the proposed solution and analytical bounds on the runtime of our algorithm. We empirically demonstrate on various real-world networks that the performance of our algorithm is an order of magnitude better than the state-of-the-art solution. We also show that in practice the runtime of our algorithm is significantly lower than that of the best-known solution.
Massive sizes of real-world graphs, such as social networks and web graphs, impose serious challenges to process and perform analytics on them. These issues can be resolved by working on a small summary of the graph instead. A summary is a compressed version of the graph that removes several details, yet preserves its essential structure. Generally, some predefined quality measure of the summary is optimized to bound the approximation error incurred by working on the summary instead of the whole graph. All known summarization algorithms are computationally prohibitive and do not scale to large graphs. In this paper, we present an efficient randomized algorithm to compute graph summaries with the goal to minimize reconstruction error. We propose a novel weighted sampling scheme to sample vertices for merging that will result in the least reconstruction error. We provide analytical bounds on the running time of the algorithm and prove an approximation guarantee for our score computation. The efficiency of our algorithm makes it scalable to very large graphs on which known algorithms cannot be applied. We test our algorithm on several real-world graphs to empirically demonstrate the quality of summaries produced and compare them to state-of-the-art algorithms. We use the summaries to answer several structural queries about the original graph and report their accuracies.
List of Publications:
Muhammad Ahmad, Sarwan Ali, Juvaria Tariq, Imdadullah Khan, Mudassir Shabbir, Arif Zaman, Combinatorial trace method for network immunization in Information Sciences, 2020.
Muhammad Ahmad, Juvaria Tariq, Mudassir Shabbir, Imdadullah Khan, Spectral Methods for Immunization of Large Networks in Australasian Journal of Information Systems, 2017
Maham Anwar Beg, Muhammad Ahmad, Arif Zaman, Imdadullah Khan, Scalable Approximation Algorithm for Graph Summarization in Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD, 2018.
Juvaria Tariq, Muhammad Ahmad, Imdadullah Khan, Mudassir Shabbir Scalable Approximation Algorithm for Graph Immunization in Pacific-Asia Conference on Information Systems, PACIS, 2017.
Muhammad Ahmad, Juvaria Tariq, Muhammad Farhan, Mudassir Shabbir, Imdadullah Khan Who Should Receive the Vaccine? in The Australasian Data Mining Conference, AusDM, 2016.