Effective Keyword Search over Weighted Graphs
Real graphs often contain edge and node weights, representing, for instance, penalty, distance or uncertainty. We study the problem of keyword search over weighted node-labeled graphs, in which a query consists of a set of keywords and an answer is a subgraph whose nodes contain the keywords. We evaluate answers using three ranking strategies: optimizing edge weights, optimizing node weights, and a bi-objective combination of both node and edge weights. We prove that optimizing node weights and the bi-objective function are NP-hard. We propose an algorithm that optimizes edge weights and has an approximation ratio of two for the unique node enumeration paradigm. To optimize node weights and the bi-objective function, we propose transformations that distribute node weights onto the edges. We then prove that our transformations allow our algorithm to also optimize node weights and the bi-objective function with the same approximation ratio of two. Notably, the proposed transformations are compatible with existing algorithms that only optimize edge weights. We empirically show that in many natural examples, incorporating node weights (both keyword holders and middle nodes) produces more relevant answers than ranking methods based only on edge weights. Extensive experiments over real-life datasets verify the effectiveness and efficiency of our solution.
Branch: CSE Domain: Data Mining
Developed In: Java