本文已被:浏览 3768次 下载 3887次
中文摘要: 排序是信息检索、数据挖掘以及社会网络分析的基础工作之一。 在线社交网络和社
会媒体的快速发展积累了大量的图数据——由表示实体的节点和表示实体间关系的连边构
成。 图数据中节点之间连接关系复杂, 通常缺少显式的全序结构, 使得图排序在图数据分析
中显得尤为重要。 图排序算法主要包括 2 大类, 面向节点中心度的图排序算法和面向节点集
合多样性的图排序算法。 与传统的图排序不同 , 多样性图排序考虑排序和聚类的融合, 体现
为节点集合对网络整体的覆盖程度。 近年来, 多样性图排序得到了广泛的关注, 取得了一系
列研究进展,研究成果成功应用到了搜索结果排序、文档自动摘要、信息推荐系统和影响最大
化等诸多场景中。 文章评述了多样性图排序的研究现状及主要进展, 将现有的多样性图排序
方法按照研究思路的不同分为边际效益最大化、竞争随机游走、聚类与排序互增强 3 类, 分别
评述了每类方法的优势和不足。 最后指出 , 设计有效的评价指标和标准测试集、克服多样性
图排序面临的精度和速度的矛盾等是多样性图排序未来的研究重点。
中文关键词: 图数据,多样性图排序,社交网络
Abstract:Ranking is a fundamental task in information retrieval, data mining, and social network analysis.
With the rapid proliferation of online social network and social media, a great deal of graph data has been accumulated. Graph data is made up of nodes, representing entities, and edges, characterizing relationships among
entities. In graph data, nodes are connected through heterogeneous relationships, lacking explicit order among
them. Therefore, graph ranking is particularly important for graph data analysis.
Existing graph ranking algorithms could be roughly classified into two categories, respectively graph ranking based on centrality of nodes and graph ranking on diversity of a set of nodes. For centrality-based graph
ranking, spectral analysis is main-stream technique, where nodes are ranked according to their magnitude in primary eigenvectors of certain matrix, like adjacency matrix, Google matrix, and modularity matrix. These methods are particularly useful at offering a global ranking of nodes. However, a global ranking is not sufficient in
many scenarios, such as personalized recommendation, influence maximization, and ranking of search results.
Hence, researchers begin to study the diversified ranking on graphs. Different from traditional ranking on
graph, diversified graph ranking focuses on the interplay of ranking and clustering, capturing the intrinsic structure regularity of graph. In recent years, diversified ranking on graph attracts great attention, and several methods are proposed and successfully applied in many scenarios, including search results ranking, information recommendation system, document summarization, and influence maximization. In this paper, we summarize
main research progress about diversified ranking on graph.
We roughly classified existing methods into three categories, respectively based on marginal-benefit maximization, random walk with node competition, and reinforcement between clustering and ranking. For marginal-benefit maximization, a benefit function is generally defined for optimization. With such a benefit function,
nodes are selected one by one according to their marginal benefit. Each time, the node with the maximum marginal benefit is selected. In this way, nodes are ranked in the decreasing order of marginal benefit. The performance of this kind of methods depends on the benefit function. The other kind of methods are based on random
walk. Traditional random walk is essentially a kind of centrality-based method. To guarantee the diversity of
ranking, researchers try to introduce competition among adjacent nodes into traditional random walk. Two typical examples are DivRank and Grasshopper. DivRank revises the transition probability among nodes to implement a rich-get-richer mechanism, forcing adjacent nodes to compete for scores and consequently form a diversified ranking. Grasshopper adopts a multi-step manner, selecting one node at each iteration and taking the selected nodes as sink point to penalize its adjacent nodes in following iterations. The third kind of diversified graph ranking exploits the reinforcement between ranking and clustering. Cluster-dependent ranking and ranking-based clustering are iteratively updated to form a diversified ranking.
Finally, we introduce some potential research trends for diversified graph ranking. We point out the community lacks a
commonly-available and widely-accepted benchmark for diversified ranking on graphs. A benchmark is particularly valuable
for filtering out state-of-the-art methods. Meanwhile, we also need to design good evaluation metrics for diversified graph
ranking. With the rapid increase of the scale of graph data, to offer an off-the-shelf method, scalability is another key issue for
future research.
文章编号: 中图分类号: 文献标志码:
基金项目:
作者 | 单位 |
程学旗 |
Author Name | Affiliation |
Cheng Xueqi |
引用文本:
程学旗.[2015年第2期] 多样性图排序的研究现状及展望[J].中国科学院院刊,2015,30(2):.
Cheng Xueqi.Research status and trends of diversified graph ranking[J].Bulletin of Chinese Academy of Sciences,2015,30(2):.
程学旗.[2015年第2期] 多样性图排序的研究现状及展望[J].中国科学院院刊,2015,30(2):.
Cheng Xueqi.Research status and trends of diversified graph ranking[J].Bulletin of Chinese Academy of Sciences,2015,30(2):.