hierarchical 阶层式;分层的;分等级的

Community定义image.png

Edge Overlap:04 Community Structure in Networks - 图3,其中04 Community Structure in Networks - 图4表示节点04 Community Structure in Networks - 图5的邻居集合。
Overlap=0表示一条边是局部bridge。
image.pngimage.png

image.pngimage.png
image.pngimage.png
如下图所示,可以看作是三个社区,社区内部是强连接,社区之间弱连接。
image.png
Communities: sets of tightly connected nodes

Modularity Q

Modularity Q:衡量网络划分成社区的效果。
给定网络的分区为不相干的分组04 Community Structure in Networks - 图13,则04 Community Structure in Networks - 图14
其中expected # edges within group s需要一个null model

Null Model

基于给定04 Community Structure in Networks - 图15个节点和04 Community Structure in Networks - 图16条边的真实图04 Community Structure in Networks - 图17,创建网络04 Community Structure in Networks - 图18

  • 04 Community Structure in Networks - 图1904 Community Structure in Networks - 图20有相同的度分布,但是04 Community Structure in Networks - 图21中的边随机连接的;
  • 考虑04 Community Structure in Networks - 图22是一个multigraph;
  • 度分别为04 Community Structure in Networks - 图2304 Community Structure in Networks - 图24的节点04 Community Structure in Networks - 图2504 Community Structure in Networks - 图26之间的期望边数为04 Community Structure in Networks - 图27,则04 Community Structure in Networks - 图28中期望边数为04 Community Structure in Networks - 图2904 Community Structure in Networks - 图30

    Modularity

    modularity of partitioning S of graph G:

  • 04 Community Structure in Networks - 图31

  • 04 Community Structure in Networks - 图32
  • 04 Community Structure in Networks - 图33表示节点04 Community Structure in Networks - 图3404 Community Structure in Networks - 图35之间边的权重;
  • 04 Community Structure in Networks - 图36表示与节点04 Community Structure in Networks - 图37相连的边的权重之和;
  • 04 Community Structure in Networks - 图38表示所有边权重之和;
  • 04 Community Structure in Networks - 图39表示节点04 Community Structure in Networks - 图40所属的社区;
  • 当节点04 Community Structure in Networks - 图4104 Community Structure in Networks - 图42属于同一个社区,即04 Community Structure in Networks - 图43,则04 Community Structure in Networks - 图44

Modularity取值范围04 Community Structure in Networks - 图45,若为正值则表示组间边数超出预期,Q值大于0.3~0.7意味着是显著的社区结构。
通过极大化modularity来区分社区。

Louvain Algorithm

一种贪婪的社区检测方法,时间复杂度为04 Community Structure in Networks - 图46,支持加权图,提供分层社区,由于该算法快速迭代聚合、有较高的Modularity输出,因此被广泛用来研究大网络。
hierarchical 阶层式;分层的;分等级的

Louvain算法贪婪地极大化Modularity,该算法由两个部分组成:

  • phase1 为每个节点选择最优的社区,使局部模块度达到最大
  • phase2 对划分好社区的网络进行重构
  • 上述两个步骤不断迭代,直至Modularity不再发生变化

image.png

phase1 Partitioning

  • 将图中每个节点都视为一个独特的社区(一个节点一个社区)。
  • 对每个节点04 Community Structure in Networks - 图48,计算当把节点04 Community Structure in Networks - 图49放入一些邻居节点04 Community Structure in Networks - 图50所属社区时的04 Community Structure in Networks - 图51,将取得最大04 Community Structure in Networks - 图52所对应的节点04 Community Structure in Networks - 图53与邻居节点所属社区合并。
  • 重复至04 Community Structure in Networks - 图54无变化。

    将节点04 Community Structure in Networks - 图55移至社区04 Community Structure in Networks - 图56时,04 Community Structure in Networks - 图57

  • 04 Community Structure in Networks - 图58 表示04 Community Structure in Networks - 图59中节点之间连接权重之和;

  • 04 Community Structure in Networks - 图60 表示04 Community Structure in Networks - 图61中所有连接权重之和;
  • 04 Community Structure in Networks - 图62 表示节点04 Community Structure in Networks - 图6304 Community Structure in Networks - 图64之间所有连接权重之和;
  • 04 Community Structure in Networks - 图65 表示节点04 Community Structure in Networks - 图66所有权重之和。

04 Community Structure in Networks - 图67表示将节点04 Community Structure in Networks - 图68移除社区04 Community Structure in Networks - 图69
04 Community Structure in Networks - 图70
04 Community Structure in Networks - 图71
image.png

phase2 Restructuring

将phase1得到的communities定义为super-nodes。

  • 若相关社区的节点之间至少有一个边,则super-nodes是连接的。
  • 两个super-nodes的边权重是相关社区之间所有边权重之和。

image.png
image.png

Detecting Overlapping Communities: BigCLAM

image.pngimage.png
image.png

  • step1 基于节点社区关系给图定义一个生成模型,Community Affiliation Graph Model (AGM)
  • step2 给定图G,假设G是由AGM生成的
  • step2 周到能生成G的最优的AGM
  • step2 由此可发现社区

    Community-Affiliation Graph Model (AGM)

    image.png
    模型参数:

  • 节点04 Community Structure in Networks - 图79,社区04 Community Structure in Networks - 图80,成员04 Community Structure in Networks - 图81

  • 每个社区04 Community Structure in Networks - 图82有个单独的概率04 Community Structure in Networks - 图83

给定参数04 Community Structure in Networks - 图84,社区04 Community Structure in Networks - 图85中的节点以概率04 Community Structure in Networks - 图86相连,节点属于多个社区的概率是多个社区概率相乘(If the miss the first time, they get another chance through the next community.)。
如,两个节点都在社区A和B中,那么这两个节点相连的概率为04 Community Structure in Networks - 图87
两个节点在图中相连的概率为04 Community Structure in Networks - 图88,若节点04 Community Structure in Networks - 图8904 Community Structure in Networks - 图90没有共同的社区,则04 Community Structure in Networks - 图91
如果按照这种方法,AGM可以适用于多种图,如下图所示,重叠图、非重叠图、覆盖图利用上述AGM算法都可以生成同构图。
image.png

Detecting Communities with AGM

给定一个图,找到模型04 Community Structure in Networks - 图93

  • Affiliation graph 04 Community Structure in Networks - 图94
  • 社区04 Community Structure in Networks - 图95的数量
  • 参数04 Community Structure in Networks - 图96

如何基于给定图04 Community Structure in Networks - 图97拟合模型04 Community Structure in Networks - 图98

  • 极大似然估计
  • 给定真实图04 Community Structure in Networks - 图99
  • 找到模型/参数04 Community Structure in Networks - 图100使得04 Community Structure in Networks - 图101

image.png
image.png

那BigCLAM是什么呢?它是AGM上的一种Relax,BigCLAM认为在二分图中,节点属于社区是有权重的,比如说我和A社区关系更紧密,权重大一点,和B社区关系一般,权重小一点,那么节点属于社区的权重越大,这个节点就越容易和其他节点相连。利用04 Community Structure in Networks - 图104表示节点04 Community Structure in Networks - 图105和各个社区的权重向量,比如04 Community Structure in Networks - 图106表示节点04 Community Structure in Networks - 图107和社区04 Community Structure in Networks - 图108的权重,那么节点04 Community Structure in Networks - 图109和节点04 Community Structure in Networks - 图110有边相连的概率为:04 Community Structure in Networks - 图111

给定一个网络04 Community Structure in Networks - 图112,极大化04 Community Structure in Networks - 图113
梯度下降:04 Community Structure in Networks - 图114

04 Community Structure in Networks - 图115

关于AGM与BigCLAM的两篇文章:
Yang J , Leskovec J . Community-Affiliation Graph Model for Overlapping Network Community Detection[C]// IEEE International Conference on Data Mining. IEEE, 2012.
Yang J , Leskovec J . Overlapping community detection at scale: A nonnegative matrix factorization approach[C]// Proceedings of the sixth ACM international conference on Web search and data mining. ACM, 2013.

参考链接

知乎笔记:https://zhuanlan.zhihu.com/p/138824980
PPT:http://web.stanford.edu/class/cs224w/slides/04-communities.pdf