Spark GraphX官方文档
    GraphX继承了RDD,并引入了新的抽象
    (边和点上附加了属性的有向多图)
    Property Graph:
    1、A directed multigraph is a directed graph with potentially multiple parallel edges
    sharing the same source and destination vertex.
    2、Each vertex is keyed by a unique 64-bit long identifier (VertexId).
    GraphX does not impose any ordering constraints on the vertex identifiers
    3、Graphx会优化原始数据类型的存储
    4、Graph的不可变,分布式和鲁棒性(immutable, distributed, and fault-tolerant)。运算中的可复用性
    5、The classes VertexRDD[VD] and EdgeRDD[ED] extend and are optimized versions of RDD[(VertexId, VD)] and RDD[Edge[ED]] respectively
    for now they can be thought of as simply RDDs of the form: RDD[(VertexId, VD)] and RDD[Edge[ED]].
    Example Property Graph:
    val users: RDD[(VertexId, (String, String))] =
    sc.parallelize(Array((3L, (“rxin”, “student”)), (7L, (“jgonzal”, “postdoc”)),
    (5L, (“franklin”, “prof”)), (2L, (“istoica”, “prof”))))
    Graph Operators:
    其中支持的一些算法:
    PageRank
    Connected components
    Label propagation 标签传播算法
    SVD++ 奇异值分解
    Strongly connected components
    Triangle count 三角形计数
    对于大多数算法而言, 你的输入RDD的分区数至少应该和集群的CPU核心数相当,这样才能达到完全的并行。
    用于图及图并行计算的Spark组件
    The Property Graph
    用户定义对象添加到每一个节点和边的有向多重图