1.3 - Choice of Graph Representation - 《图机器学习2》

二部图

在斯坦福大学CS 224 W号的这个部分，
In this part of, uh, Stanford, CS 224 W, um,

带有图的机器学习课程
machine-learning with graphs course,

我想谈谈图形表示的选择。
I wanna talk about the choice of graph representation.

[噪声]那么图或网络的组成部分是什么？
[NOISE] So what are components of a graph or a network?

因此，网络由两种类型的对象组成。
So network is composed of two types of objects.

首先，我们将对象或实体本身称为
We- first, we have objects or entities themselves called,

呃，称为节点，
uh, referred to as nodes, uh,

和顶点，然后我们在它们之间有相互作用或边，
and vertices, and then we have interactions or edges between them,

呃，叫做链接，或者，是边缘。
uh, called links or, uh, edges.

然后是整个系统，整个，
And then the entire system, the entire, um,

然后我们将域称为-网络，
domain we then call a- a network,

呃，还是图。
uh, or a graph.

通常，对于节点，我们将使用，
Usually, for nodes, we will use, uh, uh,

单词-大写字母N或大写字母V，
the word- the letter capital N or capital V,

嗯，然后是边缘，
um, and then for edges,

我们-我们通常使用-
we- we are usually using the-

字母大写E，因此图G由一组节点组成，
the letter capital E so that the graph G is then composed of a set of nodes,

恩，N和一组边，恩，
uh, N and a set of edges, uh,

E.图的重要意义在于图是一种通用语言，
E. What is important about graphs is that graphs are a common language,

表示我可以接受，例如，
meaning that I can take, for example, uh,

演员，并根据他们出现的电影来联系他们，
actors and connect them based on which movies they appeared in,

或者我可以根据他们彼此之间的关系来吸引人们，
or I can take people based on the relationships they have with each other,

或者我可以吸收诸如蛋白质之类的分子，
or I can take molecules, like proteins,

并基于蛋白质彼此相互作用建立网络。
and build a network based on which proteins interact with each other.

如果我看一下这个网络的结构，
If I look at what is the structure of this network,

基本的数学表示是什么，
what is the underlying mathematical representation,

在所有这些情况下，
in all these cases,

我们具有相同的基础数学表示形式，
we have the same underlying mathematical representation,

这意味着相同的机器学习算法将是
which means that the same machine learning algorithm will be

能够做出预测，因为这些节点，
able to make predictions be it that these nodes,

嗯，嗯，对应演员，嗯，
um, uh, correspond to actors, correspond to, uh,

人，或者它们对应于诸如蛋白质之类的分子。
people, or they correspond to molecules like proteins.

[噪声]当然，选择合适的图形表示非常重要。
[NOISE] Of course, choosing a proper graph representation is very important.

例如，如果您有一群人，
So for example, if you have a set of people,

我们可以联系与每个人一起工作的个人
we can connect individuals that work with each

其他，我们将拥有专业的网络。
other and we will have a professional network.

但是，我们也可以采用同一组人
However, we can also take the same set of individuals

并根据性关系将他们联系起来，但是，
and connect them based on sexual relationships, but then,

我们将创建一个色情网络，例如
we’ll ab- creating a sexual network, or for example,

如果我们有一套科学论文，
if we have a set of scientific papers,

我们可以根据引用将它们连接起来，
we can connect them based on citations,

哪张纸引用了另一张纸。
which paper cites which other paper.

但是例如，如果我们要基于
But for example, if we were to connect them based

关于他们是否在标题中使用相同的单词，
on whether they use the same word in the title,

基础网络和基础的质量
the- the quality of underlying network and the underlying,

representation，表示可能会更糟。
uh, representations might be, uh, much worse.

因此，选择节点是什么以及链接是什么非常重要。
So the choice of what the- the nodes are and what the links are is very important.

因此，只要我们得到一个数据集，
So whenever we are given a data set,

那么我们需要决定如何设计基础图，
then we need to decide how are we going to design the underlying graph,

兴趣节点的对象是什么，
what will be the objects of interest nodes,

以及它们之间的关系是什么，
and what will be the relationships between them,

边缘会是什么。
what will be the edges.

选择给定域或给定问题的适当网络表示形式
The choice of this proper network representation of a given domain or a given problem

决定了我们成功使用网络的能力。
deter- will determine our ability to use networks, uh, successfully.

在某些情况下，会有一个独特的
In some cases, there will be a unique,

明确表示问题的方式，嗯，问题，
unambiguous way to represent this, um, problem,

该域作为图，
this domain as a graph,

而在其他情况下，
while in other cases,

这种表示绝不是唯一的。
this representation may, by no means, be unique.

嗯，我们在对象之间分配链接的方式将确定，
Um, and the way we will assign links between the objects will determine, uh,

我们将能够研究的问题的性质以及
the nature of the questions we will be able to study and the nature of the,

嗯，我们将能够做出的预测。
um, predictions we will be able to make.

因此，为了向您展示一些在共同创建图形时面临的设计选择示例，
So to show you some examples of design choices we are faced with when co-creating graphs,

现在，我将介绍一些概念和不同类型的图形，
I will now go through some concepts and different types of graphs,

嗯，我们可以-我们可以从数据中创建。
uh, that we can- that we can create from data.

首先，我将区分有向图和无向图，对吗？
First, I wi- I will distinguish between directed and undirected graphs, right?

无向图具有链接，
Undirected graphs have links,

嗯，那是无方向的意思，
um, that- that are undirected, meaning,

它们对于建模对称或互惠关系很有用，
that they are useful for modeling symmetric or reciprocal relationships,

喜欢合作，友谊，嗯，
like collaboration, friendship, um,

和蛋白质之间的相互作用，
and interaction between proteins,

依此类推，嗯，
and so on, while directed, um,

关系是通过定向链接捕获的，
relationships are captured by directed links,

每个链接都有方向，有来源的地方
where every link has a direction, has a source,

并以a-箭头表示目的地。
and has a destination denoted by a- by an arrow.

以及这些类型的示例，
And examples of these types of, um,

现实世界中发生的链接将是电话，
links occurring in real-world would be phone calls,

金融交易，嗯，在Twitter上，
financial transactions, uh, following on Twitter,

有来源，有目的地。
where there is a source and there is a destination.

第二种类型，嗯，嗯，
The second type of, um- um, uh,

我们接下来要讨论的图是
graphs that we are going to then talk about is that as we have, um,

创建无向图，然后，
created undirected graphs, then,

嗯，我们可以谈谈节点度的概念。
um, we can talk about the notion of a node degree.

节点度就是边的数量，
And node degree is simply the number of edges,

um，与给定的uh节点相邻。
um, adjacent to a given, uh, node.

因此，例如，此示例中的节点a的阶数为4。
So for example, the node a in this example has degree 4.

平均节点度就是
The average node degree is simply the- is

只是网络中所有节点度数的平均值。
simply the average over the degrees of all the nodes in the network.

如果-如果您解决这个问题，
And if- if you work this out,

结果是边缘数除以节点数的两倍，
it turns out to be twice number of edges divided by the number of nodes,

嗯，在网络上。
uh, in the network.

有这个数字2的原因是
The reason there is this number 2 is

因为当我们计算节点的度数时，
because when we are computing the degrees of the nodes,

每个边缘都被计数两次，对吗？
each edge gets counted twice, right?

边缘n-的每个端点
Each endpoint of the n- of the edge gets

因为边缘有两个端点而计算一次，
counted once because the edge has two end points,

每个边缘都被计数两次。
every edge gets counted twice.

这也意味着拥有一个自我边缘或自我循环，
This also means that having a self edge or self-loop, um,

向节点增加一个二度，
adds a degree of two to the node,

节点的度数不是1，因为两个端点都附加到相同的节点。
not a degree of one to the node because both end points attach to the same, uh, node.

这是针对无向网络的。
This is for undirected networks.

在定向网络中，我们区分
In directed networks, we distinguish between, uh,

进度和出度，意思是
in-degree and out-degree, meaning

入度是指向节点的边数。
in-degree is the number of edges pointing towards the node.

例如，节点C的度数为2，而度数um为1，
For example, node C has in-degree 2 and the out-degree, um, 1,

指从外到外的边数
which is the number of edges pointing outside- outward

从-从节点，嗯，c。嗯
from the- from the node, uh, c. Um,

另一种非常流行的图结构类型
another, uh, very popular type of graph structure

那是-经常使用，并且在不同的领域非常自然，
that is- that is used a lot and it’s very natural in different domains,

二部图

它称为二部图。
it’s called a bipartite graph.

二部图通常是两种不同类型的节点的图，
And bipartite graph is a graph generally of nodes of two different types,

其中节点仅与其他类型的节点进行交互，
where nodes only interact with the other type of node,

但彼此之间不可以。
but not with each other.

因此，例如，二部图是可以拆分节点的图
So for example, a bipartite graph is a graph where nodes can be split

分为两个分区，并且-边缘仅从左侧开始，
into two partitions and the- the edges only go from left,

嗯，到正确的分区，而不是在同一分区内。
uh, to the right partition and not inside the same partition.

自然出现的二部图的例子是，
Examples of, uh, bipartite graphs that naturally occur are,

例如，嗯，科学作者链接到他们撰写的论文，
for example, uh, scientific authors linked to the papers they authored,

演员与他们上映的电影有关，
actors linked to the movies they appeared in,

与他们评分或观看的电影相关联的用户，
users linked to the movies they rated or watched,

嗯，依此类推。
um, and so on.

例如-
So- or for example,

购买产品的客户，
customers buying products, uh,

也是我们有一组客户的二部图，
is also a bipartite graph where we have a set of customers,

一套产品，
a set of products,

然后我们将顾客与产品链接起来，嗯，她购买了产品。
and we link, uh, customer to the product, uh, she purchased.

既然我们已经定义了双向网络，
Now that we have defined a bipartite network,

我们还可以定义折叠或投影网络的概念，我们可以在其中创建，
we can also define the notion of a folded or projected network, where we can create,

例如，作者协作网络，
for example, author collaboration networks,

或电影分级网络。
or the movie co-rating network.

想法如下：如果我有二部图，
And the idea is as follows: if I have a bipartite graph,

然后我可以将此二分图投影到左侧或右侧。
then I can project this bipartite graph to either to the left side or to the right side.

以及何时以及何时我进行投影，基本上，
And when- and when I project it, basically,

我只在投影图中从一侧使用节点，
I only use the nodes from one side in my projection graph,

我连接节点的方式是说
and the way I connect the nodes is to say,

我将在一对节点之间创建一个连接
I will create a connection between a pair of nodes

如果他们至少有一个共同的邻居。
if they have at least one neighbor in common.

因此，如果这些是作者并且是科学论文，
So if these are authors and these are scientific papers,

然后基本上，它说，
then basically, it says,

我将在其中创建一个共同合作或共同作者图表
I will create a co- collaboration or a co-authorship graph where I will

如果他们共同撰写至少一篇共同的论文，则可以将一对作者联系起来。
connect a pair of authors if they co-authored at least one paper in common.

例如1、2
So for example, 1, 2,

和3人合着了这篇论文，
and 3 co-authored this paper,

因此它们彼此相连。
so they are all connected with each other.

例如，3和4没有共同撰写论文，
For example, 3 and 4 did not co-author a paper,

因此它们之间没有链接。
so there is no link between them.

但是例如5和2共同撰写了一篇论文，
But for example, 5 and 2 co-authored a paper,

所以他们之间有联系，因为他们是共同创作的，
so there is a link between them because they co-authored this,

嗯，这里的这篇论文。
uh, this paper here.

以类似的方式，
And in analogous way,

您还可以创建一个投影
you can also create a projection of

这个双向网络-在右侧，
this bipartite network to the- to the right-hand side,

然后您将获得这样的图形。
and then you will- you would obtain a graph like this.

正如我所说的，二部图或多部图，
And as I said, bipartite graphs or multipartite graphs,

如果您有多种类型的边缘，
if you have multiple types of edges,

非常受欢迎，尤其是
are very popular, especially,

如果您有两种不同类型的节点，
if you have two different types of nodes,

例如使用者和产品，
like users and products, um,

嗯，用户和电影，嗯，
uh, users and movies, uh,

作者和论文，嗯，
authors and papers, um,

等等等等。
and so on and so forth.

[噪声]关于图的另一个有趣之处是我们如何表示它们，
[NOISE] Another interesting point about graphs is how do we represent them,

嗯，并代表图，
um, and representing graphs,

嗯，这是一个有趣的问题。
uh, is an interesting question.

表示图形的一种方法是用邻接矩阵表示它。
One way to represent a graph is to represent it with an adjacency matrix.

所以本质上，如果给定的话
So essentially, if for a given,

呃，无向的，例如图，
uh, undirected, for example, graph,

在这种情况下，在终端节点上，
in this case on end nodes, in our case,

4，我们将创建一个方矩阵，
4, we will create a square matrix,

该矩阵将是二进制的。
where this matrix will be binary.

它将仅接受0和1的条目。
It will o- only take entries of 0 and 1.

本质上，如果节点i和j连接，则矩阵ij的条目将设置为1，
And essentially, an entry of matrix ij will be set to 1 if nodes i and j are connected,

如果未连接，则将其设置为0。
and it will be set to 0 if they are not connected.

举例来说，连接1和2
So for example, 1 and 2 are connected,

因此在第1行第1行
so at entry 1, row 1,

第2列有一个1。
column 2, there is a 1.

而且，由于2在第2行连接到1
And also, because 2 is connected to 1 at row 2,

第1列，我们还有一个1。
column 1, we also have a 1.

因此，这意味着
So this means that adjacency matrices of,

呃，无向图是自然对称的。
uh, undirected graphs are naturally symmetric.

如果图形是有向的，
If the graph is directed,

那么矩阵将不是对称的，因为2链接到1。
then the matrix won’t be symmetric because 2 links to 1.

我们这里有一个1
We have a 1 here,

但1不会链接回2，
but 1 does not link back to 2,

所以有一个0。
so there is a 0.

嗯，以类似的方式，
Um, and in similar way,

然后我们可以想到节点度，嗯，嗯，
we can then think of node degrees, um, uh,

只是作为给定行的总和，或者
simply as a summation across a given row or

跨过图表的给定一列，呃，邻接矩阵。
across a given one column of the graph, uh, adjacency matrix.

因此，与其在这里思考附近有多少条边，
So rather than kind of thinking here how many edges are adjacent,

我们可以总结一下-基本上，
we can just go and sum the- basically,

数一数
count the number of ones,

该给定节点连接到的其他节点的数量。
number of other nodes that this given node is connected to.

嗯，这是针对无向图的。
Um, this is for, um, undirected graphs.

对于有向图，嗯，
For directed graphs, uh,

进出度将是列上的总和和行上的总和，呃，
in and out degrees will be sums over columns and sums over rows, uh,

图邻接矩阵的
of the graph adjacency matrix,

正如我在此说明的，嗯，
as- as I illustrate here, uh,

这个，嗯，插图。
with this, um, illustration.

现实世界网络的一个重要后果是它们极为稀疏。
One important consequence of a real-world network is that they are extremely sparse.

因此，这意味着如果您查看邻接矩阵，
So this means if you would look at the adjacency matrix,

关于实际网络的邻接矩阵的系列，基本上对于每一行，第一行，
series on adjacency matrix of a real-world network where basically for every, um, row I,

J列，如果有边，
column J, if there is an edge,

我们放一个点，否则单元格是空的，嗯，
we put a dot and otherwise the cell is empty, uh,

您会得到这些类型的超稀疏矩阵，
you get these types of super sparse matrices where,

矩阵的大部分为空白的地方是白色。
where there are large parts of the matrix that are empty, that are white.

嗯，这对财产有重要影响
Um, and this has important consequences for properties

这些矩阵中的一个，因为它们非常稀疏。
of these matrices because they are extremely, uh, sparse.

举个例子吧？
To show you an example, right?

嗯，如果您在n个节点上有一个网络，
Uh, if you have a network on n nodes,

节点，然后是节点的最大程度，
nodes, then the maximum degree of a node,

节点的连接数为n减1
the number of connections a node has is n minus one

因为原则上您可以连接到其他任何人，
because you can connect to every oth- in principle,

连接到网络中的所有其他节点。
connect to every other node in the network.

举例来说，如果您是人类，并且想着人类的社交网络，
So for example, if you are a human and you think about human social network, uh,

您可以拥有的最高学位，
the maximum degree that you could have,

您可以拥有的最大朋友数是世界上的每个其他人。
the maximum number of friends you could have is every other human in the world.

但是，没有人有70亿个朋友，对吗？
However, nobody has seven billion friends, right?

我们的友谊数量要少得多。
Our number of friendships is much, much smaller.

因此，这意味着人类社交网络极为稀疏，
So this means that, let’s say the human social network is extremely sparse,

事实证明，还有很多其他的
and it turns out that a lot of other,

嗯，不同类型的网络，
uh, different types of networks,

你知道，电网，呃，互联网连接，
you know, power-grids, uh, Internet connection,

科学合作，电子邮件图表，
science collaborations, email graphs,

嗯，等等等等都是极为稀疏的。
uh, and so on and so forth are extremely sparse.

他们的平均程度是这些，
They have average degree that these, you know,

大约10个，最多100个。
around 10 maybe up to, up to 100.

那么，结果是什么呢？
So, uh, what is the consequence?

结果是潜在的邻接矩阵，
The consequence is that the underlying adjacency matrices,

嗯，非常稀疏。
um, are extremely sparse.

因此，我们永远不会将矩阵表示为密集矩阵，
So we would never represent the matrix as a dense matrix,

但我们始终将其表示为稀疏矩阵。
but we’ve always represent it as a sparse matrix.

有两种其他方式来表示图形。
There are two other ways to represent graphs.

一种是简单地将其表示为边缘列表，
One is simply to represent it as a edge list,

只是作为边缘列表。
simply as a list of edges.

嗯，这是非常受欢迎的代表，嗯，
Uh, this is a representation that is quite popular, um,

在深度学习框架中，因为我们可以简单地
in deep learning frameworks because we can simply

将其表示为二维矩阵。
represent it as a two-dimensional matrix.

这种表示法的问题是它非常
The problem of this representation is that it is very

很难进行任何类型的图形操作或
hard to do any kind of graph manipulation or

对图形的任何类型的分析，因为即使
any kind of analysis of the graph because even

计算给定节点的度是不平凡的，
computing a degree of a given node is non-trivial,

嗯，在这种情况下。
uh, in this case.

嗯，更好，嗯，
A much, uh, better, uh,

图分析和处理的表示形式是邻接表的概念。
representation for a graph analysis and manipulation is the notion of adjacency list.

嗯，邻接表很好，因为它们更容易
Um, and adjacency lists are good because they are easier to

适用于大型和稀疏网络。
work with if for large and sparse networks.

邻接表仅使我们能够
And adjacency list simply allows us to

快速检索给定节点的所有邻居。
quickly retrieve al- all the neighbors of a given node.

因此，您可以想到，对于每个节点，
So you can think of it, that for every node,

您只需存储其邻居的列表。
you simply store a list of its neighbors.

因此，一个节点列表
So a list of nodes that the,

给定节点所连接的。
that the- a given node is connected to.

如果图形是无向的，
If the graph is undirected,

您可以存储，呃，邻居。
you could store, uh, neighbors.

如果图形已连接，
If the graph is connected,

您可以存储两个即将离任的邻居，
you could store both the outgoing neighbors,

以及根据边缘的方向传入的邻居。
as well as, uh, incoming neighbors based on the direction of the edge.

我想在这里最后提到的最重要的一点是，
And the last important thing I want to mention here is that of course,

这些图可以具有附加的属性。
these graph can- can have attached attributes to them.

所以节点地址，以及
So nodes address, as well as

整个图可以具有附加的属性。
entire graphs can have attributes or properties attached to them.

因此，例如，边缘可以具有重量。
So for example, an edge can have a weight.

这种关系有多牢固？
How strong is the relationship?

也许可以得到我的排名。
Perhaps it can have my ranking.

它可以有一个类型。
It can have a type.

无论是基于朋友的关系还是敌意，都可能有迹象表明，
It can have a sign whether this is a friend-based relationship or whether it’s animosity,

完全不信任，可以说是基于关系的关系。
a full distrust, let say based relationships.

嗯，边缘可以有多种不同类型的属性，
Um, and edges can have di- many different types of properties,

就像是打个电话一样，
like if it’s a phone call, it’s,

例如持续时间。
it’s duration, for example.

节点可以具有以下属性：如果这些是人，
Nodes can have properties in- if these are people,

可能是年龄，性别，
it could be age, gender,

兴趣，位置等等。
interests, location, and so on.

如果一个节点是a，是一个化学物质，
If a node is a, is a chemical,

也许是化学物质
perhaps it is chemical mass,

的化学式和其他性质
chemical formula and other properties of the- of

化学物质可以表示为节点的属性。
the chemical could be represented as attributes of the node.

当然，整个图也可以具有特征，或者，
And of course, also entire graphs can have features or, uh,

基于…的属性，
attributes based on, uh,

图形结构正在建模的基础对象的属性。
the properties of the underlying object that the graphical structure is modeling.

所以这意味着您将要考虑的图形
So what this means is that the graphs you will be considering

不只是拓扑结点和边缘，
are not just the topology nodes and edges,

但这也是属性
but it is also the attributes,

呃，依附于他们。
uh, attached to them.

就像我提到的那样
Um, as I mentioned,

其中一些属性实际上可以是
some of these properties can actually be

也直接在邻接矩阵中表示。
represented directly in the adjacency matrix as well.

因此，例如，边缘的属性
So for example, properties of edges like

权重可以简单地用邻接矩阵表示，对吗？
weights can simply be represented in the adjacency matrix, right?

与其将邻接矩阵设为二进制，不如说是
Rather than having adjacency matrix to be binary,

我们现在可以使邻接矩阵具有实值，其中
we can now have adjacency matrix to have real values where

连接的强度仅与该值相对应，
the strength of the connection corresponds simply to the value,

嗯，在那个条目中。
uh, in that entry.

因此，两个和四个之间的联系更加紧密，
So two and four are more strongly linked,

所以值是四
so the value is four,

而例如，一和三与
while for example, one and three are linked with

权重仅为0.5的弱连接。
a weak connection that has weight only 0.5.

嗯，嗯，
Um, as a- um,

另一个重要的事情是，当我们创建图表时，我们还
another important thing is that when we create the graphs is that we also

可以考虑具有自环的节点。
can think about nodes having self-loops.

嗯，例如，在这里，
Um, for example, here,

节点四有一个自环，呃，
node four has a self-loop, uh,

现在节点4的度等于3。
and now the degree of node four equals to three.

嗯，自环简单地对应于
Um, self-loops are simply correspond to

邻接矩阵对角线上的项。
the entries on the diagonal of the adjacency matrix.

在某些情况下
And in some cases,

我们实际上可能会创建一个多图
we may actually create a multi-graph where we

允许一对节点之间有多个边。
allow multiple edges between a pair of nodes.

有时我们可以，我们可以将多图视为
Sometimes we can, we can think of a multi-graph as

一个加权图，其中矩阵上的项计算边的数量，
a weighted graph where the entry on the matrix counts the number of edges,

但有时候您想分别代表每个边缘，
but sometimes you want to represent every edge individually,

分别是因为这些边缘可能具有不同的属性，
separately because these edges might have different properties,

以及不同的um属性。
um, and different, um, attributes.

嗯，自我循环，
Both, um, the self-loops,

以及多图在自然界中非常频繁地出现。
as well as multi-graphs occur quite frequently in nature.

嗯，例如，如果您考虑电话交易，
Uh, for example, if you think about phonecalls transactions,

一对节点之间可以有多个事务
there can be multiple transactions between a pair of nodes

我们可以准确地将其表示为多图。
and we can accurately represent this as a multi-graph.

嗯，有了这些图，我，
Um, as we have these graphs, I,

我还想谈一谈连通性的概念，
I also want to talk about the notion of connectivity,

从某种意义上说，该图是已连接还是已断开。
in a sense, whether the graph is connected or disconnected.

然后，如果在其中，任何一个节点对中的任何一对都连接了图，
And graph is connected if any pair of nodes in, uh, in this, uh,

图形可以是，可以通过沿着图形边缘的路径连接。
graph can be, can be connected via a path along the edges of the graph.

因此，举例来说，这个特定的图形是
So for example, this particular graph is

另一个图未连接时已连接，
connected while this other graph is not connected,

它具有三个连接的组件。
it has three connected components.

这是一个连接的组件，第二个连接的组件，
This is one connected component, second connected component,

然后是第三个连接的组件，
then a third connected component,

节点h，它是一个隔离节点。
the node h, which is an isolated node.

这是无向图的连通性概念，
This is the notion of connectivity for undirected graphs, uh,

这个概念有趣的是
and what is interesting in this notion is,

当我们，嗯，
that when we, um,

有图
have graphs that are,

例如，断开它并查看什么是
for example, disconnect it and look at what is

基础邻接矩阵的结构，
the structure of the underlying adjacency matrix,

我们将拥有这些块对角线结构，基本上，
we will have these block diagonal structure, where, basically,

如果这是一个由两个部分组成的图，那么我们将拥有
if this is a graph that is composed of two components, then we will have,

um，um，块对角线结构，其中边缘仅位于
um, um, block diagonal structure where the edges only go between the,

um，同一um连接的组件中的节点，
um, nodes inside the same, um, connected component,

在对角线部分没有边缘，
and there is no edges in the off-diagonal part,

这意味着两者之间没有优势，
which would mean that there is no edge between,

呃，红色和蓝色，
uh, red and blue,

呃，图的一部分。
uh, part of the graph.

连接性的概念也可以概括为有向图。
The notion of connectivity also generalizes to directed graphs.

在这里，我们谈论的是两种类型的连接，
Here, we are talking about two types of connectivity,

强弱连接。
strong and weak connectivity.

弱连接的有向图只是连接的图，
A weakly connected directed graph is simply a graph that is connected,

嗯，如果我们忽略边缘的方向。
uh, in- if we ignore the directions of the edges.

紧密相连的图um
A strongly connected graph, um,

或图是紧密连接的，如果对于每一对
or a graph is strongly connected if for every pair of

节点之间存在定向路径。
nodes there exists a directed path between them.

因此，嗯，这意味着必须存在一个来自（例如）
So, um, this means that there has to exist a directed path from, for example,

从节点A到节点B，
from node A to node B,

以及从节点B返回到
as well as from node B back to, uh,

如果图形是牢固连接的，则为节点A。
node A if the graph is strongly connected.

这也意味着我们可以谈论
What this also means is that we can talk about notion of

强连接的组件在强连接的组件所在的位置，
strongly connected components where strongly connected components are,

嗯，图中的节点集，嗯，
uh, sets of nodes in the graph, uh,

这样每个节点，
such that every node, uh,

该组中的组可以通过定向路径相互访问。
in that set can visit each other via the- via a directed path.

因此，例如，在这种情况下，
So for example, in this case here,

节点，呃，A，B，
nodes, uh, A, B,

C和C组成一个紧密连接的组件，因为它们处于循环中。
and C form a strongly connected component because they are on a cycle.

因此，我们可以从任何我们可以访问的节点访问任何其他节点。
So we ca- any- from any node we can visit, uh, any other node.

嗯，这里的示例显示了，
Uh, the example here shows, uh,

有两个强连接的有向图，
directed graph with two strongly connected component,

再次，打开两个周期，最多三个节点。
again, two cycles on, um three nodes.

这样就结束了对其他图形表示的讨论，
So this concludes the discussion of the- er- the graph representations,

嗯，那-以及我们如何从真实数据中创建图表的方式。
um, that- and ways how we can create graphs from real data.

嗯，在这堂课中
Um, in this lecture,

我们首先讨论了用图进行机器学习以及用例中的各种应用程序。
we first talked about machine-learning with graphs and various applications in use cases.

我们谈到了节点级，边缘级，
We talked about node level, edge level,

和图形级机器学习预测任务。
and graph level machine-learning prediction tasks.

然后我们讨论了有向图表示的图形表示的选择，
And then we discussed the choice of a graph representation in terms of directed,

无向图，二部图，
undirected graphs, bipartite graphs,

加权的，呃，未加权的图，
weighted, uh, unweighted graphs,

邻接矩阵，以及图论的一些定义，
adjacency matrices, as well as some definitions from graph theory,

像图的连通性um
like the connectivity, um, of graphs,

弱连通性，强连通性，
weak connectivity, strong connectivity,

以及节点度的概念。
as well as the notion of node degree.

嗯，非常感谢。
Um, thank you very much.