【斯坦福】CS224W:图机器学习( 中英字幕 | 2019秋)

Hello everyone
大家好

So, hello everyone
所以大家好

Welcome to Stanford
欢迎来到斯坦福

Some of you have already been here
你们有些人已经来过这里

for some of you this is the first time
对于某些人来说,这是第一次

or maybe even the first lecture at Stanford
甚至是斯坦福大学的第一次演讲

So I hope you had a good summer and
所以我希望你过得愉快

you are well rested and full of energy
您休息好,精力充沛

What we’re doing right now is
我们现在正在做的是

we are handing out the course handouts
我们正在分发课程讲义

If you don’t get it they will be also at the exit
如果不明白,他们也会在出口处

from the lecture hall when the lecture ends
演讲结束后从演讲厅

But TAs are handing out the course information sheets
但是助教正在分发课程信息表

We are hand out
我们在分发

Where you can get information about the course
在哪里可以获得有关课程的信息

So to start
所以开始

My name is Jure Leskovec
我叫Jure Leskovec

and I will be your instructor
我会当你的老师

I’m an associate professor
我是副教授

in computer science department
在计算机科学系

And this year we are doing something new
今年,我们正在做一些新的事情

What we’re doing new this year is that
今年我们要做的是

we actually rename the course
我们实际上将课程重命名

and we will, in a very big part
我们将在很大程度上

redesign the course from previous years
重新设计前几年的课程

Where we kind of shift the focus a bit from
我们在某种程度上将焦点从

analysis of social networks and analysis of graphs
社交网络分析和图分析

In particular
尤其是

focus more on predictive modeling of graphs
更专注于图的预测建模

So we call this course Machine Learning with Graphs
因此,我们将此课程称为图机器学习
image.png
So what I wanna do today is
所以我今天想做的是

I want first to give you a motivation
我想先给你动力

what we are going to do
我们要做什么

and what we are going to learn this quarter
以及本季度我们将学习什么

And then I’ll talk about the course logistics and then we are start
然后我将讨论课程后勤,然后我们开始

with some basics around the graph theory so we all get on the same
关于图论的一些基础知识,所以我们都处于相同的状态

same page
同一页

If you have any questions feel free to stop me at any time
如果您有任何问题,随时可以阻止我

I’ll be very happy to answer them
我很高兴回答他们

You can also come talk to me after the lecture
演讲结束后你也可以跟我说话

if you are auditing the course
如果您正在审核课程

you are very welcome to our audit and sitting
非常欢迎您参加我们的审核和会议

everything is good
万事皆安


Why Networks?

Okay, so the course
好的,当然

so what we will be doing is we will be working with
所以我们要做的就是与

graphic representations of data
数据的图形表示

and we will pull this graphic representations of data
我们将这种数据的图形表示

which we call them networks
我们称之为网络

image.png
A networks are a general language
网络是一种通用语言

for describing complex systems of interacting entities
用于描述交互实体的复杂系统

So the way we can think of this is
所以我们可以想到的是

if I show you pictorially, rather than
如果我以图片形式向您展示,而不是

thinking our data, our domain
思考我们的数据,我们的领域

That’s we are interested in is
那就是我们感兴趣的是

a set of isolated data points
一组隔离的数据点

or a set of isolated objects
或一组孤立的对象
image.png
This object interact with each other
该对象彼此交互

which means that they are connected
这意味着它们已连接

so rather than analyzing
因此,而不是分析

it has a has a cloud of data, a cloud of points
它有一个数据云,一个点云

we think of it as a network
我们将其视为网络

so a network has a set of nodes
所以网络有一组节点

a set of entities, a set of objects
一组实体,一组对象

And then a set of relationships or a set of connections
然后是一组关系或一组连接

that we denote as lines
我们表示为线

And that’s basically the data we will be talking about
这基本上就是我们将要讨论的数据

throughout this quarter
在整个季度
image.png
And, of course, these things can be humongous
当然,这些事情可能是巨大的

and very interesting and very complex
非常有趣,非常复杂

and its super rewarding to go look at them
和超级有收获的去看看他们

and learn what kind of properties
并了解什么样的属性

of the underlying domain or underlying system
基础域或基础系统的

they are explaining
他们在解释
image.png


Two Types of Networks/Graphs

I wanna first make a kind of a philosophical distinction
我想先做一个哲学上的区分

between how can we think of graphs or networks?
我们如何看待图或网络?

We can think of them as a, as networks
我们可以将它们视为网络

as essentially examples of phenomena
作为现象的本质例子

that appears in real life
出现在现实生活中

and there is a very nice or a very
并且有一个非常好的或非常的

natural representation of that phenomenon
该现象的自然表示

that domain in terms of a graph, right?
用图表示那个域,对不对?

So if you think about you want a model
因此,如果您考虑使用模型

society of seven billion people
70亿人的社会
image.png
we are interacting with each other
我们正在互相交流

and we can describe these interactions
我们可以描述这些相互作用

through the social network called the Social Graph
通过称为社交图的社交网络

and you can start asking
你可以开始问

what are the properties of the human social graph
人类社会图谱的特性是什么

or the human social network
或人类社交网络

If you think about trying to analyze the
如果您想尝试分析

robustness of the of the Internet connectivity
Internet连接的健壮性

the way you could study
你学习的方式

that is to study the underlying communication network
那就是研究底层的通讯网络

of… of…
的…的…

let’s say that allows us to communicate across the planet
假设这使我们能够在整个地球上进行交流

and if you think about how would you describe
如果您考虑如何形容

how cells in our bodies work
人体细胞的运作方式

or how the life works?
或生活如何运作?

It’s actually an interaction diagram of different types
它实际上是不同类型的交互图

of molecules that come together to do useful things in the cell
一起在细胞中发挥有用作用的分子

So again,
再说一遍

you could take data from the cell represent it as a network
您可以从细胞中获取数据,将其表示为网络

To say this is how components of the cells
说这是细胞的成分

have to work together in order for the cell to be alive
必须一起工作才能使细胞存活

and this becomes very interesting
这变得非常有趣

because then you can say
因为那样你就可以说

if I wanna design a drug
如果我想设计一种药物

I should design a drug so that
我应该设计一种药物

it changes the underlying network
它改变了底层网络

the underlying machine of how the cell works
细胞工作原理的基础机器

Right? And especially the same
对?特别是一样

for example
例如

in neuroscience
神经科学

if you think about modeling brain
如果您考虑对大脑建模
image.png
brain is again a network of neurons, right?
大脑又是神经元的网络,对吗?

So if we would like to modern thoughts
所以如果我们想现代思想

we have to first map out the networks of how neurons
我们必须先绘制出已连接神经元的网络

are connected and then start modeling that system
然后开始对该系统建模

and this is what I will call networks
这就是我所说的网络

because these are essentially in some sense
因为这些本质上在某种意义上

graphs that appear in natural world
自然世界中出现的图形

so we could call it natural graphs
所以我们可以称其为自然图

which is a bit different from what I would call
这跟我所说的有点不同

image.png
两种网络/图

  • 网络(也称为自然图):
    • §社会是7+十亿个人的集合
    • §通信系统链接电子设备
    • §基因/蛋白质之间的相互作用调节生命
    • §我们的思想隐藏在大脑数十亿个神经元之间的联系中
  • ¡信息图:
    • §信息/知识被组织和链接
    • §场景图:场景中的对象如何关联
    • §相似网络:获取数据,连接相似点
    • 有时区分是模糊的

information graphs, were essentially graph representation
信息图,本质上是图表示

It can be kind of just away or a heuristic
它可以只是遥远的或启发式的

for you to be able to solve a given prediction problem, right?
使您能够解决给定的预测问题,对吗?

so while here you are really thinking about
所以在这里你真的在想

image.png

how do I model a given domain
如何为给定域建模

and what does that network could
该网络可以做什么?

reveal and teach me about the domain
揭示并教我有关领域的信息

Down here
在这里

image.png

We are more interested about
我们更感兴趣

what are the relationships between entities
实体之间的关系是什么

so we can do our prediction tasks well
这样我们就可以做好预测任务

so many times
很多次

for example
例如

people who create similarity networks where you have a lot of objects
在您拥有很多对象的地方创建相似性网络的人

data points and you connect the ones that are similar
数据点,然后连接相似的点

and that again we’ll allow it to better learn over the data
再次,我们将允许它更好地学习数据

because you can learn how to share the data
因为您可以学习如何共享数据

across similar points
跨类似点

And as I think it’s already clear
而且我认为已经很清楚了

the distinction between these two will many times be blurred
两者之间的区别将多次模糊

But essentially the first goal we’ll talk about is
但实际上,我们要谈论的第一个目标是

how do we take data and how do you represent it as a network
我们如何获取数据以及如何将其表示为网络

so that you can then learn or analyze it
这样您就可以学习或分析它

by exploiting the connections between data points
通过利用数据点之间的连接

So what are some examples of natural graphs or natural networks?
那么自然图或自然网络的一些例子是什么?


Many Types of Data are Networks

image.png
Right?
对?

You can think of social networks are a very nice natural graphs
您可以认为社交网络是非常好的自然图

of who’s a friend of whom
谁是谁的朋友

or who is following whom
或谁在关注谁

things like that
像这样的东西
image.png
you can think, for example
你可以想到,例如

of the entire World Wide Web as a giant network of
整个万维网作为一个庞大的网络

how web pages point to each other
网页如何相互指向

where there is now a directed edge between two web pages
现在两个网页之间有一个定向边缘

is one web page hyperlinks the other web page
是一个网页超链接另一个网页

If you are interested that
如果您对此感兴趣

for example
例如

in the structure of science and how science progresses
在科学结构中以及科学如何发展

then you can start thinking about modeling citation graphs
然后您可以开始考虑对引文图进行建模

were basically a paper cites to or refers to prior works
基本上是一篇论文引用或提及先前的作品

and this is very interesting
这很有趣

if you are interested in innovation
如果您对创新感兴趣

if you wanna model how science evolves
如果您想模拟科学的发展

or if you wanna understand how patents
或者如果您想了解专利的方式

get created and how patents cite prior work
获得创造以及专利如何引用先前的工作

so this would be called citation networks
所以这被称为引文网络
image.png
and there’s many other examples of networks
还有很多其他的网络例子

anything from networks in economics
网络经济学中的任何东西

where you can have sets of companies and map out
您可以在那里拥有公司集并进行规划

how they relate to each other
它们如何相互联系

to as I was saying communication networks
就像我所说的通讯网络
image.png
this, for example
例如

is a network of neurons of a little worm
是小蠕虫的神经元网络

So, actually
所以,实际上

people dissected the worm to really know what neuron maps to what other neurons
人们解剖了蠕虫,以真正知道什么神经元映射到其他神经元

so this is a real neural network
所以这是一个真正的神经网络

A physical neural network of a little worm
蠕虫的物理神经网络
image.png
and these are some examples of networks
这些是网络的一些例子

and one type of question we will be asking in this class are questions about
我们将在本课中提出的一种问题是关于

you know, how are these systems that I have down here?
你知道,我在这里的这些系统怎么样?

How are they organized and what are their design principles, right?
它们是如何组织的,它们的设计原理是什么?

You can ask what are the properties of social networks?
您可以问一下社交网络的属性是什么?

How do we human connect
人类如何联系

Main questions

image.png
主要问题:
这些系统如何组织?
它们的设计特性是什么?

humans connecting relate to each other
相互联系的人

or you could say
或者你可以说

how do I design robust communication networks
我如何设计健壮的通信网络

and why do they reveal me about the robustness of the Internet, right?
他们为什么向我透露互联网的健壮性,对吗?

or you can be asking about evolution of science
或者您可以询问科学的发展

through this representation through the network
通过网络通过这种表示

so that’s one example of kind of domains where our methods will be applicable
这就是我们的方法将适用的领域类型的一个例子

Networks: Knowledge Discovery

So essentially here, the idea is that
所以基本上在这里,想法是

behind each of this kind of complex systems that I gave you
我给你的每种复杂系统的背后

an example of that is the wiring diagram
接线图就是一个例子
image.png
网络:知识发现
在许多系统的背后,都有一个复杂的接线图,即网络
它定义了组件之间的相互作用
除非我们了解它们背后的网络,
否则我们将永远无法对这些系统进行建模和预测。

There is a network that defines the interactions of
有一个网络定义了

how these components work and interact together
这些组件如何一起工作和交互

And if we want a model or predict the behavior of the systems
如果我们想要模型或预测系统的行为

we first need to understand and map out the networks that govern their behavior
我们首先需要了解并绘制出控制其行为的网络

So that’s the main motivation why are we doing this
这就是我们这样做的主要动机

The way we are doing
我们的工作方式


Somebody is having a coffee or stuff
有人在喝咖啡或东西

so now the other thing
所以现在另一件事


Many Types of Data are Graphs

which is if you just think about general graphs
如果您只是考虑一般图形

What, for example, becomes amazingly useful
例如,什么变得非常有用

when you are trying to model data is to represent the domain
当您尝试对数据进行建模时,就是要代表领域

through some kind of set of relations that are true
通过一系列真实的关系
image.png
So, for example
因此,例如

if you are wanna the reason about how events, this would be
如果您想了解事件发生的原因,这将是

this is an example of how you can have a create
这是一个如何创建作品的例子

a graph of events in between flights
两次航班之间的事件图

What was delayed and what flight went well
什么延误了,什么飞行顺利
image.png
and you can create networks like this
您可以创建这样的网络

You can create what is called knowledge graphs
您可以创建所谓的知识图
image.png
where you could basically go and encode all the knowledge
您基本上可以去编码所有知识的地方

you have about a given domain, right?
你有一个给定的域名,对吗?

You could say you know that Bob is interested in Mona Lisa
您可能会说您知道Bob对Mona Lisa感兴趣

that Mona Lisa was created by this other note called
蒙娜丽莎(Mona Lisa)是由另一条音符创造的

Leonardo da Vinci and so on and so forth
达芬奇(Leonardo da Vinci)等

Right and you can map out this huge graph of
对,您可以绘制出这张

basically everything we know about the world around us
基本上我们对周围世界的了解

and these are called knowledge graphs
这些被称为知识图


You can also think, for example
您也可以考虑,例如

if you wanna do computer vision or in image understanding
如果您想进行计算机视觉或图像理解
image.png
There is a lot of objects in the image
图像中有很多物体

and these objects are not kind of isolated that
这些对象不是孤立的

somehow relate to each other and you can now
彼此之间有某种联系,您现在可以

represent these relationships through our graph
通过我们的图表表示这些关系

and this would be called scene graphs
这将称为场景图

and the last example I mentioned is if you think about molecules
我提到的最后一个例子是,如果您考虑分子

molecules are graphs as well
分子也是图

So state of the art models for predicting properties of molecules
因此,用于预测分子特性的最新模型
image.png
are based on the graph representation of those molecules, right?
是基于那些分子的图形表示,对吧?

so you can think of this junctions or a corner, these are the atoms
所以你可以想到这个结点或一个角,这些是原子

and these are the bonds
这些是纽带

and that’s your little graph on on a few nodes
那就是你在几个节点上的小图

that you can now try to model and understand
您现在可以尝试建模和了解


Main questions

So here the question is
所以这里的问题是

how do I take advantage of these relational structure
我如何利用这些关系结构

to better model the underlying domain to better make predictions?
更好地为基础领域建模以更好地做出预测?
image.png

Graphs: Machine Learning

So essentially
所以本质上

what I wanna say in this case is to say that in complex domains
在这种情况下,我想说的是在复杂域中

we have reached a relational structure
我们已经建立了关系结构

which can be represented as this type of relational graph
可以表示为这种类型的关系图

and by explicitly modeling these relationships
并通过明确建模这些关系

we can achieve better models
我们可以获得更好的模型

better performance
更好的性能

we are able to solve tasks better
我们能够更好地解决任务

image.png
图:机器学习
复杂的领域(知识,文本,图像等)具有丰富的关系结构,
可以表示为关系图
通过显式建模关系,我们可以获得更好的性能

So this is what I wanted to say about motivation for what we are doing
这就是我想说的关于我们正在做的事情的动力


but you can also say
但是你也可以说

Hey, you know
嘿,你知道

but why should I care about these things?
但是我为什么要关心这些事情?
image.png
So I tell you two reasons why you should care
所以我告诉你两个原因


Why Networks? Why Now?

So the first thing is that networks
所以第一件事就是网络

as you have noticed our essentially our universal language
您已经注意到我们本质上是我们的通用语言

for describing complex systems, right?
描述复杂的系统,对吗?

image.png

I already gave you examples that all the way
我已经给你例子了

from social science
来自社会科学

to chemistry
化学

to biology
对生物学

to neuroscience and acknowledge graphs
对神经科学并承认图

I dare say I as well, right?
我也敢说对吧?

So the methods will be
因此方法将是

we will be working on will be applicable to all these domains
我们将致力于将适用于所有这些域

because the underlying data representation will be the same
因为基础数据表示形式将是相同的

It will be the representation of the network
这将是网络的代表

so this means that net that there are many different domains from
因此,这意味着存在许多不同域的网络

science, nature technology
科学,自然技术

that are kind of that can be represented as networks
可以表示为网络的那种

and it becomes interesting to say
变得有趣起来

how are these network representations different
这些网络表示形式有何不同

and can we design algorithms that work kind of across
我们可以设计出可以在

all these different networks from different domains?
来自不同域的所有这些不同网络?

image.png
为什么是网络?为什么是现在?

  • 描述复杂数据的通用语言
    • §来自科学,自然和技术的网络比人们期望的更为相似
  • 字段之间共享词汇
    • §计算机科学,社会科学,物理学,
    • 经济学,统计学,生物学
  • 数据可用性和计算挑战
    • §网络/移动,生物,健康和医疗
  • 影响!
    • §社交网络,药物设计,人工智能推理

Another thing is that the analysis of networks is a very nice interdisciplinary field
另一件事是,网络分析是一个非常好的跨学科领域

image.png

Where computer science meets social science
计算机科学与社会科学相遇的地方

physics
物理

economics
经济学

statistics
统计

biology and so on
生物学等

right? through this notion of network representations
对?通过这种网络表示的概念

another big important thing why should we worry about
另一个重要的重要事情是我们为什么要担心

image.png

these types of data Now is because it’s in some sense it’s finally available right?
这些类型的数据现在是因为从某种意义上说它终于可以使用了吧?

We have huge amounts of network data and there are very interesting
我们拥有大量的网络数据,并且非常有趣

and hard computational challenges to solve.
以及难以解决的计算难题。

and then there is big impact of this technology
然后这项技术的影响很大

image.png

right?
对?


Networks: Impact

If you think about the new wave of digital companies
如果您考虑一下数字化公司的新潮流

they are all exploiting the graph structure one way or another, right?
他们都以一种或另一种方式利用图结构,对吗?
image.png
The why Google became Google as a search engine was that
Google成为Google成为搜索引擎的原因是

they realized for the first time that Web is our graph
他们第一次意识到Web是我们的图

before they were modeling Web search as a set of documents
在他们将网络搜索建模为一组文档之前

and Google said: no this is a set of documents that are connected
和谷歌说:不,这是一组连接的文档

and that led to everything there is today
导致了今天的一切

Similarly
相似地

if you think about social networks product
如果您考虑社交网络产品

graphs and things like that
图之类的东西

So this was a high level that I wanted to kind of motivates what we will be doing
所以这是一个很高的层次,我想激发我们将要做的事情


Networks and Applications

What I wanna do in the next few minutes is
我接下来几分钟想做的是

I wanna show you some examples of applications or tasks
我想给你看一些应用程序或任务的例子
image.png
that you can do with with networks, right?
可以与网络一起使用,对吗?

So generally
所以一般

you can think of the way to analyze networks as follows, right?
您可以考虑以下分析网络的方式,对吗?


Ways to Analyze Networks

image.png

  • 预测给定节点的类型/颜色
    • §节点分类
  • ¡预测两个节点是否链接
    • §链接预测
  • ¡识别密集链接的节点簇
    • §社区检测
  • ¡测量两个节点/网络的相似性
    • §网络相似度

you can think about trying to model predict a type of node
您可以考虑尝试对节点的类型进行模型预测

so it would be, this will be called node classification
因此,这将称为节点分类

You can think about predicting whether two nodes should be connected or not
您可以考虑预测是否应该连接两个节点

This is called link prediction
这称为链接预测

You can think about identifying densely connected
您可以考虑确定密集连接

or in-densely connected sets of nodes
或密集连接的节点集

in some sense
在某种意义上

some kind of a clustering task
某种集群任务

This is called a Community detection
这称为社区检测

and you can also think about measuring and quantifying
您还可以考虑测量和量化

similarity between different graphs
不同图之间的相似性

and this is called network similarity
这就是所谓的网络相似度


And, for example
而且,例如

where would this be useful is if you wanna compare the molecules
在哪里有用,如果您想比较分子

how similar or different they are
它们有多相似或不同

I will show you an example of community detection next in social networks
接下来,我将向您展示在社交网络中进行社区检测的示例

where only one identify social circles of a person
只有一个人可以识别一个人的社交圈

I’ll show you examples of link prediction
我将向您展示链接预测的示例

for predicting side effects of drugs
用于预测药物的副作用

And I’ll show you some examples of note classification as well
我也将向您展示音符分类的一些示例

So let me kind of show you some examples
因此,让我为您展示一些示例

what can you do with these types of things?
您可以用这些类型的东西做什么?


(1)Networks:Socal Networks

So here is a kind of a very famous picture of the social networks of Facebook
因此,这是Facebook社交网络的一张非常著名的图片
image.png
It was actually created by one of the students in this class several years ago
它实际上是几年前由该班的一位学生创建的

He got hired to Facebook later and created this beautiful picture of their network
后来他被雇用到Facebook,并为他们的网络创建了这张美丽的照片

And one question is, that you can ask is
一个问题是,你可以问的是

how is this network are organized
这个网络是如何组织的

Application:Social Circle Detection

and one way you could try to understand the organization of the network is to say
而您可以尝试了解网络组织的一种方法是说

Can I identified the social circles in the network
我可以识别网络中的社交圈吗
image.png
and you could imagine that every person, right? If this is the person
您可以想象每个人,对不对?如果这个人

you take their friends and you say
你带他们的朋友,你说

how does their social network composed of the social circles?
他们的社交网络如何由社交圈组成?

And it can become very interesting, right?
它会变得非常有趣,对吗?

You could say here’s a set of my friends from high school
你可以说这是我高中时期的一组朋友

maybe there are some family members up there
也许那里有一些家庭成员

but then I’ve been a few years at Stanford
但是那时我已经在斯坦福大学呆了几年了

so I had this very kind of interested
所以我对此很有兴趣

an interesting nested structure where, you know
一个有趣的嵌套结构,您知道

these are maybe my friends from the same university from Stanford
这些也许是我来自斯坦福大学的朋友

these are my friends from the same department
这些是我来自同一部门的朋友

and up there in the innermost circle are the friends from the same research group
在最里面的那个圈子里是同一个研究小组的朋友

And for example
例如

here this the social circles overlap
在这里,社会圈子重叠

which means that these two nodes are my two friends
这意味着这两个节点是我的两个朋友

from high school who also are attending Stanford
也是从斯坦福大学就读的高中生

for example
例如

so you can see how these social circles can have very different and complex structure
这样您就可以看到这些社交圈如何具有截然不同和复杂的结构

and the question is given a network
问题被赋予了一个网络

can I identify these social circles?
我可以识别这些社交圈吗?

How well could I do that and how would they even
我做得如何,他们怎么会

define the problem if what I want this kind of some organization
定义问题,如果我想要这样的某种组织

how my friends group together, ok?
我的朋友们如何在一起,好吗?

So that’s one example
这就是一个例子


(2)Networks :Infrastructure

another example
另一个例子
image.png
It’s
它的

you can study infrastructure as networks
您可以将基础设施作为网络进行研究

and here is an example from 2003
这是2003年的一个例子

and this was a big blackout on the East coast
这是东海岸的一个大停电

So this is a satellite picture at 9:29 p.m. East Coast time
这是东海岸时间晚上9:29的卫星图片

and this is the picture 7 hours later
这是7小时后的照片

And what you see is that here, you know, you see the lights from the cities
而且您所看到的是,在这里,您看到的是城市的灯光

and here you see very little of that light
在这里你几乎看不到那种光

And essentially
基本上

what happened was, there was a big
发生的是,有一个很大的

a kind of power outage across the entire East coast
整个东海岸的停电

And why our networks is important
以及为什么我们的网络很重要

because you can map the power grid as a graph
因为您可以将电网映射为图形

and then you can start modeling these types of cascading failures
然后可以开始对这些类型的级联故障建模

where failure in one sub part of the system causes failures in other parts of the system
系统一个子部分中的故障导致系统其他部分中的故障

And then you can start asking
然后你可以开始问

how would I design networks that are robust to failures
我将如何设计对故障稳健的网络

and don’t lead to this kind of cascading failures?
并不会导致这种级联失败吗?

And you know
而且你知道

We learn some laws about what does it mean for the network
我们了解一些有关网络含义的法律

to be robust and how do we design a robust network?
变得强大,我们如何设计一个强大的网络?

image.png
这揭示了该类的两个重要主题:

  • ¡我们必须了解网络结构如何影响系统的健壮性
  • ¡开发定量工具以评估网络结构与网络上的动态过程之间的相互作用以及它们对故障的影响
  • ¡我们将了解到,实际上,故障遵循可重复的定律,可以使用网络工具量化甚至预测

(3) Networks: Knowledge

The next thing I wanna show is to talk about networks
我想展示的下一件事是谈论网络
image.png
that encode knowledge that we have about a given domain
编码我们对给定领域的知识

and the idea here is that these networks can be highly heterogeneous
这里的想法是这些网络可以是高度异构的

heterogeneous in the sense that you can have different types of nodes and different types of relationships
从某种意义上讲,您可以具有不同类型的节点和不同类型的关系

Right? So for example
对?所以举个例子

if you want this one up here is kind of a knowledge graph of cars
如果您想要这个,这里是汽车的知识图

where you know that Honda are produced in Japan
你知道本田是在日本生产的

and they make automobiles
他们制造汽车

and Japan is an Asian country and China is also an Asian country
日本是亚洲国家,中国也是亚洲国家

and Toyota is also in Japan
丰田也在日本

and so on and so forth
等等等等

Right and you are encoding this into a graph
正确,您正在将其编码为图形
image.png
Or, for example
或者,例如

if you are interested about universities
如果您对大学感兴趣

authors
作家

research papers
调查报告

research
研究

conferences and instantiation of those research conferences
会议和这些研究会议的实例化

here could be a graph that would allow you to model the relationships
这可能是一个可以让您对关系进行建模的图形

between all these different types of entities
在所有这些不同类型的实体之间

And once you have this type of representation
一旦有了这种表示形式

you can learn in very interesting ways
你可以以非常有趣的方式学习
image.png


and I’ll just give you one example
我只举一个例子

So in the afternoons
所以在下午

I also work at this company called pinterest that you may have heard enough, right?
我也在这家名为pinterest的公司工作,您可能已经听够了,对吗?

I’m in what people do in there is that they take these types of images
我在人们那里所做的就是他们拍摄这些类型的图像

and they saved them into collections
他们将它们保存到收藏夹中

and what is interesting is that you can take the same image
有趣的是,您可以拍摄相同的图像

for example
例如

this one and you can save it in two collections you create that have different names, right?
您可以将其保存在您创建的两个具有不同名称的集合中,对吗?
image.png
so somebody might be interested in identifying vintage kitchens
所以有人可能会对识别老式厨房感兴趣

because maybe there they want their kitchen to look vintage
因为也许他们希望厨房看起来复古

so they would save this into their collection of vintage kitchens
因此他们会将其保存到他们的老式厨房中

Maybe some architect is interested in this kind of bright blue color
也许有些建筑师对这种鲜艳的蓝色感兴趣

and the blue accents and how they fit into the into our kitchen
和蓝色的口音,以及它们如何融入我们的厨房

So they would say they would create another board that would be called blue accents
因此,他们会说他们将创建另一个称为蓝色口音的面板

and and so on and so forth, right?
等等等等,对吧?

and what the way you can think of entire Pinterest is that
您对整个Pinterest的看法是什么

you can think of it as this type of bipartite graph
您可以将其视为这种二部图

where on one end you have images
一端有图像

and on the other end you have these collections
另一方面,您有这些收藏
image.png
and an image can belong to multiple collections at once
一个图像可以一次属于多个集合

and basically
基本上

what can you use this now is to say
你现在可以用这个说什么

can I use this representation that I know how these different image
我可以使用这种表示法吗,我知道这些不同的图像

fit into different collections to do better pin recommendations
适合不同的收藏品以提供更好的针脚建议

Okay
好的

and the way you can do this is to say the following
而您可以做到的方式是说以下

let me represent entire Pinterest as a graph
让我将整个Pinterest表示为图表

you have about four billion notes here about three billion notes here
你这里有大约40亿张钞票

and about two hundred billion connections
和大约2,000亿个连接

That’s the size of the thing
那是东西的大小


Embedding Nodes

and then you can say
然后你可以说

can I use this notion of embeddings
我可以使用这种嵌入概念吗

where the idea is that you will take the graph
想法是您将绘制图形

and you will try to map every nodes of the graph in some d-dimensional space
然后您将尝试在某个d维空间中映射图的每个节点

so that the nodes that are related in the graph are also close in this space
因此图中的相关节点在此空间中也很靠近
image.png
And you can also do that
你也可以做到

and in this course I will show you how we can do that
在本课程中,我将向您展示我们如何做到这一点

So now
所以现在

when you say
当你说

once you’ve done this mapping
完成此映射后

you can pick an arbitrary point in the space and say
您可以在空间中选择一个任意点然后说

what are other points around it in this space and those will be your recommendations
在这个空间中还有其他哪些要点,这些将是您的建议

So now I wanna show you some examples of these types of recommendations
所以现在我想向您展示这些建议类型的一些例子


So what are we doing this?
那么我们在做什么呢?

Somebody will give us a query image here it is
有人会给我们一个查询图像
image.png
and then we can go and say
然后我们可以说

what are the images that are nearby in this embedding space, right?
这个嵌入空间附近的图像是什么,对吗?

we will take the graph
我们将采取图

where every node of the graph will now become a point in this space
图的每个节点现在将成为该空间中的一个点

and what can we learn is how to do this mapping and you know
我们可以学到的是如何进行这种映射

five lectures from now, we’ll talk about how to learn these types of mappings
从现在开始的五个讲座中,我们将讨论如何学习这些类型的映射

but once you have your data embedded in here
但是一旦您将数据嵌入此处

you can ask who are the nearest neighbors of a given pin, of a given image
您可以询问谁是给定图钉,给定图像的最近邻居

So here’s an example, this is the image
所以这是一个例子,这是图片

this is what would happen if he would be only using computer vision features
如果他仅使用计算机视觉功能,就会发生这种情况

If you only do the image features
如果仅执行图像功能

and here is what happens if you do the graph, right?
如果您制作图表,会发生什么事,对吗?

and this is some porcelain doll type thinly or little statue
这是一些薄或小的雕像的瓷娃娃

and you see these are the nearest neighbors in the space
你看到这些是空间中最近的邻居

and they are all kind of similar porcelain dolls
他们都是相似的瓷娃娃

If you are into this type of thing
如果你喜欢这种东西

right

no recommended system application is complete without cats
没有猫,没有推荐的系统应用程序完整

So I give you a cat example, right?
所以我举个例子,对吧?


So if you are in the cats with little heads
所以如果你猫头小

then right here is our recommendations right?
那我们的建议对吗?

You see very cute cats with very cute cats
你看到非常可爱的猫和非常可爱的猫
image.png
so again
再来一次

the graph representation allows you to to do this
图形表示法允许您执行此操作

If you are excited
如果你很兴奋

please don’t use Pinterest is during the lecture
请在演讲期间不要使用Pinterest

OK

you can do it after
你可以以后做

but these are the things you can do
但是这些是你可以做的

and this is actually something that is running in production so you can try it up, right?
这实际上是在生产中运行的,因此您可以尝试一下,对吗?

So that’s one example of how you can take something
所以这是一个如何拿东西的例子

represent it as a graph
将其表示为图形

not a social network and let’s do recommendations
不是社交网络,让我们做一些建议


Another place where networks are important in understanding
网络在理解中很重要的另一个地方

(4)Networks: Online Media

online media and in particular social media
在线媒体,尤其是社交媒体
image.png
So it is very interesting
所以这很有趣

if you say
如果你说

let me try to understand how these networks polarize around different topics
让我尝试了解这些网络如何围绕不同主题进行两极化

and you can represent, for example
例如,您可以代表

the network of Twitter or the network of blogs
Twitter网络或博客网络

and try to study how is it organized
并尝试研究其组织方式

based on the political opinions and leniency
基于政治见解和宽大处理

and its very interesting
它非常有趣


Application:Polarization on Twitter

for example
例如

that you find that whenever the topics are polarizing
您会发现,每当主题发生分歧时

you get this type of two sides that kind of talk a lot
你会得到很多这种类型的双方

each with its own members
每个都有自己的成员

but there is very little talking across the two sites
但是两个站点之间的对话很少

and whenever this is when the topics are polarizing
每当这是两极分化的时候

and this is when topics are not polarizing
这是话题不容置疑的时候
image.png
so you can kind of identify topics that polarized by
因此您可以识别出两极分化的主题

mapping the underlying structure of the network of who is talking to whom
映射与谁聊天的人的网络的基础结构


another example where networks help a lot is to
网络帮助很大的另一个例子是

identify misinformation or to identify fake news
识别错误信息或识别虚假新闻
image.png
And we were working here with Wikipedia and on Wikipedia
我们在这里与Wikipedia和Wikipedia一起工作

You have articles that are completely fake
您有完全伪造的文章

that you know somebody came and wrote
你知道有人来写了

and there’s nothing in there is true
里面什么都没有

so the way you can really do that is if you only look at the text
因此,您真正可以做到这一点的方法是只看文字

it’s very
这是很

very hard to do
很难做

But if you kind of map out how the article is referring to different entities across Wikipedia
但是,如果您可以确定本文是如何引用整个Wikipedia中的不同实体的,

you can map out how it refers to other parts of Wikipedia
您可以找出它如何指代维基百科的其他部分

and what you find is that when that kind of hoaxes tend to refer to
而您发现的是,当这种骗局倾向于指

kind of an incoherent set of entities, and that allows you to distinguish between hoaxes
一种不连贯的实体集,使您可以区分恶作剧

fake articles and non fake articles again
伪造物品和非伪造物品

this is just at the high level
这只是高水平的

and when you do that
当你这样做的时候

you can go to Wikipedia and identify fake articles
你可以去维基百科,找出假冒的文章

For example
例如

here is one about a new language Balboa Creole French that doesn’t exist
这是关于不存在的新语言Balboa Creole法语的一种

and there is actually a story that somebody traveled to this island to study this language
实际上有一个故事,有人去这个岛学习这种语言

But after they arrived
但是他们到达之后

they realized that language it doesn’t exist
他们意识到语言不存在

and if you show this to a human
如果你把这个展示给人类

humans do just a bit better than random, right?
人类比随机做的好一点,对吧?

So this is you ask
所以这是你问

is this article a hoax or not?
这是骗局吗?

If you’d be guessing at random,
如果您随机猜测,

let’s say your accuracy would be 50%, if you show it to humans
假设您向人类展示您的准确性为50%

it’s only 66%
只有66%

You can do it by 86%, right?
您可以做到86%,对不对?

by, again
再来一次

mapping out the network of what are the concepts this article refer to
映射出本文所指概念的网络

and how these concepts are related to each other so again
以及这些概念如何相互关联

trying to capture relationships
试图捕捉关系

If you’d like more
如果您想要更多

here’s the link to the paper
这是论文的链接

but that’s the idea
但这就是想法


Another thing we’ll talk about, so far I was talking about how to model network structures
到目前为止,我们将要讨论的另一件事是如何对网络结构建模

What is interesting is that there is also a phenomenon that spreads over networks
有趣的是,还有一种现象遍布网络

You guys recognize that right
你们认识到那权利

That’s an example of a spreading phenomenon
那是传播现象的一个例子

a little social virus or our network virus that spreads over the social graph
传播到社交图谱中的一些社交病毒或我们的网络病毒

And you can study how this information cascades
您可以研究这些信息如何层叠

For example
例如

this Gangnam Style leader
这位江南时尚领袖

how they spread and how people get adopted and how people get exposed to it
他们如何传播以及人们如何被采用以及人们如何接触它
image.png
how they learn about it and how they then
他们如何了解它,然后如何

let’s say, really share it
假设,真的分享一下

This is one example
这是一个例子

are then an example
然后是一个例子

You can also study is product adoption
您还可以研究产品的采用

right?
对?

There is a lot of products that get adopt
有很多产品被采用

that kind of wiring, wiring in a sense that people invite each other
这种布线,在某种意义上说是人们相互邀请的布线

and one of the most successful product that spread through
也是传播最成功的产品之一

these types of invitations is LinkedIn, right?
这些邀请类型是LinkedIn,对吗?
image.png
so membership to LinkedIn the way
因此成为LinkedIn会员

80% of the LinkedIn members joined LinkedIn
80%的领英会员已加入领英

was by somebody explicitly inviting them to join the network
被某人明确邀请他们加入网络

And here is one invitation cascade where the person on the top was the first person
这是一个邀请级联,顶部的人是第一人

they invited a couple other people at the next level who invited other people are the next level
他们邀请了下一个级别的其他几个人,他们邀请了下一个级别的其他人

and you see how this membership of LinkedIn is now kind of
您会看到LinkedIn的这种会员现在如何

a cascading through the underlying network
通过底层网络进行级联

So if you wanna to create products that get adopted
因此,如果您想创建被采用的产品

but without advertising by basically people who use your product
但基本上没有使用您产品的人做广告

invite others
邀请他人

you are interested in this kind of spreading phenomena across networks
您对这种跨网络传播现象感兴趣

So, and of course, we’ll talk about this and we’ll even talk about
所以,当然,我们会谈论这个,甚至会谈论

who do you infect with your product so that
您会向谁感染您的产品,以便

you get the maximum spreading power and things like that
您将获得最大的传播能力,诸如此类


And then the last thing I wanna mention
然后我想提的最后一件事

I wanna show you
我想告诉你

A quick example is about networks and representations from kind of biomedicine
一个简单的例子是关于生物医学的网络和表征

and in particular
特别是

one network that is amazingly powerful is
一个功能强大的网络是

what is called a protein-protein interaction networks, right?
什么叫做蛋白质-蛋白质相互作用网络,对吗?

(5)Networks:Biomedicine

image.png
So in our bodies we have close to basically we have
因此,在我们体内,我们基本上已经接近

W\e have 20,000 protein holding genes
拥有20,000个蛋白质保持基因

so there is around 20,000 proteins in our bodies
所以我们体内大约有20,000种蛋白质

and these proteins physically come together in our cells to do certain tasks
这些蛋白质在我们的细胞中物理结合在一起以完成某些任务

So what you can do is if you’re a biologist
所以如果您是一名生物学家,您可以做的是

you can go in the lab and actually measure
您可以去实验室进行实际测量

painstakingly for every pair of proteins
为每对蛋白质付出艰辛

do they come together in the cell physically to do something
他们会聚在一起吗?

And this gives rise to the protein-protein interaction network
这产生了蛋白质-蛋白质相互作用网络

right?
对?

It has 20,000 nodes,
它有20,000个节点,

and each of the edges is basically hand
每个边缘基本上都是手

measured by one of your colleagues in the lab it takes about
由您的一位同事在实验室进行测量,大约需要

I know a few months to get one edge out right and the possible number of edges
我知道几个月后才能正确获得一条边缘以及可能的边缘数量

20,000 ^ 2, right?
20,000 ^ 2,对吗?

So, but, you know people are doing this
所以,但是,您知道人们正在这样做

and this is a very useful network
这是一个非常有用的网络


Application:Side effects

So let me show you one example of how this network can be used for a very real problem, right?
因此,让我向您展示一个示例,说明如何将此网络用于一个非常实际的问题,对吗?

So here’s the problem, patients today take many drugs at once
这就是问题所在,今天的患者一次要服用多种药物

Right? To treat lets a coexisting coral complex diseases
对?治疗让并存的珊瑚复杂疾病
image.png
for example
例如

in US about fifty percent of the people that are older than 70
在美国,大约70%的70岁以上的人

take 5 drugs or take more than 5 drugs
服用5种药物或服用5种以上药物

And there are many patients who take 20 drugs together
还有很多病人一起服用20种药物

all at once
一次全部

right?
对?

and the problem is that drugs can have side effects
问题是药物可能有副作用

and the problem is when people take combinations of drugs
问题是当人们服用多种药物时

some kind of new adverse side effects can emerge that are not
可能会出现一些新的不良副作用

just side effects of one drug and side effects of another drag
只是一种药物的副作用和另一种药物的副作用

its new due to drug interactions
由于药物相互作用,它是新的

new side effects can emerge
可能出现新的副作用

and of course you can now create clinical trials to
当然,您现在可以创建临床试验来

test every pair of drugs to see what side effects would happen
测试每对药物,看看会发生什么副作用

It would just be too expensive to huge combinatorial explosion
巨大的组合爆炸太昂贵了

There is around 5,000 register of drugs in the United States
美国大约有5,000种药品注册

so the number of pairs you can compute, right?
所以可以计算的对数,对不对?

so you cannot do that, so the question would be how could you go and
所以你不能那样做,所以问题是你怎么去

and given a pair of drugs predict what kind of side effects can occur
并给定了一对药物,可以预测会发生什么样的副作用

And to tell you one other thing
告诉你另一件事

the way we learn about these things today is that doctors prescribed medication
今天我们了解这些事情的方式是医生开了药

if enough patients’ complaints that something is wrong
如果有足够多的患者抱怨出了问题

the doctor will investigate
医生会调查

the doctor will then report this to FDA
医生然后将其报告给FDA

If you know
如果你知道的话

if FDA kind of collects enough of these types of reports
如果FDA收集了足够多的此类报告

they have a spreadsheet where they would add a drug combination and the side effects
他们有一个电子表格,可以在其中添加药物组合和副作用

So basically we learn about this by luck or by kind of non luck
所以基本上我们是通过运气或非运气来学习的

If you see what I mean
如果你明白我的意思


OK

so the question is
所以问题是

could I predict these types of things, OK?
我可以预测这类事情好吗?

So how are we going to do that, right?
那么我们该怎么做,对吧?

The way we’ll do this is we will do this as a link prediction task over a heterogeneous graph
我们将执行此操作的方式是将其作为异构图上的链接预测任务

Application:Biomedical Graphs

image.png
So let me now show you how this will work
现在让我向您展示这将如何工作

I will have two types of nodes
我将有两种类型的节点

I will have the circles who are my proteins and how proteins interact in the cell
我将了解谁是我的蛋白质以及蛋白质如何在细胞中相互作用的圈子

So this you can think of this
所以你可以想到这个

this is, this maps
这是这张地图

how different protein molecules in the cell come together for the cell to function
细胞中的不同蛋白质分子如何聚集在一起以使细胞发挥功能

and the way drugs work, right?
毒品的运作方式,对吗?

drugs are triangles and drugs work
毒品是三角形,毒品在起作用

so in such a way that they change the behavior of proteins
这样就改变了蛋白质的行为

so drugs target proteins
所以药物靶向蛋白质

So for every drug on the market
因此,对于市场上的每种药物

we actually know what protein it targets
我们实际上知道它靶向的蛋白质

what proteins kind of behavior does it change, right?
它会改变哪种蛋白质行为,对吗?

So this drug, see, targets these four proteins
因此,该药靶向这四种蛋白质

Okay
好的

and then, now we have the DPI network
然后,我们有了DPI网络

we have the drugs now we need to encode side effects
我们现在有药物,我们需要编码副作用

and the way we can encode side effects
以及我们编码副作用的方式

Is a kind of us labelled friendships between between drugs, right?
我们中的一种被标记为毒品之间的友谊吗?

so this would mean that drug C and M
所以这意味着药物C和M

if taken together
如果一起

lead to side effects are one and there is around 1,000 different side effects
导致副作用的一种,大约有1000种不同的副作用

so there is about 1,000 different edge type up here
所以这里大约有1000种不同的边缘类型

and of course there can be multiple side effects on on each edge
当然在每个边缘上都会有多种副作用

but if we have this graph and this representation
但是如果我们有这个图和这个表示

we can now go and formulated as a link prediction task
我们现在可以将其公式化为链接预测任务

where essentially we can say given a pair of drug
从本质上讲,我们可以说给了一对药

what kind of side effects could we could we expect to happen
我们可以预期会发生什么样的副作用

and you can do these types of things
你可以做这些事情

and I will show you one way how this method has been validated is that
我将向您展示如何验证此方法的一种方法是

the model made 10 predictions that it was most certain about
该模型做出了10个最确定的预测

and for 5 out of those 10 predictions
在这10个预测中有5个

those were new side effects that were validated in the lab later, right?
这些是新的副作用,后来在实验室中进行了验证,对吧?

so this gave very high accuracy
所以这给了很高的准确性

Prediction Task

image.png
even though it’s a very hard in some sense prediction task
即使在某种意义上来说很难预测

where a prediction is link prediction between different drugs
预测是不同药物之间的链接预测

and when you say it’s not only whether they are connected
当你说不仅是它们是否连接

but also what type of connection it is
而且是什么类型的连接

so you need to predict 1,000 different numbers
所以你需要预测1,000个不同的数字

whatever for every edge
无论什么边缘

So that’s what I wanted to say about some examples of what we
这就是我想说的关于我们的一些例子

what we will do in this class and what will be able to achieve now
我们将在本节课中做什么以及现在将能够实现什么

I wanna switch gears and I wanna talk briefly about the logistics for the course
我想换档,我想简单谈一下课程的物流

Are there any questions?
有没有问题?

[Student]
[学生]

Great question
好问题

so the question was right now we only did two drugs
所以问题是现在我们只做了两种药物

so we were able to do this kind of link prediction between a pair
所以我们能够在一对之间进行这种链接预测

of what is as I will show you later
我稍后会告诉你的是什么

there are graphs
有图

and there are what is called hyper graph, and in hyper graph you can have edges
还有所谓的超图,在超图中,您可以拥有边

that connect more than two nodes so you could represent everything as a hyper graph
连接两个以上的节点,因此您可以将所有内容表示为一个超图

and then you could start making predictions over three pools of nodes
然后可以开始对三个节点池进行预测

but you know computation
但是你知道计算

If you think about that computationally
如果您以计算方式考虑

it would be more expensive
那会更贵

So here it was only focused on pairs
所以这里只专注于对

but great point, why?
但是好点,为什么呢?

good, yes, go ahead
好,是的,继续

[Student]
[学生]

So the question was why would I even care about mapping out this graph?
所以问题是为什么我什至还要关心映射此图?

Why wouldn’t I just fit
我为什么不适合

the data and left led the model automatically figure it out
数据并引导模型自动找出来

Why?
为什么?

What’s the problem with that?
这是什么问题?

The problem is that you know that you don’t have enough data
问题是您知道没有足够的数据

You will never have enough data
您将永远没有足够的数据

You talked to people from Google
您与Google的人聊天

and they say their main problem is they don’t have enough data, right?
他们说他们的主要问题是他们没有足够的数据,对吗?

which kind of right so you don’t have enough data
哪种权利使您没有足够的数据

So why do you do this?
那为什么要这么做呢?

You do this because you are in some sense capturing prior knowledge, right?
之所以这样做,是因为您在某种意义上是在获取先验知识,对吗?

hundreds of years of many years went into identifying and mapping out this network
许多年来,数百年来一直在识别和映射该网络

But this network is also given to you, right?
但是这个网络也可以给你,对吗?

so you’re essentially exploiting prior knowledge to make the model be efficient with the amount of data you have
因此,您实质上是在利用先验知识来利用所拥有的数据量来提高模型的效率

Right?
对?

and if you think about the side effects
如果考虑副作用

That is
那是

we don’t have all the side effects
我们没有所有的副作用

we only know a small number and you cannot collect more
我们只知道一小部分,而您不能收集更多

because you’d have to start giving random drugs to people and see what happens to them
因为你必须开始随机地给人们服药,看看他们发生了什么

and that’s kind of ethically and otherwise
这在道德上是不道德的

maybe not the best idea, right?
也许不是最好的主意,对不对?

So you wanna create the graph
所以你想创建图

[Student]
[学生]

Let me give you some examples and you will see why this is a good idea
让我给你举一些例子,你会明白为什么这是一个好主意

So let me give you some examples later, alright
所以让我稍后给你一些例子,好吧

Let me continue and tell you about the class
让我继续讲讲课程


So for the class I’ll be the instructor and then we have a great TA team together with Michele
因此,在课程中,我将担任讲师,然后我们将与Michele一起组成一支出色的TA团队

who will be my core instructor and he will teach part of the course, the TA team is here
他将是我的核心讲师,他将教授课程的一部分,助教团队在这里

You guys can stand up and wave. OK, great.
你们可以站起来挥手。好的太棒了。

They are nicely waving, super
他们很好地挥舞着,超级

so one thing I will say that is
所以我要说的是

I think very important is we are here to help you learn, right?
我认为非常重要的是我们在这里可以帮助您学习,对吗?

so we are here to help you learn, and kind of in some sense
所以我们在这里可以帮助您学习,并且在某种意义上

learn as much as possible
尽可能多地学习

and the way the classes is
以及上课的方式

Yes

In some sense
在某种意义上

things will be hard
事情会很难

but we are here to help you, right?
但是我们在这里可以帮助您,对吧?

so we all everyone that is here is, committed to make you, to help you, to make you learn
所以我们所有人都致力于让您,帮助您,让您学习

We will be holding office hours
我们将举行办公时间

where you will be able to come and ask us questions
您可以去哪里问我们问题

we will be able to, If you are an SCPD students
如果您是SCPD学生,我们将能够

you’ll be able to talk to us remotely
您将可以与我们远程对话

if need be
如果需要的话

we can also do and hold office if there will be interest
如果有兴趣,我们也可以做并担任办公室

hold office hours over the weekend and things like that
在周末上班时间之类的

So basically
所以基本上

what I’m trying to say is we are really here to help you
我想说的是我们真的在这里为您提供帮助

so you will have to put in energy and work and will help you to
因此您将不得不投入精力和工作,并将帮助您

achieve things that you don’t think you can do today
实现您今天认为无法做到的事情

so it will be kind of a win win situation
所以这将是双赢的局面

So that’s the first thing I wanted to say
这是我想说的第一件事

The second thing I wanted to give you is the outline for the course
我想给你的第二件事是课程大纲

so what will happen is we will have 19 lectures
所以将会发生的是,我们将举行19场讲座

We have a 10 week a quarter so that it would be 19 lectures
我们每个季度有10周的时间,因此将有19堂课

from the lecture today to the last lecture on December 5
从今天的演讲到12月5日的最后一次演讲

we will have an exam on November 19
我们将在11月19日进行考试

This is a Thursday
这是星期四

So the lecture on that Thursday
所以那个星期四的演讲

Sorry
抱歉

That’s a Tuesday
那是星期二

The lecture on that Tuesday won’t happen
那个星期二的演讲不会举行

but we will have a two hour long exam in the evening
但是晚上我们要进行两个小时的考试

we will announce the details
我们将宣布细节

the way we will structure the course is
我们构建课程的方式是

we will talk about algorithms for analyzing networks that’s kind of in blue
我们将讨论用于分析网络的算法

We’ll talk about statistical machine learning on networks
我们将讨论网络上的统计机器学习

this is the kind of the middle part of the course
这是课程的中间部分

and then we will also talk more kind of on about
然后我们还将讨论更多有关

public applications and lectures focused on the on the use cases across those applications
公共应用程序和讲座侧重于这些应用程序的用例

So that’s kind of the high level of structure and
所以这是一种高层次的结构

especially in this part
特别是在这部分

we will be covering some of the very much kind of state of the art work and papers
我们将介绍一些最先进的艺术作品和论文

and research has only recently been published
研究只是最近才发表

but we will make things very accessible
但我们将使事情变得非常容易

so we’ll do our best to explain things to you very clearly
所以我们会尽力向您解释清楚

As I said early on are a big part
正如我早先所说的那样

of the class we are reengineering and many lectures will be
我们正在重新设计的班级中,许多讲座将

you will be kind of brand new and taught for the first time
您将是全新的,并且是第一次学习

so please give us feedback and please kind of
所以请给我们反馈,并请

sometimes things may not be perfect and have a bit of patients with us
有时候事情可能并不完美,我们有些病人

but we really want your feedback because we wanna
但我们真的很想得到您的反馈,因为我们想

improve this lecture so that next year things are even better
改进本讲座,以便明年情况更好

So that’s the idea here
这就是这里的想法

now in terms of the course logistics there is the course website
现在在课程后勤方面有课程网站

and basically on the course website
基本上在课程网站上

what we have is, we have the readings
我们有的是,我们有读数

which is basically we post the slides before the lecture
基本上是我们在讲座前发布幻灯片

and then I usually update them after the lecture
然后我通常在演讲后更新它们

so slides are posted
所以幻灯片被发布了

let’s say half an hour before the lecture
比方说讲课前半小时

the latest and after the lecture
最新和演讲之后

based on your questions and feedback
根据您的问题和反馈

I may quickly update them and repost them so that there will be a new version after the lecture if needed
我可能会迅速更新它们并重新发布它们,以便在讲座后如有需要,将有新版本

reading,a here are research papers published on the on the website for every topic
阅读,这是网站上针对每个主题发布的研究论文

This will be very useful for you when you work on the course project
当您从事课程项目时,这对您将非常有用

because for the course project you wanna have a set of related work
因为对于课程项目,您想要进行一系列相关的工作

a set of research work has already been done on the topic
关于该主题的一组研究工作已经完成。

and this would be a very good kind of starting point to start thinking about the project
这将是开始考虑该项目的一个很好的起点

or if you are more interested about a given topic
或者如果您对给定主题更感兴趣

this optional readings will be very useful to you
这个可选的读物对您将非常有用

so that’s about the website
那就是关于网站

Please check the website
请检查网站

Slides will be posted there before the class
幻灯片将在上课之前张贴在那里

Then
然后

in terms of communication
在交流方面

as majority of Stanford courses will be using Piazza
因为斯坦福大学的大部分课程都将使用广场

please use your Stanford the email address to
请使用您的斯坦福大学的电子邮件地址

to register to Piazza
注册到广场

please participate and help each other
请参与并互相帮助

please don’t post code
请不要发布代码

please annotate your questions
请注释您的问题

search for answers before you before you ask
在问之前先寻找答案

Please help each other answering each other’s questions
请帮助彼此回答对方的问题

so that and what we will do is we’ll do our best to to monitor piazza constantly
因此,我们将竭尽所能不断监控广场

and answer questions as quickly as possible
并尽快回答问题

So usually our average response time is about 20 minutes and we’ll try to keep it that way
因此,通常我们的平均响应时间约为20分钟,我们会尽量保持这种状态

but this means we really have to be diligent on our and
但这意味着我们真的必须对我们的和

but if you help each other
但是如果你们互相帮助

it helps us keep that time even lower
它可以帮助我们缩短时间

Right

but that means that you guys are not stuck
但这意味着你们不会被困住

but you can make progress faster
但是你可以进步更快

so please help each other
所以请大家互相帮助

We will give extra credit for people who answer questions on Piazza
对于在广场上回答问题的人,我们将给予额外的奖励

If you wanna reach any one from the course
如果您想达到课程中的任何一个

staff means me
工作人员意味着我

or professor or the TAs, don’t email us individuall
或教授或助教,请勿给我们发送电子邮件

Always use this mailing list, right?
始终使用此邮件列表,对吗?

So whenever you wanna communicate with someone that is in the course
因此,只要您想与课程中的某人交流

don’t email them individually
不要单独给他们发送电子邮件

use the course mailing list
使用课程邮件列表

Use this staff mailing list and we will get the question and we’ll be able to respond
使用此工作人员邮件列表,我们会收到问题,我们将能够进行答复

We will post course announcements on Piazza
我们将在广场上发布课程公告

so make sure to check it regularly
所以一定要定期检查

so that’s about Piazza
那就是关于广场

About work, for the course we have three big parts
关于工作,本课程分为三大部分

we have homeworks we will have three large homeworks
我们有作业我们将有三个大型作业

and then you also have a starter homework called Homework 0
然后您还有一个名为Homework 0的入门功课

So the the large Homeworks Will each one will be worth 10 percent of the final grade
因此,大型家庭作业的价值将是最终成绩的10%

or 9.6 so that the homework 0 is worth 1% of the grade
或9.6,以使作业0的成绩达到学分的1%

and the idea for Homework 0
和作业的想法0

is just for you to install the software and spend
只供您安装软件并花费

I know an hour and it should be done so it should be easy
我知道一个小时,应该这样做,所以应该很容易

and if Homework 0 is not easy
如果作业0不容易

then please come talk to us
那请跟我们谈谈

OK

So Homework 0 is not meant to be rocket science
所以作业0并不意味着要成为火箭科学

It’s meant to be easy to get you started
入门很容易

Talk to us if it’s not like that
跟我们聊聊

then the exam that will be as I said on November 19
然后是我11月19日所说的考试

will be the other 30% of the grade
将是该年级的其他30%

And then the course project is a substantial part of your course grade
然后,课程项目是您课程成绩的重要组成部分

tt’s 40% of the course grade,
tt占课程成绩的40%,

And this will be broken down on the proposal project milestone
这将在提案项目里程碑中分解

then there is the final report
然后是最终报告

and then there will be the poster presentation, which you all will attend to present your posters
然后将有海报展示,大家都将参加以展示海报

And

as I said
就像我说的

we will be generous with extra credit for the people who go beyond
我们将为超越范围的人们提供慷慨的信誉

what’s expected in terms of Piazza participation code contribution
广场参与代码贡献的期望值

dataset contributions
数据集贡献

identifying issues in the lecture notes things like that so
在讲课中确定问题要注意类似的事情

and this will be used if you are at the boundary between the grades to kind of bump you up
如果您处于年级之间的边界以提高您的水平,则将使用此选项

now about homework and these rights right tops
现在有关作业和这些权利的顶部

The three big homeworks are relatively big
三大功课比较大

so they take between 10 and 20 hours of work each
因此他们每个人要花10到20个小时

and this means that you need to start early and please plan for that
这意味着您需要尽早开始,请为此进行计划

and this homework should be a combination of data analysis
这个作业应该是数据分析的结合

algorithm design and derivations mathematics
算法设计与推导数学

and generally they will be due on Thursdays at midnight Pacific time
通常他们定于太平洋时间的周四午夜

OK

you will be using a great scope to submit your homeworks
您将在很大的范围内提交作业

write apps as well as your code
编写应用程序以及您的代码

and please read carefully the instructions on how to do this
请仔细阅读有关操作方法的说明

because we need to be able to create this
因为我们需要能够创建这个

So every answer should start on a separate page and things like that
因此,每个答案都应在单独的页面上开始,诸如此类

It’s all written on the website
全部写在网站上

As I say it
正如我所说的

you need to submit code as well as the right up
您需要提交代码以及正确的权限

both through the great scope for code
两者都通过巨大的代码范围

we don’t go and run your code
我们不去运行您的代码

but we take as I will show you later
但是我们以我稍后会告诉你的

the Stanford Honor code very strictly
斯坦福荣誉守则非常严格

so we check out for duplicated duplication and plagiarism
所以我们检查出重复和抄袭

I’ll talk about that later
以后再说

and then every student gets too late periods for the quarter
然后每个学生的季度都太迟了

and what the late period means is that rather than you can be handy
后期的意思是不是可以方便

mean something on a Thursday night you can wait till Monday night to hand it in, OK?
是什么意思,在周四晚上您可以等到周一晚上再交,好吗?

so the late period and you can take one late period put assignment
所以后期,您可以参加一个后期的看跌期权

and now you can use one late period
现在您可以使用一个后期

but assignment the only place where we don’t allow late periods are the final reports
但是分配我们唯一不允许延迟的地方是最终报告

If an entire team wants to be late
如果整个团队都想迟到

every member of the team will be charged one late period
团队的每个成员将被收取一个后期费用

But you can use this or not
但是你可以使用这个

However
然而

however
然而

you like
你喜欢

but basically we allow it to be three days late
但基本上我们允许它晚三天

in some sense
在某种意义上

right rather than Thursday
而不是星期四

you submit on a Monday on two of these things that you have to submit
您在星期一提交了其中两项必须提交的内容

So for the exam
所以对于考试

what I wanted to say is on the November 19
我想说的是11月19日

we’ll have the exam in the evening
我们晚上要考试

Duration will be around two hours
持续时间约两个小时

It will basically cover the content up and including the lecture on November 14 and the exam is open book so
它基本上涵盖了所有内容,包括11月14日的讲座和考试是公开的,因此

and basically this will be exercises where you will have to explain solution
基本上,这将是练习,您必须在其中解释解决方案

solutions and process
解决方案和过程

They do some derivations and things like that
他们做一些推导和类似的事情

will give you examples
会给你例子

One thing I wanted to mention is especially for people
我想说的一件事特别是对人来说

who are at Stanford for the first time this Stanford honor Code
斯坦福荣誉守则的人第一次来斯坦福

So please make sure to to read what Stanford Honor Code says
因此,请务必阅读斯坦福荣誉准则所说的内容

It’s basically a document that was written in 1920
这基本上是1920年写的文件

and this university has functioned by the principles of their documents since then
从那以后,这所大学一直按照其文件原则运作

so please really make sure to read it
所以请务必确保阅读

I copied some examples from the Web page describing the honor code
我从网页上复制了一些描述荣誉代码的示例

what kind of things are not allowed?
不允许什么样的事情?

And the site says that a standard sanction for a first time offense
该网站说,这是对初犯的标准制裁

is one quarter suspension in forty hours of Community service
是四十小时的社区服务暂停时间

Right

So these things are quite serious
所以这些事情很严重

So
所以

What I would say is you are very welcome to work with each other
我要说的是,非常欢迎你们彼此合作

You

every person on the team for the assignments has to be able has to produce the solution by themselves
团队中的每个人都必须能够自己制定解决方案

But you can talk to each other and you
但是你可以互相交谈,你

but you have to acknowledge people that you talk to on your homework
但您必须承认在作业中与之交谈的人

And it’s very important to acknowledge because
承认这一点非常重要,因为

if our plagiarism detection software on homeworks or on code
如果我们在家庭作业或代码上的窃检测软件

we’ll will come with some things we’ll identify certain plagiarisms
我们将提供一些东西,我们将识别某些抄袭

it’d be very hard for us to do anything but to give this to the
除了将其交给

office of Community standards and then really bad things can happen
社区标准办公室,然后真的可能发生坏事

and nobody wants that
没有人想要

So just please read this obeyed
所以,请阅读此服从

and everyone’s life will be better
大家的生活会更好

Alright
好的

so that’s what I wanted to say about honor code
这就是我想说的荣誉代码

I have a few more things I wanted to say about course projects
关于课程项目,我还有其他要说的话

essentially they are
本质上他们是

they can be on many different topics involving networks
他们可以是涉及网络的许多不同主题

anything from empirical analysis to algorithms and machine learning models
从实证分析到算法和机器学习模型

scalable allegoric algorithms
可扩展的寓言算法

all the way to theory and mathematics of networks
一直到网络理论和数学

It is done in groups of up to three students
最多由三个学生组成的小组

You can also do groups of one or two
您也可以做一两个小组

and we’ll take the size of the group into account
我们会考虑小组的规模

but we would really say that the group of three
但是我们真的会说三个

most optimal because you can kind of help each other and work together in its more efficient
最理想的,因为您可以互相帮助,并在效率更高的情况下一起工作

project is an important part of the class
项目是课堂的重要组成部分

We will help you with the ideas
我们会帮助您的想法

data and mentoring
数据和指导

and you can start thinking about this now
您现在就可以开始考虑

another thing to say
另一件事要说

because many of you are taking many different courses that have projects
因为你们中的许多人正在上很多有项目的不同课程

it’s okay to combine projects
可以合并项目

Just please be very
请非常

very clear ahead of time that you are also proposing part of the project somewhere else and say
提前很清楚地知道您也在其他地方提议项目的一部分,并说

what will you do for this class?
您将为这堂课做什么?

what are you going to do for some other class?
你要为其他班做什么?

Ah

note that we will have a poster session on December 12 from noon to 3 p.m.
请注意,我们将于12月12日中午至下午3点举行海报发布会

this is the final exam time for this for this class
这是本课程的最后考试时间

So this means that everyone
所以这意味着每个人

everyone who is taking this class should be free at this time to attend the poster session
参加此课程的每个人此时都应有空参加海报发布会

If that’s not the case
如果不是这样

come talk to us
来和我们谈谈

but our expectation is that everyone will attend the poster session
但是我们希望每个人都可以参加发布会

everything presenting their work
所有展示他们工作的东西

One thing that I am amazingly excited this year is that will actually give you computing infrastructure
我今年感到非常兴奋的一件事是实际上将为您提供计算基础架构

So we’ve been able to talk to Google
这样我们就可以与Google交流了

so this class is around 300 students
因此这堂课大约有300名学生

so you can compute how much money is that
这样你就可以计算出多少钱

But so Google was very generous to us
但是Google对我们非常慷慨

and basically we will be able to offer 1,800 dollars of cloud
基本上我们将能够提供1800美元的云

Google Cloud credits for you to do your projects
Google Cloud功劳可助您完成专案

so this means that you will all have essentially the same hardware
因此,这意味着您将拥有基本上相同的硬件

others disposal to do your projects
其他人去做你的项目

So we’ve been in some sense democratize that
所以从某种意义上说我们一直在民主化

and give access to everyone to be able to use Google Cloud credits to do to do projects
并允许所有人使用Google Cloud积分来做项目

I think this will be a great for you
我认为这对您来说很棒

because you will also learn how to use these types of cloud services
因为您还将学习如何使用这些类型的云服务

how to reserve machine
如何预订机器

how machines
机器如何

how to do the analysis and as unique as industry is moving to the cloud
如何进行分析以及与行业一样向云迁移的独特性

They should be very good experience
他们应该是很好的经验

Note that we are doing this for the first time
请注意,我们是第一次这样做

And also
并且

it’s never been done at Stanford that this such a scale so that there might be some hiccups
在斯坦福大学从来没有做过这么大的规模,以至于可能会有一些打ic

Please work with us and be patient
请与我们合作并耐心等待

We don’t know what will go wrong
我们不知道会出什么问题

but I’m sure to something with so and Michele
但我肯定会和米歇尔一起

We’ll give you Google Cloud tutorial on Friday
我们将在周五为您提供Google Cloud教程

and please refer to him
并请他

if you have any questions or any problems
如果您有任何疑问或任何问题

we are here to help you in debug
我们在这里可以帮助您进行调试

so this is this is something we are extremely excited about
所以这是我们非常兴奋的事情

And then course schedule
然后安排课程

here are the week’s
这是一周的

here are the things that will be that you will have to hand in
这是您必须要交的东西

and these are these are the due dates
这些是到期日

The only place, the only thing that doesn’t have a where you cannot take
唯一的地方,唯一的没有你不能去的地方

a late period is the final project report
后期是最终项目报告

which will be due on December 10 again at midnight Pacific time
将于12月10日太平洋时间午夜再次到期

So it means that 23:59 p.m., and then
就是说23:59 pm,然后

as I mentioned
如我所说

poster sessions in between
之间的海报会议

there are the three homeworks project proposal, milestone
有三个家庭作业项目建议书,里程碑

and then the final project report
然后是最终项目报告

The last thing I want to quickly say is about prerequisites
我想快速说的最后一件事是关于先决条件

so our goal is to kind of cover everything from ground up
所以我们的目标是从头开始涵盖所有内容

so in some sense
所以在某种意义上

no single topic will be too hard by itself
没有一个主题本身会太难

But what will, my beheard is that we will draw up on and cover many kinds of many different topics
但是,我的听闻是,我们将总结并涵盖许多不同的主题

and will grow ideas from many parts of computer science
并将从计算机科学的许多方面产生新的想法

So it’s important to have good background in algorithms and graph theory
因此,拥有良好的算法和图论背景非常重要

probability statistics and linear algebra, right?
概率统计和线性代数,对吗?

So understanding matrices
所以了解矩阵

understanding probabilities
了解概率

and knowing graph algorithms is a good is a prerequisite
并且了解图算法是一个好前提

And then
然后

in terms of programming
在编程方面

you should be able to write nontrivial programs
您应该能够编写平凡的程序

What that means in practice should you should have some experience to write as much code
在实践中,这意味着您应该有一定的经验来编写尽可能多的代码

if you have written a program this long
如果您已经编写了这么长时间的程序

I think you’ll be fine
我想你会没事的

But not font size 32, OK?
但是不是32号字体,好吗?

I know so one hundred lines something like that
我知道一百行之类的东西

and to bring everyone up to speed
让每个人都快

We will hold two recitation sessions that will be recorded
我们将举行两次朗诵课,并将其记录下来

One will be about software
一个与软件有关

We will use this library called Snappy in Python
我们将在Python中使用称为Snappy的库

and we’ll tell you about Google Cloud
我们会告诉您有关Google Cloud的信息

This will be this coming Friday at 3 p.m. in the Skilling Auditorium
这将在下周五的下午3点在技术大礼堂举行

Again
再次

it will be recorded and accessible online
它将被记录并在线访问

and then we’ll do a review of theory in terms of probability
然后我们将根据概率对理论进行回顾

linear algebra and proof techniques
线性代数和证明技术

and this would be then next week
然后是下周

On Friday
在周五

October 4 again at 3 p.m.
10月4日下午3点再次

And the TAs will give these lectures
助教将进行这些讲座

and our goal is kind of to cover things that we expect you to know
我们的目标是涵盖我们希望您知道的事情

Now
现在

given that we’ll be looking working with networks we need libraries, the library will be using is called snap.py
鉴于我们正在寻找需要库的网络,因此该库将被称为snap.py

so it’s essentially a piece of software that we are developing in our group
所以这实际上是我们小组中正在开发的软件

there are two versions of it
有两个版本

there is the C++ based a core
有基于C ++的核心

and then there is a Python frontend for it
然后有一个Python前端

So essentially you have something that is very scalable
所以从本质上讲,您拥有非常可扩展的功能

that has a very scalable C++ backend
具有非常可扩展的C ++后端

but you can still use in Python
但您仍然可以在Python中使用

and you are of course also free to use other tools
而且您当然也可以自由使用其他工具

like a networkX or others
像networkX或其他

But we don’t support this
但是我们不支持

we’ll be able to debug you in using these tools
我们将能够使用这些工具调试您

for this we cannot
为此,我们不能

OK

so that’s what I wanted to say about the class
所以这就是我要说的

Did I forget anything, or any questions?
我忘记了什么或任何疑问吗?

The question is
问题是

is there, are there examples of previous course projects?
那里有以前课程的例子吗?

Yes

there are so for every year
每年都有

If you go to the course website
如果您访问课程网站

there should be links to the course website from previous years and for every year we publish all the
应该有前几年课程网站的链接,并且每年我们都会发布所有

all the older course projects
所有较旧的课程项目

so you can use that to get ideas and then on the website you will also see that when we ask you what to write
因此您可以使用它来获取想法,然后在网站上,当我们问您要写些什么时,您也会看到

we give you examples
我们给你例子

I think of four different project proposals project milestones
我想到了四个不同的项目提案项目里程碑

When we say this is what we
当我们说这就是我们

what we thought was a good project
我们认为这是一个好项目

so you can use previous projects to get ideas
因此您可以使用以前的项目来获得想法

and there are especially selected prior projects on the course website
并且课程网站上特别精选了一些先前的项目

that will show you kind of what do we expect from you for the proposal
这将向您展示我们对提案的期望

and so on
等等

and so forth
依此类推

Ok? Sounds great.
好?听起来不错。

There was something there on the back
后面有东西

Yes

Do you have documentation for snappy
您是否有关于快照的文档

Great

The question is do we have documentation for a snappy?
问题是我们是否有针对性的文档?

Yes

Of course
当然

So what we did a few years ago is part of one homework
所以我们几年前所做的是一项家庭作业的一部分

every student in the class wrote documentation for one function
班上的每个学生都为一个功能编写了文档

And it was done in three days
它在三天内完成

so it was great
所以很棒

So it’s really good documentation
所以这真的是很好的文档

If it’s not good
如果不好

it’s not my fault
这不是我的错

But there is documentation
但是有文档

yes, prepared by Stanford students
是的,由斯坦福大学的学生准备

and it’s good
这很好

I think it’s really good
我觉得真的很好

OK