LG - 机器学习 CV - 计算机视觉 CL - 计算与语言 AS - 音频与语音 RO - 机器人 (*表示值得重点关注)

1、[AI] Counter-Strike Deathmatch with Large-Scale Behavioural Cloning

T Pearce, J Zhu
[Tsinghua University]
“反恐精英”死亡竞赛的大规模行为克隆。提出一种仅利用像素输入玩第一人称射击游戏CS GO的AI智能体,一个深度神经网络,在死亡竞赛游戏模式中等难度设置下,采用类似人类的游戏风格,表现优于游戏中基于规则的机器人。与之前许多游戏不同,CS GO没有API,算法必须实时训练和运行。这就限制了可生成的on-policy数据的数量,使许多强化学习算法无法使用。通过使用行为克隆——从在线服务器上爬取人类游戏记录,主要是以旁观者身份加入游戏得到的游戏画面,在这些大规模含噪数据上进行训练(规模达到400万帧,70小时,与ImageNet相当),在较小的高质量专家演示数据集上进行微调,其规模比之前在FPS游戏中的模仿学习工作大一个数量级。得到的智能体可在瞄准训练模式达到略逊于最强玩家(CSGO前10%玩家)的水平,在死亡竞赛游戏模式中等难度下达到内置AI(CSGO内置的基于规则的机器人)水平。这是到目前为止游戏行为克隆领域规模最大的作品之一,也是最早解决没有API支持的现代视频游戏的工作之一。

This paper describes an AI agent that plays the popular first-person-shooter (FPS) video game `Counter-Strike; Global Offensive’ (CSGO) from pixel input. The agent, a deep neural network, matches the performance of the medium difficulty built-in AI on the deathmatch game mode, whilst adopting a humanlike play style. Unlike much prior work in games, no API is available for CSGO, so algorithms must train and run in real-time. This limits the quantity of on-policy data that can be generated, precluding many reinforcement learning algorithms. Our solution uses behavioural cloning - training on a large noisy dataset scraped from human play on online servers (4 million frames, comparable in size to ImageNet), and a smaller dataset of high-quality expert demonstrations. This scale is an order of magnitude larger than prior work on imitation learning in FPS games.

爱可可AI前沿推介(4.18) - 图1
爱可可AI前沿推介(4.18) - 图2
爱可可AI前沿推介(4.18) - 图3爱可可AI前沿推介(4.18) - 图4爱可可AI前沿推介(4.18) - 图5

2、[CL] QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering

M Yasunaga, H Ren, A Bosselut, P Liang, J Leskovec
[Stanford University]

The problem of answering questions using knowledge from pre-trained language models (LMs) and knowledge graphs (KGs) presents two challenges: given a QA context (question and answer choice), methods need to (i) identify relevant knowledge from large KGs, and (ii) perform joint reasoning over the QA context and KG. Here we propose a new model, QA-GNN, which addresses the above challenges through two key innovations: (i) relevance scoring, where we use LMs to estimate the importance of KG nodes relative to the given QA context, and (ii) joint reasoning, where we connect the QA context and KG to form a joint graph, and mutually update their representations through graph-based message passing. We evaluate QA-GNN on the CommonsenseQA and OpenBookQA datasets, and show its improvement over existing LM and LM+KG models, as well as its capability to perform interpretable and structured reasoning, e.g., correctly handling negation in questions.

爱可可AI前沿推介(4.18) - 图6
爱可可AI前沿推介(4.18) - 图7爱可可AI前沿推介(4.18) - 图8爱可可AI前沿推介(4.18) - 图9

3、[LG] Meta-Learning Bidirectional Update Rules

M Sandler, M Vladymyrov, A Zhmoginov, N Miller, A Jackson, T Madams, B A y Arcas
[Google Research]

In this paper, we introduce a new type of generalized neural network where neurons and synapses maintain multiple states. We show that classical gradient-based backpropagation in neural networks can be seen as a special case of a two-state network where one state is used for activations and another for gradients, with update rules derived from the chain rule. In our generalized framework, networks have neither explicit notion of nor ever receive gradients. The synapses and neurons are updated using a bidirectional Hebb-style update rule parameterized by a shared low-dimensional “genome”. We show that such genomes can be meta-learned from scratch, using either conventional optimization techniques, or evolutionary strategies, such as CMA-ES. Resulting update rules generalize to unseen tasks and train faster than gradient descent based optimizers for several standard computer vision and synthetic tasks.

爱可可AI前沿推介(4.18) - 图10
爱可可AI前沿推介(4.18) - 图11爱可可AI前沿推介(4.18) - 图12爱可可AI前沿推介(4.18) - 图13爱可可AI前沿推介(4.18) - 图14

4、[CL] Sparse Attention with Linear Units

B Zhang, I Titov, R Sennrich
[University of Edinburgh & University of Zurich]

Recently, it has been argued that encoder-decoder models can be made more interpretable by replacing the softmax function in the attention with its sparse variants. In this work, we introduce a novel, simple method for achieving sparsity in attention: we replace the softmax activation with a ReLU, and show that sparsity naturally emerges from such a formulation. Training stability is achieved with layer normalization with either a specialized initialization or an additional gating function. Our model, which we call Rectified Linear Attention (ReLA), is easy to implement and more efficient than previously proposed sparse attention mechanisms. We apply ReLA to the Transformer and conduct experiments on five machine translation tasks. ReLA achieves translation performance comparable to several strong baselines, with training and decoding speed similar to that of the vanilla attention. Our analysis shows that ReLA delivers high sparsity rate and head diversity, and the induced cross attention achieves better accuracy with respect to source-target word alignment than recent sparsified softmax-based models. Intriguingly, ReLA heads also learn to attend to nothing (i.e. ‘switch off’) for some queries, which is not possible with sparsified softmax alternatives.

爱可可AI前沿推介(4.18) - 图15
爱可可AI前沿推介(4.18) - 图16爱可可AI前沿推介(4.18) - 图17爱可可AI前沿推介(4.18) - 图18

5、[CL] Hierarchical Learning for Generation with Long Source Sequences

T Rohde, X Wu, Y Liu
[Birch AI]
长源序列的层次生成学习。提出一种新的基于层次注意力Transformer的架构(HAT),在几个序列到序列任务特别是具有长源文档的摘要数据集上表现优于标准Transformer。该模型在ArXiv、CNN/DM、SAMSum和AMI等四个摘要任务上取得了最先进的结果,在WMT19 EN-DE文档翻译任务上的表现以28 BLEU明显优于文档级机器翻译基线。

One of the challenges for current sequence to sequence (seq2seq) models is processing long sequences, such as those in summarization and document level machine translation tasks. These tasks require the model to reason at the token level as well as the sentence and paragraph level. We design and study a new Hierarchical Attention Transformer-based architecture (HAT) that outperforms standard Transformers on several sequence to sequence tasks. In particular, our model achieves stateof-the-art results on four summarization tasks, including ArXiv, CNN/DM, SAMSum, and AMI, and we push PubMed R1 & R2 SOTA further. Our model significantly outperforms our document-level machine translation baseline by 28 BLEU on the WMT19 EN-DE document translation task. We also investigate what the hierarchical layers learn by visualizing the hierarchical encoder-decoder attention. Finally, we study hierarchical learning on encoder-only pre-training and analyze its performance on classification downstream tasks.

爱可可AI前沿推介(4.18) - 图19
爱可可AI前沿推介(4.18) - 图20爱可可AI前沿推介(4.18) - 图21爱可可AI前沿推介(4.18) - 图22


[CL] From partners to populations: A hierarchical Bayesian account of coordination and convention

R D. Hawkins, M Franke, M C. Frank, K Smith, T L. Griffiths, N D. Goodman
[Princeton University & University of Osnabrück & Stanford University & University of Edinburgh]

Languages are powerful solutions to coordination problems: they provide stable, shared expectations about how the words we say correspond to the beliefs and intentions in our heads. Yet language use in a variable and non-stationary social environment requires linguistic representations to be flexible: old words acquire new ad hoc or partner-specific meanings on the fly. In this paper, we introduce a hierarchical Bayesian theory of convention formation that aims to reconcile the long-standing tension between these two basic observations. More specifically, we argue that the central computational problem of communication is not simply transmission, as in classical formulations, but learning and adaptation over multiple timescales. Under our account, rapid learning within dyadic interactions allows for coordination on partner-specific common ground, while social conventions are stable priors that have been abstracted away from interactions with multiple partners. We present new empirical data alongside simulations showing how our model provides a cognitive foundation for explaining several phenomena that have posed a challenge for previous accounts: (1) the convergence to more efficient referring expressions across repeated interaction with the same partner, (2) the gradual transfer of partner-specific common ground to novel partners, and (3) the influence of communicative context on which conventions eventually form.

爱可可AI前沿推介(4.18) - 图23
爱可可AI前沿推介(4.18) - 图24爱可可AI前沿推介(4.18) - 图25爱可可AI前沿推介(4.18) - 图26

[LG] mlf-core: a framework for deterministic machine learning

L Heumos, P Ehmele, K Menden, L K Cuellar, E Miller, S Lemke, G Gabernet, S Nahnsen
[University of Tübingen & University of Hamburg]
爱可可AI前沿推介(4.18) - 图27
爱可可AI前沿推介(4.18) - 图28爱可可AI前沿推介(4.18) - 图29爱可可AI前沿推介(4.18) - 图30

[CV] CodedStereo: Learned Phase Masks for Large Depth-of-field Stereo

S Tan, Y Wu, S Yu, A Veeraraghavan
[Rice University & Facebook Reality Labs]
爱可可AI前沿推介(4.18) - 图31
爱可可AI前沿推介(4.18) - 图32爱可可AI前沿推介(4.18) - 图33爱可可AI前沿推介(4.18) - 图34爱可可AI前沿推介(4.18) - 图35

[CV] Adversarial Open Domain Adaption for Sketch-to-Photo Synthesis

X Xiang, D Liu, X Yang, Y Zhu, X Shen, J P. Allebach
[Purdue University & ByteDance Inc]
爱可可AI前沿推介(4.18) - 图36
爱可可AI前沿推介(4.18) - 图37爱可可AI前沿推介(4.18) - 图38爱可可AI前沿推介(4.18) - 图39爱可可AI前沿推介(4.18) - 图40