归档 - 爱可可AI论文推介(10月27日) - 《爱可可老师分享》

LG - 机器学习 CV - 计算机视觉 CL - 计算与语言 AS - 音频与语音 RO - 机器人

1、[LG] Meta-trained agents implement Bayes-optimal agents
V Mikulik, G Delétang, T McGrath, T Genewein, M Martic, S Legg, P A. Ortega
[DeepMind]
记忆元学习数值上逼近贝叶斯优化智能体，在理论计算机科学的启发下，发现元学习和贝叶斯优化智能体不仅行为相似，甚至共享相似的计算结构，一个智能体系统可以近似模拟另一个。证明了贝叶斯优化代理是元学习动态的不动点。 Memory-based meta-learning is a powerful technique to build agents that adapt fast to any task within a target distribution. A previous theoretical study has argued that this remarkable performance is because the meta-training protocol incentivises agents to behave Bayes-optimally. We empirically investigate this claim on a number of prediction and bandit tasks. Inspired by ideas from theoretical computer science, we show that meta-learned and Bayes-optimal agents not only behave alike, but they even share a similar computational structure, in the sense that one agent system can approximately simulate the other. Furthermore, we show that Bayes-optimal agents are fixed points of the meta-learning dynamics. Our results suggest that memory-based meta-learning might serve as a general technique for numerically approximating Bayes-optimal agents - that is, even for task distributions for which we currently don’t possess tractable models. https://weibo.com/1402400261/Jr4aNhc6A
f01.png.jpg f02.png.jpg

2、[CV] **Lightweight Generative Adversarial Networks for Text-Guided Image Manipulation
B Li, X Qi, P H. S. Torr, T Lukasiewicz
[University of Oxford & University of Hong Kong]
面向文本引导图像生成的轻量GAN，提出一种新的词级鉴别器，用明确的词级监督标签，可为生成器提供与每个词相关的细粒度训练反馈，从而构建一个轻量级架构，用于自然语言描述引导的图像操作。与之前方法相比，该方法参数数量少得多，但仍可实现具有竞争力的操作性能，对内存受限的设备更友好。** We propose a novel lightweight generative adversarial network for efficient image manipulation using natural language descriptions. To achieve this, a new word-level discriminator is proposed, which provides the generator with fine-grained training feedback at word-level, to facilitate training a lightweight generator that has a small number of parameters, but can still correctly focus on specific visual attributes of an image, and then edit them without affecting other contents that are not described in the text. Furthermore, thanks to the explicit training signal related to each word, the discriminator can also be simplified to have a lightweight structure. Compared with the state of the art, our method has a much smaller number of parameters, but still achieves a competitive manipulation performance. Extensive experimental results demonstrate that our method can better disentangle different visual attributes, then correctly map them to corresponding semantic words, and thus achieve a more accurate image modification using natural language descriptions.>
https://weibo.com/1402400261/Jr4elae9R
f01.png.jpg f02.png.jpg

3、[CV] **Noise2Same: Optimizing A Self-Supervised Bound for Image Denoising
Y Xie, Z Wang, S Ji
[Texas A&M University]
Noise2Same图像自监督去噪框架，提出一种新的自监督损失，消除了神经网络对J不变函数的假设和过度限制，既不需要J不变性，也不需要关于噪声模型的额外信息，可用于更广泛的去噪应用。实验结果表明，该方法在去噪性能和训练效率方面都优于以往的自监督去噪方法。** Self-supervised frameworks that learn denoising models with merely individual noisy images have shown strong capability and promising performance in various image denoising tasks. Existing self-supervised denoising frameworks are mostly built upon the same theoretical foundation, where the denoising models are required to be J-invariant. However, our analyses indicate that the current theory and the J-invariance may lead to denoising models with reduced performance. In this work, we introduce Noise2Same, a novel self-supervised denoising framework. In Noise2Same, a new self-supervised loss is proposed by deriving a self-supervised upper bound of the typical supervised loss. In particular, Noise2Same requires neither J-invariance nor extra information about the noise model and can be used in a wider range of denoising applications. We analyze our proposed Noise2Same both theoretically and experimentally. The experimental results show that our Noise2Same remarkably outperforms previous self-supervised denoising methods in terms of denoising performance and training efficiency. Our code is available at > this https URL. https://weibo.com/1402400261/Jr4iqokct
f01.png.jpg f02.png.jpg

4、[LG] Counterfactual Explanations for Machine Learning: A Review
S Verma, J Dickerson, K Hines
[University of Washington & Arthur AI]
机器学习反事实解释综述，回顾了39篇论文，在这些论文中，提出了各种算法解决方案，来寻找由自动化系统特别是机器学习自动化产生的决策的反事实解释。对所有论文在统一框架下进行评价，有助于快速理解不同方法的特点，以及每种方法的优缺点，也有助于在实际中选择最适合应用约束的算法。 Machine learning plays a role in many deployed decision systems, often in ways that are difficult or impossible to understand by human stakeholders. Explaining, in a human-understandable way, the relationship between the input and output of machine learning models is essential to the development of trustworthy machine-learning-based systems. A burgeoning body of research seeks to define the goals and methods of explainability in machine learning. In this paper, we seek to review and categorize research on counterfactual explanations, a specific class of explanation that provides a link between what could have happened had input to a model been changed in a particular way. Modern approaches to counterfactual explainability in machine learning draw connections to the established legal doctrine in many countries, making them appealing to fielded systems in high-impact areas such as finance and healthcare. Thus, we design a rubric with desirable properties of counterfactual explanation algorithms and comprehensively evaluate all currently-proposed algorithms against that rubric. Our rubric provides easy comparison and comprehension of the advantages and disadvantages of different approaches and serves as an introduction to major research themes in this field. We also identify gaps and discuss promising research directions in the space of counterfactual explainability. https://weibo.com/1402400261/Jr4tL1Yo7
f01.png.jpg

5、[LG] **Reinforcement Learning with Combinatorial Actions: An Application to Vehicle Routing
A Delarue, R Anderson, C Tjandraatmadja
[MIT Operations Research Center & Google Research]
组合行为强化学习在能力车辆路径问题(CVRP)中的应用。提出了一个基于价值函数具有组合行为空间的深度强化学习框架，将行为选择问题明确表述为混合整数优化问题。将该框架用于CVRP问题，将行为建模为单个路由的构造，考虑一个通过简单策略迭代算法不断改进的确定性策略。** Value-function-based methods have long played an important role in reinforcement learning. However, finding the best next action given a value function of arbitrary complexity is nontrivial when the action space is too large for enumeration. We develop a framework for value-function-based deep reinforcement learning with a combinatorial action space, in which the action selection problem is explicitly formulated as a mixed-integer optimization problem. As a motivating example, we present an application of this framework to the capacitated vehicle routing problem (CVRP), a combinatorial optimization problem in which a set of locations must be covered by a single vehicle with limited capacity. On each instance, we model an action as the construction of a single route, and consider a deterministic policy which is improved through a simple policy iteration algorithm. Our approach is competitive with other reinforcement learning methods and achieves an average gap of 1.7% with state-of-the-art OR methods on standard library instances of medium size. https://weibo.com/1402400261/Jr4wRpVMQ
f01.png.jpg

其他几篇值得关注的论文：

[LG] BYOL works even without batch statistics
BYOL(Bootstrap Your Own Latent)不需要批量统计也可以工作**
P H. Richemond, J Grill, F Altché, C Tallec, F Strub, A Brock, S Smith, S De, R Pascanu, B Piot, M Valko
[DeepMind]
https://weibo.com/1402400261/Jr4nxatCb
t02.png.jpg

[LG] Adaptive Gradient Quantization for Data-Parallel SGD
数据并行SGD的自适应梯度量化
F Faghri, I Tabrizian, I Markov, D Alistarh, D Roy, A Ramezani-Kebrya
[University of Toronto & IST Austria]
https://weibo.com/1402400261/Jr4BUnJxT
f01.png.jpg f02.png.jpg

[CL] RECONSIDER: Re-Ranking using Span-Focused Cross-Attention for Open Domain Question Answering
RECONSIDER：面向开放域问答重排序的跨度聚焦交叉注意力
S Iyer, S Min, Y Mehdad, W Yih
[Facebook AI & University of Washington]
https://weibo.com/1402400261/Jr4EFqSNA
f01.png.jpg

[CL] Complaint Identification in Social Media with Transformer Networks
基于Transformer网络的社交媒体抱怨识别
M Jin, N Aletras
[University of Sheffield]
https://weibo.com/1402400261/Jr4Gj8chw
t02.png.jpg

[CL] Transition-based Parsing with Stack-Transformers
Stack-Transformer转换解析
R F Astudillo, M Ballesteros, T Naseem, A Blodgett, R Florian
[IBM Research & Amazon AI & Georgetown University]
f01.png.jpg