归档 - 爱可可AI前沿推介(12月15日) - 《爱可可老师分享》

LG - 机器学习 CV - 计算机视觉 CL - 计算与语言 AS - 音频与语音 RO - 机器人

(*表示值得重点关注)

1、** **[CV] Amodal Segmentation Based on Visible Region Segmentation and Shape Prior
Y Xiao, Y Xu, Z Zhong, W Luo, J Li, S Gao
[ShanghaiTech University & ASTAR]
*基于可见区域分割和形状先验的遮挡目标分割。提出一种新的模型，模拟人的遮挡目标感知，基于可见区域特征，利用形状先验来想象不可见区域。现有方法大多利用整个感兴趣区域的外观，来推断遮挡目标掩膜，与人类的遮挡目标感知相悖，相同的遮挡外观可能需要不同的预测。为模拟可见区域和形状先验的想象，用可见掩模作为聚焦可见区域的注意力，构建码本存储收集到的形状先验嵌入，进行细化和后期处理。形状先验的利用使遮挡目标掩模估计更加鲁棒和合理。 Almost all existing amodal segmentation methods make the inferences of occluded regions by using features corresponding to the whole image. This is against the human’s amodal perception, where human uses the visible part and the shape prior knowledge of the target to infer the occluded region. To mimic the behavior of human and solve the ambiguity in the learning, we propose a framework, it firstly estimates a coarse visible mask and a coarse amodal mask. Then based on the coarse prediction, our model infers the amodal mask by concentrating on the visible region and utilizing the shape prior in the memory. In this way, features corresponding to background and occlusion can be suppressed for amodal mask estimation. Consequently, the amodal mask would not be affected by what the occlusion is given the same visible regions. The leverage of shape prior makes the amodal mask estimation more robust and reasonable. Our proposed model is evaluated on three datasets. Experiments show that our proposed model outperforms existing state-of-the-art methods. The visualization of shape prior indicates that the category-specific feature in the codebook has certain interpretability. https://weibo.com/1402400261/Jyw6vjENC
f01.png.jpg f02.png.jpg f03.png.jpg

2、** **[CL] Discriminating Between Similar Nordic Languages
R Haas, L Derczynski
[IT University of Copenhagen]
相似北欧语言的鉴别。提出了一种机器学习方法，对北欧语言进行自动语言识别，着重于六种北欧语言间的鉴别：麦语、瑞典语、挪威语(尼诺斯克语)、挪威语(波克马尔语)、法罗语和冰岛语。发布了数据集和详细的基线方法及问题分析，用字符级n-grams作特征，提高了FastText监督分类器的性能。 Automatic language identification is a challenging problem. Discriminating between closely related languages is especially difficult. This paper presents a machine learning approach for automatic language identification for the Nordic languages, which often suffer miscategorisation by existing state-of-the-art tools. Concretely we will focus on discrimination between six Nordic languages: Danish, Swedish, Norwegian (Nynorsk), Norwegian (Bokmål), Faroese and Icelandic. https://weibo.com/1402400261/Jywf1gNri
f01.png.jpg f02.png.jpg
f10.png.jpg

3、** **[CV] Relighting Images in the Wild with a Self-Supervised Siamese Auto-Encoder
Y Liu, A Neophytou, S Sengupta, E Sommerlade
[University of Surrey & Microsoft]
基于自监督Siamese自编码器的自然环境图片重打光。提出一种自动的、无监督的重打光算法，用大量未标记数据进行训练，将源图像信息分为内容嵌入和光照嵌入两部分，用Siamese自编码器实现重打光。对几个自编码器网络进行了重建、比较和光照估计的训练。提出了球谐损失的概念，并应用了四种增广图像。实现了与监督方法相似的性能，可避免常见的照明伪影。 We propose a self-supervised method for image relighting of single view images in the wild. The method is based on an auto-encoder which deconstructs an image into two separate encodings, relating to the scene illumination and content, respectively. In order to disentangle this embedding information without supervision, we exploit the assumption that some augmentation operations do not affect the image content and only affect the direction of the light. A novel loss function, called spherical harmonic loss, is introduced that forces the illumination embedding to convert to a spherical harmonic vector. We train our model on large-scale datasets such as Youtube 8M and CelebA. Our experiments show that our method can correctly estimate scene illumination and realistically re-light input images, without any supervision or a prior shape model. Compared to supervised methods, our approach has similar performance and avoids common lighting artifacts. https://weibo.com/1402400261/JywiZtPrK
f01.png.jpg f02.png.jpg f03.png.jpg f04.png.jpg

4、** **[CL] Towards Neural Programming Interfaces
Z C. Brown, N Robinson, D Wingate, N Fulda
[Duke University & Brigham Young University]
神经编程接口探索。将控制自然语言生成问题，重构为与预训练语言模型的接口的学习问题，就像应用编程接口(API)通过改变超参数来控制程序行为一样。在这种新范式中，一个专门的神经网络（称为神经编程接口或NPI）通过操纵预训练模型的隐藏激活来学习与预训练语言模型的接口，以产生期望的输出。重要的是，不会对原始模型的权重进行永久性改变，可以在不覆盖语言模型的任何方面的情况下，将预训练模型重新用于新任务。提出了一种新的数据集构建算法和GAN启发损失函数，可训练NPI模型来控制自回归transformers的输出。用OpenAI的GPT-2证明了该方法的有效性，成功控制了名词选择、话题厌恶、攻击性言论过滤和语言的其他方面，在确定性设置下基本保持了受控模型的流畅性。 It is notoriously difficult to control the behavior of artificial neural networks such as generative neural language models. We recast the problem of controlling natural language generation as that of learning to interface with a pretrained language model, just as Application Programming Interfaces (APIs) control the behavior of programs by altering hyperparameters. In this new paradigm, a specialized neural network (called a Neural Programming Interface or NPI) learns to interface with a pretrained language model by manipulating the hidden activations of the pretrained model to produce desired outputs. Importantly, no permanent changes are made to the weights of the original model, allowing us to re-purpose pretrained models for new tasks without overwriting any aspect of the language model. We also contribute a new data set construction algorithm and GAN-inspired loss function that allows us to train NPI models to control outputs of autoregressive transformers. In experiments against other state-of-the-art approaches, we demonstrate the efficacy of our methods using OpenAI’s GPT-2 model, successfully controlling noun selection, topic aversion, offensive speech filtering, and other aspects of language while largely maintaining the controlled model’s fluency under deterministic settings. https://weibo.com/1402400261/JywmAdzxR

5、** **[CL] Mapping the Space of Chemical Reactions Using Attention-Based Neural Networks
P Schwaller, D Probst, A C. Vaucher, V H. Nair, D Kreutter, T Laino, J Reymond
[IBM Research & University of Bern]
利用基于注意力的神经网络绘制化学反应空间图谱。有机反应通常被分配到包含具有类似试剂和机理的反应类别中，反应类别有助于复杂概念的交流和化学反应空间的有效导航。但分类过程非常繁琐，需要标记反应中的分子数、反应中心以及反应物和试剂的区分来确定相应的反应类模板。本文工作表明，基于transformer的模型可以从未标记的、基于简单文本的化学反应表示中推断出反应类，达到了98.2%的分类准确率。学习到的表征可以用作反应指纹，比传统的反应指纹更好地捕捉反应类之间的细粒度差异，该指纹对化学反应空间的洞察进一步通过交互式反应图谱来说明，提供了可视化的聚类和相似性搜索。 Organic reactions are usually assigned to classes containing reactions with similar reagents and mechanisms. Reaction classes facilitate the communication of complex concepts and efficient navigation through chemical reaction space. However, the classification process is a tedious task. It requires the identification of the corresponding reaction class template via annotation of the number of molecules in the reactions, the reaction center, and the distinction between reactants and reagents. This work shows that transformer-based models can infer reaction classes from non-annotated, simple text-based representations of chemical reactions. Our best model reaches a classification accuracy of 98.2%. We also show that the learned representations can be used as reaction fingerprints that capture fine-grained differences between reaction classes better than traditional reaction fingerprints. The insights into chemical reaction space enabled by our learned fingerprints are illustrated by an interactive reaction atlas providing visual clustering and similarity searching. https://weibo.com/1402400261/JywrJ76gS
f01.png.jpg f02.png.jpg

f04.png.jpg f05.png.jpg

另外几篇值得关注的论文：

[LG] Hardware Beyond Backpropagation: a Photonic Co-Processor for Direct Feedback Alignment
超越反向传播的硬件：直接反馈对齐的光子协处理器
J Launay, I Poli, K Müller, G Pariente, I Carron, L Daudet, F Krzakala, S Gigan
[LightOn]
https://weibo.com/1402400261/JywwlpLnF
f01.png.jpg f02.png.jpg

[LG] Neural Dynamic Mode Decomposition for End-to-End Modeling of Nonlinear Dynamics
神经动态模式分解非线性动力学端到端建模
T Iwata, Y Kawahara
[NTT Communication Science Laboratories & Kyushu University]
https://weibo.com/1402400261/Jywxu4oiA
f01.png.jpg f02.png.jpg f03.png.jpg

[LG] Sample-efficient proper PAC learning with approximate differential privacy
具有近似差分隐私的样本高效适当PAC学习
B Ghazi, N Golowich, R Kumar, P Manurangsi
[Google Research & MIT]
https://weibo.com/1402400261/JywyOeevJ

[LG] Improved Contrastive Divergence Training of Energy Based Models
能量模型改进对比发散训练
Y Du, S Li, J Tenenbaum, I Mordatch
[MIT CSAIL & Google]
https://weibo.com/1402400261/JywBioxvg

f02.png.jpg f03.png.jpg