LG - 机器学习 CV - 计算机视觉 CL - 计算与语言 AS - 音频与语音 RO - 机器人

1、[CL] A Survey of Data Augmentation Approaches for NLP

S Y. Feng, V Gangal, J Wei, S Chandar, S Vosoughi, T Mitamura, E Hovy
[CMU & Google Research & Mila & Dartmouth College]

Data augmentation has recently seen increased interest in NLP due to more work in lowresource domains, new tasks, and the popularity of large-scale neural networks that require large amounts of training data. Despite this recent upsurge, this area is still relatively underexplored, perhaps due to the challenges posed by the discrete nature of language data. In this paper, we present a comprehensive and unifying survey of data augmentation for NLP by summarizing the literature in a structured manner. We first introduce and motivate data augmentation for NLP, and then discuss major methodologically representative approaches. Next, we highlight techniques that are used for popular NLP applications and tasks. We conclude by outlining current challenges and directions for future research. Overall, our paper aims to clarify the landscape of existing literature in data augmentation for NLP and motivate additional work in this area.


2、[CV] ResMLP: Feedforward networks for image classification with data-efficient training

H Touvron, P Bojanowski, M Caron, M Cord, A El-Nouby, E Grave, A Joulin, G Synnaeve, J Verbeek, H Jégou
[Facebook AI ]

We present ResMLP, an architecture built entirely upon multi-layer perceptrons for image classification. It is a simple residual network that alternates (i) a linear layer in which image patches interact, independently and identically across channels, and (ii) a two-layer feed-forward network in which channels interact independently per patch. When trained with a modern training strategy using heavy data-augmentation and optionally distillation, it attains surprisingly good accuracy/complexity trade-offs on ImageNet. We will share our code based on the Timm library and pre-trained models.


3、[CV] LASR: Learning Articulated Shape Reconstruction from a Monocular Video

G Yang, D Sun, V Jampani, D Vlasic, F Cole, H Chang, D Ramanan, W T. Freeman, C Liu
[CMU & Google Research]

Remarkable progress has been made in 3D reconstruction of rigid structures from a video or a collection of images. However, it is still challenging to reconstruct nonrigid structures from RGB inputs, due to its under-constrained nature. While template-based approaches, such as parametric shape models, have achieved great success in modeling the “closed world” of known object categories, they cannot well handle the “open-world” of novel object categories or outlier shapes. In this work, we introduce a template-free approach to learn 3D shapes from a single video. It adopts an analysis-by-synthesis strategy that forward-renders object silhouette, optical flow, and pixel values to compare with video observations, which generates gradients to adjust the camera, shape and motion parameters. Without using a category-specific shape template, our method faithfully reconstructs nonrigid 3D structures from videos of human, animals, and objects of unknown classes. Our code is available at lasr-google.github.io.


4、[LG] What Kinds of Functions do Deep Neural Networks Learn? Insights from Variational Spline Theory

R Parhi, R D. Nowak
[University of Wisconsin–Madison]

We develop a variational framework to understand the properties of functions learned by deep neural networks with ReLU activation functions fit to data. We propose a new function space, which is reminiscent of classical bounded variation spaces, that captures the compositional structure associated with deep neural networks. We derive a representer theorem showing that deep ReLU networks are solutions to regularized data fitting problems in this function space. The function space consists of compositions of functions from the (non-reflexive) Banach spaces of second-order bounded variation in the Radon domain. These are Banach spaces with sparsity-promoting norms, giving insight into the role of sparsity in deep neural networks. The neural network solutions have skip connections and rank bounded weight matrices, providing new theoretical support for these common architectural choices. The variational problem we study can be recast as a finite-dimensional neural network training problem with regularization schemes related to the notions of weight decay and path-norm regularization. Finally, our analysis builds on techniques from variational spline theory, providing new connections between deep neural networks and splines.


5、[CV] Contrastive Learning for Unsupervised Image-to-Image Translation

H Lee, J Seol, S Lee
[Seoul National University]

Image-to-image translation aims to learn a mapping between different groups of visually distinguishable images. While recent methods have shown impressive ability to change even intricate appearance of images, they still rely on domain labels in training a model to distinguish between distinct visual features. Such dependency on labels often significantly limits the scope of applications since consistent and high-quality labels are expensive. Instead, we wish to capture visual features from images themselves and apply them to enable realistic translation without humangenerated labels. To this end, we propose an unsupervised image-to-image translation method based on contrastive learning. The key idea is to learn a discriminator that differentiates between distinctive styles and let the discriminator supervise a generator to transfer those styles across images. During training, we randomly sample a pair of images and train the generator to change the appearance of one towards another while keeping the original structure. Experimental results show that our method outperforms the leading unsupervised baselines in terms of visual quality and translation accuracy.



[CL] DEXPERTS: On-the-Fly Controlled Text Generation with Experts and Anti-Experts

A Liu, M Sap, X Lu, S Swayamdipta, C Bhagavatula, N A. Smith, Y Choi
[University of Washington & Allen Institute for Artificial Intelligence]

[CL] A Dataset of Information-Seeking Questions and Answers Anchored in Research Papers

P Dasigi, K Lo, I Beltagy, A Cohan, N A. Smith, M Gardner
[Allen Institute for AI & University of Washington]

[LG] Hierarchical Graph Neural Networks

S Sobolevsky
[New York University]

[CL] How (Non-)Optimal is the Lexicon?

T Pimentel, I Nikkarinen, K Mahowald, R Cotterell, D Blasi
[University of Cambridge]