- 分享主题:Transfer Learning, Domain Adaptation, CV, Explicit Feature Distribution Alignment, Adversarial
- 论文标题:Transfer Learning with Dynamic Adversarial Adaptation Network
- 论文链接:https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8970703
1.Summary
This is a paper on picture classification. There are many pictures in some picture datasets, but there are no labels. If you add labels manually, the efficiency is very low. At this time, other labeled image datasets can be used to help unlabeled image datasets judge labels. Of course, the premise is that the picture labels of the two datasets coincide. If a model is directly pre-trained in a labeled dataset and then used in a unlabeled dataset, the effect may be poor. Because the two datasets belong to different domains, their styles and other features will be different, which will affect the final prediction results. In order to solve this problem, this paper proposes a model called DAAN. DAAN uses domain adversarial method to extract the common features of the two domains and align the marginal probability distribution. At the same time, a domain discriminator is set for each differrent picture label to narrow the features of the same label, so as to align the conditional probability distribution. The weights of the two are sometimes different, so a dynamic parameter is set to measure the importance of the loss of two domain adversarial method. In order to deepen my understanding of this paper, I can read some papers on using domain adversarial method to align conditional probability distribution.2.你对于论文的思考
这是一篇图片分类的文章,用带标签的数据集上的知识迁移到不带标签的数据集上,之前利用域对抗的方法的也比较多,但仅仅是对齐了边缘概率分布,这篇文章也是利用域对抗的方法,但是与之前方法不同的是,这篇文章还为每一个图片分类设置了一个域判别器,以此来拉近相同标签的图片,从而对齐条件概率分布,同时,考虑到边缘概率分布和条件概率分布的重要程度可能不同,因此还设置了一个动态的参数来进行衡量。3. 其他
3.1 解决的问题
如下图所示,假设左边的数据集是带标签的,而右边的那一个数据集是不带标签,所有数据集的风格不太一样,同时所有数据集的图片标签都是重合的,现在需要利用左边的一个数据集,将它的知识迁移给右边的数据集,帮助右边的数据集预测标签。
因此,源域:
目标域:3.2 DAAN模型
损失函数分为了三部分:
(1)标签分类损失:
(2)全局域分类损失(对齐边缘概率分布):
(3)局部子域分类损失(对齐条件概率分布):
因为边缘概率分布和条件概率分布的重要程度可能不一样,因此设计了一个动态参数来进行衡量,如果上一轮训练时,Lg比Ll大,那么应该更重视边缘概率分布,反之,应当更视条件概率分布,因此,总的损失函数以及动态参数的计算方法如下所示:
A-distance:
3.3 实验
两个图片数据集:
ImageCLEF-DA:
Office-Home
(1)在ImageCLEF-DA数据集上的实验
(2)在Office-Home数据集上的实验