转载至:微信公众号 蘑菇先生学习记

本文整理了KDD21的Accepted Papers[1]中,工业界在搜索、推荐、广告、nlp上的文章。整理的论文列表比较偏个人口味,选取的方式是根据论文作者列表上看是否是公司主导的,但判断比较偏主观,存在漏掉的可能。整理的方式主要按照公司和方向来划分,排名不计先后顺序。

1. 按照方向分类

主要挑选了一些笔者比较感兴趣的方向,并整理了对应的文章名称。读者可以大致读一下文章名,判断是否和自己的研究方向或工作方向一致,从中选择感兴趣的文章进行精读。

1.1 推荐系统

1.1.1 样本

涉及到采样、负样本等。

  • Google: Bootstrapping for Batch Active Sampling
  • Google: Bootstrapping Recommendations at Chrome Web Store
  • Alibaba:Real Negatives Matter: Continuous Training with Real Negatives for Delayed Feedback Modeling

    1.1.2 表征学习

  • Google: Learning to Embed Categorical Features without Embedding Tables for Recommendation

  • 华为:An Embedding Learning Framework for Numerical Features in CTR Prediction
  • 腾讯:Learning Reliable User Representations from Volatile and Sparse Data to Accurately Predict Customer Lifetime Value
  • 阿里:Representation Learning for Predicting Customer Orders

    1.1.3 跨域推荐

  • 阿里:Debiasing Learning based Cross-domain Recommendation

  • 腾讯:Adversarial Feature Translation for Multi-domain Recommendation

    1.1.4 纠偏

  • 阿里:Contrastive Learning for Debiased Candidate Generation in Large-Scale Recommender Systems

  • 阿里:Debiasing Learning based Cross-domain Recommendation

    1.1.5 图神经网络

  • 华为:Dual Graph enhanced Embedding Neural Network for CTR Prediction

  • 美团:Signed Graph Neural Network with Latent Groups
  • 阿里:DMBGN: Deep Multi-Behavior Graph Networks for Voucher Redemption Rate Prediction
  • 百度:MugRep: A Multi-Task Hierarchical Graph Representation Learning Framework for Real Estate Appraisal

    1.1.6 多任务学习

  • Google:Understanding and Improving Fairness-Accuracy Trade-offs in Multi-Task Learning

  • 美团:Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning for Customer Acquisition
  • 百度:MugRep: A Multi-Task Hierarchical Graph Representation Learning Framework for Real Estate Appraisal

    1.1.7 多模态/短视频推荐

  • 阿里:SEMI: A Sequential Multi-Modal Information Transfer Network for E-Commerce Micro-Video Recommendations

    1.1.8 知识图谱

  • Microsoft:Reinforced Anchor Knowledge Graph Generation for News Recommendation Reasoning

    1.1.9 推荐系统架构

  • Facebook:Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism

  • Facebook:Hierarchical Training: Scaling Deep Recommendation Models on Large CPU Clusters
  • 阿里,FleetRec: Large-Scale Recommendation Inference on Hybrid GPU-FPGA Clusters
  • 腾讯,Large-Scale Network Embedding in Apache Spark
  • Microsoft,On Post-Selection Inference in A/B Testing

    1.2 搜索

    1.2.1 向量检索

  • 阿里:Embedding-based Product Retrieval in Taobao Search

    1.2.2 查询/内容理解

  • Facebook:Que2Search: Fast and Accurate Query and Document Understanding for Search at Facebook

    1.2.3 概念图谱

  • 阿里巴巴:AliCG: Fine-grained and Evolvable Conceptual Graph Construction for Semantic Search at Alibaba

  • 阿里巴巴:AliCoCo2: Commonsense Knowledge Extraction, Representation and Application in E-commerce

    1.2.4 预训练

  • 百度:Pretrained Language Models for Web-scale Retrieval in Baidu Search

  • 微软:Domain-Specific Pretraining for Vertical Search: Case Study on Biomedical Literature

    1.2.5 Query改写/自动补全

  • 微软:Diversity driven Query Rewriting in Search Advertising

  • 百度:Meta-Learned Spatial-Temporal POI Auto-Completion for the Search Engine at Baidu Maps

    1.2.6 图神经网络

  • 百度:HGAMN: Heterogeneous Graph Attention Matching Network for Multilingual POI Retrieval at Baidu Maps

    1.2.7 多模态

  • Google: Mondegreen: A Post-Processing Solution to Speech Recognition Error Correction for Voice Search Queries

  • Facebook:VisRel: Media Search at Scale

    1.2.8 边缘计算

  • 阿里:FIVES: Feature Interaction Via Edge Search for Large-Scale Tabular Data

    1.2.9 搜索引擎架构

  • 百度:Norm Adjusted Proximity Graph for Fast Inner Product Retrieval

  • 百度:JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu

    1.3 广告

    这一块文章不是很多,就不细分了。

  • Google: Clustering for Private Interest-based Advertising

  • 阿里:A Unified Solution to Constrained Bidding in Online Display Advertising
  • 阿里:Exploration in Online Advertising Systems with Deep Uncertainty-Aware Learning
  • 阿里:Neural Auction: End-to-End Learning of Auction Mechanisms for E-Commerce Advertising
  • 阿里:We Know What You Want: An Advertising Strategy Recommender System for Online Advertising

    1.4 NLP

    1.4.1 预训练

  • 微软:NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search

  • 阿里:M6: Multi-Modality-to-Multi-Modality Multitask Mega-transformer for Unified Pretraining
  • 微软:TUTA: Tree-based Transformers for Generally Structured Table Pre-training

    1.4.2 命名实体识别

  • 微软:Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition

    1.4.3 少样本学习

  • 微软:Generalized Zero-Shot Extreme Multi-label Learning

  • 微软:Zero-shot Multi-lingual Interrogative Question Generation for “People Also Ask” at Bing

    1.4.4 摘要

  • 微软:Reinforcing Pretrained Models for Generating Attractive Text Advertisements

    1.4.5 意图识别

  • 阿里:MeLL: Large-scale Extensible User Intent Classification for Dialogue Systems with Meta Lifelong Learning

    1.4.6 多模态

  • 阿里:M6: Multi-Modality-to-Multi-Modality Multitask Mega-transformer for Unified Pretraining

    2.按照公司分类

    2.1 Google

  • Learning to Embed Categorical Features without Embedding Tables for Recommendation

  • NewsEmbed: Modeling News through Pre-trained Document Representations
  • Understanding and Improving Fairness-Accuracy Trade-offs in Multi-Task Learning
  • Bootstrapping for Batch Active Sampling
  • Bootstrapping Recommendations at Chrome Web Store
  • Clustering for Private Interest-based Advertising
  • Dynamic Language Models for Continuously Evolving Content
  • Mondegreen: A Post-Processing Solution to Speech Recognition Error Correction for Voice Search Queries
  • On Training Sample Memorization: Lessons from Benchmarking Generative Modeling with a Large-scale Competition

    2.2 Facebook

  • Training Recommender Systems at Scale: Communication-Efficient Model and Data Parallelism

  • Preference Amplification in Recommender Systems
  • Hierarchical Training: Scaling Deep Recommendation Models on Large CPU Clusters
  • Network Experimentation at Scale
  • Que2Search: Fast and Accurate Query and Document Understanding for Search at Facebook
  • VisRel: Media Search at Scale
  • Balancing Consistency and Disparity in Network Alignment

    2.3 Microsoft

  • Generalized Zero-Shot Extreme Multi-label Learning

  • Learning Multiple Stock Trading Patterns with Temporal Routing Adaptor and Optimal Transport
  • NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search
  • Reinforced Anchor Knowledge Graph Generation for News Recommendation Reasoning
  • Table2Charts: Recommending Charts by Learning Shared Table Representations
  • TabularNet: A Neural Network Architecture for Understanding Semantic Structures of Tabular Data
  • TUTA: Tree-based Transformers for Generally Structured Table Pre-training
  • Contextual Bandit Applications in a Customer Support Bot
  • Diversity driven Query Rewriting in Search Advertising
  • Domain-Specific Pretraining for Vertical Search: Case Study on Biomedical Literature
  • On Post-Selection Inference in A/B Testing
  • Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition
  • Reinforcing Pretrained Models for Generating Attractive Text Advertisements
  • Zero-shot Multi-lingual Interrogative Question Generation for “People Also Ask” at Bing

    2.4 阿里

  • A Unified Solution to Constrained Bidding in Online Display Advertising

  • AliCG: Fine-grained and Evolvable Conceptual Graph Construction for Semantic Search at Alibaba
  • AliCoCo2: Commonsense Knowledge Extraction, Representation and Application in E-commerce
  • Contrastive Learning for Debiased Candidate Generation in Large-Scale Recommender Systems
  • Debiasing Learning based Cross-domain Recommendation
  • Device-Cloud Collaborative Learning for Recommendation
  • Deep Inclusion Relation-aware Network for User Response Prediction at Fliggy
  • DMBGN: Deep Multi-Behavior Graph Networks for Voucher Redemption Rate Prediction
  • Dual Attentive Sequential Learning for Cross-Domain Click-Through Rate Prediction
  • Embedding-based Product Retrieval in Taobao Search
  • Exploration in Online Advertising Systems with Deep Uncertainty-Aware Learning
  • FIVES: Feature Interaction Via Edge Search for Large-Scale Tabular Data
  • FleetRec: Large-Scale Recommendation Inference on Hybrid GPU-FPGA Clusters
  • Intention-aware Heterogeneous Graph Attention Networks for Fraud Transactions Detection
  • Live-Streaming Fraud Detection: A Heterogeneous Graph Neural Network Approach
  • M6: Multi-Modality-to-Multi-Modality Multitask Mega-transformer for Unified Pretraining
  • Markdowns in E-Commerce Fresh Retail: A Counterfactual Prediction and Multi-Period Optimization Approach
  • MeLL: Large-scale Extensible User Intent Classification for Dialogue Systems with Meta Lifelong Learning
  • Multi-Agent Cooperative Bidding Games for Multi-Objective Optimization in e-Commercial Sponsored Search
  • Neural Auction: End-to-End Learning of Auction Mechanisms for E-Commerce Advertising
  • Real Negatives Matter: Continuous Training with Real Negatives for Delayed Feedback Modeling
  • Representation Learning for Predicting Customer Orders
  • SEMI: A Sequential Multi-Modal Information Transfer Network for E-Commerce Micro-Video Recommendations
  • We Know What You Want: An Advertising Strategy Recommender System for Online Advertising

    2.5 百度

  • Norm Adjusted Proximity Graph for Fast Inner Product Retrieval

  • Curriculum Meta-Learning for Next POI Recommendation
  • Pretrained Language Models for Web-scale Retrieval in Baidu Search
  • HGAMN: Heterogeneous Graph Attention Matching Network for Multilingual POI Retrieval at Baidu Maps
  • JIZHI: A Fast and Cost-Effective Model-As-A-Service System for Web-Scale Online Inference at Baidu
  • Meta-Learned Spatial-Temporal POI Auto-Completion for the Search Engine at Baidu Maps
  • MugRep: A Multi-Task Hierarchical Graph Representation Learning Framework for Real Estate Appraisal
  • SSML: Self-Supervised Meta-Learner for En Route Travel Time Estimation at Baidu Maps
  • Talent Demand Forecasting with Attentive Neural Sequential Model

    2.6 腾讯

  • Why Attentions May Not Be Interpretable?

  • Adversarial Feature Translation for Multi-domain Recommendation
  • Large-Scale Network Embedding in Apache Spark
  • Learn to Expand Audience via Meta Hybrid Experts and Critics
  • Learning Reliable User Representations from Volatile and Sparse Data to Accurately Predict Customer Lifetime Value

    2.7 美团

  • Modeling the Sequential Dependence among Audience Multi-step Conversions with Multi-task Learning for Customer Acquisition

  • User Consumption Intention Prediction in Meituan
  • Signed Graph Neural Network with Latent Groups
  • A Deep Learning Method for Route and Time Prediction in Food Delivery Service

    2.8 华为

  • An Embedding Learning Framework for Numerical Features in CTR Prediction

  • Dual Graph enhanced Embedding Neural Network for CTR Prediction
  • Discrete-time Temporal Network Embedding via Implicit Hierarchical Learning
  • Retrieval & Interaction Machine for Tabular Data Prediction
  • A Multi-Graph Attributed Reinforcement Learning Based Optimization Algorithm for Large-scale Hybrid Flow Shop Scheduling Problem