推荐系统(一）:FM - 《推荐系统》

背景
分析
FM定义
公式改进
FM求导

FM是2010年提出的模型，凭借其在数据量比较大并且特征稀疏的情况下，仍然能够得到优秀的性能和效果的特性，FM是为了解决稀疏数据的特征组合问题（多项式模型容易出现的问题）

背景

在推荐领域，点击率CTR（click-through rate）和转化率CVR（conversion rate）是两个关键指标。预估CTR/CVR，业界常用的方法有人工特征工程 + LR(Logistic Regression)、GBDT(Gradient Boosting Decision Tree) + LR[1][2][3]、FM（Factorization Machine）[2][7]和FFM（Field-aware Factorization Machine）[9]模型。

分析

推荐系统(一）:FM - 图1

当使用多项式模型时，特征经过onehot编码以后，数据变得稀疏，而且存在很多特征值为0的情况。因为上述公式在求导的时候![](https://cdn.nlark.com/yuque/__latex/eb10f67aaab725201e8b7e3f5477f83a.svg#card=math&code=x%7Bi%7Dx%7Bj%7D&height=16&width=31)非0，权重![](https://cdn.nlark.com/yuque/__latex/726f6b4b6f91c6d328204af3422e14fc.svg#card=math&code=%5Comega%7Bij%7D&height=16&width=20)才有意义。为解决这个问题，提出FM算法

FM定义

推导式如下：
推荐系统(一）:FM - 图2

公式改进

其中根据矩阵上三角求和，推导
推荐系统(一）:FM - 图3

FM求导

推荐系统(一）:FM - 图4
推荐系统(一）:FM - 图5

#-*- coding:utf-8 -*-
# @Time:2020/11/5 17:38
# @Auther :lizhe
# @File：FM.py
# @Email:bylz0213@gmail.com
import tensorflow.compat.v1 as tf1
class FM:
    def __init__(self,feat_num,hidden_num):
        self.feat_num = feat_num
        self.hidden_num = hidden_num
        self.x = tf1.placeholder(dtype=tf1.float32, shape=[None, feat_num], name='input_x')
        self.y = tf1.placeholder(dtype=tf1.float32, shape=[None, 1], name='input_y')
        w_bias= tf1.get_variable(name="weight_bias", shape=[1], dtype=tf1.float32)
        w_linear = tf1.get_variable(name='linear_weight', shape=[feat_num], dtype=tf1.float32)
        w_h = tf1.get_variable(name="interaction_w", shape=(self.feat_num, self.hidden_num), dtype=tf1.float32)
        linear = w_bias + tf1.reduce_sum(tf1.multiply(self.x, w_linear), axis=-1, keep_dims=True)
        interaction_part = 0.5 * tf1.reduce_sum(
            tf1.square(tf1.matmul(self.x, w_h)) - tf1.matmul(tf1.square(self.x), tf1.square(w_h)),
            axis=-1,keep_dims=True)
        y_hat = linear + interaction_part
        self.loss = tf1.reduce_mean(tf1.square(self.y - y_hat))