[模型]-Transformer

浏览 50 扫码分享 2023-11-22 00:48:06

Attention
Transformer
损失函数
Backpropagation and the brain
- BERT
- GPT

RNN

Attention

注意力机制，允许模型按需聚焦在输入序列的相关部分
https://jalammar.github.io/visualizing-neural-machine-translation-mechanics-of-seq2seq-models-with-attention/

Transformer

http://jalammar.github.io/illustrated-transformer/

损失函数

交叉熵 https://colah.github.io/posts/2015-09-Visual-Information/
相对熵 https://www.countbayesie.com/blog/2017/5/9/kullback-leibler-divergence-explained

Follow-up works:
Attention Is All You Need

https://cs231n.github.io

Backpropagation and the brain

BERT

GPT

若有收获，就点个赞吧

上一篇:

下一篇:

让时间为你证明

展开/收起文章目录