本文前置知识: BERT: 详见ELMo, GPT, BERT. StructBERT: Incorporating Language Structures into Pre-training for Deep Language U
2021-05-04
StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding
BART和mBART
Pytorch实现: BERT
Introduction: Graph Neural Network
ConvBERT: Improving BERT with Span-based Dynamic Convolution
基于轻量级卷积和动态卷积替代的注意力机制
深度可分离卷积与分组卷积
Pytorch实现: Transformer
Pytorch实现: Skip-Gram
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Transformer-XL与XLNet
ELMo, GPT, BERT