DaNing
SpanBERT: Improving Pre-training by Representing and Predicting Spans SpanBERT: Improving Pre-training by Representing and Predicting Spans
本文前置知识: BERT: 详见ELMo, GPT, BERT. SpanBERT: Improving Pre-training by Representing and Predicting Spans本文是论文SpanBERT:
2021-05-13
StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding
本文前置知识: BERT: 详见ELMo, GPT, BERT. StructBERT: Incorporating Language Structures into Pre-training for Deep Language U
2021-05-04
BART和mBART BART和mBART
本文前置知识: Transformer: 详见Transformer精讲. BERT, GPT: 详见ELMo, GPT, BERT. BART和mBART本文是如下论文的阅读笔记和个人理解: BART: Denoising
2021-04-26
Pytorch实现: BERT Pytorch实现: BERT
本文前置知识: Transformer: 详见Transformer精讲 BERT: 详见ELMo, GPT, BERT Pytorch实现: Transformer 2022.04.03: 修正Pre Norm效果好于Post No
2021-03-12
Introduction: Graph Neural Network Introduction: Graph Neural Network
本文前置知识: 图结构基础知识(数据结构相关内容, 自行查阅). 2021.04.06: 更新GraphSAGE的理解. Introduction: Graph Neural Network本文介绍的是GNN方面入门级别的知识, 其
2021-03-04
ConvBERT: Improving BERT with Span-based Dynamic Convolution ConvBERT: Improving BERT with Span-based Dynamic Convolution
ConvBERT: Improving BERT with Span-based Dynamic Convolution 本文前置知识: Light Weight Convolution: 详见基于轻量级卷积和动态卷积替代的注意力机制.
2021-02-12
基于轻量级卷积和动态卷积替代的注意力机制 基于轻量级卷积和动态卷积替代的注意力机制
本文前置知识: Depthwise Convolution: 详见深度可分离卷积与分组卷积. Attention: 详见Seq2Seq和Attention. Transformer: 详见Transformer精讲. 本文是论文PA
2020-12-05
深度可分离卷积与分组卷积 深度可分离卷积与分组卷积
本文前置知识: CNN: 详见卷积神经网络小结. 本文着重介绍深度可分离卷积和分组卷积两种操作. 深度可分离卷积深度可分离卷积(Depthwise Separable Convolution)应用在MobileNet和Xceptio
2020-11-26
Pytorch实现: Transformer Pytorch实现: Transformer
本文前置知识: Pytorch基本操作 Transformer: 详见Transformer精讲 2022.04.03: 去掉了Pre Norm比Post Norm效果好的表述. Pytorch实现: Transformer本文是T
2020-11-23
Pytorch实现: Skip-Gram Pytorch实现: Skip-Gram
本文前置知识: Pytorch基本操作 Word2Vec Pytorch实现: Skip-Gram本文用Pytorch实现了Skip - Gram, 它是Word2Vec的其中一种. 本文实现参考PyTorch 实现 Word2Ve
2020-11-19
RoBERTa: A Robustly Optimized BERT Pretraining Approach RoBERTa: A Robustly Optimized BERT Pretraining Approach
本文前置知识: BERT(详见ELMo, GPT, BERT) RoBERTa: A Robustly Optimized BERT Pretraining Approach本文是论文RoBERTa: A Robustly Opti
2020-11-18
Transformer-XL与XLNet Transformer-XL与XLNet
本文前置知识: Transformer(Masked Self - Attention和FFN) BERT(与XLNet做对比) Seq2Seq(AutoRegressive & AutoEncoding) 2020.10.2
2020-10-14
4 / 5