Introduction: Variational Auto - EncoderIntroduction: Variational Auto - Encoder变分自动编码器(VAE, Variational Auto - Encoder)是一种基于自编码器结构的深度生成模型. 本文对VAE更深层次的数学原理没有探讨,2021-07-09 深度学习VAE 知识蒸馏: Distilling the Knowledge in a Neural NetworkDistilling the Knowledge in a Neural Network本文是论文Distilling the Knowledge in a Neural Network的阅读笔记和个人理解. Basic Idea现有机器学2021-07-03 深度学习KD ALBERT: A Lite BERT for Self-supervised Learning of Language Representations本文前置知识: BERT: 详见ELMo, GPT, BERT. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations本文是论文AL2021-06-29 深度学习NLP BERT UniLM: Unified Language Model Pre-training for Natural Language Understanding and Generation本文前置知识: BERT: 详见ELMo, GPT, BERT. UniLM: Unified Language Model Pre-training for Natural Language Understanding and G2021-06-18 深度学习NLP BERT MASS: Masked Sequence to Sequence Pre - training for Language Generation本文前置知识; BERT: 详见ELMo, GPT, BERT. Transformer: 详见Transformer精讲. MASS: Masked Sequence to Sequence Pre-training for La2021-06-08 深度学习NLP BERT SpanBERT: Improving Pre-training by Representing and Predicting Spans本文前置知识: BERT: 详见ELMo, GPT, BERT. SpanBERT: Improving Pre-training by Representing and Predicting Spans本文是论文SpanBERT:2021-05-13 深度学习NLP BERT StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding本文前置知识: BERT: 详见ELMo, GPT, BERT. StructBERT: Incorporating Language Structures into Pre-training for Deep Language U2021-05-04 深度学习NLP BART和mBART本文前置知识: Transformer: 详见Transformer精讲. BERT, GPT: 详见ELMo, GPT, BERT. BART和mBART本文是如下论文的阅读笔记和个人理解: BART: Denoising2021-04-26 深度学习NLP Pytorch实现: BERT本文前置知识: Transformer: 详见Transformer精讲 BERT: 详见ELMo, GPT, BERT Pytorch实现: Transformer 2022.04.03: 修正Pre Norm效果好于Post No2021-03-12 深度学习NLP Transformer Pytorch Introduction: Graph Neural Network本文前置知识: 图结构基础知识(数据结构相关内容, 自行查阅). 2021.04.06: 更新GraphSAGE的理解. Introduction: Graph Neural Network本文介绍的是GNN方面入门级别的知识, 其2021-03-04 深度学习GNN ConvBERT: Improving BERT with Span-based Dynamic ConvolutionConvBERT: Improving BERT with Span-based Dynamic Convolution 本文前置知识: Light Weight Convolution: 详见基于轻量级卷积和动态卷积替代的注意力机制.2021-02-12 深度学习NLP CNN Attention 基于轻量级卷积和动态卷积替代的注意力机制本文前置知识: Depthwise Convolution: 详见深度可分离卷积与分组卷积. Attention: 详见Seq2Seq和Attention. Transformer: 详见Transformer精讲. 本文是论文PA2020-12-05 深度学习CNN Attention