Distilling the Knowledge in a Neural Network本文是论文Distilling the Knowledge in a Neural Network的阅读笔记和个人理解. Basic Idea现有机器学
2021-07-03
知识蒸馏: Distilling the Knowledge in a Neural Network
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
UniLM: Unified Language Model Pre-training for Natural Language Understanding and Generation
MASS: Masked Sequence to Sequence Pre - training for Language Generation
SpanBERT: Improving Pre-training by Representing and Predicting Spans
StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding
BART和mBART
GAKE: Graph Aware Knowledge Embedding
HAKE: Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction
ReInceptionE: Relation-Aware Inception Network with Joint Local-Global Structural Information for KGE
KBAT: Learning Attention-based Embeddings for Relation Prediction in KGs
CompGCN: Composition-based Multi-Relational Graph Convolutional Networks