推荐文章
深度学习

Introduction: Vector Quantization

Introduction: Vector QuantizationVector QuantizationAutoEncoder(AE)由En

阅读更多
深度学习

Multimodal Large Language Model 总结

本文前置知识: Vision & Language Pretrained Model 总结. Multimodal Large

阅读更多
RoPE / RoFormer: Enhanced Transformer with Rotary Position Embedding RoPE / RoFormer: Enhanced Transformer with Rotary Position Embedding
RoPE / RoFormer: Enhanced Transformer with Rotary Position Embedding本文是论文 RoFormer: Enhanced Transformer with Rotary Pos
2025-02-12
CLAP: Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation CLAP: Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation
Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation本文是论文Large-Sca
2025-02-07
EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks
本文前置知识: HiFi - GAN: HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis. EVA-
2025-01-17
Whisper: Robust Speech Recognition via Large-Scale Weak Supervision Whisper: Robust Speech Recognition via Large-Scale Weak Supervision
Robust Speech Recognition via Large-Scale Weak Supervision本文是论文Robust Speech Recognition via Large-Scale Weak Supervisio
2025-01-14 DaNing
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis本文是论文HiFi-GAN: Generative Adve
2025-01-03
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
本文前置知识: DDPM: Denoising Diffusion Probabilistic Model. DiffSinger: Singing Voice Synthesis via Shallow Diffusion Me
2024-10-18
DDPM: Denoising Diffusion Probabilistic Model DDPM: Denoising Diffusion Probabilistic Model
DDPM: Denoising Diffusion Probabilistic ModelDDPM Overview DDPM: Denoising Diffusion Probabilistic Models. 扩散概率模型(Diffu
2024-10-07
Pytorch实现: VQ-VAE Pytorch实现: VQ-VAE
本文前置知识: VQ基本知识: Introduction: Vector Quantization Vector Quantization. Pytorch实现: VQ - VAE本文是VQ - VAE的Pytorch版本实现, 并
2024-07-28 DaNing
Introduction: Vector Quantization Introduction: Vector Quantization
Introduction: Vector QuantizationVector QuantizationAutoEncoder(AE)由Encoder和Decoder组成, Encoder将图像压缩为一个低维的隐向量(Latent), 再由
2024-07-16
Multimodal Large Language Model 总结 Multimodal Large Language Model 总结
本文前置知识: Vision & Language Pretrained Model 总结. Multimodal Large Language Model 总结最近MLLM的进展实在是太快了, 必须得赶紧写一篇博客出来了… 再
2024-07-03
通用信息抽取(下) - UniEX, Mirror, RexUIE 通用信息抽取(下) - UniEX, Mirror, RexUIE
本文前置知识: 通用信息抽取(上) - UIE, USM, InstructUIE. 通用信息抽取(下) - UniEX, Mirror, RexUIE本文为介绍通用信息抽取领域经典模型的下篇, 将会介绍了UniEX, Mirror
2024-05-29
通用信息抽取(上) - UIE, USM, InstructUIE 通用信息抽取(上) - UIE, USM, InstructUIE
2024.5.27: 稍微补充了UIE的其中一个改进版MetaRetriever. 本文前置知识: T5: Exploring the Limits of Transfer Learning with a Unified Text-to
2024-01-21
1 / 11