DDIM: Denoising Diffusion Implicit Models本文前置知识: DDPM: DDPM: Denoising Diffusion Probabilistic Model. Denoising Diffusion Implicit Models 论文: Denoising Diffu2025-03-21 深度学习Diffusion DDIM
RoPE / RoFormer: Enhanced Transformer with Rotary Position EmbeddingRoPE / RoFormer: Enhanced Transformer with Rotary Position Embedding本文是论文 RoFormer: Enhanced Transformer with Rotary Pos2025-02-12 深度学习RoPE LLM
CLAP: Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption AugmentationLarge-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation本文是论文Large-Sca2025-02-07 深度学习Audio
EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks本文前置知识: HiFi - GAN: HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis. EVA-2025-01-17 深度学习Audio SVS Vocoder TTS
Whisper: Robust Speech Recognition via Large-Scale Weak SupervisionRobust Speech Recognition via Large-Scale Weak Supervision本文是论文Robust Speech Recognition via Large-Scale Weak Supervisio2025-01-14 DaNingAudio ASR
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech SynthesisHiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis本文是论文HiFi-GAN: Generative Adve2025-01-03 深度学习Audio SVS Vocoder TTS