MeanFlow: Mean Flows for One-step Generative Modeling本文前置知识: Flow Matching: Flow Matching for Generative Modeling. 或者: Rectified Flow: ReFlow: Flow Straight and Fast: Le2025-05-22 深度学习Diffusion Flow Flow Matching Mean Flow
ReFlow: Flow Straight and Fast-Learning to Generate and Transfer Data with Rectified FlowFlow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow 论文: Flow Straight and Fast: Learning2025-05-20 深度学习Diffusion Flow ReFlow
AlignSTS: Speech-to-Singing Conversion via Cross-Modal AlignmentAlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment 论文: AlignSTS: Speech-to-Singing Conversion via Cross-Mo2025-05-13 深度学习STS Audio
DDIM: Denoising Diffusion Implicit Models本文前置知识: DDPM: DDPM: Denoising Diffusion Probabilistic Model. Denoising Diffusion Implicit Models 论文: Denoising Diffu2025-03-21 深度学习Diffusion DDIM
RoPE / RoFormer: Enhanced Transformer with Rotary Position EmbeddingRoPE / RoFormer: Enhanced Transformer with Rotary Position Embedding本文是论文 RoFormer: Enhanced Transformer with Rotary Pos2025-02-12 深度学习RoPE LLM
CLAP: Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption AugmentationLarge-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation本文是论文Large-Sca2025-02-07 深度学习Audio
EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks本文前置知识: HiFi - GAN: HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis. EVA-2025-01-17 深度学习Audio SVS Vocoder TTS
Whisper: Robust Speech Recognition via Large-Scale Weak SupervisionRobust Speech Recognition via Large-Scale Weak Supervision本文是论文Robust Speech Recognition via Large-Scale Weak Supervisio2025-01-14 DaNingAudio ASR
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech SynthesisHiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis本文是论文HiFi-GAN: Generative Adve2025-01-03 深度学习Audio SVS Vocoder TTS
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism本文前置知识: DDPM: Denoising Diffusion Probabilistic Model. DiffSinger: Singing Voice Synthesis via Shallow Diffusion Me2024-10-18 深度学习Audio SVS DDPM Diffusion
DDPM: Denoising Diffusion Probabilistic Model2025.03.17: 更新Reverse Process中的描述. DDPM: Denoising Diffusion Probabilistic ModelDDPM Overview DDPM: Denoising Diffusio2024-10-07 深度学习DDPM Diffusion
Pytorch实现: VQ-VAE本文前置知识: VQ基本知识: Introduction: Vector Quantization Vector Quantization. Pytorch实现: VQ - VAE本文是VQ - VAE的Pytorch版本实现, 并2024-07-28 DaNingVQ-VAE Pytorch