MeanFlow: Mean Flows for One-step Generative Modeling本文前置知识: Flow Matching: Flow Matching for Generative Modeling. 或者: Rectified Flow: ReFlow: Flow Straight and Fast: Le2025-05-22 深度学习Diffusion Flow Flow Matching Mean Flow ReFlow: Flow Straight and Fast-Learning to Generate and Transfer Data with Rectified FlowFlow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow 论文: Flow Straight and Fast: Learning2025-05-20 深度学习Diffusion Flow ReFlow AlignSTS: Speech-to-Singing Conversion via Cross-Modal AlignmentAlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment 论文: AlignSTS: Speech-to-Singing Conversion via Cross-Mo2025-05-13 深度学习STS Audio DDIM: Denoising Diffusion Implicit Models本文前置知识: DDPM: DDPM: Denoising Diffusion Probabilistic Model. Denoising Diffusion Implicit Models 论文: Denoising Diffu2025-03-21 深度学习Diffusion DDIM RoPE / RoFormer: Enhanced Transformer with Rotary Position EmbeddingRoPE / RoFormer: Enhanced Transformer with Rotary Position Embedding本文是论文 RoFormer: Enhanced Transformer with Rotary Pos2025-02-12 深度学习RoPE LLM CLAP: Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption AugmentationLarge-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation本文是论文Large-Sca2025-02-07 深度学习Audio EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks本文前置知识: HiFi - GAN: HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis. EVA-2025-01-17 深度学习Audio SVS Vocoder TTS HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech SynthesisHiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis本文是论文HiFi-GAN: Generative Adve2025-01-03 深度学习Audio SVS Vocoder TTS DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism本文前置知识: DDPM: Denoising Diffusion Probabilistic Model. DiffSinger: Singing Voice Synthesis via Shallow Diffusion Me2024-10-18 深度学习Audio SVS DDPM Diffusion DDPM: Denoising Diffusion Probabilistic Model2025.03.17: 更新Reverse Process中的描述. DDPM: Denoising Diffusion Probabilistic ModelDDPM Overview DDPM: Denoising Diffusio2024-10-07 深度学习DDPM Diffusion Introduction: Vector QuantizationIntroduction: Vector QuantizationVector QuantizationAutoEncoder(AE)由Encoder和Decoder组成, Encoder将图像压缩为一个低维的隐向量(Latent), 再由2024-07-16 深度学习VQ VQ-VAE VQ-GAN Multimodal Large Language Model 总结本文前置知识: Vision & Language Pretrained Model 总结. 2025.05.06: 应评论区要求, 更新了Qwen-VL系列(Qwen-VL, Qwen2-VL, Qwen2.5-VL). Mu2024-07-03 深度学习MLLM MM