本文前置知识: Flow Matching: Flow Matching: Flow Matching for Generative Modeling. TCSinger: TCSinger: Zero-Shot Singing Voi
2025-08-07
TCSinger 2: Customizable Multilingual Zero-shot Singing Voice Synthesis
TCSinger: Zero-Shot Singing Voice Synthesis with Style Transfer and Multi-Level Style Control
Flow Matching: Flow Matching for Generative Modeling
MeanFlow: Mean Flows for One-step Generative Modeling
ReFlow: Flow Straight and Fast-Learning to Generate and Transfer Data with Rectified Flow
AlignSTS: Speech-to-Singing Conversion via Cross-Modal Alignment
DDIM: Denoising Diffusion Implicit Models
RoPE / RoFormer: Enhanced Transformer with Rotary Position Embedding
CLAP: Large-scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation
EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks
Whisper: Robust Speech Recognition via Large-Scale Weak Supervision
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis