Lyrics Transcription

Created Jul 1, 2025 - Last updated: Jul 1, 2025

Evergreen 🌳

phd singing-voice

I explore the task of transcribing lyrics directly from sung audio, with a focus on multilinguality, musical accompaniment, and real-world variability in vocal delivery. My research investigates architectural choices (CRNNs, Transformers, and Wav2Vec2-based models), training objectives (CTC, seq2seq, and hybrid losses), and data-centric challenges in building robust ALT systems.