Ugrás a tartalomhoz

 

A Hybrid Algorithm for Robust Pitch Estimation in Emotional Speech Synthesis

  • Metaadatok
Tartalom: http://hdl.handle.net/10890/58920
Archívum: Műegyetem Digitális Archívum
Gyűjtemény: 1. Tudományos közlemények, publikációk
Konferenciák gyűjteményei
Workshop on Intelligent Infocommunication Networks, Systems and Services
3rd Workshop on Intelligent Infocommunication Networks, Systems and Services, 2025
Cím:
A Hybrid Algorithm for Robust Pitch Estimation in Emotional Speech Synthesis
Létrehozó:
Zineb, Hammadi
Al-Radhi, Mohammed Salah
Dátum:
2025-02-20T13:52:05Z
2025-02-20T13:52:05Z
2025
Tartalmi leírás:
Emotional intelligence in synthetic speech remains a critical challenge in human-machine interaction, despite significant advances in speech synthesis naturalness and intelligibility. Current systems struggle to accurately capture the nuanced emotional expressions characteristic of human speech, including rapid pitch transitions, wide frequency variations, and irregular vibrato patterns. While pitch estimation algorithms like PESTO and FCPE have proven effective for standard speech, their performance on emotional content remains largely unexplored. We present ESCAPE (Emotion Self-Supervised ContextAware Pitch Estimation), a novel algorithm specifically designed for emotional speech processing. ESCAPE synthesizes PESTO's precise frequency variation handling with FCPE's context-aware processing through a hybrid architecture that achieves robust pitch tracking in expressive vocal content. Our approach maintains computational efficiency while excelling at capturing complex acoustic patterns unique to emotional utterances. This paper provides the first comprehensive evaluation of PESTO and FCPE on emotional speech datasets and introduces ESCAPE as a transformative solution for pitch estimation in emotionally expressive speech synthesis. Our results demonstrate significant progress toward bridging the gap between human-like emotional expression and machine-generated speech, marking an important advancement in emotional speech synthesis technology.
Nyelv:
angol
Típus:
Könyvfejezet
Formátum:
application/pdf
Azonosító: