![]() |
International Journal of Scientific Research and Engineering Development( International Peer Reviewed Open Access Journal ) ISSN [ Online ] : 2581 - 7175 |
IJSRED » Archives » Volume 9 -Issue 2

📑 Paper Information
| 📑 Paper Title | Affective State Representation and Prosodic Modulation Strategies in Deep Neural–Based Speaker and Voice Cloning Systems |
| 👤 Authors | Chunduri Raghavendra, Chirumamilla Praveen Kumar, Kola Sri Venkata Sai Manichand, Borra Naga Siva Shankar Vara Prasad, Kurra Leela Mahesh |
| 📘 Published Issue | Volume 9 Issue 2 |
| 📅 Year of Publication | 2026 |
| 🆔 Unique Identification Number | IJSRED-V9I2P154 |
| 📑 Search on Google | Click Here |
📝 Abstract
Neural voice cloning models have reached a great level of speaker similarity; yet, most current models produce unemotional speech, which fails to convey natural expression. This paper presents an emotion-sensitive voice-cloning model using variational representation learning, which combines modeling and regulation of affective state through prosody. The proposed model uses a speaker encoder for preserving speaker identity, a reference encoder for extracting emotional prosodic features, and a variational autoencoder for separating speaker and emotional attributes. Extracted features are incorporated into Tacotron and FastSpeech models and then synthesized using neural vocoders. Experiments on the IEMOCAP, CREMA-D, and VCTK corpora have shown an enhancement in emotional expression and speaker similarity scores. Objective evaluation revealed an enhancement of 17.8% in pitch variability and 21.3% in energy expression compared to baseline models, while MOP scores have improved from 3.61 to 4.18 in perceived emotional naturalness assessment. The results verify that the affective representation/prosody modeling task has a significant positive impact on the realism of neural voice cloning models.
📝 How to Cite
Chunduri Raghavendra, Chirumamilla Praveen Kumar, Kola Sri Venkata Sai Manichand, Borra Naga Siva Shankar Vara Prasad, Kurra Leela Mahesh,"Affective State Representation and Prosodic Modulation Strategies in Deep Neural–Based Speaker and Voice Cloning Systems" International Journal of Scientific Research and Engineering Development, V9(2): Page(991-994) Mar-Apr 2026. ISSN: 2581-7175. www.ijsred.com. Published by Scientific and Academic Research Publishing.
📘 Other Details
