Implementing a Text-to-Speech synthesis model on a Raspberry Pi for Industrial Applications - TUdományos DOkumentumok Közös Keresője

in English |
magyarul

Betűméret: Súgó

Kereső

Bejelentkezés

Regisztráció

Kapcsolat

MTA KIK
HUN-REN SZTAKI DSD

Implementing a Text-to-Speech synthesis model on a Raspberry Pi for Industrial Applications

Metaadatok

Tartalom:	http://hdl.handle.net/10890/40715
Archívum:	Műegyetem Digitális Archívum
Gyűjtemény:	1. Tudományos közlemények, publikációk Konferenciák gyűjteményei 1st Workshop on Intelligent Infocommunication Networks, Systems and Services, 2023 Workshop on Intelligent Infocommunication Networks, Systems and Services
Cím:	Implementing a Text-to-Speech synthesis model on a Raspberry Pi for Industrial Applications
Létrehozó:	Mandeel, Ali Raheem Aggar, Ammar Abdullah Al-Radhi, Mohammed Salah Csapó, Tamás Gábor
Dátum:	2023-03-13T16:07:10Z 2023-03-13T16:07:10Z 2023
Tartalmi leírás:	Text-to-Speech (TTS) produces human-like speech from input text. It has recently acquired prominence by applying deep neural networks. Nowadays, end-to-end TTS models produce highly natural synthesized speech but require extremely high computational resources. Deploying such high-quality TTS models in a real-time environment has been a challenging problem due to the limited resources of embedding systems and cell phones. This paper demonstrated the implementation of an end-to-end TTS model (FastSpeech 2) in an embedded device (Raspberry Pi4 B+). The objective experimental results showed that the TTS model is compatible with the Raspberry Pi with high-quality synthesized speech and acceptable performance in terms of processing speed. Our proposed model could be used in many real-life applications if used together with a mechanism for caching, such as railway announcements and industrial purposes.
Nyelv:	angol
Típus:	Konferenciaközlemény
Formátum:	application/pdf
Azonosító:	http://hdl.handle.net/10890/40715