Dysphonia detection using a fully convolutional neural network adapted to dynamic speech lengths - TUdományos DOkumentumok Közös Keresője

in English |
magyarul

Betűméret: Súgó

Kereső

Bejelentkezés

Regisztráció

Kapcsolat

MTA KIK
HUN-REN SZTAKI DSD

Dysphonia detection using a fully convolutional neural network adapted to dynamic speech lengths

Metaadatok

Tartalom:	http://hdl.handle.net/10890/54982
Archívum:	Műegyetem Digitális Archívum
Gyűjtemény:	1. Tudományos közlemények, publikációk Konferenciák gyűjteményei 2nd Workshop on Intelligent Infocommunication Networks, Systems and Services, 2024
Cím:	Dysphonia detection using a fully convolutional neural network adapted to dynamic speech lengths
Létrehozó:	Aziz, Dosti Sztahó, Dávid
Dátum:	2024-02-26T15:42:02Z 2024-02-26T15:42:02Z 2024
Tartalmi leírás:	Various conditions affect human speech, leading individuals to produce speech that differs in terms of pitch, quality, and clarity. Numerous deep learning-based methods have been proposed to detect speech disorders. Deep learning (DL) detection methods based on speech require fixed-length dimensional input, which can sometimes be challenging to achieve, even for normal speakers, particularly when dealing with continuous speech rather than sustained vowels or one-word utterances. In this paper, we propose a fully convolutional approach for dysphonia detection. Our proposed method can accommodate any speech duration, unlike previous work that relies on fixed-length samples. The model incorporates exclusively convolutional layers without fully connected layers, enabling it to handle varying speech lengths. Our results demonstrate the superior performance of our proposed model in comparison to other DL approaches that used the same dataset for dysphonia detection. Specifically, our model showcased an accuracy of 91.69%. This represents a notable improvement of over 6% compared to the performance achieved by previous methodologies.
Nyelv:	angol
Típus:	Konferenciaközlemény
Formátum:	application/pdf
Azonosító:	http://hdl.handle.net/10890/54982