Kereső
Bejelentkezés
Kapcsolat
Aggregation Strategies of Wav2vec 2.0 Embeddings for Computational Paralinguistic Tasks |
Tartalom: | https://real.mtak.hu/203111/ |
---|---|
Archívum: | REAL |
Gyűjtemény: |
Status = Published
Subject = P Language and Literature / nyelvészet és irodalom: P0 Philology. Linguistics / filológia, nyelvészet Subject = Q Science / természettudomány: QA Mathematics / matematika: QA75 Electronic computers. Computer science / számítástechnika, számítógéptudomány Type = Article Subject = Q Science / természettudomány: QA Mathematics / matematika: QA76.76 Software Design and Development / Szoftvertervezés és -fejlesztés |
Cím: |
Aggregation Strategies of Wav2vec 2.0 Embeddings for Computational Paralinguistic Tasks
|
Létrehozó: |
Kiss-Vetráb, Mercedes
Gosztolya, Gábor
|
Dátum: |
2023
|
Téma: |
P0 Philology. Linguistics / filológia, nyelvészet
QA75 Electronic computers. Computer science / számítástechnika, számítógéptudomány
QA76.76 Software Design and Development / Szoftvertervezés és -fejlesztés
|
Tartalmi leírás: |
Throughout the history of computational paralinguistics, numerous feature extraction, preprocessing and classification techniques have been used. One of the important challenges in this subfield of speech technology is handling utterances with different duration. Since standard speech processing features (such as filter banks or DNN embeddings) are typically frame-level ones and we would like to classify whole utterances, a set of frame-level features have to be converted into fixed-sized utterance-level features. The choice of this aggregation method is often overlooked, and simple functions like mean and/or standard deviation are used without solid experimental support. In this study we take wav2vec 2.0 deep embeddings, and aggregate them with 11 different functions. We sought to obtain a subset of potentially optimal aggregation functions, because there are no general rules yet that can be applied universally between subtopics. Besides testing both standard and non-traditional aggregation strategies individually, we also combined them to improve the classification performance. By using multiple aggregation functions, we were able to achieve significant improvements on three public paralinguistic corpora.
|
Nyelv: |
angol
|
Típus: |
Article
PeerReviewed
info:eu-repo/semantics/article
|
Formátum: |
text
|
Azonosító: |
Kiss-Vetráb, Mercedes and Gosztolya, Gábor (2023) Aggregation Strategies of Wav2vec 2.0 Embeddings for Computational Paralinguistic Tasks. LECTURE NOTES IN COMPUTER SCIENCE, 14338. pp. 79-93. ISSN 0302-9743
|
Kapcsolat: |
MTMT:34511444 10.1007/978-3-031-48309-7_7
|