Kereső
Bejelentkezés
Kapcsolat
Adaptive Temporal Convolutional Network for language modeling |
Tartalom: | http://hdl.handle.net/10890/54994 |
---|---|
Archívum: | Műegyetem Digitális Archívum |
Gyűjtemény: |
1. Tudományos közlemények, publikációk
Konferenciák gyűjteményei 2nd Workshop on Intelligent Infocommunication Networks, Systems and Services, 2024 |
Cím: |
Adaptive Temporal Convolutional Network for language modeling
|
Létrehozó: |
Abed, Hamdi M H
Gyires-Tóth, Bálint
|
Dátum: |
2024-02-26T15:42:49Z
2024-02-26T15:42:49Z
2024
|
Tartalmi leírás: |
Temporal Convolutional Networks (TCNs) are one-dimensional convolutional neural networks for modelling sequential data. A key component in TCN is the dilation, that is used to increase the receptive field while keeping the number of parameters low. Dilation rates are predetermined in TCN. In this paper, an adaptive method is introduced for learning dilation rates by utilizing trainable binary masks with sparsity constraints (named as Adaptive TCN, AdaTCN). To select connections that are deemed important, the binary masks are applied to convolutional layers. We introduce structured sparsity into the mask using Gumbel Sharp softmax in order to control the number of active connections. Four different models, including TCN, random masked TCN, AdaTCN, and an AdaTCN that is initiated with TCN-like mask are trained and evaluated. With the Penn TreeBank (PTB) and WikitText-2 (WT2) datasets, experiments are conducted on word-level language models
|
Nyelv: |
angol
|
Típus: |
Konferenciaközlemény
|
Formátum: |
application/pdf
|
Azonosító: |