Ugrás a tartalomhoz

 

Adaptive Temporal Convolutional Network for language modeling

  • Metaadatok
Tartalom: http://hdl.handle.net/10890/54994
Archívum: Műegyetem Digitális Archívum
Gyűjtemény: 1. Tudományos közlemények, publikációk
Konferenciák gyűjteményei
2nd Workshop on Intelligent Infocommunication Networks, Systems and Services, 2024
Cím:
Adaptive Temporal Convolutional Network for language modeling
Létrehozó:
Abed, Hamdi M H
Gyires-Tóth, Bálint
Dátum:
2024-02-26T15:42:49Z
2024-02-26T15:42:49Z
2024
Tartalmi leírás:
Temporal Convolutional Networks (TCNs) are one-dimensional convolutional neural networks for modelling sequential data. A key component in TCN is the dilation, that is used to increase the receptive field while keeping the number of parameters low. Dilation rates are predetermined in TCN. In this paper, an adaptive method is introduced for learning dilation rates by utilizing trainable binary masks with sparsity constraints (named as Adaptive TCN, AdaTCN). To select connections that are deemed important, the binary masks are applied to convolutional layers. We introduce structured sparsity into the mask using Gumbel Sharp softmax in order to control the number of active connections. Four different models, including TCN, random masked TCN, AdaTCN, and an AdaTCN that is initiated with TCN-like mask are trained and evaluated. With the Penn TreeBank (PTB) and WikitText-2 (WT2) datasets, experiments are conducted on word-level language models
Nyelv:
angol
Típus:
Konferenciaközlemény
Formátum:
application/pdf
Azonosító: