Adaptive Temporal Convolutional Network for language modeling - TUdományos DOkumentumok Közös Keresője

in English |
magyarul

Betűméret: Súgó

Kereső

Bejelentkezés

Regisztráció

Kapcsolat

MTA KIK
HUN-REN SZTAKI DSD

Adaptive Temporal Convolutional Network for language modeling

Metaadatok

Tartalom:	http://hdl.handle.net/10890/54994
Archívum:	Műegyetem Digitális Archívum
Gyűjtemény:	1. Tudományos közlemények, publikációk Konferenciák gyűjteményei 2nd Workshop on Intelligent Infocommunication Networks, Systems and Services, 2024
Cím:	Adaptive Temporal Convolutional Network for language modeling
Létrehozó:	Abed, Hamdi M H Gyires-Tóth, Bálint
Dátum:	2024-02-26T15:42:49Z 2024-02-26T15:42:49Z 2024
Tartalmi leírás:	Temporal Convolutional Networks (TCNs) are one-dimensional convolutional neural networks for modelling sequential data. A key component in TCN is the dilation, that is used to increase the receptive field while keeping the number of parameters low. Dilation rates are predetermined in TCN. In this paper, an adaptive method is introduced for learning dilation rates by utilizing trainable binary masks with sparsity constraints (named as Adaptive TCN, AdaTCN). To select connections that are deemed important, the binary masks are applied to convolutional layers. We introduce structured sparsity into the mask using Gumbel Sharp softmax in order to control the number of active connections. Four different models, including TCN, random masked TCN, AdaTCN, and an AdaTCN that is initiated with TCN-like mask are trained and evaluated. With the Penn TreeBank (PTB) and WikitText-2 (WT2) datasets, experiments are conducted on word-level language models
Nyelv:	angol
Típus:	Konferenciaközlemény
Formátum:	application/pdf
Azonosító:	http://hdl.handle.net/10890/54994