Canonical convolutional neural networks

We introduce canonical weight normalization for convolutional neural networks. Inspired by the canonical tensor decomposition, we express the weight tensors in so-called canonical networks as scaled sums of outer vector products. In particular, we train network weights in the decomposed form, where scale weights are optimized separately for each mode. Additionally, similarly to weight normalization, we include a global scaling parameter. We study the initialization of the canonical form by running the power method and by drawing randomly from Gaussian or uniform distributions. Our results indicate that we can replace the power method with cheaper initializations drawn from standard distributions. The canonical re-parametrization leads to competitive normalization performance on the MNIST, CIFAR10, and SVHN data sets. Moreover, the formulation simplifies network compression. Once training has converged, the canonical form allows convenient model-compression by truncating the parameter sums.

Metadaten
Document Type:	Preprint
Language:	English
Author:	Lokesh Veeramacheneni, Moritz Wolter, Reinhard Klein, Jochen Garcke
DOI:	https://doi.org/10.48550/arXiv.2206.01509
ArXiv Id:	http://arxiv.org/abs/2206.01509
Date of first publication:	2022/06/03
Publication status:	Preprint accepted at the International Joint Conference on Neural Networks (IJCNN) 2022
Departments, institutes and facilities:	Fachbereich Informatik
Dewey Decimal Classification (DDC):	0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik
Entry in this database:	2022/06/24

Open Access