Scaling supervised local learning with augmented auxiliary networks

Ma, C; Wu, J; Si, C; Tan, KC

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/107657

DC Field	Value	Language
dc.contributor	Department of Computing	en_US
dc.contributor	Department of Data Science and Artificial Intelligence	en_US
dc.creator	Ma, C	en_US
dc.creator	Wu, J	en_US
dc.creator	Si, C	en_US
dc.creator	Tan, KC	en_US
dc.date.accessioned	2024-07-09T03:54:33Z	-
dc.date.available	2024-07-09T03:54:33Z	-
dc.identifier.uri	http://hdl.handle.net/10397/107657	-
dc.description	The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 07 2024	en_US
dc.language.iso	en	en_US
dc.publisher	OpenReview.net	en_US
dc.rights	Posted with permission of the author.	en_US
dc.title	Scaling supervised local learning with augmented auxiliary networks	en_US
dc.type	Conference Paper	en_US
dcterms.abstract	Deep neural networks are typically trained using global error signals that backpropagate (BP) end-to-end, which is not only biologically implausible but also suffers from the update locking problem and requires huge memory consumption. Local learning, which updates each layer independently with a gradient-isolated auxiliary network, offers a promising alternative to address the above problems. However, existing local learning methods are confronted with a large accuracy gap with the BP counterpart, particularly for large-scale networks. This is due to the weak coupling between local layers and their subsequent network layers, as there is no gradient communication across layers. To tackle this issue, we put forward an augmented local learning method, dubbed AugLocal. AugLocal constructs each hidden layer’s auxiliary network by uniformly selecting a small subset of layers from its subsequent network layers to enhance their synergy. We also propose to linearly reduce the depth of auxiliary networks as the hidden layer goes deeper, ensuring sufficient network capacity while reducing the computational cost of auxiliary networks. Our extensive experiments on four image classification datasets (i.e., CIFAR-10, SVHN, STL-10, and ImageNet) demonstrate that AugLocal can effectively scale up to tens of local layers with a comparable accuracy to BP-trained networks while reducing GPU memory usage by around 40%. The proposed AugLocal method, therefore, opens up a myriad of opportunities for training high-performance deep neural networks on resource-constrained platforms.	en_US
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 07 2024, https://openreview.net/forum?id=Qbf1hy8b7m¬eId=Qbf1hy8b7m	en_US
dcterms.issued	2024	-
dc.relation.conference	International Conference on Learning Representations [ICLR]	en_US
dc.description.validate	202406 bcch	en_US
dc.description.oa	Version of Record	en_US
dc.identifier.FolderNumber	a2887b	-
dc.identifier.SubFormID	48654	-
dc.description.fundingSource	RGC	en_US
dc.description.fundingSource	Others	en_US
dc.description.fundingText	National Natural Science Foundation of China	en_US
dc.description.pubStatus	Published	en_US
dc.description.oaCategory	Copyright retained by author	en_US
Appears in Collections:	Conference Paper

Files in This Item:

File	Description	Size	Format
Ma_Scaling_Supervised_Local.pdf		557.04 kB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Show simple item record

Page views

136

Citations as of Nov 10, 2025

Downloads

36

Citations as of Nov 10, 2025

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Page views

Downloads

Google ScholarTM

Google Scholar^TM