Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/107657
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Computingen_US
dc.contributorDepartment of Data Science and Artificial Intelligenceen_US
dc.creatorMa, Cen_US
dc.creatorWu, Jen_US
dc.creatorSi, Cen_US
dc.creatorTan, KCen_US
dc.date.accessioned2024-07-09T03:54:33Z-
dc.date.available2024-07-09T03:54:33Z-
dc.identifier.urihttp://hdl.handle.net/10397/107657-
dc.descriptionThe Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 07 2024en_US
dc.language.isoenen_US
dc.publisherOpenReview.neten_US
dc.rightsPosted with permission of the author.en_US
dc.titleScaling supervised local learning with augmented auxiliary networksen_US
dc.typeConference Paperen_US
dcterms.abstractDeep neural networks are typically trained using global error signals that backpropagate (BP) end-to-end, which is not only biologically implausible but also suffers from the update locking problem and requires huge memory consumption. Local learning, which updates each layer independently with a gradient-isolated auxiliary network, offers a promising alternative to address the above problems. However, existing local learning methods are confronted with a large accuracy gap with the BP counterpart, particularly for large-scale networks. This is due to the weak coupling between local layers and their subsequent network layers, as there is no gradient communication across layers. To tackle this issue, we put forward an augmented local learning method, dubbed AugLocal. AugLocal constructs each hidden layer’s auxiliary network by uniformly selecting a small subset of layers from its subsequent network layers to enhance their synergy. We also propose to linearly reduce the depth of auxiliary networks as the hidden layer goes deeper, ensuring sufficient network capacity while reducing the computational cost of auxiliary networks. Our extensive experiments on four image classification datasets (i.e., CIFAR-10, SVHN, STL-10, and ImageNet) demonstrate that AugLocal can effectively scale up to tens of local layers with a comparable accuracy to BP-trained networks while reducing GPU memory usage by around 40%. The proposed AugLocal method, therefore, opens up a myriad of opportunities for training high-performance deep neural networks on resource-constrained platforms.en_US
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationThe Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 07 2024, https://openreview.net/forum?id=Qbf1hy8b7m¬eId=Qbf1hy8b7men_US
dcterms.issued2024-
dc.relation.conferenceInternational Conference on Learning Representations [ICLR]en_US
dc.description.validate202406 bcchen_US
dc.description.oaVersion of Recorden_US
dc.identifier.FolderNumbera2887b-
dc.identifier.SubFormID48654-
dc.description.fundingSourceRGCen_US
dc.description.fundingSourceOthersen_US
dc.description.fundingTextNational Natural Science Foundation of Chinaen_US
dc.description.pubStatusPublisheden_US
dc.description.oaCategoryCopyright retained by authoren_US
Appears in Collections:Conference Paper
Files in This Item:
File Description SizeFormat 
Ma_Scaling_Supervised_Local.pdf557.04 kBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Show simple item record

Page views

85
Citations as of Apr 13, 2025

Downloads

26
Citations as of Apr 13, 2025

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.