Convex clustering : model, theoretical guarantee and efficient algorithm

Sun, D; Toh, KC; Yuan, Y

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/93926

DC Field	Value	Language
dc.contributor	Department of Applied Mathematics	en_US
dc.creator	Sun, D	en_US
dc.creator	Toh, KC	en_US
dc.creator	Yuan, Y	en_US
dc.date.accessioned	2022-08-03T01:24:14Z	-
dc.date.available	2022-08-03T01:24:14Z	-
dc.identifier.issn	1532-4435	en_US
dc.identifier.uri	http://hdl.handle.net/10397/93926	-
dc.language.iso	en	en_US
dc.publisher	MIT Press	en_US
dc.rights	© 2021 Defeng Sun, Kim-Chuan Toh and Yancheng Yuan.	en_US
dc.rights	License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v22/18-694.html.	en_US
dc.rights	The following publication Sun, D., Toh, K. C., & Yuan, Y. (2021). Convex Clustering: Model, Theoretical Guarantee and Efficient Algorithm. Journal of Machine Learning Research, 22(9), 1-32 is available at https://www.jmlr.org/papers/v22/18-694.html	en_US
dc.subject	Convex clustering	en_US
dc.subject	Augmented Lagrangian method	en_US
dc.subject	Semismooth Newton method	en_US
dc.subject	Conjugate gradient method	en_US
dc.subject	Unsupervised learning	en_US
dc.title	Convex clustering : model, theoretical guarantee and efficient algorithm	en_US
dc.type	Journal/Magazine Article	en_US
dc.identifier.volume	22	en_US
dcterms.abstract	Clustering is a fundamental problem in unsupervised learning. Popular methods like K-means, may suffer from poor performance as they are prone to get stuck in its local minima. Recently, the sum-of-norms (SON) model (also known as the convex clustering model) has been proposed by Pelckmans et al. (2005), Lindsten et al. (2011) and Hocking et al. (2011). The perfect recovery properties of the convex clustering model with uniformly weighted all-pairwise-differences regularization have been proved by Zhu et al. (2014) and Panahi et al. (2017). However, no theoretical guarantee has been established for the general weighted convex clustering model, where better empirical results have been observed. In the numerical optimization aspect, although algorithms like the alternating direction method of multipliers (ADMM) and the alternating minimization algorithm (AMA) have been proposed to solve the convex clustering model (Chi and Lange, 2015), it still remains very challenging to solve large-scale problems. In this paper, we establish sufficient conditions for the perfect recovery guarantee of the general weighted convex clustering model, which include and improve existing theoretical results in (Zhu et al., 2014; Panahi et al., 2017) as special cases. In addition, we develop a semismooth Newton based augmented Lagrangian method for solving large-scale convex clustering problems. Extensive numerical experiments on both simulated and real data demonstrate that our algorithm is highly efficient and robust for solving large-scale problems. Moreover, the numerical results also show the superior performance and scalability of our algorithm comparing to the existing first-order methods. In particular, our algorithm is able to solve a convex clustering problem with 200,000 points in R3 in about 6 minutes.	en_US
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	Journal of machine learning research, 2021, v. 22, 9, p. 1-32	en_US
dcterms.isPartOf	Journal of machine learning research	en_US
dcterms.issued	2021	-
dc.identifier.eissn	1533-7928	en_US
dc.identifier.artn	9	en_US
dc.description.validate	202208 bcfc	en_US
dc.description.oa	Version of Record	en_US
dc.identifier.FolderNumber	AMA-0089	-
dc.description.fundingSource	RGC	en_US
dc.description.pubStatus	Published	en_US
dc.identifier.OPUS	54170844	-
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
18-694.pdf		1.04 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show simple item record

Page views

61

Last Week
1

Last month

Citations as of May 12, 2024

Downloads

14

Citations as of May 12, 2024

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

Page views

Downloads

Google ScholarTM

Google Scholar^TM