Please use this identifier to cite or link to this item:
Title: Feature representation learning in complex networks
Authors: Shen, Xiao
Degree: Ph.D.
Issue Date: 2019
Abstract: Complex networks are ubiquitous in the real world. Learning appropriate feature representations for complex networks is important for a wide variety of graph mining tasks. Motivated by this, in this thesis, we propose four models to learn informative feature vector representations for nodes or edges in the networks, which can effectively and efficiently address several canonical graph mining tasks. In the first work, we utilize a feature-engineering approach to define explicit topological features for nodes and edges in the influence maximization (IM) scenario. Next, we propose three deep network embedding models to learn the low-dimensional latent node vector representations which can well preserve the original network structures and properties. Preserving various network properties is important for learning informative feature representations for different graph mining tasks. Thus, the first two proposed deep network embedding models focus on preserving the asymmetric network transitivity and the signed network property to effectively address the typical graph mining tasks within a single network, including node classification, node clustering and link prediction. In addition, the third proposed deep network embedding model incorporates domain adaptation technique into deep network embedding to learn generalized and comparable feature representations which can effectively address the cross-network prediction task. In the first work, we propose a cross-network learning (CNL) framework to leverage the greedy seed selection and influence propagation knowledge pre-learned from a smaller source network to select seed nodes and remove inactive edges for multiple larger target networks. To address domain discrepancy, we assign lower weights to the explicit topological features which perform less similarly between the source network and the target network. In addition, we utilize a fuzzy self-training algorithm to iteratively retrain the prediction model based on not only the fully labeled instances from the source network, but also the most confident predicted instances in the target network with their predicted fuzzy labels. Extensive experiments demonstrate that the proposed CNL model can achieve a good trade-off between the efficiency and effectiveness of the IM task in the target networks.
In addition, the three proposed deep network embedding models focus on addressing several open issues in current network embedding research, i.e., asymmetric network embedding, signed network embedding and cross-network embedding. Firstly, an asymmetry-aware deep network embedding (AsDNE) model is proposed, which is composed of two semi-supervised stacked auto-encoders (SAEs) to preserve the asymmetric outward and inward network proximities. To well capture the asymmetric relationships, we design pairwise constraints to map node pairs with bi-directionally strong connections much closer than those with strong connection in only one direction. Extensive experiments demonstrate that AsDNE can learn task-independent network representations outperforming the state-of-the-art network embedding algorithms, in both directed and undirected networks. Secondly, we propose a deep network embedding model with structural balance preservation (DNE-SBP) for signed networks. A semi-supervised SAE is employed to reconstruct the signed adjacency matrix, where larger penalty is added to make the SAE focus more on reconstructing the scarce negative links than the abundant positive links. To well preserve the structural balance property, we design pairwise constraints to map positively connected nodes much closer than negatively connected nodes. Extensive experiments demonstrate the superiority of DNE-SBP over the state-of-the-art network embedding algorithms for graph representation learning in signed networks. Finally, we propose a cross-network deep network embedding (CDNE) model, which innovatively integrates deep network embedding and domain adaptation techniques to learn label-discriminative and network-invariant node vector representations. Two semi-supervised SAEs are employed to embed nodes from the source network and the target network into a unified low-dimensional latent space. In addition, similar nodes within a network and across networks would be mapped closer to each other, based on their network structures, attributes and labels. Extensive experiments demonstrate that CDNE significantly outperforms the state-of-the-art network embedding algorithms for node classification in the target network.
Subjects: Hong Kong Polytechnic University -- Dissertations
Data mining -- Graphic methods
Machine learning
Pages: xiv, 157 pages : color illustrations
Appears in Collections:Thesis

Show full item record

Page views

Citations as of May 15, 2022

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.