Graph convolutional network: GNN in-depth discussion【02/4】

 

1. Description

 Among various types of GNNs, graph convolutional network (GCN) has become the most general and widely used model . GCNs are innovative because they are able to exploit the features of nodes and their locality for prediction, thus providing an efficient way to deal with graph-structured data. In this paper, we give an overview of graph theory and graph neural networks (GNNs) in the context of recommender systems.

2. Classical matrix completion method

        A popular technique for the system is matrix completion, which is a collaborative filtering method, using classical machine learning methods. Given a number  m of users  and a number  n of items , it aims to fill in missing values ​​in a user-item interaction matrix  R ( of dimension mxn ). To achieve this, we map each user and item to an   embedding  of size k — an abstract representation in a vector space. These embeddings might capture features like movie genre or user demographics, but many times are potentially unknown features. Generate user embedding matrix  U (dimension mxk) and item embedding matrix  I (dimension  nxk ).  To predict user-item pairs, we compute the dot product of the transposed item matrix and user matrix. Initially, the latent matrix is ​​initialized randomly, and we optimize the embeddings using a loss function based on known user-item interactions.

        Figure 1: This figure shows the user-item interaction matrix R and how we can take the dot product between the user and item embedding matrices to predict specific values ​​in the R matrix.

        However, this method suffers from performance issues when dealing with sparse matrices . In cases where a user interacts with only a few items out of millions of available items, classical matrix completion methods may not be sufficient, since they only consider direct connections between users and items. To address this limitation, recommender systems based on graph neural networks (GNNs) have emerged as a more effective alternative.

        GNNs provide improved performance in sparse datasets by not only considering individual user preferences but also integrating information from neighboring users . By exploiting the graph structure, GNN can more comprehensively capture the relationship between users and items, leading to more accurate and personalized recommendations. Let's start by reminding ourselves a little about graph theory.

3. Overview of Graph Theory

3.1 What is a graph?

        A graph is a data structure that represents collections of entities as nodes (vertices) and their relationships as edges . It is a powerful tool for modeling and understanding a variety of real-world scenarios. For example, a graph could represent bank transactions, where nodes symbolize bank accounts and edges represent transactions between them. Likewise, a social network graph has people as nodes, and edges depict relationships between individuals.

Figure 2: Graphic example.

3.2 Types of charts

        There are different types of graphics according to their characteristics.  A directed graph has edges with a specific direction. For example, in a bank transaction graph, each edge represents a transaction from a sender to a receiver, establishing a clear direction. On the other hand, undirected graphs do not assign directions to edges. In social networks, an undirected edge represents a connection or acquaintance between two people without any inherent directionality.

        Graphs can also be classified as homogeneous or heterogeneous.  Homogeneous graphs have a single type of nodes and edges , while heterogeneous graphs may contain multiple types. For example, in an e-commerce scenario, there might be two types of nodes: one representing items available for sale and one representing users. Different types of edges can represent different interactions , such as a user clicking an item or making a purchase.

Figure 3: Examples of directed, undirected, homogeneous and heterogeneous graphs

        A bipartite graph is a specific type of heterogeneous graph that is very useful in modeling recommender systems. They involve two different sets of nodes, such as users and projects, with edges specifically connecting nodes from different sets . Bipartite graphs effectively capture user-item interactions and enable efficient recommendation algorithms to exploit rich network structures.

Figure 4: Bipartite graph example.

3.3 How do we store graph data?

        There are various ways to store graph data. One approach is to use an adjacency matrix , denoted A ∈ {0, 1}ⁿxⁿ, where n  is the number of nodes in the graph .  The (i, j) entry Ai,j of the matrix represents the connectivity between nodes vi and vj, and Ai,j = 1 if there is an edge connecting vi and vj . For undirected graphs, the adjacency matrix is ​​symmetric, ie Ai,j = Aj,i. However, for large and sparse graphs (such as social networks), adjacency matrices can be memory intensive . This is because the adjacency matrix scales with the number of nodes. In a social network with millions of nodes, most people don't know each other. This will result in a large matrix with most cells empty. 

        To solve this problem, the adjacency list representation is more memory efficient.  It describes the edges between nodes as tuples (i, j), where (0, 1) represents the edge between nodes 0 and 1 . For example, for the graph in Figure 5, the adjacency list is [(A,B), (B,D ), (B,C), (D,C)].

Figure 5a: Example of a graph - Figure 5b: Adjacency matrix for the graph in Figure 4a.

        The adjacency list representation provides more memory efficiency , especially for sparse graphs, since it only stores necessary information about connected nodes. This makes it the first choice for processing large-scale graph data, such as social networks, where the number of connections is usually limited compared to the total number of nodes.

4. Graph neural network in recommendation system

        Similar to traditional matrix completion methods, GNNs can generate embeddings for users and items to predict unseen user-item interactions. However, they provide a way to explicitly incorporate higher-order graph structures and can capture latent or hidden correlations that may not be available in the data itself.

        Given a graph, our goal is to map each node  v  to its own d-dimensional final embedding, where similar nodes based on their network neighborhood features as well as their own features should end up close to each other in their final embedding space.

Figure 6: Node encoding into the embedding space.

4.1 Graph Neural Network Layer

        A layer of a GNN exchanges information between all immediate neighbors in the graph , generating new node embeddings for each node in the graph. In a 2-layer GNN model, each node will generate its layer 2 embedding based on its 2-hop neighborhood.  K-hop neighborhood refers to all nodes that are K-edge away from the node of interest . This is an iterative process in which neighbor variables "talk" to each variable by passing messages (a method of message passing).

Figure 7: Input graph and computation graph for a specific target node in a 2-layer GNN

        In this image, we see that the layer-2 representation of node A is generated by somehow aggregating the layer-1 embeddings of its immediate neighbors [ B,C,D ] and applying a black-box transformation or neural network to it. These embeddings in turn have their immediate neighbors embedded by their layer 0 [X_A, X_B...X_F], which is the initial input feature. Each layer generates a new node embedding, and K-level embeddings of nodes get information from nodes that are K hops away from itself.

4.2 Characteristics, advantages and limitations of graph neural network

        Graph Neural Networks (GNNs) have several notable features and advantages that set them apart from traditional matrix completion methods. These features contribute to their effectiveness in recommender systems. Let's explore these features:

  • Order invariance:  GNNs are order invariant, which means that the order in which nodes are labeled does not affect the result. Computational graphs consider node connectivity rather than node order, utilizing order-invariant aggregation functions (e.g. average, max/min pooling) for message passing.
  • Size invariance:  Each node in GNN has its own computation graph , which makes GNN invariant in size. This allows individual nodes to process and integrate information according to their local neighborhoods, enabling personalized and flexible learning. The figure below shows the computation graph for each node in the figure above.

Figure 8: Computational graph for each node in the input graph of Figure 7.

  • Handling sparse matrices: Unlike classic matrix completion methods, GNNs are good at handling sparse matrices. They go beyond direct node interactions and capture hidden dependencies present in higher-order graph structures . This feature enhances their performance in scenes with limited interaction
  • End-to-end learning:  GNNs provide end-to-end learning, simultaneously optimizing embedding and prediction tasks. This alleviates the need for manual feature engineering , simplifying the recommendation pipeline. Additionally, GNNs adapt well to evolving user/project functionality, reducing the need for major code modifications.

Although GNNs have advantages, they also have limitations that should be considered:

  • Computational complexity : GNNs can be computationally intensive, especially for large graphs and deep architectures. Training GNNs can require significant computational resources and longer training times than simpler models.
  • Interpretability: The complexity of GNNs may make them less interpretable compared to traditional methods. Understanding the inner workings and reasoning behind GNN-based proposals can be challenging.

V. Conclusion

        In this paper, we explore the potential of graph neural networks (GNNs) in recommender systems, emphasizing their advantages over traditional matrix completion methods. GNNs provide a powerful framework for leveraging graph theory to improve recommender systems.

        By exploiting the rich information embedded in the graph structure, GNNs can capture complex patterns, discover latent features, and consider the influence of neighboring users during the recommendation process. This approach enhances the ability of recommender systems to make accurate predictions, even in sparse datasets where classical methods struggle.

        As the field of recommender systems continues to grow, GNNs have emerged as a promising solution to the limitations of traditional methods.  Their ability to adapt to different domains and automatically learn from data makes them ideal for providing relevant and tailored recommendations in a variety of situations.

        In the next part of this series, we will delve into the mathematical basis of GNNs, with a special focus on the application of LightGCN to movie recommendation systems. By understanding the fundamentals and algorithms, we can learn more about how GNNs can change the landscape of recommender systems.

Guess you like

Origin blog.csdn.net/gongdiwudu/article/details/132324058