#Reading Paper# 【Sequential Recommendation Review】IJCAI'19:Sequential Recommender Systems: Challenges, Progress and Prospects

#Thesis Title: [Sequence Recommendation] Sequential Recommender Systems: Challenges, Progress and Prospects (Sequence Recommendation System: Challenges, Processes and Prospects) #Paper Address: https://www.researchgate.net/publication/337183009_Sequential_Recommender_Systems_Challenges_Progress_and_Prospects
#Paper
Source Code Open Source Address : No
#Thesis affiliation conference: IJCAI 2019
#Thesis affiliation: University of Shanghai for Science and Technology, Macquarie University, University of Technology Sydney

1. Introduction

Sequential recommender system (SRS) is different from traditional recommender systems (collaborative filtering, content-based filtering), traditional recommender systems, such as content-based and collaborative filtering recommender systems, model users in a static way The interaction with products can only capture the generalized preferences of users. In contrast, SRSs model user-item interactions as a dynamic sequence and exploit sequence dependencies to capture current and recent user preferences.
insert image description here

The author first analyzes the motivation of the rise of SRS, there are three main points:

  1. The interaction between the user and the item is sequence-dependent.
    For example, in the above figure, Jimmy first bought a plane ticket and then prepared to book a hotel. At this time, his behavior of booking a hotel is basically related to his behavior of buying a plane ticket, and he will choose a flight from the airport. Not far from the hotel. After booking the hotel, his car rental behavior will be affected by the hotel booking. Jimmy is likely to choose a car rental company whose pick-up location is closer to the hotel. Therefore, in this series of interactions, each behavior of Jimmy depends on the previous behavior. Such dependencies are very common in transactional data.
  2. It is easy to understand that the interest of users and the popularity of items change dynamically
    . For example, I used to like to use iPhone, but now I like to use Huawei; Nokia mobile phones used to be very popular, but now they are rare... These dynamic Changes can only be effectively captured by relying on SRS.
  3. User-item interaction usually occurs in a specific sequential context, and
    different contexts will lead to different user behaviors. For example, when scrolling Douyin, if I suddenly come across a video that I am very interested in, I will probably like it, but if I like some similar videos and continue to recommend them to me, then I'd probably just swipe away because I'm bored. SRS is easier than traditional recommendation systems to enrich recommendation results and avoid homogeneity.

Next, the author clarifies the form of serialized recommendation, that is, the serialized recommendation system is obtained by maximizing the following function :
insert image description here
Sequence recommendation is different from sequence modeling tasks in the general sense, mainly because the sequence structure of these tasks is relatively simple, often only Contains atomic elements (such as words generated by text), while the sequence recommendation has a more complex structure (generates a tuple).

2. Data characteristics and challenges

insert image description here

2.1 Handling long user-item interaction sequences

A longer user-item interaction sequence contains more interactions and more complex dependencies. This data characteristic presents two challenges.

  1. The challenge of learning higher-order sequential dependencies
    In long sequences, often the dependencies are higher-order and not relatively simple to solve by Markov chains or factorization machines, because higher-order relationships often imply more complex Multi-level cascading dependencies. At present, the main solution to this challenge is Markov chain or RNN, but both have limitations: the parameter quantity of high-order Markov chain increases exponentially with the order; the strong sequence assumption of RNN limits the flexible order of RNN. application in the scene.
  2. Learning Long-Term Sequence Dependencies
    In long sequences, sometimes two items with dependencies may be far apart. For example, such a shopping sequence {rose, egg, bread, milk, vase}, although the rose and the vase are far apart, there is obviously a dependency relationship. If LSTM or GRU are used to capture long-term relationships, it is easy to generate a false dependency, such as thinking that there is a dependency between milk and vases, because the model may incorrectly assume that adjacent items in a sequence are highly dependent. The current work to solve this problem mainly combines multiple sub-models to take advantage of the advantages of the mixed model, but it is still limited overall.

2.2 Process interaction sequences of user items in a flexible order

In the real world, some user-item interaction sequences are strictly ordered, while others may not, i.e. not all adjacent interactions are order dependent .

For example, in the shopping sequence S2={milk, butter, flour}, it does not matter whether to buy milk or butter first, but buying both products will lead to a higher probability of buying flour next; that is, the difference between milk and butter There is no strict order, but the order of the flours depends on how they are combined. Thus, for a sequence with flexible ordering, capturing set-order dependencies is much better than capturing point-wise dependencies, since the former is ambiguous and does not assume strict ordering on user-item interactions. Therefore, how to capture set sequence correlation under the assumption of flexible order becomes a key issue in SRSs for handling flexible order sequences.

At present, there are not many studies on this issue at home and abroad. Existing SRS based on Markov chains, factorization machines or RNNs can only handle point dependencies, but are not good at modeling and capturing set dependencies. Some current works try to capture local and global dependencies by exploiting the strengths of CNNs, i.e. a series of interacting embedding matrices.

2.3 Handling Noisy User-Item Interactions

In a user-item interaction sequence, some historical interactions are strongly correlated with the next interaction, while others may be weakly correlated or even uncorrelated.

For example, the rose in the sequence {bacon, rose, egg, bread} is a noise item, and there is no dependency relationship with the other three items. Next, the user is likely to buy milk, which is not related to roses, but is related to the other three items. Therefore, the challenge raised by this feature is how to learn sequential dependencies precisely but discriminatively. The existing work mainly starts from the attention model and memory model.

2.4 Handling User-Item Interaction Sequences with Heterogeneous Relationships

When dealing with user-item interaction sequences associated with heterogeneous relations, how to effectively capture heterogeneous relations embedded in user-item interaction sequences and make them work together for sequential recommendation.

Heterogeneous relations refer to different types of relations that transfer different types of information and should be modeled differently in SRSs. For example, in a user-item interaction sequence, in addition to the ubiquitous occurrence-based sequential dependencies between user-item interactions, there is also a similarity-based relationship among interactive items in terms of their features. **Furthermore, although both are sequential dependencies, long-term sequential dependencies are quite different from short-term sequential dependencies, and they cannot be modeled in the same way.

  • At present, the processing of this relationship is mainly based on the mixed model.

2.5 Using Hierarchy to Handle Interaction Sequences of User Items

The authors argue that there are two levels of hierarchy in the sequence of interactions. One is the hierarchical structure between metadata (meta data) and interaction, that is, the impact of user demographic attributes and item characteristics on user behavior and preferences; the second is the hierarchical relationship between subsequences and interactions, often a Long sequences contain many subsequences. The corresponding challenge is how to deal with the dependencies on these two hierarchies to generate more accurate recommendations.

The direction of work in this field is mainly to use user and item feature enhancement models; in addition, there are also structures such as hierarchical RNN and hierarchical attention.

3. Research status

insert image description here

3.1 Traditional sequence model

The traditional sequential model is divided into two parts, sequential pattern mining & Markow chain.

  1. Sequential pattern mining
    Sequential pattern-based recommendation first mines the common patterns of sequence data, and then uses the mined patterns to guide subsequent recommendations. Although simple, it often produces a large number of redundant patterns, which often add a lot of unnecessary time and space costs. At the same time, such methods often lose infrequent patterns and items, so the recommendation for items that are not very popular will be limited.

  2. Markov chain model The
    Markov chain-based recommendation system will use the Markov chain model to model the user-product interaction conversion to predict the next interaction. According to the technology used, the recommendation system based on Markov chain can be divided into the recommendation system based on basic Markov chain and the recommendation system method based on latent Markov embedding.

The former directly calculates transition probabilities based on explicitly observed values, while the latter first embeds Markov chains in Euclidean space, and then calculates transition probabilities between interactions based on their Euclidean distances. There are two main disadvantages of the recommendation system based on Markov chain. On the one hand, since the Markov properties assume that the current interaction only depends on one or a few recent interactions, they can only capture short-term dependencies and ignore long-term dependencies; on the other hand, they can only capture point-wise dependencies and ignore user items Interacting collective dependencies.

3.2 Implicit Representation of Sequence Recommendations

The implicit representation model first learns the latent representation of each user and item, and then uses the learned representation to predict the next user-item interaction. In doing so, more internal relationships can be captured.

  1. Factorization Machines =
    Sequential recommendation based on factorization machines typically leverages matrix factorization or tensor decomposition to decompose observed user-item interactions into latent factors for users and items. The difference from collaborative filtering is that the matrix or tensor to be decomposed is composed of interactions, not the scores in CF. Such a model is easily affected by the sparseness of observed data, so it cannot achieve the desired recommendation effect. .

  2. Embedding (Embedding)
    Embedding-based serialized recommendation learns latent representations of each user and item for subsequent recommendations by encoding user-item interaction sequences into a latent space. Specifically, some works take the learned latent representations as the input of the network to further calculate the interaction score between users and items, or the behavior of subsequent users, while other works directly use them to calculate metrics such as Euclidean distance as an interaction score. The model is simple, efficient, and efficient, and has shown great potential in recent years.

3.3 DNN model for serialized recommendation

  • Basic Neural Network Model
  1. RNN-based SRSs
    include basic RNN models, LSTM, GRU and other models, as well as hierarchical RNNs. This type of model almost dominates the research in the field of SRS, but there are still shortcomings: first, it is easy to generate false dependencies, and the assumption is too strong (it is believed that there is a dependency between each adjacent item in the sequence); the second is to capture point-to-point dependencies, ignoring collection dependencies.

  2. CNN-based SRSs
    CNN treats the sequence embedding matrix as a picture (similar to NLP). The advantage of CNN is that the assumption of sequence relationship is not strong, and it learns the pattern between different regions, which can avoid the problem of RNN assumption being too strong. The disadvantage is that it is not easy to capture long-term dependencies.

  3. The approach of GNN-based SRSs
    GNN is to regard each interaction as a node, and each sequence as a path. The advantage of the GNN approach is the ability to provide more explanatory recommendations. GNNSRS is still in its infancy.

  • advanced model
  1. Attention models
    attention can be used to emphasize relevant and important interactions, and downplay those that are irrelevant to the next interaction. It is widely combined with models such as RNN to deal with noisy interaction sequences.
  2. Memory networks
    are used to capture the dependencies between past interactions and next interactions, relying on a memory matrix. Memory matrices improve model performance by storing and updating historical interactions, reducing the impact of irrelevant interactions.
  3. Mixture models
    refer to models that capture multiple dependencies by combining multiple models. A typical example is to combine multiple encoders to capture long-term and short-term relationships respectively, and then learn accurate sequence representation. Such models are also in their infancy.

4. Future work

  1. Context-aware Serialized Recommender System
    The current environment where a user or item is located can greatly affect the user's choice of items, which should be taken into account when making recommendations. This is more necessary in serialized recommender systems, where the context may change over time. However, existing serialization recommendations mostly ignore this important aspect. Therefore, context-aware serialized recommendation will be an important direction of future work;

  2. Social-aware Serialized Recommender System
    Users live in a society and connect with various people both online and offline. The actions or opinions of other people often have a great influence on the user's choice. Therefore, in the existing research, we often ignore the social impact of the serialized case recommendation system;

  3. Interactive's Serialized Recommender System
    Most shopping behaviors in the real world are sequential rather than isolated events. In other words, there is actually a sequential interaction between the user and the shopping platform (such as Amazon). However, existing serialized recommendations often ignore this interaction and only generate recommendations for one action at a single time step. How to combine user-seller interactions to generate multi-time-step recommendations is a promising research direction;

  4. Cross-domain serialized recommendation
    In the real world, the items purchased by users within a certain period of time usually come from multiple domains instead of one domain. Essentially, there are some sequential dependencies between items from different domains, such as buying car insurance after buying a car. This cross-domain order dependency is ignored in most serialized recommendations. Therefore, cross-domain SRS is another promising research direction to generate more accurate recommendations by utilizing information from other domains and more diverse recommendations from different domains.

Guess you like

Origin blog.csdn.net/CRW__DREAM/article/details/128181772