Papers Read | A Survey on Multi-Task Learning

Summary 

Multi-task learning (Multi-Task Learning, MTL) is a learning paradigm in machine learning, which aims to take advantage of the useful information contained in multiple related tasks to help improve the generalization performance of all tasks.

First, we MTL algorithm is divided into different characteristics of a learning process, low-rank method, task clustering method, the relationship between learning and task decomposition , and then discuss the characteristics of each method. In order to further improve the performance of learning tasks, MTL can be combined with other learning paradigms, including semi-supervised learning, active learning, unsupervised learning, reinforcement learning, learning and multi-view graphical model. When a large dimension or a higher data number of tasks, the model is difficult to handle bulk MTL, online paper, parallel and distributed model and dimension reduction and MTL hash characteristics were reviewed, revealing their advantages in terms of computing and storage. Many practical applications use MTL to improve their performance. Finally, we present a theoretical analysis, and discussed the future of several directions.

 

1 Introduction

When data collection is difficult to collect, MTL is a good solution.

In the MTL, there are several learning tasks, learning each task can be a common learning tasks, such as supervisory tasks (such as classification or regression problems), unsupervised tasks (such as clustering). Semi-supervised tasks, reinforcement learning task. Learning task or multi-view graphical model. In these learning tasks, all of these tasks, or at least a portion is considered to be interrelated . In this case, we find that the joint study of these tasks can be more effective in improving performance than a single study. This observation led to the birth of MTL. Therefore, when multiple tasks related, MTL aimed at improving their generalization performance.

MTL provided with transfer learning [2] is similar, but there are significant differences. In the MTL, there is no distinction between different tasks, the goal is to improve the performance of all tasks. But the migration of learning is to improve the performance objectives and tasks with the help of source tasks, objectives and tasks play a more important role than the source task.

Therefore, MTL are the same for all tasks, but in the transfer of learning, objectives and tasks of all tasks in the most talked about. In [3], [4], [5], the researchers MTL a new set, known as an asymmetric multi-task learning , this arrangement takes into account a different scene, when a plurality of tasks combined by some method MTL learning, he has a new mission . A simple solution is to start from scratch to learn the old and the new task, but it requires a lot of computing. Symmetric Multi-task learning, rather than just learning a new task with the help of the old mission, the core issue is therefore how old the knowledge contained in the task of transferring to the new job. In this sense, this set is more similar to the transfer of learning rather than MTL.

In this paper, we MTL studied. After MTL given definition, we will MTL algorithm is divided into several different categories: feature learning method, which can be divided into feature transformation and feature selection methods, low-rank method, clustering task, the task of learning methods and recombinant decomposition method . We discuss the characteristics of each method. MTL can be combined with other learning paradigms, to further improve the performance of learning tasks, so we discussed the combination of MTL and other learning paradigms, including semi-supervised learning, active learning, unsupervised learning, reinforcement learning, multi-view learning and graphical models . When a large number of tasks, the amount of training data for all tasks are likely to be very large, which requires online parallel MTL model calculations. In this case, the different tasks of training data can be placed on different machines, so the distributed MTL model is a good solution. In MTL, the dimensions and characteristics of hash reconstruction is an important tool to reduce the dimensionality of high dimensional data. Therefore, we reviewed the useful when dealing with large data in multiple tasks in technology. As a universal learning paradigm, MTL in various fields have a wide range of applications, the paper briefly reviews its computer vision, bioinformatics, health informatics, voice, natural language processing, networking and pervasive computing applications in areas such as application.

 

2 MTL model

MTL: Suppose m learning task Ti (i = 1 ~ m), where all tasks or wherein a subset of the relevant object multitasking learning is performed by using all or a knowledge portion of m contained in the task to help improve Ti model of learning.

Here, we need to distinguish between homogeneous MTL and heterogeneous MTL . [7], heterogeneous MTL is believed to be a different type of supervisory tasks, including classification and regression problems, where we extended to the more general task set by the different types of heterogeneous MTL include supervised learning, unsupervised learning, semisupervised learning , reinforcement learning, learning and multi-view graphical model. MTL is heterogeneous and homogeneous opposite MTL, it contains only one type of task. In summary, homogeneous and heterogeneous class class MTL there are differences in the type of learning task, and homogeneous and heterogeneous class features MTL MTL class features in the original feature indicates the presence of differences. Similarly, there is no special explanation, the default setting is the same kind of MTL MTL .

In order to describe the relevance of MTL definition, need to address three questions: when to share, how to share, and what shared .

"When sharing" issue is a problem for more than one task to choose between single-tasking and multitasking request model. Currently, such decisions are made by human experts, very few people learning study its methods. A simple solution is to calculate such decisions represented as a model selection problem, then use the model selection techniques (such as cross-validation) to make decisions, but this solution is often computationally intensive and may require more training data . Another solution is to use a multi-task model, the model can be reduced to single-task model, such problem (34), when learning becomes different tasks (plus and symbols) can be decoupled. In this case, we can make the training data to determine (and plus sign) in the form, in order to make the implicit choice.

To share what needs to be determined between all the tasks can share knowledge through what form . Typically, the shared content has three forms, including characteristics, parameters and examples . Feature-based MTL aims to study the common features between different tasks, share knowledge. Based on a sample of MTL hope for other tasks in the task to identify examples of useful data, and then share knowledge through examples identified . MTL parameters based on the use of the model parameters (eg, coefficient of linear model) in the task, in some way help to learn the model parameters other tasks, such regularization. MTL conventional research focuses on features and parameters based on the method, examples of the method based on research belongs less. Representative examples of the method based on the distribution of matching multi-tasking method according to [8], the density ratio between the first probability estimate for each instance of a label and its own tasks and mixing of all the tasks, then weighted based on the training data to learn all tasks estimated density than the model parameters for each task. Due to less research-based MTL instance, we review the main features and parameters based on model-based MTL.

After determining the "share what", "how to share" specific ways to share knowledge between tasks assigned. MTL-based feature, there is a main method: learning feature. Characteristics of a learning approach focuses on learning the common characteristics based on multi-tasking model of shallow or deep, where learning can be a common feature represents a subset of the original feature representation or conversion. Based on the parameters MTL. There are four main methods: low-rank method, clustering tasks, task relationships learning methods and decomposition . The method of the low rank correlation between the plurality of tasks parameter interpreted as low-rank matrix of these tasks. The method assumes that all tasks task cluster is formed of several clusters, wherein cluster tasks related to each other. Task relationships learning method designed to study the quantitative relationship between tasks automatically from the data. Decomposition model parameters of all the tasks into two or more components, punished by different normalizer. In short, there are mainly five methods based on characteristics and parameters of MTL. In the next section, we review these processes in chronological order, in order to reveal the relationship between them and the evolution of different models. (????)

 

Wherein transfer method

Enter all samples after Hidden layers - "which task the output of the sample belongs.

Hidden layer output may be considered herein m represents common characteristics learned learning tasks and represent a conversion depends on the weights of the input and hidden layer activation function and the weight of the unit in hidden from the original. Thus, if the hidden layer activation function is linear, the transformation is linear, the contrary is nonlinear. Compared with the previous single task learning a multilayer feedforward neural network, except that the output layer of the network structure, the only output a single task learning unit, and there are m in the MTL output units. In [9], only a radial basis function network hidden layer is determined by greedily hidden layer structure to expand MTL. With these neural network model is different, Silver et al proposed a context-sensitive multi-tasking neural networks, it is only a task shared by different output unit , but there is a context-specific task as an additional input .

Multilayer connectionist model and feedforward neural networks, learning multitasking feature (MTFL) in regularization method frame, the target function

Guess you like

Origin www.cnblogs.com/shona/p/11831207.html