MATLAB 张量工具箱的使用

文章目录

- MATLAB 张量工具箱的使用

MATLAB 的 Tensor Toolbox 是一个强大的张量工具，它提供了一些强大的张量函数。
下载地址：最新版本下载地址

张量存储格式

工具箱中张量的存储格式有很多，这里就介绍两个。

Tucker 格式

Tucker 格式是将张量 $X$ 分解为核心张量 $G$ 和每个维度中的矩阵（例如， $A 、 B 、 C$ ) 的乘积。换句话说，张量 $X$ 可以表示为：
$\mathcal{X}=\mathcal{G} \times_{1} A \times{ }_{2} B \times{ }_{2} C$
这个分解表达式让人看起来一头雾水，没关系，我们写成求和的形式：
$\mathcal{X} = \sum_{i=a_1}^{a_n}\sum_{j=b_1}^{b_n}\sum_{k=c_1}^{c_n}\mathcal{G}(i,j,k)A(:,i)\otimes B(:,j)\otimes C(:,k).$

一言以蔽之，就是将 $A, B, C$ 中的所有列都拿出来，做 Kronecker 内积之后求和，求和的系数来自于张量$ \mathcal{G}$ 的 $(i, j, k)$ 位置，这里 $(i, j, k)$ 就是取出的 $A, B, C$ 的第 $(i, j, k)$ 列。张量积 $\otimes$ 有的地方也写为 $\circ$ ，表示外积。

在 MATLAB 中表示方式为 $\mathrm{X}=\mathrm{ttm}(\mathrm{G},\{\mathrm{A}, \mathrm{B}, \mathrm{C}\})$ 。ttensor 格式存储张量 $\mathrm{X}$ 的分量，并且可以执行许多操作，例如 $\mathrm{ttm}$ ，而无需显式形成张量 $\mathrm{X}$ 。

展示一下基本用法如下（继承了很多 MATLAB 矩阵的用法）：

clc
clear
close all
core = tensor(rand(3,2,1),[3 2 1]); %<-- The core tensor.
U = {rand(5,3), rand(4,2), rand(3,1)}; %<-- The matrices.
X = ttensor(core,U) %<-- Create the ttensor.
core1 = sptenrand([3 2 1],3); %<-- Create a 3 x 2 x 1 sptensor. 创建一个三个非零元的系数张量
Y = ttensor(core1,U) %<-- Core is a sptensor.
V = {rand(3,2),rand(2,2),rand(1,2)}; %<-- Create some random matrices.
core2 = ktensor(V); %<-- Create a 3 x 2 x 1 ktensor.
Y = ttensor(core2,U) %<-- Core is a ktensor.
core3 = ttensor(tensor(1:8,[2 2 2]),V); %<-- Create a 3 x 2 x 1 ttensor.
Y = ttensor(core3,U) %<-- Core is a ttensor.
Z = ttensor(tensor(rand(2,1),2), rand(4,2)) %<-- One-dimensional ttensor.
X.core %<-- Core tensor.
X.U %<-- Cell array of matrices.
Y = ttensor(X.core,X.U) %<-- Recreate a tensor from its parts.
X = ttensor %<-- empty ttensor
X = ttensor(core,U) %<-- Create a tensor
full(X) %<-- Converts to a tensor.张开
tensor(X) %<-- Also converts to a tensor.
double(X) %<-- Converts to a MATLAB array 转成 MATLAB 矩阵
ndims(X) %<-- Number of dimensions.
size(X) %<-- Row vector of the sizes.
size(X,2) %<-- Size of the 2nd mode.
X.core(1,1,1) %<-- Access an element of the core.
X.U{2} %<-- Extract a matrix.
X{2} %<-- Same as above.
X.core = tenones(size(X.core)) %<-- Insert a new core. ones 类型的 tensor
X.core(2,2,1) = 7 %<-- Change a single element.
X{3}(1:2,1) = [1;1] %<-- Change the matrix for mode 3.
X{end}  %<-- The same as X{3}.
X = ttensor(tenrand([2 2 2]),{rand(3,2),rand(1,2),rand(2,2)}) %<-- Data.
+X %<-- Calls uplus.
-X %<-- Calls uminus.
5*X %<-- Calls mtimes.
permute(X,[3 2 1]) %<-- Reverses the modes of X 翻滚等变换
disp(X) %<-- Prints out the ttensor.

Kruskal 格式

Kruskal 格式是将张量 $X$ 分解为矩阵列的外积之和。例如，我们可以写
$\mathcal{X}=\sum_{r} a_{r} \circ b_{r} \circ c_{r}$
其中下标表示列索引，圆圈表示外积。换句话说，张量 $\mathcal{X}$ 是由矩阵 $A 、 B$ 和 $C$ 的列构建的。明确指定每个外部乘积的权重通常很有帮助，我们在这里这样做：
$\mathcal{X}=\sum_{r} \lambda_{r} a_{r} \circ b_{r} \circ c_{r}$
工具箱的 ktensor 类型存储张量 $\mathcal{X}$ 的分量，并且可以执行许多操作，例如 $\mathcal{ttm}$ ，而无需显式形成张量 $X_{\circ}$

展示一下这个存储格式基本的用法：

clc
clear
close all
rand('state',0);
A = rand(4,2); %<-- First column is a_1, second is a_2.
B = rand(3,2); %<-- Likewise for B.
C = rand(2,2); %<-- Likewise for C.
X = ktensor({A,B,C}) %<-- Create the ktensor. 通过矩阵元胞数组转成 ktensor
Y = ktensor({rand(4,1),rand(2,1),rand(3,1)}) %<-- Another ktensor.
lambda = [5.0; 0.25]; %<-- Weights for each factor.
X = ktensor(lambda,{A,B,C}) %<-- Create the ktensor.
Y = ktensor({rand(4,5)}) %<-- A one-dimensional ktensor.
X.lambda %<-- Weights or multipliers.
X.U %<-- Cell array of matrices.
Y = ktensor(X.lambda,X.U) %<-- Recreate X.
Z = ktensor %<-- Empty ktensor.
full(X) %<-- Converts to a tensor.
tensor(X) %<-- Same as above.
double(X) %<-- Converts to an array. 通过 touble 转成高维矩阵
R = length(X.lambda);  %<-- Number of factors in X.
core = tendiag(X.lambda, repmat(R,1,ndims(X))); %<-- Create a diagonal core. 创建一个对角线的 tensor
Y = ttensor(core, X.u) %<-- Assemble the ttensor.
norm(full(X)-full(Y)) %<-- They are the same.
core = sptendiag(X.lambda, repmat(R,1,ndims(X))); %<-- Sparse diagonal core.
Y = ttensor(core, X.u) %<-- Assemble the ttensor
norm(full(X)-full(Y)) %<-- They are the same.
ndims(X) %<-- Number of dimensions.
size(X) %<-- Row vector of the sizes.
size(X,2) %<-- Size of the 2nd mode.
X(1,1,1) %<-- Assemble the (1,1,1) element (requires computation).
X.lambda(2) %<-- Weight of 2nd factor.
X.U{2} %<-- Extract a matrix.
X{2} %<-- Same as above.
X.lambda = ones(size(X.lambda)) %<-- Insert new multipliers.
X.lambda(1) = 7 %<-- Change a single element of lambda.
X{3}(1:2,1) = [1;1] %<-- Change the matrix for mode 3.
X(3:end,1,1)  %<-- Calculated X(3,1,1) and X((4,1,1). 和矩阵的用法是一样的
X(1,1,1:end-1)  %<-- Calculates X(1,1,1).
X{end}  %<-- Or use inside of curly braces. This is X{3}.
X = ktensor({rand(4,2),rand(2,2),rand(3,2)}) %<-- Data.
Y = ktensor({rand(4,2),rand(2,2),rand(3,2)}) %<-- More data.
Z = X + Y %<-- Concatenates the factor matrices.
Z = X - Y %<-- Concatenates as with plus, but changes the weights.
norm( full(Z) - (full(X)-full(Y)) ) %<-- Should be zero.
5*X %<-- Calls mtimes. 只有系数乘了
permute(X,[2 3 1]) %<-- Reorders modes of X 按这个顺序改变一下维度
X = ktensor({rand(3,2),rand(4,2),rand(2,2)})  % <-- Unit weights.
arrange(X) %<-- Normalized and rearranged.
Y = X;
Y.u{1}(:,1) = -Y.u{1}(:,1);  % switch the sign on a pair of columns
Y.u{2}(:,1) = -Y.u{2}(:,1)
fixsigns(Y)
A = rand(4,3) %<-- A random matrix.
[U,S,V] = svd(A,0); %<-- Compute the SVD.
X = ktensor(diag(S),{U,V}) %<-- Store the SVD as a ktensor.
double(X) %<-- Reassemble the original matrix.
disp(X) %<-- Displays the vector lambda and each factor matrix.
X = ktensor({[0.8 0.1 1e-10]',[1e-5 2 3 1e-4]',[0.5 0.5]'}); %<-- Create tensor.
X = arrange(X) %<-- Normalize the factors.
labelsDim1 = {'one','two','three'}; %<-- Labels for mode 1.
labelsDim2 = {'A','B','C','D'}; %<-- Labels for mode 2.
labelsDim3 = {'on','off'}; %<-- Labels for mode 3.
datadisp(X,{labelsDim1,labelsDim2,labelsDim3}) %<-- Display.数据展示

函数使用（持续补充）

tensor toolbox 里面有很多有用的函数，这里列举几个作为说明。对于 ktensor，有如下的函数可以用：

    arrange      - 重排 ktensor 的 rank-1 组分
    datadisp     - ktensor 的展示
    disp         - 命令行展示
    display      - 命令行展示
    double       - 转为高维 double 数组
    end          - 同矩阵的用法
    extract      - 通过给定的组分形成新的 ktensor
    fixsigns     - 修复符号二义性
    full         - 同矩阵的 full
    innerprod    - ktensor 内积
    isequal      - 判断是否相等
    isscalar     - 判断是否不是 ktensor
    issymmetric  - 判断在所有的维度上是不是都是对称的
    ktensor      - 生成 ktensor
    mask         - 通过 mask tensor 提取固定的值
    minus        - ktensor 的二元相减
    mtimes       - ktensor 的标量积
    mttkrp       - 矩阵化的张量乘 Khatri-Rao 积
    ncomponents  - ktensor 的组分的数目
    ndims        -ktensor 的维度
    norm         - ktensor 的 F 范数
    normalize    - 正则化因子矩阵的每一列
    nvecs        - 计算张量前面的 n 个向量
    permute      - 置换 ktensor 不同的维度
    plus         - ktensor 的二进制加
    redistribute - 重分布 lambda 值到一个特别的模式
    score        - 判断两个 ktensor 是否相似
    size         - ktensor 的大小
    subsasgn     - ktensor 的下标分配
    subsref      - ktensor 的下标应用
    symmetrize   - 在在所有的维度对称化一个 ktensor
    times        - ktensor 之间的元素乘
    tocell       - 转为元胞数组
    tovec        - ktensor 转为向量
    ttm          - 张量乘矩阵
    ttv          - 张量乘向量
    uminus       - 张量的一元减法 
    update       - 更新张量的某一个 mode 的值
    uplus        - 张量的一元加
    viz          - ktensor 的可视化

张量 CP 分解（张量补全）

工具箱实现 CP Weighted Optimization (CP-WOPT) 方法，该方法可以令张量 missing 一些 entries（变成了张量补全问题），已有的 entries 也可以添加噪声。该方法可以参考：

E. Acar, D. M. Dunlavy, T. G. Kolda and M. Mørup, Scalable Tensor Factorizations for Incomplete Data, Chemometrics and Intelligent Laboratory Systems, 106(1):41-56, 2011

贴一些示例代码如下：

clc
clear
close all
R = 2;
info = create_problem('Size', [15 10 5], 'Num_Factors', R, ...
    'M', 0.25, 'Noise', 0.10);
X = info.Data;
P = info.Pattern;
M_true= info.Soln;
M_init = create_guess('Data', X, 'Num_Factors', R, ...
    'Factor_Generator', 'nvecs');
[M,~,output] = cp_wopt(X, P, R, 'init', M_init);
exitmsg = output.ExitMsg
scr = score(M,M_true)

clc
clear
close all
R = 2;
info = create_problem('Size', [150 100 50], 'Num_Factors', R, ...
    'M', 0.95, 'Sparse_M', true, 'Noise', 0.00);
X = info.Data;
P = info.Pattern;
M_true= info.Soln;
M_init = create_guess('Data', X, 'Num_Factors', R, ...
    'Factor_Generator', 'nvecs');
[M,~,output] = cp_wopt(X, P, R, 'init', M_init);
exitmsg = output.ExitMsg
scr = score(M,M_true)

代码已经很明了了，需要说明的是，create_problem 创建的 info 对象中，info.Soln 是完全的，没被噪声污染的数据，我们称之为精确解，用上述的张量形式存储。info.Pattern 是逻辑下标张量，表示哪些位置不是缺失的。info.Data 是经过 missing 和 noise 之后的数据，用一般的 tensor 形式存储。Num_Factors 表示的是 CP 分解求和符号的上标，也是因子的个数。create_guess 给了迭代方法的一个初猜，M 给出了补全的矩阵的 Kruskal 格式（CP）。score 给出了补全矩阵和原来的矩阵的一个误差表示，它不仅仅考虑缺失的部分。

运行的结果上可以看出，原来缺失的位置补全的效果可能不是非常地好。但是看起来，更高维的张量补全效果会更好一些。