数据操作

本章节主要是用介绍写tensor的常规操作

在pytorch中，tensor是存储和计算数据的主要工具。tensor和numpy那种多维数据很相似。也有点不同的是，tensor提供了GPU计算和自动求导的功能。

Tensor 张量，一个数字标量称为0维张量，一维向量称为一维张量，二维向量称为二维张量.....

Tensor创建

1. 使用torch内置函数创建

torch.Tensor() : 创建Tensor类
torch.tensor() : tensor()中的参数可以是列表、numpy对象、元祖、集合
torch.ones()：全1Tensor
torch.zeros()：全零Tensor
torch.eye()：单位Tensor
torch.arange(s,e,step)：从s到e，步长step
torch.linspace(s,e,step)：从s到e，均匀切成steps份
torch.full(size,int)：生成指定个数的大小为size的tensor
torch.randn()：均匀分布
torch.normal()：正态分布
torch.randperm：随机排列

以上参数可以通过dtype设置数据类型

2. 使用numpy创建

a = np.arange(1, 10)
a_torch = torch.from_numpy(a)
print(type(a_torch),a_torch)

tensor操作

1.算术操作

x+y 、add()和add_

x = torch.tensor([[1, 2], [3, 4]])
y = torch.tensor([1, 1])
print(x,y,x + y)
print(torch.add(x,y))
print(x.add_(y))
Output: 
x: tensor([[1, 2],[3, 4]]);  
y:tensor([1, 1]);  
z: tensor([[2, 3],[4, 5]])

可以看出，+号操作，并不要求x和y同shape，它可以自己进行维度扩充变成同维，再相加

2. 索引

索引操作对结果原数据共享内存，也即修改一个，另一个也会修改。要不想改变原来对数据，可以使用copy函数，copy一份新数据。

x = torch.tensor([[1, 2], [3, 4]])
y = x[0,:]
y += 1
print(y,x)
Output:
y:tensor([2, 3]) 
x:tensor([[2, 3],[3, 4]])

pytorch高级选择函数

函数	功能
index_select(input, dim, index)	在指定维度dim上选取
masked_select(input, mask)	使用ByteTensor进行选取
non_zero(input)	非0元素的下标
gather(input, dim, index)	根据index，在dim维度上选取数据，输出size与index一样

3. 改变形状

view()函数改变shape

x = torch.
print(x)  
y = x.view(-1,6)
print(x)  
Output:
tensor([[-1.2974, -0.6428, -0.0371,  0.5693],
        [ 0.8977,  0.1022,  0.3800,  0.9273],
        [ 0.5164,  0.6040, -0.8866, -1.1613]])
tensor([[-1.2974, -0.6428, -0.0371,  0.5693,  0.8977,  0.1022],
        [ 0.3800,  0.9273,  0.5164,  0.6040, -0.8866, -1.1613]])

补充：view函数返回的新的tensor，如果想要把一个标量tensor转换成一个python number，可以用到item()函数

4. 线性函数

trace()
diag()
triu()
mmm
t
·····
用的不多，就不写了

广播机制

当对两个shape不同的tensor按元素运算时，可能会触发广播机制：先适当复制元素使两个Tensor形状相同后再按元素运算。

tensor和numpy相互转换

1. tensor转numpy

x = torch.ones(4)   
y = x.numpy()       
print(x,y)          
x += 1              
print(x,y)          
Output:
tensor([1., 1., 1., 1.]) [1. 1. 1. 1.]
tensor([2., 2., 2., 2.]) [2. 2. 2. 2.]

tensor和numpy中的数组共享相同的内存

2. numpy数组转tensor

使用from_numpy()将numpy数组转换为tensor

x = np.ones(4)             
y = torch.from_numpy(x)    
x += 1                     
print(x,y)                 
Output:
[2. 2. 2. 2.] 

tensor([2., 2., 2., 2.], dtype=torch.float64)

所有在CPU上的tensor（除了charTensor）都可以与numpy数组相互转换

3. 直接使用torch.tensor()

直接使用torch.tensor() 将numpy数组转换成tensor，此时numpy和tensor并不是资源共享的。会对数据进行拷贝。

GPU上的tensor

用to()方法可以将tensor在CPU和GPU直接互相移动。

x = torch.tensor([1, 2])              
if torch.cuda.is_available():         
    device = torch.device('cuda')     
    y = torch.ones_like(x, device=device)
    x = x.to(device)                  
    z = x + y                         
    print(z)                          
    print(z.to('cpu'), torch.double)

pytorch基础——数据操作