Article Directory
Preface
There should be two articles in this series, mainly recording and sorting out some pytorch
basic usage and experimental codes.
python and pytorch
1. Type comparison
python | pytorch |
---|---|
int | inttensor of size() |
float | floattensor of size() |
int array | inttensor of size [d1,d2,d3…] |
float array | floattensor of size [d1,d2,d3…] |
string | – |
Quest1. How to represent the string type
one-hot
That is: a total of several types are represented by several-dimensional arrays, but there will be two disadvantages. First, when the data dimension is particularly large, the data is sparse (most of the bits are 0); second, for, for example, text, the semantic relevance of the original text cannot be retained after transformation.Embedding
:Word2vec
glove
2. Code example
Of course, only pytorch
internally, there are also differences between cpu
variables and gpu
variables. Combine the following code to learn more. The ipynb format codes of all codes are in the github code base of the entire column, welcome to star and download and use
import torch
a = torch.randn(2,3) # 随机初始化一个两行三列的矩阵 randn表示N(0,1)来进行初始化
print(a)
print(a.type())
print(type(a)) # 不推荐使用python的type,不显示其详细类型
print(isinstance(a, torch.FloatTensor)) # isinstance 判断是否是已知的这个类型
tensor([[-0.4170, -0.5086, 0.0340],
[-1.8330, 0.3811, -0.3105]])
torch.FloatTensor
<class 'torch.Tensor'>
True
torch.FloatTensor
# cpu类型与gpu类型的不同
print(isinstance(a, torch.cuda.FloatTensor))
a = a.cuda()
print(isinstance(a, torch.cuda.FloatTensor))
False
True
Scalar creation of pytorch
Just look at the code below and write it in the remarks
# pytorch的标量表示
a = torch.tensor(1.1) # 标量0维矩阵
print(a.shape)
print(len(a.shape))
print(a.size())
torch.Size([])
0
torch.Size([])
Tensor creation of pytorch
Three creation methods:
# 第一种创建方法,直接赋值
torch.tensor([1,2,3])
# 第二种创建方法,指定初始化的元素个数
torch.Tensor(3) # 注意要大写,区分上面的那个标量表示
# 第三种创建方法 使用numpy创建,之后引入
import numpy
data = numpy.ones(2)
data
torch.from_numpy(data)
Concepts corresponding to several nouns
example:
[[1,1],[2,2]] # 2行2列矩阵
dim
: Write all dimensions, that is, dimensions, corresponding to rows or columnssize
/shape
: Corresponding to [2,2] means 2 rows and 2 columns matrix when corresponding datatensor
Variable name: specifically refers to the above data
Import of data
The most common way numpy
to import data is by importing,
# 从numpy导入数据
import numpy,torch
a = numpy.array([2,3,3])
print(a)
b = torch.from_numpy(a)
print(b)
# 从list中导入
torch.tensor([1,2,3,4,5])
Data initialization problem
1. Uninitialized allocation
Although the uppercase T can also be assigned, in order to avoid some confusion and code readability, uppercase means unassigned initialization, and lower case means assignment initialization (it can also be understood as being list
transformed into a tensor
type)
# 未初始化的api
torch.Tensor(2,2)
# 分配了内存空间之后一定要记得初始化赋值,否则可能会出现各种各样的问题,比如下面的例子,数值非常大或者非常小
The setting torch.set_default_tensor_type(torch.DoubleTensor)
is to improve accuracy. Generally, when no changes are made, the default is the torch.FloatTensor
type
2. Random number initialization
# 随机数初始化 rand / rand)like / randint
a = torch.rand(3, 3)
print(a)
b = torch.rand_like(a)
print(b)
c = torch.randint(1, 10, [3,3])
print(c)
d = 10*torch.rand(3, 3)
print(d)
Two points of attention:
_like
This type of functions are equivalent, to specifytensor
theshape
extract, to the random initialization function initializes threw;rand
The initialization range is 0,1, andrandint
must be an integer, so thefloat
random number in the range must be initialized by multiplication.
# 正太分布随机数初始化 randn / normal
a = torch.randn(3,3) # N(0,1)
print(a)
b = torch.normal(mean=torch.full([10],1.0), std=torch.arange(1, 0, -0.1))
print(b)
Note:
normal
Each value generated is a corresponding N(mean,std)
generated random number, I gave 10 mean
, yes std
, so finally 10 random numbers are generated, but they are one-dimensional, and you can re-divide them into multiple dimensions.tensor
3. Initialize the specified value
# 用指定值填充指定结构,dtype指定类型
print(torch.full([10],1, dtype=torch.int))
print(torch.full([2,3],1, dtype=torch.int))
print(torch.full([],1, dtype=torch.int))
torch.arange(100,90,-1)
# 等分
print(torch.linspace(0,10, steps=5))
print(torch.logspace(0,10, steps=5)) # 这个分出来还要变成10 x次方
Ps: Finally, torch
there is no shuffle
function in order to solve
# 产生随机索引,主要是为了shuffle
torch.randperm(10)
Index and slice
a = torch.rand(4,3,28,28)
# 从最左边开始索引
print(a[0].shape)
print(a[0,0].shape)
print(a[0,0,2,4])
# 冒号索引,和python中的列表的用法差不多
print(a.shape)
print(a[:2].shape)
print(a[:1,:1].shape)
print(a[:1,1:].shape)
print(a[:1,-1:].shape)
# 隔行采样,和python也一样 start:end:step
print(a[:1,:1,0:10:2].shape)
print(a[:1,:1,::2].shape)
# 在某个维度上面选给定的几个
print(a.index_select(2, torch.arange(28)).shape)
print(a.index_select(2, torch.arange(1,8,1)).shape)
# ...的利用,其实也就是可以少写几个:
print(a[...].shape)
print(a[:,1,...].shape)
print(a[:,1,:,:].shape)
x = torch.randn(4,4)
print(x)
mark = x.ge(0.5) # 把所有大于0.5的选出来
print(mark)
print(torch.masked_select(x, mark))# 把对应标记矩阵中为true的选出来
Dimensional change
1. Irreversible changes
a = torch.rand(4,1,28,28)
print(a.shape)
b = a.view(4,28*28)
print(b.shape)
b = a.reshape(4,28*28)
print(b.shape)
reshape
It view
is completely consistent with the function. When using these two functions, you must pay attention to three issues:
- Must be as much as the original total amount of data
- Don’t make changes that are meaningless (unintelligible)
- After the operation is completed, because the original information is lost, there is no way to
reshape
go back. The original dimension/storage order is very important
2. Increased dimensions
a = torch.rand(4,32,28,28)
b = torch.rand(32)
c = b.unsqueeze(1).unsqueeze(2).unsqueeze(0)
print(a.shape)
print(b.shape)
print(c.shape)
b = torch.rand(32,2)
c = b.unsqueeze(1).unsqueeze(2).unsqueeze(0)
print(a.shape)
print(b.shape)
print(c.shape)
Adding a dimension to a certain dimension, looking at the comparison of the above two examples, it is easy to understand
3. Reduced dimensions
Here c is the c of the previous code slice
# squeeze与unsqueeze相反,将所有1的尽可能给压缩
print(c.shape)
print(c.squeeze().shape)
print(c.squeeze(0).shape)
It can be seen that when the compression dimension is not specified, all the compressible (value 1) are directly compressed, and if specified, it will be in accordance with the specified
4. Dimensional expansion
Pay attention to the difference added to the expansion under the distinction, and expand
expansion requires two premises:
- Consistent dimensions
- 1 expands to n
Combine code examples to understand
a = torch.rand(3,3)
b = torch.rand(3,1)
print(a)
print(b)
print(a.shape)
print(b.shape)
c = b.expand(3,3)
print(c)
print(c.shape)
Let's look at repeat
expansion
print(b.shape)
d = b.repeat(3,3)
e = b.repeat(1,3)
print(d.shape)
print(d)
print(e.shape)
print(e)
repeat
The number of copies of the corresponding dimension, not the final dimension
4. Transpose operation
a = torch.rand(2,3)
a.t()
It should be noted here that transposition is only applicable to 2-dimensional matrices
5. Dimension exchange
# transpose 只能两两交换
a = torch.rand(1,2,3,4)
print(a.shape)
b = a.transpose(1,3)
print(b.shape)
c = a.permute(0,3,1,2)# 这里的0,1,2,3指的是之前的tensor矩阵的维度位置
print(c.shape)
transpose
Can only be exchanged in pairs, permute
you can do it all at once