numpy基础知识(一)

创建ndarray

In [1]: data = [[1,2,3,4],[5,6,7,8]]

In [2]: arry = np.array(data)

In [3]: arry
Out[3]: 
array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

查看ndarray的维度

In [4]: arry.ndim
Out[4]: 2

输出ndarray的行数和列数

In [5]: arry.shape
Out[5]: (2, 4)

zeros和ones可以创建指定长度或形状的全是零和一的数组

In [15]: np.zeros((3,4))
Out[15]: 
array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

empty会产生一些未初始化的垃圾值

In [17]: np.empty((2,3,2))
Out[17]: 
array([[[0., 0.],
        [0., 0.],
        [0., 0.]],

       [[0., 0.],
        [0., 0.],
        [0., 0.]]])

数组和标量之间的运算

In [1]: arry = np.array([[1,2,3,4],[5,6,7,8]])

In [2]: arry * arry
Out[2]: 
array([[ 1,  4,  9, 16],
       [25, 36, 49, 64]])

In [3]: arry - arry
Out[3]: 
array([[0, 0, 0, 0],
       [0, 0, 0, 0]])
In [4]: 1 / arry
Out[4]: 
array([[1, 0, 0, 0],
       [0, 0, 0, 0]])

In [5]: arry ** 0.5
Out[5]: 
array([[1.        , 1.41421356, 1.73205081, 2.        ],
       [2.23606798, 2.44948974, 2.64575131, 2.82842712]])

基本索引和切片

In [6]: arr = np.arange(15)

In [7]: arr
Out[7]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14])

In [8]: arr[5]
Out[8]: 5

In [9]: arr[5:8]
Out[9]: array([5, 6, 7])
In [10]: arr[5:8] = 12

In [11]: arr
Out[11]: array([ 0,  1,  2,  3,  4, 12, 12, 12,  8,  9, 10, 11, 12, 13, 14])
In [12]: arr_slice = arr[5:8]

In [13]: arr_slice[1] = 1234

In [14]: arr
Out[14]: 
array([   0,    1,    2,    3,    4,   12, 1234,   12,    8,    9,   10,
         11,   12,   13,   14])
In [19]: arr2d = np.array([[1,2,3],[4,5,6],[7,8,9]])
> 将一个标量赋值给一个切片时,该值将自动传播到选中的区域。跟列表最重要的区别在于,数组切片是原始数组的视图,这意味着数据不会被复制,视图上的修改会直接反应到原始数组。主要原因如果numpy坚持将数据复制来复制去会产生性能和内存问题
In [20]: arr2d
Out[20]: 
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [21]: arr2d[:2]
Out[21]: 
array([[1, 2, 3],
       [4, 5, 6]])

In [22]: arr2d[:2,:1]
Out[22]: 
array([[1],
       [4]])
In [23]: arr2d[:2,1:]
Out[23]: 
array([[2, 3],
       [5, 6]])
In [24]: arr2d[:,:1]
Out[24]: 
array([[1],
       [4],
       [7]])
>可以一次性传入多个切片;只有冒号时,表示选取整个轴

布尔型索引

In [29]: names
Out[29]: array(['Bob', 'Joe', 'Will', 'Bob', 'Will', 'Joe', 'Joe'], dtype='|S4')
In [33]: data = np.random.randn(7,4)

In [34]: data
Out[34]: 
array([[ 1.16675978,  1.92107701,  0.26941017, -1.19809428],
       [ 0.07649339, -0.91997008, -0.21832289,  1.07434831],
       [-1.98754599, -0.60850528,  0.06302264, -0.63377773],
       [ 0.32234302,  0.32310855,  0.73595001,  0.24015727],
       [-1.09218583, -0.08907211, -1.51447014,  0.53083635],
       [-0.27048015, -0.09299773,  0.75937323, -0.71705501],
       [ 1.23737272, -0.08242024, -2.3591149 ,  0.64462961]])
In [37]: data[names=='Bob']
Out[37]: 
array([[ 1.16675978,  1.92107701,  0.26941017, -1.19809428],
       [ 0.32234302,  0.32310855,  0.73595001,  0.24015727]])
>布尔型数组的长度必须和被索引轴相同
In [38]: data[names=='Bob',2:]
Out[38]: 
array([[ 0.26941017, -1.19809428],
       [ 0.73595001,  0.24015727]])
In [40]: data[names != 'Bob']
Out[40]: 
array([[ 0.07649339, -0.91997008, -0.21832289,  1.07434831],
       [-1.98754599, -0.60850528,  0.06302264, -0.63377773],
       [-1.09218583, -0.08907211, -1.51447014,  0.53083635],
       [-0.27048015, -0.09299773,  0.75937323, -0.71705501],
       [ 1.23737272, -0.08242024, -2.3591149 ,  0.64462961]])
In [41]: mask = (names == 'Bob') | (names == 'Will')

In [42]: data[mask]
Out[42]: 
array([[ 1.16675978,  1.92107701,  0.26941017, -1.19809428],
       [-1.98754599, -0.60850528,  0.06302264, -0.63377773],
       [ 0.32234302,  0.32310855,  0.73595001,  0.24015727],
       [-1.09218583, -0.08907211, -1.51447014,  0.53083635]])
>支持与或非的布尔运算
In [43]: data[data<0] = 0

In [44]: data
Out[44]: 
array([[1.16675978, 1.92107701, 0.26941017, 0.        ],
       [0.07649339, 0.        , 0.        , 1.07434831],
       [0.        , 0.        , 0.06302264, 0.        ],
       [0.32234302, 0.32310855, 0.73595001, 0.24015727],
       [0.        , 0.        , 0.        , 0.53083635],
       [0.        , 0.        , 0.75937323, 0.        ],
       [1.23737272, 0.        , 0.        , 0.64462961]])

In [45]:
>把小于零的值制成零

花式索引

In [45]: arr = np.empty((8,4))
In [48]: for i in range(8):
    ...:     arr[i] = i
    ...:     

In [49]: arr
Out[49]: 
array([[0., 0., 0., 0.],
       [1., 1., 1., 1.],
       [2., 2., 2., 2.],
       [3., 3., 3., 3.],
       [4., 4., 4., 4.],
       [5., 5., 5., 5.],
       [6., 6., 6., 6.],
       [7., 7., 7., 7.]])

In [50]: arr[[4,3,0,6]]
Out[50]: 
array([[4., 4., 4., 4.],
       [3., 3., 3., 3.],
       [0., 0., 0., 0.],
       [6., 6., 6., 6.]])
In [51]: arr[[-3,-5,-7]]
Out[51]: 
array([[5., 5., 5., 5.],
       [3., 3., 3., 3.],
       [1., 1., 1., 1.]])
>负数索引是从末尾开始选取
In [52]: arr = np.arange(32).reshape((8,4))

In [53]: arr
Out[53]: 
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31]])
In [54]: arr[[1,5,7,2],[0,3,1,2]]
Out[54]: array([ 4, 23, 29, 10])
>这种结果不是我们所需要的
In [55]: arr[[1,5,7,2]][:,[0,3,1,2]]
Out[55]: 
array([[ 4,  7,  5,  6],
       [20, 23, 21, 22],
       [28, 31, 29, 30],
       [ 8, 11,  9, 10]]
In [57]: arr[np.ix_([1,5,7,2],[0,3,1,2])]
Out[57]: 
array([[ 4,  7,  5,  6],
       [20, 23, 21, 22],
       [28, 31, 29, 30],
       [ 8, 11,  9, 10]])
>np.ix_函数可以把一位函数变成索引选择器

数组转置和轴对换

In [60]: arr
Out[60]: 
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [61]: arr.T
Out[61]: 
array([[ 0,  5, 10],
       [ 1,  6, 11],
       [ 2,  7, 12],
       [ 3,  8, 13],
       [ 4,  9, 14]])

相关概念

1. 矢量化:即使不用编写循环即可对数据进行计算.
2.不同大小数组之间的运算叫广播
3.花式索引指利用整数数组进行索引;花式索引跟切片不一样,它总是将数据复制到新数组.

猜你喜欢

转载自my.oschina.net/u/238361/blog/1800040