Module development technology --Numpy

Development | Numpy module

Numpy data analysis module is the base package, so it is still very important, Numpy patience to understand what this tool can do, I will talk about the real estate it a source implementations, I wish you a happy reading!
Numpy module provides two important objects : ndarray (to solve the problem of multidimensional arrays), ufunc (solve for array processing functions)

Foreword

目前所有的文章思想格式都是:知识+情感。
知识:对于所有的知识点的描述。力求不含任何的自我感情色彩。
情感:用我自己的方式,解读知识点。力求通俗易懂,完美透析知识。

table of Contents

Numpy介绍
Numpy的数组
    ​创建数组示例(.array())
    array()源码剖析
数组与列表的区别
ndarray-常用属性
​ndarray-数据类型
​ndarray-其他方式创建数组
adarray运算
adarray索引与切片
adarray布尔型索引
adarray花式索引
Numpy的数组函数
    通用函数
    数学统计方法
    ​随机函数

text

Numpy Introduction

1. What NumPy that?
NumPy (Numerical Python) is an extension library Python language, supports a number of dimensions of the array and matrix operations, in addition, it provides a lot of math library for array operations. It is the basis of pandas and other various tools .

2. NumPy main functions : .
1) ndarray, a multi-dimensional array structure, efficient and space-saving
2) without circulating the entire data set of mathematical functions fast operation
3) linear algebra, a random number generation function and Fourier Transform

3.NumPy installation?
pip3 install numpy

3. Call the way?
import numpy as np

Numpy array

Introduces the way to create an array, then the angle from source to explain, compared with arrays of different lists, a list of common attributes ndarray and describes the type of data ndarray

Creating Array Example (.array ())

import numpy as np


# 一维数组
a = np.array([1, 2, 3, 4])
print(a)  # [1 2 3 4]
print(a.size)  # 4
print(a.dtype)  # int32
print(a.shape)  # (4,)


# 二维数组
b = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(b)
        # [[1 2 3]
        #  [4 5 6]
        #  [7 8 9]]
print(b.size)  # 9
print(b.dtype)  # int32
print(b.shape)  # (3, 3)

array () source code analysis

# array()源码
def array(p_object, dtype=None, copy=True, order='K', subok=False, ndmin=0):
    """
    Examples
    --------
    >>> np.array([1, 2, 3])  # 直接括号里面放一个列表
    array([1, 2, 3])

    Upcasting:
    >>> np.array([1, 2, 3.0])  # 当列表的数据类型不一样的时候,array会自动的将数据类型统一,这里就变成了float类型
    array([ 1.,  2.,  3.])


    More than one dimension:
    >>> np.array([[1, 2], [3, 4]])  # 二维数组,相当于双层列表放在array中
    array([[1, 2],
           [3, 4]])


    Minimum dimensions 2:
    >>> np.array([1, 2, 3], ndmin=2)  # 二维数组的特殊形式,使用ndmin   指定生成数组的最小维度
    array([[1, 2, 3]])


    Type provided:
    >>> np.array([1, 2, 3], dtype=complex)  # 数据类型中的复数类型,使用dtype    指定数组元素的数据类型
    array([ 1.+0.j,  2.+0.j,  3.+0.j])


    Data-type consisting of more than one element:
        # int8, int16, int32, int64 四种数据类型可以使用字符串 'i1', 'i2','i4','i8' 代替。
        # '<i4' === int 32
    >>> x = np.array([(1,2),(3,4)],dtype=[('a','<i4'),('b','<i4')])   # 由多个元素组成的数据类型
    x['a']
    Out[7]: array([1, 3])
    x['b']
    Out[8]: array([2, 4])


    Creating an array from sub-classes:
    >>> np.array(np.mat('1 2; 3 4'))  # 从子类创建数组, 使用mat导致可以使用分号作为维数的个数,空格区分每一个数
    array([[1, 2],
           [3, 4]])

    >>> np.array(np.mat('1 2; 3 4'), subok=True)  # subok默认返回一个与基类类型一致的数组,此时matrix表示的数矩阵
    matrix([[1, 2],
            [3, 4]])
    """
    pass

The difference between an array and a list of

1) the object within an array element type must be the same.
Array: see above code, when there is a floating point number, the type of array elements automatically unified floating point numbers. Of course, when there is a string, the string will automatically match.
List: Not only can store digital internal list, you can also store the string, you can also store the dictionary .....

2) array size can not be modified
array: how the underlying implementation is, when defined, directly apply a position corresponding to the size of the computer's memory, which also led to its operation much faster than a listing
lists: the bottom directly memory address pointing manner, so when in each cycle list data are the first to find the memory address of the data in the list, and then get the corresponding data.

ndarray- common attributes

  1. T is the transpose of the array (in terms of high dimensional array)
  2. The number of elements in the array size
  3. Dimension array ndim
  4. Shape dimension of the array size (in the form of tuples)
  5. Array element data type dtype
  6. Examples
import numpy as np

c = np.array([[1, 2, 3], [2, 3, 4]])

print(c)  # 原数组
# [[1 2 3]
#  [2 3 4]]

print(c.T)  # 数组的转置
# [[1 2]
#  [2 3]
#  [3 4]]

print(c.size)  # 数组元素的个数 6

print(c.ndim)  # 数组的维数 2

print(c.shape)  # 数组的维度大小 (2, 3)

print(c.dtype)  # 数组元素的数据类型 int32

7. BRIEF

ndarray- data types

  1. Boolean: bool_
  2. Integer: int_ int8 int16 int32 int 64
  3. Unsigned integer: uint8 uint16 uint32 uint64
  4. Float: float_ float16 float32 float64
  5. Complex-type: complex_ complex64 complex 128
    used NumPy basic data types.

Other ways to create an array ndarray-

  1. Array () converts an array list, select explicitly specified dtype
  2. aRange () numpy version of the Range, floating-point support
  3. linspace () similar to arange (), the third parameter is the length of the array
  4. zeros () to create an array of all-0 according to a predetermined shape and dtype
  5. ones () to create a full array of the specified shape and dtype
  6. empty () to create an empty array (random values) according to a prescribed shape and dtype
  7. eye () based on the specified sides and create a matrix dtype
  8. Example: empty
import numpy as np

a = np.empty([2, 2])
print(a)
# [[6.89800058e-307 6.89799040e-307]
#  [2.33646676e-307 1.76127852e-312]]

# empty源码
# numpy.empty 方法用来创建一个指定形状(shape)、数据类型(dtype)且未初始化的数组,产生的数是随机的(注意这里面的数据,是之前内存存储的数据残留下来的0与1,在使用empty的时候,直接去内存获得了相应的大小的内存,直接显示原来的信息读取得到的数据,这种数据完全随机,灭有 任何规律可言。)
def empty(shape, dtype=None, order='C'):
    """
    Examples
    --------
    >>> np.empty([2, 2])
    array([[ -9.74499359e+001,   6.69583040e-309],
           [  2.13182611e-314,   3.06959433e-309]])         #random

    >>> np.empty([2, 2], dtype=int)
    array([[-1073741821, -1067949133],
           [  496041986,    19249760]])                     #random
    """
    pass

9. Example: zero

import numpy as np

# 默认为浮点数
x = np.zeros(5)
print(x)  # [0. 0. 0. 0. 0.]

# 设置类型为整数
y = np.zeros((5,), dtype=np.int)
print(y)  # [0 0 0 0 0]

# 自定义类型
z = np.zeros((2, 2), dtype=[('x', 'i4'), ('y', 'i4')])
print(z)
# [[(0, 0) (0, 0)]
#  [(0, 0) (0, 0)]]


# 创建指定大小的数组,数组元素以 0 来填充
def zeros(shape, dtype=None, order='C'):
    """
    Examples
    --------
    >>> np.zeros(5)
    array([ 0.,  0.,  0.,  0.,  0.])

    >>> np.zeros((5,), dtype=int)
    array([0, 0, 0, 0, 0])

    >>> np.zeros((2, 1))
    array([[ 0.],
           [ 0.]])

    >>> s = (2,2)
    >>> np.zeros(s)
    array([[ 0.,  0.],
           [ 0.,  0.]])

    >>> np.zeros((2,), dtype=[('x', 'i4'), ('y', 'i4')]) # custom dtype
    array([(0, 0), (0, 0)],
          dtype=[('x', '<i4'), ('y', '<i4')])
    """
    pass

11. Example: once

import numpy as np


c = np.ones(4)
print(c)  # [1. 1. 1. 1.]


def ones(shape, dtype=None, order='C'):
    """
    Examples
    --------
    >>> np.ones(5)
    array([ 1.,  1.,  1.,  1.,  1.])

    >>> np.ones((5,), dtype=int)
    array([1, 1, 1, 1, 1])

    >>> np.ones((2, 1))
    array([[ 1.],
           [ 1.]])

    >>> s = (2,2)
    >>> np.ones(s)
    array([[ 1.,  1.],
           [ 1.,  1.]])

    """

12. Example: arange

import numpy as np

s = np.arange()


# 根据 start 与 stop 指定的范围以及 step 设定的步长,生成一个 ndarray。
# 注意依然是取左不取右
def arange(start=None, *args, **kwargs):
    """
    arange([start,] stop[, step,], dtype=None)
        Examples
        --------
        >>> np.arange(3)
        array([0, 1, 2])
        >>> np.arange(3.0)
        array([ 0.,  1.,  2.])
        >>> np.arange(3,7)
        array([3, 4, 5, 6])
        >>> np.arange(3,7,2)
        array([3, 5])
    """
    pass

14. Example: linspace

import numpy as np


a = np.linspace(1, 20, 40)
print(a)  # 结果是浮点数
# [ 1.          1.48717949  1.97435897  2.46153846  2.94871795  3.43589744
#   3.92307692  4.41025641  4.8974359   5.38461538  5.87179487  6.35897436
#   6.84615385  7.33333333  7.82051282  8.30769231  8.79487179  9.28205128
#   9.76923077 10.25641026 10.74358974 11.23076923 11.71794872 12.20512821
#  12.69230769 13.17948718 13.66666667 14.15384615 14.64102564 15.12820513
#  15.61538462 16.1025641  16.58974359 17.07692308 17.56410256 18.05128205
#  18.53846154 19.02564103 19.51282051 20.        ]


# numpy.linspace 函数用于创建一个一维数组,数组是一个等差数列构成的
def linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None,
             axis=0):
    """
    Examples
    --------
    >>> np.linspace(2.0, 3.0, num=5)
    array([ 2.  ,  2.25,  2.5 ,  2.75,  3.  ])
    >>> np.linspace(2.0, 3.0, num=5, endpoint=False)
    array([ 2. ,  2.2,  2.4,  2.6,  2.8])
    >>> np.linspace(2.0, 3.0, num=5, retstep=True)
    (array([ 2.  ,  2.25,  2.5 ,  2.75,  3.  ]), 0.25)


    Graphical illustration:
    >>> import matplotlib.pyplot as plt
    >>> N = 8
    >>> y = np.zeros(N)
    >>> x1 = np.linspace(0, 10, N, endpoint=True)
    >>> x2 = np.linspace(0, 10, N, endpoint=False)
    >>> plt.plot(x1, y, 'o')
    [<matplotlib.lines.Line2D object at 0x...>]
    >>> plt.plot(x2, y + 0.5, 'o')
    [<matplotlib.lines.Line2D object at 0x...>]
    >>> plt.ylim([-0.5, 1])
    (-0.5, 1)
    >>> plt.show()

    """

15. Example: eye

import numpy as np


def eye(N, M=None, k=0, dtype=float, order='C'):
    """
    Return a 2-D array with ones on the diagonal and zeros elsewhere.

    Examples
    --------
    >>> np.eye(2, dtype=int)
    array([[1, 0],
           [0, 1]])
    >>> np.eye(3, k=1)
    array([[ 0.,  1.,  0.],
           [ 0.,  0.,  1.],
           [ 0.,  0.,  0.]])

adarray运算

1.数组和标量之间的运算
a+1 a*3 1//a a**0.5 a>5
示例:

import numpy as np
a = np.arange(1, 20)
a
Out[8]: 
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19])
a + 1
Out[9]: 
array([ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
       19, 20])
a * 2
Out[10]: 
array([ 2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34,
       36, 38])
1 // a
Out[11]: 
array([1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
      dtype=int32)
a**0.5
Out[12]: 
array([1.        , 1.41421356, 1.73205081, 2.        , 2.23606798,
       2.44948974, 2.64575131, 2.82842712, 3.        , 3.16227766,
       3.31662479, 3.46410162, 3.60555128, 3.74165739, 3.87298335,
       4.        , 4.12310563, 4.24264069, 4.35889894])
a>4
Out[13]: 
array([False, False, False, False,  True,  True,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True])

2.同样大小数组之间的运算
a+b a/b a**b a%b a==b

import numpy as np
a = np.arange(1, 10)
b = np.arange(11, 20)
a
Out[23]: array([1, 2, 3, 4, 5, 6, 7, 8, 9])
b
Out[24]: array([11, 12, 13, 14, 15, 16, 17, 18, 19])
a + b
Out[25]: array([12, 14, 16, 18, 20, 22, 24, 26, 28])
a - b
Out[26]: array([-10, -10, -10, -10, -10, -10, -10, -10, -10])
a > b
Out[27]: array([False, False, False, False, False, False, False, False, False])
a == b
Out[28]: array([False, False, False, False, False, False, False, False, False])

adarray索引与切片

1.一维数组的索引:a[5]
2.多维数组的索引:
1) 列表式写法:a[2] [3]
2) 新式写法:a[2,3]

3,切片
1)一维数组的切片:a[5:8] a[4:] a[2:10] = 1
2)多维数组的切片:a[1:2, 3:4] a[:,3:5] a[:,1]
3)数组切片与列表切片的不同:数组切片时并不会自动复制(而是创建一个视图),在切片数组上的修改会影响原数组。(列表的切片是直接创建新的列表内存,注意数组与列表的底层实现方式是不一样的。)
4)copy()方法可以创建数组的深拷贝
5)通过冒号分隔切片参数 start:stop:step 来进行切片操作
6)多维数组进行切片的时候,需要注意切的是哪个维度的数,注意顺序

adarray布尔型索引

布尔索引通过布尔运算(如:比较运算符)来获取符合指定条件的元素的数组。
1)a[(a>5) & (a%2==0)]
2)a[(a>5) | (a%2==0)]

import numpy as np

x = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11]])
print('我们的数组是:')
print(x)
print('\n')
# 现在我们会打印出大于 5 的元素
print('大于 5 的元素是:')
print(x[x > 5])

adarray花式索引

花式索引指的是利用整数数组进行索引。
花式索引根据索引数组的值作为目标数组的某个轴的下标来取值。对于使用一维整型数组作为索引,如果目标是一维数组,那么索引的结果就是对应位置的元素;如果目标是二维数组,那么就是对应下标的行。
花式索引跟切片不一样,它总是将数据复制到新数组中。

1)a[[1,3,4,6,7]]
2)a[:,[1,3]]

import numpy as np
x = np.arange(32).reshape((8, 4))
print('我们的初始array', x)
print('我取出来的array', x[[4, 2, 1, 7]])

我们的初始array
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]
 [24 25 26 27]
 [28 29 30 31]]
我取出来的array 
[[16 17 18 19]
 [ 8  9 10 11]
 [ 4  5  6  7]
 [28 29 30 31]]

Numpy的数组函数

通用函数

1.通用函数:能同时对数组中所有元素进行运算的函数
2.常见通用函数:
一元函数:abs, sqrt, exp, log, ceil, floor, rint, trunc, modf, isnan, isinf, cos, sin, tan
二元函数:add, substract, multiply, divide, power, mod, maximum, mininum,

man与inf

1.nan(Not a Number):不等于任何浮点数(nan != nan)
注意:man一般会在两个array进行开平方的时候出现。默认算出来为complex复数,与数组类型float不符,所有出现nan

c = np.arange(-5, 5)
c
Out[23]: array([-5, -4, -3, -2, -1,  0,  1,  2,  3,  4])
c**(1/3)
Out[24]: 
C:/Users/Administrator/Desktop/Review-Python/Numpy.py:1: RuntimeWarning: invalid value encountered in power
array([       nan,        nan,        nan,        nan,        nan,
  # -*- coding : utf-8 -*-
       0.        , 1.        , 1.25992105, 1.44224957, 1.58740105])

2.inf(infinity):比任何浮点数都大

a = np.arange(1, 10, 1)
a
Out[7]: array([1, 2, 3, 4, 5, 6, 7, 8, 9])
b = np.zeros(9)
b/a
Out[12]: array([0., 0., 0., 0., 0., 0., 0., 0., 0.])
a/b
Out[13]: C:/Users/Administrator/Desktop/Review-Python/Numpy.py:1: RuntimeWarning: divide by zero encountered in true_divide
  # -*- coding : utf-8 -*-
array([inf, inf, inf, inf, inf, inf, inf, inf, inf])

3.注意
NumPy中创建特殊值:np.nan np.inf
在数据分析中,nan常被用作表示数据缺失值

数学统计方法

sum 求和
mean 求平均数
std 求标准差
var 求方差
min 求最小值
max 求最大值
argmin 求最小值索引
argmax 求最大值索引

随机函数

随机数函数在np.random子包内
rand 给定形状产生随机数组(0到1之间的数)
randint 给定形状产生随机整数
choice 给定形状产生随机选择
shuffle 与random.shuffle相同,打乱数据
uniform 给定形状产生随机数组

结束语

以上就是常见的Numpy的操作,Numpy对于文件的操作将基于pandas工具包记性讲解。
参考资料:https://www.runoob.com/numpy/numpy-tutorial.html
个人感觉,学习这个东西还是需要自己多动手,自己对照资料进行练习。希望大学可以将Numpy熟练的用于解决问题中。

Guess you like

Origin www.cnblogs.com/Kate-liu/p/11237895.html