机器学习之numpy和matplotlib学习（十三）

今天来学习numpy读取文件和保存文件的两个函数。
因为后面的机器学习我们需要从各种文件之中载入数据到numpy中，所以这两个函数尤为重要，我对这两个函数的每个参数都有详细讲解，但是还是希望大家亲自试一下。
本次的eye1.txt文件由程序自动创建，eye2.txt文件需要我们自己创建。
eye2.txt内容如下：

11,12,13,14
21,22,23,24
31,32,33,34
41,42,43,44

实验的全部代码如下：

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# @Author  : SundayCoder-俊勇
# @File    : numpy5.py
import numpy as np
# numpy基本学习第五课
# 今天学习numpy中的一些基本函数
a=np.eye(3,4)
print a
# 输出结果：
# [[ 1.  0.  0.  0.]
#  [ 0.  1.  0.  0.]
#  [ 0.  0.  1.  0.]]
# 1、将矩阵a保存到一个txt文件之中。
# numpy.savetxt(fname, X, fmt=’%.18e’,delimiter=’ ‘, newline=’\n’, header=’‘, footer=’‘, comments=’#‘)
# 作用：把一个矩阵保存到一个文件之中。
# 参数解释：
# fname : 存储的文件名和格式 例如：eye.txt
# X : 要存储的矩阵
# header='header', header就是存储文件第一行要添加的数据。
# footer='footer'，footer就是存储文件最后一行要添加的数据。
# comments参数的作用如下：
# 试一下：np.savetxt("eye1.txt",b,newline='\n',header='header', footer='footer', comments='@@@#')
# 文件eye1.txt的内容如下：

# @@@#header
# 1.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00
# 0.000000000000000000e+00 1.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00
# 0.000000000000000000e+00 0.000000000000000000e+00 1.000000000000000000e+00 0.000000000000000000e+00
# 0.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00 1.000000000000000000e+00
# @@@#footer

# 例子：
np.savetxt("eye1.txt",a)
# 注意这个函数是每次保存会把之前文件的内容删掉。
# 例子，保存完a之后再保存b。则文件里面只有b
b=np.identity(4)
np.savetxt("eye1.txt",b)

# 文件内容只有b：

# 1.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00
# 0.000000000000000000e+00 1.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00
# 0.000000000000000000e+00 0.000000000000000000e+00 1.000000000000000000e+00 0.000000000000000000e+00
# 0.000000000000000000e+00 0.000000000000000000e+00 0.000000000000000000e+00 1.000000000000000000e+00



# 2、读取文件函数
# numpy.loadtxt(fname, dtype=<type ‘float’>, comments=’#’, delimiter=None, converters=None,
# skiprows=0, usecols=None, unpack=False, ndmin=0)
# 作用是：Load data from a text file.
#        Each row in the text file must have the same number of values.
# [文件的格式可以是csv和txt，实际上读取这两个文件使用的最多]。
# csv是一些数据库文件的后缀名，在实际的机器学习中经常使用。
# 主要参数解释如下：
# fname：读取文件的文件名。例如eye2.txt。
# delimiter：数据之间的分隔符。如使用逗号","。
# dtype：数据类型。如float，str等。
# usecols：选取数据的列。
# unpack=True 意思是分拆存储不同列的数据分开存储。
   # unpack : bool, optional
   # If True, the returned array is transposed, so that arguments may be unpacked using
   # x, y, z = loadtxt(...). When used with a structured data-type, arrays are
   # returned for each field. Default is False.
   # 也就是说选取几列就必须有个变量与之对应,如下面选取第一列存储在c,第二列存储在v
# skiprows=1  跳过表头那一行,再例如:skiprows=[0, 2]) # 跳过文件第一行和第三行

# 其中eye2.txt文件内容如下：
# 11,12,13,14
# 21,22,23,24
# 31,32,33,34
# 41,42,43,44

c,v=np.loadtxt('eye2.txt',delimiter=',',usecols=(0,1),dtype=int,unpack=True,skiprows=1)
print c
print v

# 输出结果:
# [21 31 41]
# [22 32 42]



# 这里主要说一下usecols的用法。
# 如果选取eye2.txt中的前4列，则usecols=(0,1,2,3)。
# 如果取第5列这一列，则usecols=(4,)。
# 这种取单一列的情况容易出问题，请大家多注意。

# 整个语句如下：
# loadtxt("eye2.txt" , delimiter = "," , usecols=(0,1,2,3) , dtype=str)
# loadtxt("eye2.txt" , delimiter = "," , usecols=(4,) , dtype=str)

# 来试一下unpack=False的情况.
k=np.loadtxt('eye2.txt',delimiter=',',usecols=(0,1),dtype=int,unpack=False,skiprows=1)
print k
# 输出结果为:
# [[21 22]
#  [31 32]
#  [41 42]]

所有的输出结果如下：

[[ 1.  0.  0.  0.]
 [ 0.  1.  0.  0.]
 [ 0.  0.  1.  0.]]
[21 31 41]
[22 32 42]
[[21 22]
 [31 32]
 [41 42]]

机器学习之numpy和matplotlib学习（十三）

更新完毕

猜你喜欢