Udacity机器学习笔记——深度学习(1)

Udacity机器学习笔记——深度学习(1)

深度学习是机器学习的一个热门分支,它使用大量的数据里解决与人类感知相关的问题,例如图像识别,语音识别、理解自然语言等。许多公司都将深度学习作为研究的一部分,例如Facebook、Google、Microsoft和百度等。上个世纪八九十年代虽然有关于神经网络的研究,但是由于当时数据量小和计算机运算能力有限,并没有得到广泛的应用,在二十一世纪的前十年关于神经网络的研究也比较边缘化。但进入第二个十年,随着语音识别(2009)、计算机视觉(2012)和机器翻译(2014)等领域的兴起,神经网络才以深度学习的身份变得流行起来,其背后也是因为我们现在拥有了海量数据以及更加快速的处理器。

工具

  1. 课程使用TensorFlow,Google的开源库。可以使用Pip、Virtualenv或者Docker方法安装。

深度学习简介

  1. 在此之前学习了监督学习、无监督学习和强化学习。监督学习基于一组带有标签的数据学习预测新的数据的标签;无监督学习基于一组无标签的数据学习如何分类;强化学习则是创建了一个模型,可以在环境中学习得到最优的结果。深度学习也是基于一组数据进行学习,从数据中抽取出更高等级的抽象出来。

  2. 在深度学习课程中,首先介绍了深度神经网络,一种多层人工神经网络结构模型;然后,介绍了卷积神经网络模型,适用于图像和语音识别领域的神经网络模型;最后,介绍了序列神经网络模型,适用于识别书写和语言文本等领域。

TensorFlow

  1. 使用Conda安装TensorFlow,在OS X或者Linux系统:
conda create -n tensorflow python=3.5
source activate tensorflow
conda install pandas matplotlib jupyter notebook scipy scikit-learn
pip install tensorflow

在Windows系统:

conda create -n tensorflow python=3.5
activate tensorflow
conda install pandas matplotlib jupyter notebook scipy scikit-learn
pip install tensorflow
  1. Hello, world!程序:
import tensorflow as tf

#Create TensorFlow object called tensor
hello_constant = tf.constant('Hello World!')

with tf.Session() as sess:
    #Run the tf.constant operation in the session
    output = sess.run(hello_constant)
    print(output)
  1. 数据类型Tensor的定义:
#A is a 0-dimensional int32 tensor
A = tf.constant(1234) 
#B is a 1-dimensional int32 tensor
B = tf.constant([123,456,789]) 
#C is a 2-dimensional int32 tensor
C = tf.constant([ [123,456,789], [222,333,444] ])
  1. TensorFlow计算图运行环境Session:
with tf.Session() as sess:
    output = sess.run(hello_constant)
    print(output)

线性函数

  1. 神经网络中最常见的运算就是计算输入、权重和偏差的线性函数:
    y = x W + b y = xW + b
  2. 训练神经网络的目标就是通过修改权重和偏差,从而更好地预测新输入的标签或者类型,在TensorFlow中定义权重和偏差,需要使用可以修改的变量,而不是上面定义的tf.Constant。另外,初始化权重时可以选择正交分布随机值;初始化偏差可以不用随机值,直接使用最简单的0。
#Define variable x and initialize all variables
x = tf.Variable(5)
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
#Initialize weights
n_features = 120
n_labels = 5
weights = tf.Variable(tf.truncated_normal((n_features, n_labels)))
#Initialize bias
n_labels = 5
bias = tf.Variable(tf.zeros(n_labels))
  1. 下面演示一个分类函数对手写数字0、1、2进行识别,其中数据集源自MNIST数据集。
# Import modules
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data


# Define weights, biases, and linear function
def get_weights(n_features, n_labels):
    return tf.Variable(tf.truncated_normal((n_features, n_labels)))


def get_biases(n_labels):
    return tf.Variable(tf.zeros(n_labels))


def linear(input, w, b):
    return tf.add(tf.matmul(input, w), b)


# Extract part of MNIST data
def mnist_features_labels(n_labels):
    mnist_features = []
    mnist_labels = []
    mnist = input_data.read_data_sets('/datasets/ud730/mnist', one_hot=True)
    # We only look at 10000 images
    for mnist_feature, mnist_label in zip(*mnist.train.next_batch(10000)):
        if mnist_label[:n_labels].any():
            mnist_features.append(mnist_feature)
            mnist_labels.append(mnist_label[:n_labels])
    return mnist_features, mnist_labels


# Number of features (28*28 image is 784 features)
n_features = 784
# Number of labels
n_labels = 3

# Features and Labels
features = tf.placeholder(tf.float32)
labels = tf.placeholder(tf.float32)

# Weights and Biases
w = get_weights(n_features, n_labels)
b = get_biases(n_labels)

# Linear Function xW+b
logits = linear(features, w, b)

# Training data
train_features, train_labels = mnist_features_labels(n_labels)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    # Softmax
    prediction = tf.nn.softmax(logits)

    # Cross entropy
    cross_entropy = -tf.reduce_sum(labels * tf.log(prediction), reduction_indices=1)

    # Training loss
    loss = tf.reduce_mean(cross_entropy)

    # Rate at which the weights are changed
    learning_rate = 0.08

    # Gradient Descent
    optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)

    # Run optimizer and get loss
    _, l = sess.run([optimizer, loss], feed_dict={features: train_features, labels: train_labels})

# Print loss
print('Loss: {}'.format(l))
  • sigmoid函数作为激活函数,存在让学习速度变慢的问题。上面代码中的激活函数使用softmax函数,该函数将输入转换为0与1之间的概率分布,相比于sigmoid函数,softmax适合于多分类问题;softmax函数形式如下:
    σ ( z ) j = e z j k = 1 K e z k j = 1 , 2 , . . . , K \sigma(z)_{j} = \frac{e^{z_{j}}}{\sum_{k=1}^{K}e^{z_{k}}} \quad j=1,2,...,K
  • One-Hot Encoding独热码,就是将离散型数据进行编码,每条数据变成只有一个值为1,其他值为0的方法。下面代码演示了使用scikit-learn的LabelBinarizer方法进行One-Hot Encoding:
import numpy as np
from sklearn import prepocessing
labels = np.array([1,5,3,2,1,4,2,1,3])
lb = preprocessing.LabelBinarizer()
lb.fit(labels)
lb.transform(labels)
  • 上面代码使用了Cross Entropy成本函数:
    C ( y , y ) = j y j l n y j C(y',y)=-\sum_{j}y_{j}lny'_{j}
  1. 上面仅对前10000个数据进行训练,如果数据量更大,可以使用mini-batching和SGD技术对数据集进行训练。代码如下:
import math
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
import numpy as np


#Define print epoch status function
def print_epoch_stats(epoch_i, sess, last_features, last_labels):
    #Print cost and validation accuracy of an epoch
    current_cost = sess.run(cost, feed_dict={features: last_features, labels: last_labels})
    valid_accuracy = sess.run(accuracy, feed_dict={features: valid_features, labels: valid_labels})
    print('Epoch: {:<4} - Cost: {:<8.3} Valid Accuracy: {:<5.3}'.format(epoch_i, current_cost, valid_accuracy))

#Define batches function
def batches(batch_size, features, labels):
    assert len(features) == len(labels)
    output_batches = []

    sample_size = len(features)
    for start_i in range(0, sample_size, batch_size):
        end_i = start_i + batch_size
        batch = [features[start_i:end_i], labels[start_i:end_i]]
        output_batches.append(batch)
    return output_batches

n_input = 784
n_classes = 10

#Import MNIST data
mnist = input_data.read_data_sets('/dataset/ud730/mnist', one_hot=True)

#The features are already scaled and the data is shuffled
train_features = mnist.train.images
valid_features = mnist.validation.images
test_features = mnist.test.images

train_labels = mnist.train.labels.astype(np.float32)
valid_labels = mnist.validation.labels.astype(np.float32)
test_labels = mnist.test.labels.astype(np.float32)

#Features and Labels
features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])

#Weights and Biaes
weights = tf.Variable(tf.random_normal([n_input, n_classes]))
bias = tf.Variable(tf.random_normal([n_classes]))

#Logits - xW+b
logits = tf.add(tf.matmul(features, weights), bias)

#Define loss and optimizer
learning_rate = tf.placeholder(tf.float32)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost)

#Calculate accuray
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

#Set batch size
batch_size = 128
epochs = 80
learn_rate = 0.001

assert batch_size is not None, 'You must set the batch size'
init = tf.global_variables_initializer()

with tf.Session() as sess:
    sess.run(init)

    #Training cycle
    for epoch_i in range(epochs):

        #Train optimizer on all batches
        for batch_features, batch_labels in batches(batch_size, train_features, train_labels):
            sess.run(optimizer, feed_dict={features: batch_features, labels: batch_labels, learning_rate: learn_rate})
        #Print cost and validation accuracy of an epoch
        print_epoch_stats(epoch_i, sess, batch_features, batch_labels)

    #Calculate accuracy for test dataset
    test_accuracy = sess.run(accuracy, feed_dict={features: test_features, labels: test_labels})

print('Test Accuracy: {}'.format(test_accuracy))

猜你喜欢

转载自blog.csdn.net/withjeffrey/article/details/83450586