【译】Effective TensorFlow Chapter 1: TensorFlow 基础

TensorFlow 基础

本文翻译自：《TensorFlow Basics》，如有侵权请联系删除，仅限于学术交流，请勿商用。如有谬误，请联系指出。

TensorFlow和其他数值计算库（如NumPy）之间最显著的区别在于TensorFlow中的操作是基于符号运算的。这是一个强大的概念，它允许TensorFlow执行命令式库（如NumPy）所不能做的所有事情（例如，自动区分）。但这也要付出更大的代价。在我我试图揭秘TensorFlow，并提供一些指导方针和最佳实践，以便更有效地使用TensorFlow。
让我们从一个简单的例子开始，我们要乘以两个随机矩阵。首先，我们看一个在NumPy完成的实施：

import numpy as np

x = np.random.normal(size=[10, 10])
y = np.random.normal(size=[10, 10])
z = np.dot(x, y)

print(z)

现在我们在TensorFlow中执行完全相同的计算：

import tensorflow as tf

x = tf.random_normal([10, 10])
y = tf.random_normal([10, 10])
z = tf.matmul(x, y)

sess = tf.Session()
z_val = sess.run(z)

print(z_val)

与立即执行计算并生成结果的NumPy不同，tensorflow只向图中表示结果的节点提供一个handle（类型为Tensor）。如果我们尝试直接打印Z值，我们会得到这样的结果：

Tensor("MatMul:0", shape=(10, 10), dtype=float32)

由于两个输入都具有完全定义的形状，所以tensorflow能够推断张量的形状及其类型。为了计算张量的值，我们需要创建一个会话并使用Session.run()方法评估它。

提示：当使用Jupyter笔记本时，请确保在定义新节点之前在开始调用tf.reset_default_graph()来清除符号图。

为了理解符号计算有多么强大，让我们看看另一个例子。假设我们有来自曲线（例如f(x)=5x^2+3）的样本，并且我们希望基于这些样本预测f(x)。我们定义一个参数函数g(x, w) = w0 x^2 + w1 x + w2，它是输入x和潜在参数w的函数，我们的目标是找到潜在的参数，使得g(x，w)≈f(x)。这可以通过最小化损失函数来完成：L(w) = ∑ (f(x) - g(x, w))^2.。虽然这个简单问题有一个闭合解(a closed form solution)，但我们选择使用更通用的方法，可以应用于任意的可微函数，并且使用随机梯度下降。我们简单地计算L(w)相对于一组采样点上的w的平均梯度，并沿相反方向移动。

以下是利用TensorFlow完成的代码：

import numpy as np
import tensorflow as tf

# Placeholders are used to feed values from python to TensorFlow ops. We define
# two placeholders, one for input feature x, and one for output y.
x = tf.placeholder(tf.float32)
y = tf.placeholder(tf.float32)

# Assuming we know that the desired function is a polynomial of 2nd degree, we
# allocate a vector of size 3 to hold the coefficients. The variable will be
# automatically initialized with random noise.
w = tf.get_variable("w", shape=[3, 1])

# We define yhat to be our estimate of y.
f = tf.stack([tf.square(x), x, tf.ones_like(x)], 1)
yhat = tf.squeeze(tf.matmul(f, w), 1)

# The loss is defined to be the l2 distance between our estimate of y and its
# true value. We also added a shrinkage term, to ensure the resulting weights
# would be small.
loss = tf.nn.l2_loss(yhat - y) + 0.1 * tf.nn.l2_loss(w)

# We use the Adam optimizer with learning rate set to 0.1 to minimize the loss.
train_op = tf.train.AdamOptimizer(0.1).minimize(loss)

def generate_data():
    x_val = np.random.uniform(-10.0, 10.0, size=100)
    y_val = 5 * np.square(x_val) + 3
    return x_val, y_val

sess = tf.Session()
# Since we are using variables we first need to initialize them.
sess.run(tf.global_variables_initializer())
for _ in range(1000):
    x_val, y_val = generate_data()
    _, loss_val = sess.run([train_op, loss], {x: x_val, y: y_val})
    print(loss_val)
print(sess.run([w]))

通过运行这段代码，你应该看到接近这个的结果：

[4.9924135, 0.00040895029, 3.4504161]

这是我们参数的一个相对接近的近似值。

对于TensorFlow来说，这只是冰山一角。许多问题，例如优化具有数百万参数的大型神经网络，只需几行代码就可以在TensorFlow中高效实现。TensorFlow考虑到了多个设备和线程的扩展，并支持各种平台。

【译】Effective TensorFlow Chapter 1: TensorFlow 基础

TensorFlow 基础

猜你喜欢