调试经验——TensorFlow中的交叉熵函数验证

其他 2021-03-26 00:30:30 阅读次数: 0

在运行吴恩达深度学习作业的过程中，出现了TensorFlow交叉熵函数：tensorflow.nn.sigmoid_cross_entropy_with_logits。

为了验证该函数的运行结果与数学公式是否一致，在Excel中进行了验证。

Excel运算结果：

Tensorflow代码：

# GRADED FUNCTION: cost

def cost(logits, labels):
    """
    Computes the cost using the sigmoid cross entropy
    
    Arguments:
    logits -- vector containing z, output of the last linear unit (before the final sigmoid activation)
    labels -- vector of labels y (1 or 0) 
    
    Note: What we've been calling "z" and "y" in this class are respectively called "logits" and "labels" 
    in the TensorFlow documentation. So logits will feed into z, and labels into y. 
    
    Returns:
    cost -- runs the session of the cost (formula (2))
    """
    
    ### START CODE HERE ### 
    
    # Create the placeholders for "logits" (z) and "labels" (y) (approx. 2 lines)
    z = tf.compat.v1.placeholder(tf.float32, name='z')
    y = tf.compat.v1.placeholder(tf.float32, name='y')
    
    # Use the loss function (approx. 1 line)
    cost = tf.nn.sigmoid_cross_entropy_with_logits(logits = z, labels = y)
    
    # Create a session (approx. 1 line). See method 1 above.
    sess = tf.compat.v1.Session()
    
    # Run the session (approx. 1 line).
    cost = sess.run(cost, feed_dict={z: logits, y: labels})
    
    # Close the session (approx. 1 line). See method 1 above.
    sess.close()
    
    ### END CODE HERE ###
    
    return cost

测试代码：

tf.compat.v1.disable_v2_behavior()
logits = np.array([0.2,0.4,0.7,0.9]) 
print ("logits = " + str(logits))
cost = cost(logits, np.array([0,0,1,1])) 
print ("cost = " + str(cost))

运算结果：

logits = [0.2 0.4 0.7 0.9]
cost = [0.79813886 0.91301525 0.40318602 0.34115386]

结论：验证通过。作业原代码“logits = sigmoid(np.array([0.2,0.4,0.7,0.9]))”应修改为“logits = np.array([0.2,0.4,0.7,0.9])”

参考：

tensorflow.nn.sigmoid_cross_entropy_with_logits

Signature:
tensorflow.nn.sigmoid_cross_entropy_with_logits(
    _sentinel=None,
    labels=None,
    logits=None,
    name=None,
)
Docstring:
Computes sigmoid cross entropy given `logits`.

Measures the probability error in discrete classification tasks in which each
class is independent and not mutually exclusive.  For instance, one could
perform multilabel classification where a picture can contain both an elephant
and a dog at the same time.

For brevity, let `x = logits`, `z = labels`.  The logistic loss is

      z * -log(sigmoid(x)) + (1 - z) * -log(1 - sigmoid(x))
    = z * -log(1 / (1 + exp(-x))) + (1 - z) * -log(exp(-x) / (1 + exp(-x)))
    = z * log(1 + exp(-x)) + (1 - z) * (-log(exp(-x)) + log(1 + exp(-x)))
    = z * log(1 + exp(-x)) + (1 - z) * (x + log(1 + exp(-x))
    = (1 - z) * x + log(1 + exp(-x))
    = x - x * z + log(1 + exp(-x))

For x < 0, to avoid overflow in exp(-x), we reformulate the above

      x - x * z + log(1 + exp(-x))
    = log(exp(x)) - x * z + log(1 + exp(-x))
    = - x * z + log(1 + exp(x))

Hence, to ensure stability and avoid overflow, the implementation uses this
equivalent formulation

    max(x, 0) - x * z + log(1 + exp(-abs(x)))

`logits` and `labels` must have the same type and shape.

Args:
  _sentinel: Used to prevent positional parameters. Internal, do not use.
  labels: A `Tensor` of the same type and shape as `logits`.
  logits: A `Tensor` of type `float32` or `float64`.
  name: A name for the operation (optional).

Returns:
  A `Tensor` of the same shape as `logits` with the componentwise
  logistic losses.

Raises:
  ValueError: If `logits` and `labels` do not have the same shape.
File:      d:\idev\anaconda3\lib\site-packages\tensorflow\python\ops\nn_impl.py
Type:      function