What is the difference between a sigmoid followed by the cross entropy and sigmoid_cross_entropy_with_logits in TensorFlow?
You’re confusing the cross-entropy for binary and multi-class problems. Multi-class cross-entropy The formula that you use is correct and it directly corresponds to tf.nn.softmax_cross_entropy_with_logits: -tf.reduce_sum(p * tf.log(q), axis=1) p and q are expected to be probability distributions over N classes. In particular, N can be 2, as in the following example: p = tf.placeholder(tf.float32, shape=[None, … Read more