Why do we have to normalize the input for an artificial neural network? [closed]

It’s explained well here. If the input variables are combined linearly, as in an MLP [multilayer perceptron], then it is rarely strictly necessary to standardize the inputs, at least in theory. The reason is that any rescaling of an input vector can be effectively undone by changing the corresponding weights and biases, leaving you with … Read more

Common causes of nans during training of neural networks

I came across this phenomenon several times. Here are my observations: Gradient blow up Reason: large gradients throw the learning process off-track. What you should expect: Looking at the runtime log, you should look at the loss values per-iteration. You’ll notice that the loss starts to grow significantly from iteration to iteration, eventually the loss … Read more

Clarification on a Neural Net that plays Snake

In this post, I will advise you of: How to map navigational instructions to action sequences with an LSTM neural network Resources that will help you learn how to use neural networks to accomplish your task How to install and configure neural network libraries based on what I needed to learn the hard way General … Read more

How to choose cross-entropy loss in TensorFlow?

Preliminary facts In functional sense, the sigmoid is a partial case of the softmax function, when the number of classes equals 2. Both of them do the same operation: transform the logits (see below) to probabilities. In simple binary classification, there’s no big difference between the two, however in case of multinomial classification, sigmoid allows … Read more

Keras input explanation: input_shape, units, batch_size, dim, etc

Units: The amount of “neurons”, or “cells”, or whatever the layer has inside it. It’s a property of each layer, and yes, it’s related to the output shape (as we will see later). In your picture, except for the input layer, which is conceptually different from other layers, you have: Hidden layer 1: 4 units … Read more

Loss & accuracy – Are these reasonable learning curves?

A little understanding of the actual meanings (and mechanics) of both loss and accuracy will be of much help here (refer also to this answer of mine, although I will reuse some parts)… For the sake of simplicity, I will limit the discussion to the case of binary classification, but the idea is generally applicable; … Read more