Normally, if you are using a neural network, you should not have too different results between different runs on the same training set. So, first of all, check that your algorithm is working correctly using some standard benchmark problems (like iris/wisconsin from UCI repository)
Regarding when to stop the training, there are two options:
1. When the training set error falls below a threshold
2. When the validation set error starts increasing
Case (1) is clear, as the training error always decreases. For case (2) however, there is no absolute criterion, as the validation error might vary during the training. So, just plot it, to see how it behaves, and then set a threshold depending on you observations (for example, stop when its value becomes 10% larger than the minimum value it acquired during the training)