neural-network - w3toppers.com

How to create caffe.deploy from train.prototxt

There are two main differences between a “train” prototxt and a “deploy” one: 1. Inputs: While for training data is fixed to a pre-processed training dataset (lmdb/HDF5 etc.), deploying the net require it to process other inputs in a more “random” fashion. Therefore, the first change is to remove the input layers (layers that push … Read more

scale the loss value according to “badness” in caffe

You are correct in observing that the scaling you are doing for the log(prob) is exactly what “InfogainLoss” layer is doing (You can read more about it here and here). As for the derivative (back-prop): the loss computed by this layer is L = – sum_j infogain_mat[label * dim + j] * log( prob(j) ) … Read more

Tensorflow : Memory leak even while closing Session?

TL;DR: Closing a session does not free the tf.Graph data structure in your Python program, and if each iteration of the loop adds nodes to the graph, you’ll have a leak. Since your function feedForwardStep creates new TensorFlow operations, and you call it within the for loop, then there is a leak in your code—albeit … Read more

Caffe: What can I do if only a small batch fits into memory?

You can change the iter_size in the solver parameters. Caffe accumulates gradients over iter_size x batch_size instances in each stochastic gradient descent step. So increasing iter_size can also get more stable gradient when you cannot use large batch_size due to the limited memory.

Tensor is not an element of this graph

Try first: import tensorflow as tf graph = tf.get_default_graph() Then, when you need to use predict: with graph.as_default(): y = model.predict(X)

TimeDistributed(Dense) vs Dense in Keras – Same number of parameters

TimeDistributedDense applies a same dense to every time step during GRU/LSTM Cell unrolling. So the error function will be between predicted label sequence and the actual label sequence. (Which is normally the requirement for sequence to sequence labeling problems). However, with return_sequences=False, Dense layer is applied only once at the last cell. This is normally … Read more

Multiple outputs in Keras

from keras.models import Model from keras.layers import * #inp is a “tensor”, that can be passed when calling other layers to produce an output inp = Input((10,)) #supposing you have ten numeric values as input #here, SomeLayer() is defining a layer, #and calling it with (inp) produces the output tensor x x = SomeLayer(blablabla)(inp) x … Read more

How to count total number of trainable parameters in a tensorflow model?

Loop over the shape of every variable in tf.trainable_variables(). total_parameters = 0 for variable in tf.trainable_variables(): # shape is an array of tf.Dimension shape = variable.get_shape() print(shape) print(len(shape)) variable_parameters = 1 for dim in shape: print(dim) variable_parameters *= dim.value print(variable_parameters) total_parameters += variable_parameters print(total_parameters) Update: I wrote an article to clarify the dynamic/static shapes in … Read more

Pytorch, what are the gradient arguments

Explanation For neural networks, we usually use loss to assess how well the network has learned to classify the input image (or other tasks). The loss term is usually a scalar value. In order to update the parameters of the network, we need to calculate the gradient of loss w.r.t to the parameters, which is … Read more

Where do I call the BatchNormalization function in Keras?

Just to answer this question in a little more detail, and as Pavel said, Batch Normalization is just another layer, so you can use it as such to create your desired network architecture. The general use case is to use BN between the linear and non-linear layers in your network, because it normalizes the input … Read more