lstm - w3toppers.com

Using pre-trained word2vec with LSTM for word generation

I’ve created a gist with a simple generator that builds on top of your initial idea: it’s an LSTM network wired to the pre-trained word2vec embeddings, trained to predict the next word in a sentence. The data is the list of abstracts from arXiv website. I’ll highlight the most important parts here. Gensim Word2Vec Your … Read more

ValueError: Input 0 is incompatible with layer lstm_13: expected ndim=3, found ndim=4

I solved the problem by making input size: (95000,360,1) and output size: (95000,22) and changed the input shape to (360,1) in the code where model is defined: model = Sequential() model.add(LSTM(22, input_shape=(360,1))) model.add(Dense(22, activation=’softmax’)) model.compile(loss=”categorical_crossentropy”, optimizer=”adam”, metrics=[‘accuracy’]) print(model.summary()) model.fit(ml2_train_input, ml2_train_output_enc, epochs=2, batch_size=500)

In Keras, what exactly am I configuring when I create a stateful `LSTM` layer with N `units`?

You can check this question for further information, although it is based on Keras-1.x API. Basically, the unit means the dimension of the inner cells in LSTM. Because in LSTM, the dimension of inner cell (C_t and C_{t-1} in the graph), output mask (o_t in the graph) and hidden/output state (h_t in the graph) should … Read more

Keras LSTM input dimension setting

For the sake of completeness, here’s what’s happened. First up, LSTM, like all layers in Keras, accepts two arguments: input_shape and batch_input_shape. The difference is in convention that input_shape does not contain the batch size, while batch_input_shape is the full input shape including the batch size. Hence, the specification input_shape=(None, 20, 64) tells keras to … Read more

How do I create a variable-length input LSTM in Keras?

I am not clear about the embedding procedure. But still here is a way to implement a variable-length input LSTM. Just do not specify the timespan dimension when building LSTM. import keras.backend as K from keras.layers import LSTM, Input I = Input(shape=(None, 200)) # unknown timespan, fixed feature size lstm = LSTM(20) f = K.function(inputs=[I], … Read more

What’s the difference between “hidden” and “output” in PyTorch LSTM?

I made a diagram. The names follow the PyTorch docs, although I renamed num_layers to w. output comprises all the hidden states in the last layer (“last” depth-wise, not time-wise). (h_n, c_n) comprises the hidden states after the last timestep, t = n, so you could potentially feed them into another LSTM. The batch dimension … Read more

Error when checking model input: expected lstm_1_input to have 3 dimensions, but got array with shape (339732, 29)

Setting timesteps = 1 (since, I want one timestep for each instance) and reshaping the X_train and X_test as: import numpy as np X_train = np.reshape(X_train, (X_train.shape[0], 1, X_train.shape[1])) X_test = np.reshape(X_test, (X_test.shape[0], 1, X_test.shape[1])) This worked!

LSTM module for Caffe

I know Jeff Donahue worked on LSTM models using Caffe. He also gave a nice tutorial during CVPR 2015. He has a pull-request with RNN and LSTM. Update: there is a new PR by Jeff Donahue including RNN and LSTM. This PR was merged on June 2016 to master.

Passing output of a CNN to BILSTM

The problem is the data passed to LSTM and it can be solved inside your network. The LSTM expects 3D data while Conv2D produces 4D. There are two possibilities you can adopt: 1) make a reshape (batch_size, H, W*channel); 2) make a reshape (batch_size, W, H*channel). In these ways, you have 3D data to use … Read more

expected ndim=3, found ndim=2

LSTM layer expects inputs to have shape of (batch_size, timesteps, input_dim). In keras you need to pass (timesteps, input_dim) for input_shape argument. But you are setting input_shape (9,). This shape does not include timesteps dimension. The problem can be solved by adding extra dimension to input_shape for time dimension. E.g adding extra dimension with value … Read more