Using pre-trained word2vec with LSTM for word generation

I’ve created a gist with a simple generator that builds on top of your initial idea: it’s an LSTM network wired to the pre-trained word2vec embeddings, trained to predict the next word in a sentence. The data is the list of abstracts from arXiv website. I’ll highlight the most important parts here. Gensim Word2Vec Your … Read more

ValueError: Input 0 is incompatible with layer lstm_13: expected ndim=3, found ndim=4

I solved the problem by making input size: (95000,360,1) and output size: (95000,22) and changed the input shape to (360,1) in the code where model is defined: model = Sequential() model.add(LSTM(22, input_shape=(360,1))) model.add(Dense(22, activation=’softmax’)) model.compile(loss=”categorical_crossentropy”, optimizer=”adam”, metrics=[‘accuracy’]) print(model.summary()) model.fit(ml2_train_input, ml2_train_output_enc, epochs=2, batch_size=500)

In Keras, what exactly am I configuring when I create a stateful `LSTM` layer with N `units`?

You can check this question for further information, although it is based on Keras-1.x API. Basically, the unit means the dimension of the inner cells in LSTM. Because in LSTM, the dimension of inner cell (C_t and C_{t-1} in the graph), output mask (o_t in the graph) and hidden/output state (h_t in the graph) should … Read more

Keras LSTM input dimension setting

For the sake of completeness, here’s what’s happened. First up, LSTM, like all layers in Keras, accepts two arguments: input_shape and batch_input_shape. The difference is in convention that input_shape does not contain the batch size, while batch_input_shape is the full input shape including the batch size. Hence, the specification input_shape=(None, 20, 64) tells keras to … Read more

How do I create a variable-length input LSTM in Keras?

I am not clear about the embedding procedure. But still here is a way to implement a variable-length input LSTM. Just do not specify the timespan dimension when building LSTM. import keras.backend as K from keras.layers import LSTM, Input I = Input(shape=(None, 200)) # unknown timespan, fixed feature size lstm = LSTM(20) f = K.function(inputs=[I], … Read more

What’s the difference between “hidden” and “output” in PyTorch LSTM?

I made a diagram. The names follow the PyTorch docs, although I renamed num_layers to w. output comprises all the hidden states in the last layer (“last” depth-wise, not time-wise). (h_n, c_n) comprises the hidden states after the last timestep, t = n, so you could potentially feed them into another LSTM. The batch dimension … Read more

expected ndim=3, found ndim=2

LSTM layer expects inputs to have shape of (batch_size, timesteps, input_dim). In keras you need to pass (timesteps, input_dim) for input_shape argument. But you are setting input_shape (9,). This shape does not include timesteps dimension. The problem can be solved by adding extra dimension to input_shape for time dimension. E.g adding extra dimension with value … Read more