In Keras, what exactly am I configuring when I create a stateful `LSTM` layer with N `units`?

You can check this question for further information, although it is based on Keras-1.x API.

Basically, the unit means the dimension of the inner cells in LSTM. Because in LSTM, the dimension of inner cell (C_t and C_{t-1} in the graph), output mask (o_t in the graph) and hidden/output state (h_t in the graph) should have the SAME dimension, therefore you output’s dimension should be unit-length as well.

And LSTM in Keras only define exactly one LSTM block, whose cells is of unit-length. If you set return_sequence=True, it will return something with shape: (batch_size, timespan, unit). If false, then it just return the last output in shape (batch_size, unit).

As for the input, you should provide input for every timestamp. Basically, the shape is like (batch_size, timespan, input_dim), where input_dim can be different from the unit. If you just want to provide input at the first step, you can simply pad your data with zeros at other time steps.

Leave a Comment