Update only part of the word embedding matrix in Tensorflow

TL;DR: The default implementation of opt.minimize(loss), TensorFlow will generate a sparse update for word_emb that modifies only the rows of word_emb that participated in the forward pass. The gradient of the tf.gather(word_emb, indices) op with respect to word_emb is a tf.IndexedSlices object (see the implementation for more details). This object represents a sparse tensor that … Read more

How does mask_zero in Keras Embedding layer work?

Actually, setting mask_zero=True for the Embedding layer does not result in returning a zero vector. Rather, the behavior of the Embedding layer would not change and it would return the embedding vector with index zero. You can confirm this by checking the Embedding layer weights (i.e. in the example you mentioned it would be m.layers[0].get_weights()). … Read more

Using Gensim Fasttext model with LSTM nn in keras

here the procedure to incorporate the fasttext model inside an LSTM Keras network # define dummy data and precproces them docs = [‘Well done’, ‘Good work’, ‘Great effort’, ‘nice work’, ‘Excellent’, ‘Weak’, ‘Poor effort’, ‘not good’, ‘poor work’, ‘Could have done better’] docs = [d.lower().split() for d in docs] # train fasttext from gensim api … Read more

How does Keras 1d convolution layer work with word embeddings – text classification problem? (Filters, kernel size, and all hyperparameter)

I would try to explain how 1D-Convolution is applied on a sequence data. I just use the example of a sentence consisting of words but obviously it is not specific to text data and it is the same with other sequence data and timeseries. Suppose we have a sentence consisting of m words where each … Read more