I am training a stateful RNN on variable length sequences (optional: see my previous question for more details).
I padded the sequences to a fixed length with the value -1.
The when batches are loaded, some samples will be entirely -1. (e.g. the batches are shape [batchsize, ...] and samples 1,6,8 may be entirely composed of -1's). I would like to:
I tried using tf.Keras.layers.Masking as in:
input = tf.Keras.layers.Masking(mask_val=-1)(input)
But, this doesnt seem to do anything. The subsequent operations are still performed, and as far as I can tell the samples are still included in the loss function. Why is this?
NOTE: This question was previously asked on stackoverflow.com & datascience.stackexchange.com, but deleted (modified in latter case) due to lack of response. I think this will be a better home for it.