Implementing word dropout in pytorch

I want to add word dropout to my network so that I can have sufficient training examples for training the embedding of the "unk" token. As far as I'm aware, this is standard practice. Let's assume the index of the unk token is 0, and the index for padding is 1 (we can switch them if that's more convenient).
This is a simple CNN network which implements word dropout the way I would have expected it to work:
class Classifier(nn.Module):
    def __init__(self, params):
        super(Classifier, self).__init__()
        self.params = params
        self.word_dropout = nn.Dropout(params["word_dropout"])
        self.pad = torch.nn.ConstantPad1d(max(params["window_sizes"])-1, 1)
        self.embedding = nn.Embedding(params["vocab_size"], params["word_dim"], padding_idx=1)
        self.convs = nn.ModuleList([nn.Conv1d(1, params["feature_num"], params["word_dim"] * window_size, stride=params["word_dim"], bias=False) for window_size in params["window_sizes"]])
        self.dropout = nn.Dropout(params["dropout"])
        self.fc = nn.Linear(params["feature_num"] * len(params["window_sizes"]), params["num_classes"])

    def forward(self, x, l):
        x = self.word_dropout(x)
        x = self.pad(x)
        embedded_x = self.embedding(x)
        embedded_x = embedded_x.view(-1, 1, x.size()[1] * self.params["word_dim"]) # [batch_size, 1, seq_len * word_dim]
        features = [F.relu(conv(embedded_x)) for conv in self.convs]
        pooled = [F.max_pool1d(feat, feat.size()[2]).view(-1, params["feature_num"]) for feat in features]
        pooled = torch.cat(pooled, 1)
        pooled = self.dropout(pooled)
        logit = self.fc(pooled)
        return logit
Don't mind the padding - pytorch doesn't have an easy way of using non zero padding in CNNs, much less trainable non-zero padding, so I'm doing it manually. Dropout also doesn't allow me to use non zero dropout, and I want to separate the padding token from the unk token. I'm keeping it in my example because it's the reason for this question's existence.
This doesn't work because dropout wants Float Tensors so that it can scale them properly, while my input is Long Tensors that don't need to be scaled.
Is there an easy way of doing this in pytorch? I essentially want to use LongTensor-friendly dropout (bonus: better if it will let me specify a dropout constant that isn't 0, so that I could use zero padding).

Комментарии

Популярные сообщения из этого блога

Skipping acquire of configured file 'contrib/binary-i386/Packages' as repository … doesn't support architecture 'i386'

Connection string for MariaDB using ODBC

Celery like system based on django channels