SSL, Fine Tuning, and Linear Probing Heads

none of these work!

Linear Probing and Fine Tuning Heads


source

RNNProbingHead


def RNNProbingHead(
    c_in, input_size, hidden_size, n_classes, contrastive:bool=False, module:str='GRU', linear_dropout:float=0.0,
    rnn_dropout:float=0.0, num_rnn_layers:int=1, act:str='gelu',
    pool:str='average', # 'average' or 'max' or 'majority'
    temperature:float=2.0, # only used if pool='majority'
    n_linear_layers:int=1, predict_every_n_patches:int=1, bidirectional:bool=True, affine:bool=False,
    shared_embedding:bool=True, augmentations:NoneType=None, augmentation_mask_ratio:float=0.0,
    augmentation_dims_to_shuffle:list=[1, 2, 3], norm:NoneType=None, # one of [None, 'pre', 'post']
):

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool


source

RNNProbingHeadExperimental


def RNNProbingHeadExperimental(
    c_in, input_size, hidden_size, n_classes, contrastive:bool=False, # deprecated
    module:str='GRU', linear_dropout:float=0.0, rnn_dropout:float=0.0, num_rnn_layers:int=1, act:str='gelu',
    pool:str='average', # 'average' or 'max' or 'majority'
    temperature:float=2.0, # only used if pool='majority'
    predict_every_n_patches:int=1, bidirectional:bool=True, affine:bool=False, augmentations:NoneType=None,
    augmentation_mask_ratio:float=0.0, augmentation_dims_to_shuffle:list=[1, 2, 3],
    pre_norm:bool=True, # one of [None, 'pre', 'post']
    mlp_final_head:bool=False
):

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool

m = RNNProbingHeadExperimental(c_in=7, 
                                pool='average', 
                                input_size = 384, 
                                bidirectional=True,
                                affine=False, 
                                hidden_size=1200,
                                module='GRU',
                                n_classes=4,
                                predict_every_n_patches=32,
                                rnn_dropout=0.,
                                num_rnn_layers=1,
                                linear_dropout=0.,
                                mlp_final_head=False,
                                pre_norm=True)
x = torch.randn((4,7,384,960))
sequence_padding_mask = torch.zeros(4,960)
sequence_padding_mask[:,-32:] = 1
m(x, return_softmax=True, sequence_padding_mask=sequence_padding_mask).shape
torch.Size([4, 4, 30])
m = RNNProbingHead(c_in=7, pool='majority', input_size = 384, contrastive=False, bidirectional=True, affine=True, shared_embedding=False, hidden_size=384, module='GRU', n_classes=4, predict_every_n_patches=32, rnn_dropout=0., num_rnn_layers=1, linear_dropout=0., n_linear_layers=1, norm='post')
x = torch.randn((4,7,384,960))

m(x, return_softmax=True).shape
torch.Size([4, 4, 30])
m = RNNProbingHead(c_in=7, input_size = 512, contrastive=True, bidirectional=True, affine=False, shared_embedding=True, hidden_size=256, module='GRU', n_classes=5, predict_every_n_patches=5, rnn_dropout=0., num_rnn_layers=1, linear_dropout=0., n_linear_layers=1)
x = torch.randn((4,7,512*2,3600))

m(x, return_softmax=True).shape
torch.Size([4, 5, 720])

source

TransformerDecoderProbingHead


def TransformerDecoderProbingHead(
    c_in, d_model, n_classes, norm:str='BatchNorm', dropout:float=0.0, act:str='gelu', d_ff:int=2048,
    num_layers:int=1, n_heads:int=2, predict_every_n_patches:int=1, affine:bool=False, shared_embedding:bool=True
):

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool

layer = TransformerDecoderProbingHead(c_in=7, affine=True, shared_embedding=False, d_model=512, n_classes=5, dropout=0., num_layers=1, n_heads=2, predict_every_n_patches=5)
x = torch.randn((4, 7, 512, 3600))

layer(x).shape
torch.Size([4, 5, 720])

source

DecoderFeedForward


def DecoderFeedForward(
    c_in, # the number of input channels
    predict_every_n_patches, # for a given sequence of length m with frequency f, number of predictions
    num_layers, d_ff, attn_dropout, res_attention, pre_norm, store_attn, n_heads, shared_embedding, affine,
    n_classes, # the number of classes to predict (for sleep stage - there are 6)
    d_model, # the dimension of the transformer model
    norm:str='BatchNorm', # batchnorm or layernorm between linear and convolutional layers
    act:str='gelu', # activation function to use between layers, 'gelu' or 'relu'
    dropout:float=0.0, # dropout in between linear layers
):

transformer decoder with attention for feedforward predictions. This is really just another encoder layer followed by a linear layer + 1d convolution + softmax. However, if used in linear probing, could be useful.

c_in = 7
frequency = 125
win_length=750 
overlap = 0.
hop_length=win_length - int(overlap*win_length)
max_seq_len_sec = (6*3600) # for dataloader
#seq_len_sec = sample_stride = 3*3600 # for dataloader
max_seq_len = max_seq_len_sec*frequency # for model
#n_patches = n_fft // 2 + 1
n_patches = (max(max_seq_len, win_length)-win_length) // hop_length + 1

#patch_len = int((win_length-conv_kernel_stride_size[1])/conv_kernel_stride_size[1] + 1)
x = torch.randn(2,c_in,512,n_patches)

model = DecoderFeedForward(c_in=c_in,
                           predict_every_n_patches=5,
                           num_layers=1,
                           d_ff = 2048,
                           attn_dropout=0.,
                           res_attention = False,
                           pre_norm = False,
                           store_attn = False,
                           n_heads=2,
                           affine=False,
                           shared_embedding=False,
                           n_classes=5,
                           d_model=512,
                           norm='BatchNorm',
                           act='gelu',
                           dropout=0.
                           )

model(x).shape
torch.Size([2, 5, 720])

source

TimeDistributedConvolutionalFeedForward


def TimeDistributedConvolutionalFeedForward(
    c_in, # the number of input channels
    frequency, # the frequency of the original channels
    predict_every_seconds, # for a given sequence of length m with frequency f, number of predictions
    n_classes, # the number of classes to predict (for sleep stage - there are 6)
    win_length, # the convolved patch length, the first step in this is to do a linear layer to this dimension
    d_model, # the dimension of the transformer model
    affine:bool=False, shared_embedding:bool=True
):

Convolutional feed forward head that first uses a linear feed forward network to project features into the original convolutional dimension. Then, a convolutional transpose is used to extrapolate the data to its original form. Finally, a final convolution is used to predict the classes.


source

LinearProbingHead


def LinearProbingHead(
    c_in, # the number of input channels in the original input
    predict_every_n_patches, # for a given sequence of length m with frequency f, number of predictions
    n_classes, # the number of classes to predict (for sleep stage - there are 6)
    input_size, # the dimension of the transformer model
    n_layers, # the number of linear layers to use in the prediciton head, with RELU activation and dropout
    num_patch,
    shared_embedding:bool=True, # whether or not to have a dense layer per channel or one layer per channel
    affine:bool=True, # include learnable parameters to weight predictions
    norm:str='BatchNorm', # batchnorm or layernorm between linear and convolutional layers
    act:str='gelu', # activation function to use between layers, 'gelu' or 'relu'
    dropout:float=0.0, # dropout in between linear layers
):

A linear probing head (with optional MLP), assumes that the d_model corresponds to a particular segment of time and will make a prediction per patch per channel, and average the results

m = LinearProbingHead(c_in=7, 
                      input_size = 512, 
                      predict_every_n_patches=5,
                      n_classes=5,
                      n_layers=3,
                      shared_embedding=True,
                      affine=True,
                      num_patch=3600,
                      dropout=0.1)

x = torch.randn((4,7,512,3600))

m(x, return_softmax=True).shape
torch.Size([4, 5, 720])

source

TimeDistributedFeedForward


def TimeDistributedFeedForward(
    c_in, # the number of input channels
    n_classes, # the number of classes to predict (for sleep stage - there are 6)
    n_patches, # the number of stft or time patches
    d_model, # the dimension of the transformer model
    pred_len_seconds, # the sequence multiclass prediction length in seconds
    n_linear_layers, # the number of linear layers to use in the prediciton head, with RELU activation and dropout
    conv_kernel_stride_size, # the 1d convolution kernel size and stride, in seconds. If you want every 30 second predicitons, put 30 here.
    dropout:float=0.0, # dropout in between linear layers
):

Feed forward head that uses a convolutional layer to reduce channel dimensionality Followed by a feedforward network to make


source

ConvBiGRU


def ConvBiGRU(
    input_size, hidden_sizes, kernel_sizes, n_layers, d_model, predict_every_n_patches, n_classes
):

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool


source

ConvGRU1D


def ConvGRU1D(
    input_size, hidden_sizes, # if integer, the same hidden size is used for all cells.
    kernel_sizes, # if integer, the same kernel size is used for all cells.
    n_layers
):

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool


source

ConvGRU1DCell


def ConvGRU1DCell(
    input_size, hidden_size, kernel_size
):

Generate a convolutional GRU cell

x = torch.randn((4,7,512,3600))

convgru = ConvBiGRU(input_size=7, hidden_sizes=32, kernel_sizes=3, n_layers=1, d_model=512, predict_every_n_patches=5, n_classes=5)

out = convgru(x)
out.shape
torch.Size([4, 5, 720])