Tokenizers

Imagine a word is an interval of time series data…

import torch.nn as nn
proj = nn.Conv1d(
            in_channels=3,
            out_channels=512*3, # if shared embedding, then d_model is the output dimension
            kernel_size=100,
            stride=100,
            padding=0,
            groups=3 # added to handle multiple channels / keep them separate
        )

import torch

W_P = nn.ModuleList()
for _ in range(3): W_P.append(nn.Linear(100, 512))

x = torch.randn(10,3,1000).shape

l = nn.Linear(100,512)

l.weight.shape, l.bias.shape

(torch.Size([512, 100]), torch.Size([512]))

def count_parameters(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad)

count_parameters(W_P), count_parameters(proj)

(155136, 155136)

source

InceptionTokenizer


def InceptionTokenizer(
    c_in, # the number of input channels
    patch_size, # the length of the patches (either stft or interval length)
    d_model, # the dimension of the initial linear layers for inputting patches into transformer
    patch_stride:NoneType=None, # the stride of the patches
    shared_embedding:bool=True, tokenizer_kwargs:VAR_KEYWORD
):

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool

source

LinearTokenizer


def LinearTokenizer(
    c_in, # the number of input channels
    patch_size, # the length of the patches (either stft or interval length)
    d_model, # the dimension of the initial linear layers for inputting patches into transformer
    shared_embedding:bool=False, # indicator of whether to project each channel individually or together
):

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool

source

TS_Tokenizer_Complex


def TS_Tokenizer_Complex(
    c_in, patch_size, d_model, constant_pad_value:float=0.0
):

Time series 2D convolutional Embedding

/opt/hostedtoolcache/Python/3.12.12/x64/lib/python3.12/site-packages/fastcore/docscrape.py:259: UserWarning: potentially wrong underline length... 
Tokenizer class based on a Conv1D 
--- in 
Tokenizer class based on a Conv1D
---...
  else: warn(msg)

source

TS_Tokenizer


def TS_Tokenizer(
    c_in, patch_size, d_model, patch_stride:NoneType=None, shared_embedding:bool=True
):

Tokenizer class based on a Conv1D

c_in (int): Number of input channels
patch_size (int): Size of each patch/kernel
d_model (int): Output embedding dimension

source

PatchEncoder


def PatchEncoder(
    c_in, # the number of input channels
    patch_len, # the length of the patches (either stft or interval length)
    d_model, # the dimension of the initial linear layers for inputting patches into transformer
    shared_embedding, # indicator of whether to project each channel individually or together
):

Base class for all neural network modules.

Your models should also subclass this class.

Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.

.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.

:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool