m = PatchEncoder3D(c_in=7, patch_size = 3, n_patches = 12, tubelet_size=2, d_model = 512)
x = torch.randn(4, 7, 10, 12, 3)
m(x).shapetorch.Size([4, 5, 512])
Series decomposition block
Moving average block to highlight the trend of time series
Call self as a function.
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
3d convolution for patched time series data, broken into frames
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
d_model = 512
n_heads = 8
batch_size = 2
n_vars = 7
max_len = 100
# Create sequences of different lengths
seq_lens = torch.randint(50, max_len, (batch_size,))
# Create input tensors with different sequence lengths
x_list = [torch.randn(length, d_model) for length in seq_lens]
x_nested = torch.nested.as_nested_tensor(x_list, layout=torch.jagged)
p = PositionalEncoding(num_patch=max_len, d_model=d_model)
out = p(x_nested)time Absolute Position Encoding Adapted from tsai
d_model = 768
batch_size = 2
n_vars = 7
max_len = 14400
# Create sequences of different lengths
seq_lens = torch.randint(10000, max_len, (batch_size,))
# Create input tensors with different sequence lengths
x_list = [torch.randn(length, d_model) for length in seq_lens]
x_nested = torch.nested.as_nested_tensor(x_list, layout=torch.jagged)
p = tAPE(seq_len=max_len, d_model=d_model)
out = p(x_nested)
out.shapeAny nans: tensor(False)
Any nans in x: tensor(False)
Any nans: tensor(False)
Any nans in x: tensor(False)
torch.Size([2, j15, 768])
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
True
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
patch_len = 750
d_model = 12
n_vars = 7
max_len = 100
bs = 3
# Create sequences of different lengths
seq_lens = torch.randint(50, max_len, (bs,))
# Create input tensors with different sequence lengths
x_list = [torch.randn(length, n_vars, patch_len) for length in seq_lens]
x_nested = torch.nested.as_nested_tensor(x_list, layout=torch.jagged)
f = FFT(dim=-1)
x = f(x_nested)
x.size(0), x.dim()(3, 4)
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
(torch.Size([2, 7, 100]), torch.Size([2, 7, 100]))
batch_size = 2
n_vars = 7
max_len = 100
# Create sequences of different lengths
seq_lens = torch.randint(50, max_len, (batch_size,))
# Create input tensors with different sequence lengths
x_list = [torch.randn(n_vars, length) for length in seq_lens]
x_nested = torch.nested.as_nested_tensor(x_list, layout=torch.jagged)
print(x_nested.shape)
revin = RevIN(n_vars, dim_to_reduce=-1, affine=True)
x_norm = revin(x_nested, mode=True)
print(x_norm.shape)
x_denorm = revin(x_norm, mode=False)
print(x_denorm.shape)torch.Size([2, 7, j26])
torch.Size([2, 7, j27])
torch.Size([2, 7, j28])
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
x = torch.randn(2,7,512,3600)
missing_channel_indices = [0,1]
learnable_mask_tokens = LearnableMaskedChannelTokens(missing_channel_indices, d_model=512)
learnable_mask_tokens(x).shape
seq_lens = torch.tensor([50, 100])
# Create input tensors with different sequence lengths
x_list = [torch.randn(n_vars,512, length) for length in seq_lens]
x_nested = torch.nested.as_nested_tensor(x_list, layout=torch.jagged)
out = learnable_mask_tokens(x_nested)
out.shapetorch.Size([2, 7, 512, j99])
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
Inception module adapted from https://github.com/timeseriesAI/tsai/blob/main/tsai/models/InceptionTime.py
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
Scaled Dot-Product Attention module (Attention is all you need by Vaswani et al., 2017) with optional residual attention from previous layer (Realformer: Transformer likes residual attention by He et al, 2020) and locality self sttention (Vision Transformer for Small-Size Datasets by Lee et al, 2021)
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
MultiheadAttentionCustom(
(W_Q): Linear(in_features=512, out_features=512, bias=True)
(W_K): Linear(in_features=512, out_features=512, bias=True)
(W_V): Linear(in_features=512, out_features=512, bias=True)
(sdp_attn): ScaledDotProductAttention(
(attn_dropout): Dropout(p=0.0, inplace=False)
)
(to_out): Sequential(
(0): Linear(in_features=512, out_features=512, bias=True)
(1): Dropout(p=0.0, inplace=False)
)
)
def test_attention_equivalence():
# Set random seed for reproducibility
torch.manual_seed(42)
# Test parameters
batch_size = 2
seq_len = 10
d_model = 64
n_heads = 4
# Create input tensor (only need one since we're using self-attention)
x = torch.randn(batch_size, seq_len, d_model)
# Create key padding mask
key_padding_mask = torch.zeros(batch_size, seq_len, dtype=torch.bool)
key_padding_mask[:, -2:] = True # mask last 2 positions
# Initialize both implementations
custom_mha = MultiheadAttentionCustom(d_model=d_model, n_heads=n_heads)
flash_mha = MultiHeadAttention(d_model=d_model, n_heads=n_heads)
# Set both models to eval mode to disable dropout
custom_mha.eval()
flash_mha.eval()
# Copy weights to ensure identical parameters
# Combine QKV weights from custom implementation into single matrix for flash attention
combined_weight = torch.cat([
custom_mha.W_Q.weight,
custom_mha.W_K.weight,
custom_mha.W_V.weight
], dim=0)
combined_bias = torch.cat([
custom_mha.W_Q.bias,
custom_mha.W_K.bias,
custom_mha.W_V.bias
], dim=0)
# Copy combined weights to flash attention
flash_mha.c_attn.weight.data = combined_weight
flash_mha.c_attn.bias.data = combined_bias
# Output projection weights
flash_mha.c_proj.weight.data = custom_mha.to_out[0].weight.data.clone()
flash_mha.c_proj.bias.data = custom_mha.to_out[0].bias.data.clone()
# Forward pass
with torch.no_grad():
custom_output, custom_attn = custom_mha(x, key_padding_mask=key_padding_mask)
flash_output = flash_mha(x, attn_mask=key_padding_mask)
# Compare outputs
print(f"Custom output shape: {custom_output.shape}")
print(f"Flash output shape: {flash_output.shape}")
output_close = torch.allclose(custom_output, flash_output, rtol=0, atol=0)
print(f"Outputs match: {output_close}")
if not output_close:
print("\nOutput differences:")
print(f"Max difference: {(custom_output - flash_output).abs().max().item()}")
print(f"Mean difference: {(custom_output - flash_output).abs().mean().item()}")
return custom_output, flash_output
custom_output, flash_output = test_attention_equivalence()
#: 8.940696716308594e-08
#Mean difference: 1.0550138540565968e-08Custom output shape: torch.Size([2, 10, 64])
Flash output shape: torch.Size([2, 10, 64])
Outputs match: True
d_model=512
n_heads=8
d_k = d_v = d_model // n_heads
attn = ScaledDotProductAttention(d_model=d_model, n_heads=n_heads)
mha_attn = MultiheadAttentionCustom(d_model, n_heads)
W_Q = nn.Linear(d_model, d_k * n_heads)
W_K = nn.Linear(d_model, d_k * n_heads)
W_V = nn.Linear(d_model, d_v * n_heads)
X,_,_ = ds[0]
X = create_patch(X, patch_len=(10*50), stride=(5*50), constant_pad=True)
patch_len = X.shape[-1]
X = X[None, ...].permute(0,2,1,3) # simulate batch size of 1 [bs x n_vars x num_patch x patch_len]
print(f'X input shape: {X.shape}')
W_P = nn.Linear(patch_len, d_model)
X = W_P(X) # project to d_model
print(f"Projected X shape to d_model: {X.shape}")
X = torch.reshape(X, (X.shape[0]*X.shape[1],X.shape[2],X.shape[3]))
print(f"Reshape for attention: {X.shape}")
# test multihead attention
print("\nTesting MHA and SDA attention, with just 50 elements.")
mha_output, mha_attn_weights = mha_attn(Q=X[:,:50,:])
print(f"MHA attention output shape: {mha_output.shape}, mha attn weight shape: {mha_attn_weights.shape}")
# test scaled dot product attn
K = Q = V = X
# # Linear (+ split in multiple heads)
bs = 1 # 1 * 16
q_s = W_Q(Q).reshape(bs, -1, n_heads, d_k).transpose(1, 2)
k_s = W_K(K).reshape(bs, -1, n_heads, d_k).permute(0, 2, 3, 1)
v_s = W_V(V).reshape(bs, -1, n_heads, d_v).transpose(1, 2)
print(f"Q shape: {q_s.shape}, K shape: {k_s.shape}, V shape: {v_s.shape}")
to_out = nn.Linear(n_heads * d_v, d_model)
output, attn_weights = attn(q_s[:,:,:50,:],k_s[:,:,:,:50], v_s[:,:,:50,:])
output = output.transpose(1, 2).contiguous().view(bs, -1, n_heads * d_v)
print(f"Attn output shape {output.shape}, attn weight shape: {attn_weights.shape}")X input shape: torch.Size([1, 7, 10799, 500])
Projected X shape to d_model: torch.Size([1, 7, 10799, 512])
Reshape for attention: torch.Size([7, 10799, 512])
Testing MHA and SDA attention, with just 50 elements.
MHA attention output shape: torch.Size([7, 50, 512]), mha attn weight shape: torch.Size([7, 8, 50, 50])
Q shape: torch.Size([1, 8, 75593, 64]), K shape: torch.Size([1, 8, 64, 75593]), V shape: torch.Size([1, 8, 75593, 64])
Attn output shape torch.Size([1, 50, 512]), attn weight shape: torch.Size([1, 8, 50, 50])
def Attention_Rel_Scl(
d_model:int, # Embedding dimension
n_heads:int, # number of attention heads
seq_len:int, # sequence length or num patches
d_k:int=None, # key dimension
d_v:int=None, # value dimension
res_attention:bool=False, # whether to use residual attention
attn_dropout:float=0.0, # dropout for attention
lsa:bool=False, # whether to use LSA, trainable paramater for scaling
proj_dropout:float=0.0, # dropout for projection
qkv_bias:bool=True, # bias for q, k, v
):
Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing them to be nested in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self) -> None:
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will also have their parameters converted when you call :meth:to, etc.
.. note:: As per the example above, an __init__() call to the parent class must be made before assignment on the child.
:ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool
## test w patches [bs *c_in x num_patches x d_model]
d_model=512
c_in = 2
num_patches = 10
x_emb = torch.randn(4*c_in,num_patches, d_model)
abs_position = tAPE(d_model, seq_len=num_patches)
x_emb_pos = abs_position(x_emb)
model = Attention_Rel_Scl(d_model=d_model,
n_heads=2, # number of attention heads
seq_len=num_patches, # sequence length or num patches
)
out, attn_weights = model(x_emb)