PatchTFTSimple
PatchTFTSimple (c_in:int, win_length, hop_length, max_seq_len,
time_domain=True, pos_encoding_type='learned',
relative_attn_type='vanilla', use_flash_attn=False,
use_revin=True, dim1reduce=False, affine=True,
mask_ratio=0.1, augmentations=['patch_mask',
'jitter_zero_mask', 'reverse_sequence',
'shuffle_channels'], n_layers:int=2, d_model=512,
n_heads=2, shared_embedding=False, d_ff:int=2048,
norm:str='BatchNorm', attn_dropout:float=0.0,
dropout:float=0.1, act:str='gelu',
res_attention:bool=True, pre_norm:bool=False,
store_attn:bool=False, pretrain_head=True,
pretrain_head_n_layers=1, pretrain_head_dropout=0.0)
*Base class for all neural network modules.
Your models should also subclass this class.
Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes::
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self):
super().__init__()
self.conv1 = nn.Conv2d(1, 20, 5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x = F.relu(self.conv1(x))
return F.relu(self.conv2(x))
Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:to
, etc.
.. note:: As per the example above, an __init__()
call to the parent class must be made before assignment on the child.
ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool*
c_in |
int |
|
the number of input channels |
win_length |
|
|
the length of the patch of time/interval or short time ft windown length (when time_domain=False) |
hop_length |
|
|
the length of the distance between each patch/fft |
max_seq_len |
|
|
maximum sequence len |
time_domain |
bool |
True |
|
pos_encoding_type |
str |
learned |
options include learned or tAPE |
relative_attn_type |
str |
vanilla |
options include vanilla or eRPE |
use_flash_attn |
bool |
False |
indicator to use flash attention |
use_revin |
bool |
True |
if time_domain is true, whether or not to instance normalize time data |
dim1reduce |
bool |
False |
indicator to normalize by timepoint in revin |
affine |
bool |
True |
if time_domain is true, whether or not to learn revin normalization parameters |
mask_ratio |
float |
0.1 |
amount of signal to mask |
augmentations |
list |
[‘patch_mask’, ‘jitter_zero_mask’, ‘reverse_sequence’, ‘shuffle_channels’] |
the type of mask to use, options are patch or jitter_zero |
n_layers |
int |
2 |
the number of transformer encoder layers to use |
d_model |
int |
512 |
the dimension of the input to the transofmrer encoder |
n_heads |
int |
2 |
the number of heads in each layer |
shared_embedding |
bool |
False |
indicator for whether or not each channel should be projected with its own set of linear weights to the encoder dimension |
d_ff |
int |
2048 |
the feedforward layer size in the transformer |
norm |
str |
BatchNorm |
BatchNorm or LayerNorm during trianing |
attn_dropout |
float |
0.0 |
dropout in attention |
dropout |
float |
0.1 |
dropout for linear layers |
act |
str |
gelu |
activation function |
res_attention |
bool |
True |
whether to use residual attention |
pre_norm |
bool |
False |
indicator to pre batch or layer norm |
store_attn |
bool |
False |
indicator to store attention |
pretrain_head |
bool |
True |
indicator to include a pretraining head |
pretrain_head_n_layers |
int |
1 |
how many linear layers on the pretrained head |
pretrain_head_dropout |
float |
0.0 |
dropout applied to pretrain head |
XX = torch.randn(4,7,1*3600*100)
pad = torch.zeros(4,1*3600*100)
pad[:,0:100] = 1
model = PatchTFTSimple(c_in=7,
win_length=750,
hop_length=750,
max_seq_len=(1*3600*100),
use_revin=True,
time_domain=True,
affine=False,
dim1reduce=False,
act='gelu',
use_flash_attn=True,
relative_attn_type='vanilla',
pos_encoding_type='learned',
mask_ratio=0.1,
augmentations=['jitter_zero_mask'],
n_layers=1,
n_heads=1,
d_model=512,
d_ff=2048,
dropout=0.,
attn_dropout=0.,
pre_norm=False,
res_attention=False,
shared_embedding=False,
pretrain_head=True
)
r = model(XX, sequence_padding_mask=pad)
r[0].shape, r[1].shape, r[3].shape
(torch.Size([4, 480, 7, 750]),
torch.Size([4, 480, 7, 750]),
torch.Size([4, 480]))