Augmentations

Because this will help

Patching


source

create_patch

 create_patch (xb, patch_len, stride, return_patch_num=False,
               constant_pad=False, constant_pad_value=0, max_seq_len=None)

xb: [bs x n_vars x seq_len]

x = torch.randn(4, 1000)

# test seq_len > patch len == stride 
xb = create_patch(x, patch_len=505, stride=500, constant_pad=False)
xb_rep = create_patch(x, patch_len=500, stride=500, constant_pad=True)
x.shape, xb.shape, xb_rep.shape
#xb_rep_short = create_patch(x_short, patch_en=502, stride=500, replication_pad=False)b
(torch.Size([4, 1000]), torch.Size([4, 1, 505]), torch.Size([4, 2, 500]))
x = torch.randn(1,7,1350000)

# test seq_len > patch len == stride 
xb = create_patch(x, patch_len=1024, stride=1024, constant_pad=False)
xb_rep = create_patch(x, patch_len=1024, stride=1024, constant_pad=True)
x.shape, xb.shape, xb_rep.shape
#xb_rep_short = create_patch(x_short, patch_en=502, stride=500, replication_pad=False)
(torch.Size([1, 7, 1350000]),
 torch.Size([1, 1318, 7, 1024]),
 torch.Size([1, 1319, 7, 1024]))

source

unpatch

 unpatch (x, seq_len, remove_padding=True)

x: [bs/None x patch_num x n_vars x patch_len] returns x: [bs x n_vars x seq_len]

x = torch.randn(1,1,50)

# test seq_len > patch len == stride 
xb = create_patch(x, patch_len=6, stride=6, constant_pad=True)
xb = unpatch(xb, seq_len=50, remove_padding=False)
xb.shape
(torch.Size([1, 1, 50]), torch.Size([1, 9, 1, 6]))

Patch Masking


source

random_masking

 random_masking (xb, mask_ratio)

source

mask_patches_simple

 mask_patches_simple (xb, mask_ratio)

*Function that masks patches in a simple way

xb: [bs x patch_num x n_vars x patch_len] padding_mask [bs x patch_num x 1|num_vars x patch_len]*

Details
xb input tensor of size 3 or 4 to be masked
mask_ratio ratio of masking of patches
x = torch.randn(50,16,7,50)
mask_ratio = 0.4

x_new, mask = mask_patches_simple(x,mask_ratio=mask_ratio)

Value Augmentations


source

jitter_augmentation

 jitter_augmentation (x, mask_ratio=0.05, jitter_ratio=0.05)

source

remove_values

 remove_values (x, mask_ratio)
## note that the random number generator advances state...
torch.manual_seed(42)
x = torch.randn(4,7,1000)

torch.manual_seed(42)
x_new, n_masks = jitter_augmentation(x)
n_masks /(4* 7*1000)

torch.manual_seed(42)
x_new2, n_masks2 = jitter_augmentation(x)
torch.equal(x_new, x_new2)
True

Shuffle Augmentations


source

shuffle_dim

 shuffle_dim (x, dim=1, p=0.5)

shuffles a dimension randomly along dim x: [bs x n channels x n patches x patch len]


source

reverse_sequence

 reverse_sequence (x, seq_dim=(-1,), p=0.5)
x = torch.randn(4,1,5,5).to('cuda')

torch.equal(shuffle_dim(x), x)
False

Callbacks


source

IntraClassCutMix1d

 IntraClassCutMix1d (mix_prob=0.5, return_y_every_sec=30, frequency=125,
                     return_sequence_padding_mask=True)

*Intra-class CutMix for 1D data (e.g., time-series).

This is a callback that can be used to apply CutMix to the training data. It is used to mix segments within the same class.*

Type Default Details
mix_prob float 0.5 probability of applying cutmix
return_y_every_sec int 30 length of segment to mix, if one value of y corresponds to 30 seconds of signal data, this should be set to 30.
frequency int 125 frequency of the data
return_sequence_padding_mask bool True whether to return the sequence padding mask
x = torch.randn(4,7,90)
x_c = x.clone()
y = torch.randint(0, 5, size=(4,90//30))
xxt = IntraClassCutMix1d(mix_prob=1, frequency=1, return_y_every_sec=30, return_sequence_padding_mask=False)
batch = (x,y)
xxt.on_train_batch_start(None, None, batch, 0)
torch.equal(x_c, batch[0]) == False
True

source

IntraClassCutMixBatch

 IntraClassCutMixBatch (mix_prob=0.5, return_y_every_sec=30,
                        frequency=125, return_sequence_padding_mask=True,
                        intra_class_only=True)

*Intra-class CutMix for 1D data (e.g., time-series).

This is a callback that can be used to apply CutMix to the training data. It is used to mix segments within the same class.

This is different to IntraClassCutMix1d in that it mixes segments of the same class across batches of data, rather than just at the same segment*

Type Default Details
mix_prob float 0.5 probability of applying cutmix
return_y_every_sec int 30 length of segment to mix, if one value of y corresponds to 30 seconds of signal data, this should be set to 30.
frequency int 125 frequency of the data
return_sequence_padding_mask bool True whether to return the sequence padding mask
intra_class_only bool True whether to mix only within same class (True) or across all classes (False)
x = torch.randn(4,7,90)
x_c = x.clone()
y = torch.randint(0, 5, size=(4,90//30))
xxt = IntraClassCutMixBatch(mix_prob=1, frequency=1, return_y_every_sec=30, return_sequence_padding_mask=False)
batch = (x,y)
batch = xxt.on_train_batch_start(None, None, batch, 0)
torch.equal(x_c, batch[0]) == False
intra-class CutMixBatch is being applied!
True
# Create a tuple
batch = ([1,2,3], [4,5,6])

# Unpack into new variables
x, y = batch

# Modify x and y
x[0] = 99  # This modifies the list because lists are mutable
y[0] = 88  # This modifies the list because lists are mutable

print(batch)  # Will show ([99,2,3], [88,5,6]) because lists are mutable
([99, 2, 3], [88, 5, 6])

source

MixupCallback

 MixupCallback (num_classes, mixup_alpha=0.4,
                return_sequence_padding_mask=True, ignore_index=-100)

*Mixup for 1D data (e.g., time-series).

This callback applies Mixup to the training data, blending both the input data and the labels.

See tsai implementation here: https://github.com/timeseriesAI/tsai/blob/bdff96cc8c4c8ea55bc20d7cffd6a72e402f4cb2/tsai/data/mixed_augmentation.py#L43

Note that this creates non-integer labels/soft labels. Loss functions should be able to handle this.*

Type Default Details
num_classes
mixup_alpha float 0.4 alpha parameter for the beta distribution
return_sequence_padding_mask bool True whether to return the sequence padding mask
ignore_index int -100 ignore index
x = torch.randn(4,7,90)
x_c = x.clone()
y_og = torch.randint(0, 5, size=(4,90//30))
y_og[1,2] = -100
y_og[2,1] = -100
y_c = y_og.clone()
xxt = MixupCallback(num_classes=5, mixup_alpha=0.4, return_sequence_padding_mask=False)
batch = (x,y_og)
batch = xxt.on_train_batch_start(None, None, batch, 0)
torch.equal(x_c, batch[0]) == False, torch.equal(y_c, batch[1]) == False
Mixup is being applied!
(True, True)