Slumber
Constants
EDF Helpers
Read EDFs
read_edf
read_edf (file_path, channels=None, frequency=None)
Function to read an edf file and return a list of signals and header with the option to resample to a passed frequency
Type | Default | Details | |
---|---|---|---|
file_path | file path of edf | ||
channels | NoneType | None | channels in edf to read, will raise warning if channels do not exist |
frequency | NoneType | None | frequency to resample all signals to |
read_edf_mne
read_edf_mne (file_path, channels=None, frequency=None)
function to read edf with mne library i dont recommend using this. Use edfio instead.
Type | Default | Details | |
---|---|---|---|
file_path | file path of edf | ||
channels | NoneType | None | channels in edf to read, will raise warning if channels do not exist |
frequency | NoneType | None | frequency to resample all signals to |
Returns | Union | tuple of signals and header dictionary |
read_edf_edfio
read_edf_edfio (file_path, channels=None, frequency=None)
function to read edfs with edfio
Type | Default | Details | |
---|---|---|---|
file_path | file path of edf | ||
channels | NoneType | None | channels in edf to read, will raise warning if channels do not exist |
frequency | NoneType | None | frequency to resample all signals to |
Returns | Union | tuple of signals and header dictionary |
Read Hypnograms
read_hypnogram
read_hypnogram (file, epoch_length=None)
Function that reads a hypnogram csv and returns a numpy array of the hypnogram with optional repeats
Type | Default | Details | |
---|---|---|---|
file | file path of the hypnogram csv | ||
epoch_length | NoneType | None | epoch length of the hypnogram measurements, if passed will repeat this many times at each element |
Returns | array | numpy array of hypnogram |
EDFs to Zarr
edf_signals_to_zarr
edf_signals_to_zarr (edf_file_path, write_data_dir, overwrite=False, channels=None, channel_name_map=None, frequency=None, hyp_epoch_length=30, hyp_data_dir=None)
*Function that converts an edf to a zarr file
try_mne: tries to load files with mne instead of pyedflib (if there is an error). This seems dangerous as mne converts units (and potentially resamples, while pyedflib does not)*
Datasets
Self Supervised Dataset
trim_wake_epochs_from_signals
trim_wake_epochs_from_signals (X, hypnogram, sequence_padding_mask, resampled_hypnogram_length, mask_x_with_zeros=False, padding_mask=-100)
*Function to trim wake epochs (if wake is the largest class) from signals
X: bs, channels, seq_len hypnogram: bs, seq_len / resampled_hypnogram_length sequence_padding_mask: bs, seq_len*
trim_wake_epochs_from_hypnogram
trim_wake_epochs_from_hypnogram (hypnogram, padding_mask=-100)
*Function to trim wake epochs (if wake is the largest class) from hypnograms This function trims the wake epochs from the beginning and/or end of the hypnogram
Adapted from Phan et al L-SeqSleepNet*
SelfSupervisedTimeFrequencyDataset
SelfSupervisedTimeFrequencyDataset (zarr_files, channels, max_seq_len_sec, sample_seq_len_sec, sample_stride_sec, start_offset_sec=0, trim_wake_epochs=True, include_partial_samples=True, sample_df=None, frequency=125, return_hypnogram_every_sec=30, hypnogram_padding_mask=-100, hypnogram_frequency=125, butterworth_filters=None, median_filter_kernel_size=None, voltage_channels=['ECG', 'ECG (LL- RA)', 'EKG', 'ECG (L-R)', 'EOG(L)', 'EOG-L', 'E1', 'LOC', 'E1-M2', 'E1-AVG', 'EMG', 'cchin_l', 'chin', 'EMG (L-R)', 'EMG (1-2)', 'EMG (1-3)', 'Chin3', 'C4-M1', 'C4_M1', 'EEG', 'EEG1', 'EEG2', 'EEG3', 'C3-M2', 'C3_M2', 'C4-AVG'], clip_interpolations=None, scale_channels=False, time_channel_scales=None, return_sequence_padding_mask=False)
*An abstract class representing a :class:Dataset
.
All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite :meth:__getitem__
, supporting fetching a data sample for a given key. Subclasses could also optionally overwrite :meth:__len__
, which is expected to return the size of the dataset by many :class:~torch.utils.data.Sampler
implementations and the default options of :class:~torch.utils.data.DataLoader
. Subclasses could also optionally implement :meth:__getitems__
, for speedup batched samples loading. This method accepts list of indices of samples of batch and returns list of samples.
.. note:: :class:~torch.utils.data.DataLoader
by default constructs an index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.*
Type | Default | Details | |
---|---|---|---|
zarr_files | zarr files that include samples | ||
channels | channels to use | ||
max_seq_len_sec | maximum sequence length (in seconds) to use (this is especially relevant when you are returning both stft and raw ts data to keep them in sync) | ||
sample_seq_len_sec | if no sample_df, generate sequences of this length in seconds as one sample | ||
sample_stride_sec | if no sample_df, seconds of overlap for samples from the same array, if seq_len_seconds == overlap_seconds, there is no overlap | ||
start_offset_sec | int | 0 | number of seconds to exclude from beginning of sleep studies |
trim_wake_epochs | bool | True | indicator to trim wake epochs from hypnograms, if it is the largest class |
include_partial_samples | bool | True | indicator to include data from partial samples when return_full_length is false |
sample_df | NoneType | None | dataframe indicating which indices within each zarr file includes a sample |
frequency | int | 125 | frequency of underlying data |
return_hypnogram_every_sec | int | 30 | integer value indicating the step in indexing in seconds |
hypnogram_padding_mask | int | -100 | padded value to add to target and indice to ignore when computing loss |
hypnogram_frequency | int | 125 | frequency of underlying y hypnogram data |
butterworth_filters | NoneType | None | dictionary of low pass, high pass, and bandpass dictionary to perform on channels |
median_filter_kernel_size | NoneType | None | if not none, will apply median filter with kernel size |
voltage_channels | list | [‘ECG’, ‘ECG (LL-RA)’, ‘EKG’, ‘ECG (L-R)’, ‘EOG(L)’, ‘EOG-L’, ‘E1’, ‘LOC’, ‘E1-M2’, ‘E1-AVG’, ‘EMG’, ‘cchin_l’, ‘chin’, ‘EMG (L-R)’, ‘EMG (1-2)’, ‘EMG (1-3)’, ‘Chin3’, ‘C4-M1’, ‘C4_M1’, ‘EEG’, ‘EEG1’, ‘EEG2’, ‘EEG3’, ‘C3-M2’, ‘C3_M2’, ‘C4-AVG’] | if not None, these channels units will be looked at and changed to microvolts from mv uv etc. |
clip_interpolations | NoneType | None | dictionary of channels:{‘phys_range’:…, ‘percentiles’:…} for filtering and interpolation of filtered values |
scale_channels | bool | False | indicator to scale channels to the mean and std of the zarr files. |
time_channel_scales | NoneType | None | dictionary of channel:mean and channel:std values for scaling. Should use training statistics |
return_sequence_padding_mask | bool | False | indicator to return the key padding mask for attention masking |
Sleep Stage Supervised Dataset
HypnogramTimeFrequencyDataset
HypnogramTimeFrequencyDataset (zarr_files, channels, max_seq_len_sec, sample_seq_len_sec, sample_stride_sec, start_offset_sec=0, trim_wake_epochs=True, include_partial_samples=True, sample_df=None, frequency=125, return_y_every_sec=30, y_padding_mask=-100, y_frequency=125, butterworth_filters=None, median_filter_kernel_size=None, voltage_channels=['ECG', 'ECG (LL-RA)', 'EKG', 'ECG (L-R)', 'EOG(L)', 'EOG-L', 'E1', 'LOC', 'E1-M2', 'E1-AVG', 'EMG', 'cchin_l', 'chin', 'EMG (L-R)', 'EMG (1-2)', 'EMG (1-3)', 'Chin3', 'C4-M1', 'C4_M1', 'EEG', 'EEG1', 'EEG2', 'EEG3', 'C3-M2', 'C3_M2', 'C4-AVG'], clip_interpolations=None, scale_channels=False, time_channel_scales=None, return_sequence_padding_mask=False)
*An abstract class representing a :class:Dataset
.
All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite :meth:__getitem__
, supporting fetching a data sample for a given key. Subclasses could also optionally overwrite :meth:__len__
, which is expected to return the size of the dataset by many :class:~torch.utils.data.Sampler
implementations and the default options of :class:~torch.utils.data.DataLoader
. Subclasses could also optionally implement :meth:__getitems__
, for speedup batched samples loading. This method accepts list of indices of samples of batch and returns list of samples.
.. note:: :class:~torch.utils.data.DataLoader
by default constructs an index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.*
Type | Default | Details | |
---|---|---|---|
zarr_files | zarr files that include samples | ||
channels | channels to use | ||
max_seq_len_sec | maximum sequence length (in seconds) to use (this is especially relevant when you are returning both stft and raw ts data to keep them in sync) | ||
sample_seq_len_sec | if no sample_df, generate sequences of this length in seconds as one sample | ||
sample_stride_sec | if no sample_df, seconds of overlap for samples from the same array, if seq_len_seconds == overlap_seconds, there is no overlap | ||
start_offset_sec | int | 0 | number of seconds to exclude from beginning of sleep studies |
trim_wake_epochs | bool | True | indicator to trim wake epochs from hypnograms, if it is the largest class |
include_partial_samples | bool | True | indicator to include data from partial samples when return_full_length is false |
sample_df | NoneType | None | dataframe indicating which indices within each zarr file includes a sample |
frequency | int | 125 | frequency of underlying data |
return_y_every_sec | int | 30 | integer value indicating the step in indexing in seconds |
y_padding_mask | int | -100 | padded value to add to target and indice to ignore when computing loss |
y_frequency | int | 125 | frequency of underlying y hypnogram data |
butterworth_filters | NoneType | None | dictionary of low pass, high pass, and bandpass dictionary to perform on channels |
median_filter_kernel_size | NoneType | None | if not none, will apply median filter with kernel size |
voltage_channels | list | [‘ECG’, ‘ECG (LL-RA)’, ‘EKG’, ‘ECG (L-R)’, ‘EOG(L)’, ‘EOG-L’, ‘E1’, ‘LOC’, ‘E1-M2’, ‘E1-AVG’, ‘EMG’, ‘cchin_l’, ‘chin’, ‘EMG (L-R)’, ‘EMG (1-2)’, ‘EMG (1-3)’, ‘Chin3’, ‘C4-M1’, ‘C4_M1’, ‘EEG’, ‘EEG1’, ‘EEG2’, ‘EEG3’, ‘C3-M2’, ‘C3_M2’, ‘C4-AVG’] | if not None, these channels units will be looked at and changed to microvolts from mv uv etc. |
clip_interpolations | NoneType | None | dictionary of channels:{‘phys_range’:…, ‘percentiles’:…} for filtering and interpolation of filtered values |
scale_channels | bool | False | indicator to scale channels to the mean and std of the zarr files. |
time_channel_scales | NoneType | None | dictionary of channel:mean and channel:std values for scaling. Should use training statistics |
return_sequence_padding_mask | bool | False | indicator to return the key padding mask for attention masking |
Plotting
plot_edf_signals
plot_edf_signals (signals, signal_names, signal_comparisons=None, use_resampler=False, normalize=False, title_text='', colorscale=None)