Skip to content
This repository has been archived by the owner on Dec 7, 2021. It is now read-only.

Commit

Permalink
Merge branch 'develop'
Browse files Browse the repository at this point in the history
  • Loading branch information
AlessioZanga committed Jul 12, 2020
2 parents d9bb7a5 + cec0590 commit a88f62a
Show file tree
Hide file tree
Showing 25 changed files with 926 additions and 62 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ tuh_eeg_artifact:

tuh_eeg_seizure:
echo "Request your access password at: https://www.isip.piconepress.com/projects/tuh_eeg/html/request_access.php"
rsync -auxvL [email protected]:~/data/tuh_eeg_seizure/v1.5.1 data/tuh_eeg_seizure
rsync -auxvL [email protected]:~/data/tuh_eeg_seizure/v1.5.2 data/tuh_eeg_seizure

eegmmidb:
wget -r -N -c -np https://physionet.org/files/eegmmidb/1.0.0/ -P data
Expand Down
50 changes: 44 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,15 +47,53 @@ If you need a bleeding edge version, you can install it directly from GitHub:

The following datasets will work upon downloading:

* [Temple University Abnormal EEG Dataset](https://www.isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml)
* [Temple University Artifact EEG Dataset](https://www.isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml)
* [EEG Motor Movement/Imagery Dataset](https://physionet.org/content/eegmmidb/1.0.0/)
| Dataset | Size (GB) | Class Distribution | Task | Notes |
|---------|---------------:|:------------------------|------|-------|
| [TUH Abnormal EEG Dataset](https://www.isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml) | 59.0 GB | 'normal': 1521 <br /> 'abnormal': 1472 | Generic abnormal EEG events vs. normal EEG traces. | This dataset does not contain any annotation, the event extraction is performed according to other papers that used this dataset: for each record a 60s sample is extracted and labelled according to the class of the file. |
| [TUH Artifact EEG Dataset](https://www.isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml) | 5.5 GB | 'null': 1940 <br /> 'eyem': 606 <br /> 'musc': 254 <br /> 'elpp': 178 <br /> 'chew': 161 <br /> 'shiv': 60 | Multiple artifacts vs. EEG baseline. | At the moment, only the '01_tcp_ar' EEG reference setup can be used (more than ~95% of total records). |
| [TUH Seizure EEG Dataset](https://www.isip.piconepress.com/projects/tuh_eeg/html/downloads.shtml) | 54.0 GB | 'fnsz': 4240 <br /> 'gnsz': 1717 <br /> 'cpsz': 1496 <br /> 'tnsz': 334 <br /> 'tcsz': 191 <br /> 'mysz': 6 <br /> 'absz': 2 | Generic unclassified seizure type vs. specific seizure types. | At the moment, only the '01_tcp_ar' EEG reference setup can be used (more than ~95% of total records). <br /> Also, 'bckg' and 'scpz' classes are ignored: the former is just (a lot of) background noise, the latter has just one instance, which cannot be used with stratified cross-validation. |
| [Motor Movement/Imagery EEG Dataset](https://physionet.org/content/eegmmidb/1.0.0/) | 3.4 GB | | Motor movement / imagery events. | The size of this dataset will increase a lot during preprocessing: although its download size is fairly small, the records of this dataset are entirely annotated, meaning that the whole dataset is suitable for feature extraction, not just sparse events like the others datasets. |
| [CHB-MIT Scalp EEG Dataset](https://physionet.org/content/chbmit/1.0.0/) | 43.0 GB | 'noseizure': 545 <br /> 'seizure': 184 | No seizure events vs. seizure events. | While for 'seizure' events there are (begin, end, label) records, the 'noseizure' class is computed by extracting a 60s sample from records that are flagged as 'noseizure'. |

## How to Class Meaning - From the TUH Seizure docs

| **Class&nbsp;Code** | **Event&nbsp;Name** | **Description** |
| -------------- | -------------------------------------------- | ------------------------------------------------------------------------------------------------------------------ |
| _NULL_ | No Event | An unclassified event |
| _SPSW_ | Spike/Sharp and Wave | Spike and wave/complexes , sharp and wave/complexes |
| _GPED_ | Generalized Periodic Epileptiform Discharges | Diffused periodic discharges |
| _PLED_ | Periodic Lateralized Epileptiform Discharges | Focal periodic discharges |
| _EYBL_ | Eye blink | A specific type of sharp, high amplitude eye movement artifact corresponding to blinks |
| _ARTF_ | Artifacts (All) | Any non-brain activity electrical signal, such as those due to equipment or environmental factors |
| _BCKG_ | Background | Baseline/non-interesting events |
| _SEIZ_ | Seizure | Common seizure class which can include all types of seizure |
| _FNSZ_ | Focal Non-Specific Seizure | Focal seizures which cannot be specified with its type |
| _GNSZ_ | Generalized Non-Specific Seizure | Generalized seizures which cannot be further classified into one of the groups below |
| _SPSZ_ | Simple Partial Seizure | Partial seizures during consciousness; Type specified by clinical signs only |
| _CPSZ_ | Complex Partial Seizure | Partial Seizures during unconsciousness; Type specified by clinical signs only |
| _ABSZ_ | Absence Seizure | Absence Discharges observed on EEG; patient loses consciousness for few seconds (Petit Mal) |
| _TNSZ_ | Tonic Seizure | Stiffening of body during seizure (EEG effects disappears) |
| _CNSZ_ | Clonic Seizure | Jerking/shivering of body during seizure |
| _TCSZ_ | Tonic Clonic Seizure | At first stiffening and then jerking of body (Grand Mal) |
| _ATSZ_ | Atonic Seizure | Sudden loss of muscle tone |
| _MYSZ_ | Myoclonic Seizure | Myoclonous jerks of limbs |
| _NESZ_ | Non-Epileptic Seizure | Any non-epileptic seizure observed. Contains no electrographic signs. |
| _INTR_ | Interesting Patterns | Any unusual or interesting patterns observed that don't fit into the above classes. |
| _SLOW_ | Slowing | A brief decrease in frequency |
| _EYEM_ | Eye Movement Artifact | A very common frontal/prefrontal artifact seen when the eyes move |
| _CHEW_ | Chewing Artifact | A specific artifact involving multiple channels that corresponds with patient chewing, “bursty” |
| _SHIV_ | Shivering Artifact | A specific, sustained sharp artifact that corresponds with patient shivering. |
| _MUSC_ | Muscle Artifact | A very common, high frequency, sharp artifact that corresponds with agitation/nervousness in a patient. |
| _ELPP_ | Electrode Pop Artifact | A short artifact characterized by channels using the same electrode “spiking” with perfect symmetry. |
| _ELST_ | Electrostatic Artifact | Artifact caused by movement or interference on the electrodes, variety of morphologies. |
| _CALB_ | Calibration Artifact | Artifact caused by calibration of the electrodes. Appears as a flattening of the signal in the beginning of files. |
| _HPHS_ | Hypnagogic Hypersynchrony | A brief period of high amplitude slow waves. |
| _TRIP_ | Triphasic Wave | Large, three-phase waves frequently caused by an underlying metabolic condition. |
| _ELEC_ | Electrode Artifact | Electrode pop, Electrostatic artifacts, Lead artifacts. |

## How to Get a Dataset

> **WARNING (1)**: Retriving the TUH EEG Abnormal dataset require at least 65GB of free disk space.
> **WARNING (2)**: Retriving the TUH EEG Abnormal dataset require valid credentials, you can get your own at https://www.isip.piconepress.com/projects/tuh_eeg/html/request_access.php.
> **WARNING**: Retriving the TUH EEG datasets require valid credentials, you can get your own at: https://www.isip.piconepress.com/projects/tuh_eeg/html/request_access.php.
In the root directory of this project there is a Makefile, by typing:

Expand Down
1 change: 1 addition & 0 deletions examples/tensorboard/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
logs*
230 changes: 230 additions & 0 deletions examples/tensorboard/example_tensorboard.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,230 @@
#!/usr/bin/env python

# Ignore MNE and TensorFlow warnings
import warnings
warnings.simplefilter(action='ignore')

# Import TensorFlow with GPU memory settings
import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
try:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
except RuntimeError as e:
print(e)

# Import TensorBoard params and metrics
from tensorboard.plugins.hparams import api as hp
from tensorflow.keras.metrics import CategoricalAccuracy, Precision, Recall

# Import Spektral for GraphAttention
import spektral as sp

# Others imports
import os
import pickle
import numpy as np
from random import shuffle
from itertools import product
from networkx import to_numpy_matrix
from sklearn.model_selection import train_test_split
from tensorflow.python.keras.utils.np_utils import to_categorical

# Relative import PyEEGLab
import sys
from os.path import abspath, dirname, join

sys.path.insert(0, abspath(join(dirname(__file__), '../..')))
from pyeeglab import *

def build_data(dataset):
dataset.set_cache_manager(PickleCache('../../export'))

preprocessing = Pipeline([
CommonChannelSet(),
LowestFrequency(),
ToDataframe(),
MinMaxCentralizedNormalization(),
DynamicWindow(8),
ForkedPreprocessor(
inputs=[
SpearmanCorrelation(),
Mean(),
Variance(),
Skewness(),
Kurtosis(),
ZeroCrossing(),
AbsoluteArea(),
PeakToPeak(),
Bandpower(['Delta', 'Theta', 'Alpha', 'Beta'])
],
output=ToMergedDataframes()
),
ToNumpy()
])

return dataset.set_pipeline(preprocessing).load()

def adapt_data(data, test_size=0.1, shuffle=True):
if isinstance(data, str):
with open(data, 'rb') as f:
data = pickle.load(f)
samples, labels = data['data'], data['labels']
x_train, x_test, y_train, y_test = train_test_split(samples, labels, test_size=test_size, shuffle=shuffle, stratify=labels)
x_train, x_val, y_train, y_val = train_test_split(x_train, y_train, test_size=test_size, shuffle=shuffle, stratify=y_train)
classes = np.sort(np.unique(labels))
y_train = to_categorical(y_train, num_classes=len(classes))
y_test = to_categorical(y_test, num_classes=len(classes))
y_val = to_categorical(y_val, num_classes=len(classes))
return x_train, y_train, x_val, y_val, x_test, y_test

def build_model(shape, classes, hparams):
print(hparams)
N = shape[2]
F = shape[3] - N
frames = shape[1]

def get_feature_matrix(x, frame, N, F):
x = tf.slice(x, [0, frame, 0, N], [-1, 1, N, F])
x = tf.squeeze(x, axis=[1])
return x

def get_correlation_matrix(x, frame, N, F):
x = tf.slice(x, [0, frame, 0, 0], [-1, 1, N, N])
x = tf.squeeze(x, axis=[1])
return x

input_0 = tf.keras.Input((frames, N, F + N))

gans = []
for frame in range(frames):
feature_matrix = tf.keras.layers.Lambda(
get_feature_matrix,
arguments={'frame': frame, 'N': N, 'F': F}
)(input_0)

correlation_matrix = tf.keras.layers.Lambda(
get_correlation_matrix,
arguments={'frame': frame, 'N': N, 'F': F}
)(input_0)

x = sp.layers.GraphAttention(hparams['output_shape'])([feature_matrix, correlation_matrix])
x = tf.keras.layers.Flatten()(x)
gans.append(x)

combine = tf.keras.layers.Concatenate()(gans)
reshape = tf.keras.layers.Reshape((frames, N * hparams['output_shape']))(combine)
lstm = tf.keras.layers.LSTM(hparams['hidden_units'])(reshape)
dropout = tf.keras.layers.Dropout(hparams['dropout'])(lstm)
out = tf.keras.layers.Dense(classes, activation='softmax')(dropout)

model = tf.keras.Model(inputs=[input_0], outputs=out)
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=hparams['learning_rate']),
loss='categorical_crossentropy',
metrics=[
'accuracy',
Recall(class_id=0, name='recall'),
Precision(class_id=0, name='precision'),
]
)
model.summary()
return model

def run_trial(path, step, model, hparams, x_train, y_train, x_val, y_val, x_test, y_test, epochs):
with tf.summary.create_file_writer(path).as_default():
hp.hparams(hparams)
model.fit(x_train, y_train, epochs=epochs, batch_size=32, shuffle=True, validation_data=(x_val, y_val))
loss, accuracy, recall, precision = model.evaluate(x_test, y_test)
tf.summary.scalar('accuracy', accuracy, step=step)
tf.summary.scalar('recall', recall, step=step)
tf.summary.scalar('precision', precision, step=step)

def hparams_combinations(hparams):
hp.hparams_config(
hparams=list(hparams.values()),
metrics=[
hp.Metric('accuracy', display_name='Accuracy'),
hp.Metric('recall', display_name='Recall'),
hp.Metric('precision', display_name='Precision'),
]
)
hparams_keys = list(hparams.keys())
hparams_values = list(product(*[
h.domain.values
for h in hparams.values()
]))
hparams = [
dict(zip(hparams_keys, values))
for values in hparams_values
]
shuffle(hparams)
return hparams

def tune_model(dataset_name, data):
LOGS_DIR = join('./logs/generic', dataset_name)
os.makedirs(LOGS_DIR, exist_ok=True)
# Prepare the data
x_train, y_train, x_val, y_val, x_test, y_test = adapt_data(data)
# Set tuning session
counter = 0
# Parameters to be tuned
hparams = {
'learning_rate': [1e-4, 5e-4, 1e-3],
'hidden_units': [8, 16, 32, 64],
'output_shape': [8, 16, 32, 64],
'dropout': [0.00, 0.05, 0.10, 0.15, 0.20],
}
hparams = {
key: hp.HParam(key, hp.Discrete(value))
for key, value in hparams.items()
}
hparams = hparams_combinations(hparams)
for hparam in hparams:
# Build the model
model = build_model(data['data'].shape, len(data['labels_encoder']), hparam)
# Run session
run_name = f'run-{counter}'
print(f'--- Starting trial: {run_name}')
print(hparam)
run_trial(
join(LOGS_DIR, run_name),
counter,
model,
hparam,
x_train,
y_train,
x_val,
y_val,
x_test,
y_test,
epochs=50
)
counter += 1


if __name__ == '__main__':
dataset = {}

dataset['tuh_eeg_abnormal'] = TUHEEGAbnormalDataset('../../data/tuh_eeg_abnormal/v2.0.0/edf')

dataset['tuh_eeg_artifact'] = TUHEEGArtifactDataset('../../data/tuh_eeg_artifact/v1.0.0/edf')
dataset['tuh_eeg_artifact'].set_minimum_event_duration(4)

dataset['tuh_eeg_seizure'] = TUHEEGSeizureDataset('../../data/tuh_eeg_seizure/v1.5.2/edf')
dataset['tuh_eeg_seizure'].set_minimum_event_duration(4)

# dataset['eegmmidb'] = EEGMMIDBDataset('../../data/physionet.org/files/eegmmidb/1.0.0')
# dataset['eegmmidb'].set_minimum_event_duration(4)

dataset['chbmit'] = CHBMITDataset('../../data/physionet.org/files/chbmit/1.0.0')
dataset['chbmit'].set_minimum_event_duration(4)

"""
Note: You can just use paths as values in the dictionary
and comment-out the first line of the following for cycle ;)
"""

for key, value in dataset.items():
value = build_data(value)
tune_model(key, value)
Loading

0 comments on commit a88f62a

Please sign in to comment.