CMActivities-DataSet

More Details about the Dataset

See the following publication for more details on the data processing and collection process. Please cite the following paper when using the dataset. We have provided the sample training scripts to show the usage of the dataset along with the model architectures.

@inproceedings{xing2018enabling,
  title={Enabling Edge Devices that Learn from Each Other: Cross Modal Training for Activity Recognition},
  author={Xing, Tianwei and Sandha, Sandeep Singh and Balaji, Bharathan and Chakraborty, Supriyo and Srivastava, Mani},
  booktitle={Proceedings of the 1st International Workshop on Edge Systems, Analytics and Networking},
  pages={37--42},
  year={2018},
  organization={ACM}
}

Data Collection Procedure

CMActivities dataset contains video, audio, and IMU modalities collected using two smartphones from users performing activities. We refer to the users performing activities as performers hereafter. An observer is holding the first smartphone which is used to record and timestamp the video (along with audio) of the performers. In this way, the first smartphone is acting as an ambient sensor which is recording (video and audio) of the performer. The second smartphone was in the trouser’s front pocket of the performers. The second smartphone was used to timestamp the IMU data captured from the left and right wrist sensors worn by the performers. Both smartphones were synchronized using NTP.

CMActivities dataset is collected from two performers doing seven activities (upstairs, downstairs, walk, run, jump, wash hand and jumping jack). Every data collection session roughly lasted for 10 seconds, where the performer performed a singular activity. The training split is generated from the 624 training sessions. The test split is generated from the 71 test sessions. Each split contains the data from both performers. The validation split is generated by using a part of the training sessions.

Released Data

The audio and IMU windowed samples are released. Audio samples are processed, and 193 features are extracted. IMU samples are available in raw form.
The training, validation, and test samples are available below.
Training samples download
Validation samples download
Test samples download
Other samples from CMActivities dataset: Transfer samples download, Limited train download, Personalization download
More data processing details are in the publication.
We plan to release the video samples using the intermediate representation soon.

Sample Model Training Scripts

Audio Model

Audio Model Training Notebook
Training accuracy: 99.9%
Validation accuracy: 99.5%
Test accuracy: 90.9%

IMU Model

IMU Model Training Notebook
Training accuracy: 99.8%
Validation accuracy: 95%
Test accuracy: 90.5%

Time-shift Data Augmentation Code

@inproceedings{sandha2020time,
  title={Time awareness in deep learning-based multimodal fusion across smartphone platforms},
  author={Sandha, Sandeep Singh and Noor, Joseph and Anwar, Fatima M and Srivastava, Mani},
  booktitle={2020 IEEE/ACM Fifth International Conference on Internet-of-Things Design and Implementation (IoTDI)},
  pages={149--156},
  year={2020},
  organization={IEEE}
}

The following notebooks are written with comments for Time-shift data augmentation

Fusion_Training_Vanilla.ipynb: Vanilla fusion training code with audio and IMU modalities synced.
Augmentation_Data_generation.ipynb: Create augmented training data by introducing artificial errors between audio and IMU modalities.
Fusion_Training_Augmented_1000ms.ipynb: Training fusion model with 1000ms time-shift augmentation.
Testing_Time_Shifting_1000ms.ipynb: Tests the vanilla model and augmented model on the testing data by introducing errors in the test data.

Note: Notebook-1 (Fusion_Training_Vanilla.ipynb) directly uses the data samples available for download. Notebook-2 (Augmentation_Data_generation.ipynb) creates new augmented samples used by the Notebook-3 (Fusion_Training_Augmented_1000ms.ipynb). Notebook-4 (Testing_Time_Shifting_1000ms.ipynb) uses the models trained by Notebook-1 and Notebook-3 along with the test data samples that are available.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Audio_Model_Example.ipynb		Audio_Model_Example.ipynb
Augmentation_Data_generation.ipynb		Augmentation_Data_generation.ipynb
Fusion_Training_Augmented_1000ms.ipynb		Fusion_Training_Augmented_1000ms.ipynb
Fusion_Training_Vanilla.ipynb		Fusion_Training_Vanilla.ipynb
IMU_Model_Example.ipynb		IMU_Model_Example.ipynb
README.md		README.md
Testing_Time_Shifting_1000ms.ipynb		Testing_Time_Shifting_1000ms.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CMActivities-DataSet

More Details about the Dataset

Data Collection Procedure

Released Data

Sample Model Training Scripts

Audio Model

IMU Model

Time-shift Data Augmentation Code

The following notebooks are written with comments for Time-shift data augmentation

About

Releases

Packages

Languages

nesl/CMActivities-DataSet

Folders and files

Latest commit

History

Repository files navigation

CMActivities-DataSet

More Details about the Dataset

Data Collection Procedure

Released Data

Sample Model Training Scripts

Audio Model

IMU Model

Time-shift Data Augmentation Code

The following notebooks are written with comments for Time-shift data augmentation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages