Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Completeness Module #141
base: development
Are you sure you want to change the base?
Adding Completeness Module #141
Changes from 2 commits
315a538
a1f822d
1950322
d4f8390
f73b8fb
2dd5e75
3bee5bd
f562dd1
53132e1
2dace34
6394950
4f6ef86
eaeb6a1
1a399ec
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this has to be handled here, but we might need to give some thought how time periods would be specified for multiiple different subjects, ie can they only be specified via hours from recording start, etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The time periods are specified by the user as times in numpy datetime format, or as 'daily' in which case it's recording time start plus 24 h, 48h, etc. Since the pipeline works on each subject independently, any batch processing of multiple subjects would just be handled by the user looping through all subjects. Not sure if this is consistent with the rest of SKDH processing but I think so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this not be handled by a
pandas.to_timestamp
call?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it could be handled by pd.to_timedelta but I'm not sure if there's any benefit of using that over numpy since it returns the same data format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the benefit would not be requiring a function that has to be maintained by us when one exists
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sry thought you meant the np.timedelta64 function. pd.DataFrame.to_timestamp does not do the same thing as this whole function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These device specific parsers should be wrapped into the CSV reader somehow, or if not possible, we can discuss how to include them or rely on the end-user to generate the inputs for this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These parsers are more like helper functions for me or other people working with these devices to help getting from native to the forced input format, they're not actually a part of the pipeline. The native format changes a lot depending on software versions, data capture platform etc. so they should not be integrated directly in the pipeline or they will fail every single time a new data batch comes. We could possibly move this to a private extension repo if we don't want to have it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think private would be better. Or we modify existing IO functions if there is a way to add small functionality bits that we need.
MultiReader
might already handle some/most of thisThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed the file from the repo, I believe we have an internal private skdh extension repo already that we can add it to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see previous