Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Status of MLDataPattern porting #3

Closed
17 tasks done
CarloLucibello opened this issue Dec 27, 2021 · 7 comments · Fixed by #196
Closed
17 tasks done

Status of MLDataPattern porting #3

CarloLucibello opened this issue Dec 27, 2021 · 7 comments · Fixed by #196

Comments

@CarloLucibello
Copy link
Member

CarloLucibello commented Dec 27, 2021

A list of what is currently exported from MLDataPattern.jl.

TO PORT

NOT TO BE PORTED

  • BufferGetObs
  • RandomObs, RandomBatches
  • BalancedObs
  • FoldView
  • targets
  • eachtarget
@CarloLucibello CarloLucibello changed the title Status of MLDataPattern porting Status of MLDataPattern and MLLabelUtils porting Jan 30, 2022
@CarloLucibello CarloLucibello changed the title Status of MLDataPattern and MLLabelUtils porting Status of MLDataPattern porting Jan 30, 2022
@CarloLucibello
Copy link
Member Author

We can consider this essentially done

@rmkn85
Copy link

rmkn85 commented Aug 4, 2022

Hi, what about stratifiedobs and slidingwindow?
Were they explicitly excluded on purpose?
Thanks

@CarloLucibello
Copy link
Member Author

not really, we just didn't port code that we weren't sure was going to be useful. I think stratifiedobs should go in, less sure of slidingwindow but didn't look much into it and alternatives in the ecosystem.

@CarloLucibello CarloLucibello reopened this Aug 5, 2022
@rmkn85
Copy link

rmkn85 commented Aug 5, 2022

Just to clarify, I came here specifically for missing stratifiedobs.
It is needed to replicate the behaviour of Python's sklearn.model_selection.train_test_split([...] stratify=true)

Asked about slidingwindow on the way, since it was the only other one unchecked but not in the list of explicitly "not to be ported", but I don't have any use-case for it.

@kpa28-git
Copy link

I use slidingwindow often for time series data. Haven't looked too much for a replacement but the closest I've found is IterTools.jl partition. It has a similar interface but returns a tuple iterator

@kpa28-git
Copy link

Also found DSP.Periodograms.arraysplit which is similar to slidingwindow but you set the overlap instead of the stride. So far slidingwindow is the fastest of the three because it returns views.

@CarloLucibello
Copy link
Member Author

stratified split has been implemented in #195

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants