-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve paging control #183
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@trangdata Thanks a lot - this looks good, but there is one shortcoming: I have to do it essentially manually. I have to get the number of pages (an example should show how) and then iterate through the pages. I still think, that an automatic saving (and I will come to the format in a second) would be the best. The following should work for an automatic workflow:
Regarding the format of saving: I was also thinking of To offer the flexibility of saving in different formats, one could introduce in oa_request() But these are many options, and I think that where and how these parameter / options are set should be discussed in the new discussion. One final point: I see the |
I agree. We should include an example to calculate the expected number of pages. Perhaps you could contribute a vignette on "Paging control" given your use case?
I think we want to stay consistent with the behavior of On your larger point of providing an option for saving the individual pages, I still want to keep any serialization/IO stuff outside of the package. Especially with the flexibility that we would potentially have to support: directory, file name/type, save functions, number of pages to save in each iteration, and other options like you said, I think it's best to leave this on the user's side of things. |
Resolves #166.
Hi @rkrug, could you try installing this branch and see if this works for your use case in #166?
For example, what you can now do is specifying the pages:
One thing I'd like to note is that concatenating these
rds
files later may still raise memory issues if you're trying to do this in R. I recommend checking outarrow::write_parquet
to save these outputs as parquet files which would likely make it easier to combine later, potentially outside of R.