You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Would it be possible to not only support a pandas dataframe serialization/deserialization for the return, but also a dask dataframe for when the v3io return is much larger than memory?
I was thinking that perhaps if we are reading chunks of any of these data sources into in memory arrow, then could serialize to parquet (I would prefer pure arrow files, but dask doesn't support that right now), and finally read those into a dask dataframe with read_parquet. Would love more thoughts on this, so we could support larger than memory dataframe operations.
The text was updated successfully, but these errors were encountered:
Would it be possible to not only support a pandas dataframe serialization/deserialization for the return, but also a dask dataframe for when the v3io return is much larger than memory?
I was thinking that perhaps if we are reading chunks of any of these data sources into in memory arrow, then could serialize to parquet (I would prefer pure arrow files, but dask doesn't support that right now), and finally read those into a dask dataframe with read_parquet. Would love more thoughts on this, so we could support larger than memory dataframe operations.
The text was updated successfully, but these errors were encountered: