Feature: support Dask Dataframes for larger than memory returns #569

ihs-nick · 2020-11-24T20:51:54Z

Would it be possible to not only support a pandas dataframe serialization/deserialization for the return, but also a dask dataframe for when the v3io return is much larger than memory?

I was thinking that perhaps if we are reading chunks of any of these data sources into in memory arrow, then could serialize to parquet (I would prefer pure arrow files, but dask doesn't support that right now), and finally read those into a dask dataframe with read_parquet. Would love more thoughts on this, so we could support larger than memory dataframe operations.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: support Dask Dataframes for larger than memory returns #569

Feature: support Dask Dataframes for larger than memory returns #569

ihs-nick commented Nov 24, 2020

Feature: support Dask Dataframes for larger than memory returns #569

Feature: support Dask Dataframes for larger than memory returns #569

Comments

ihs-nick commented Nov 24, 2020