Consider sparse data types #8

DanielFEvans · 2020-08-27T13:17:05Z

Pandas' sparse data structures are another handy-looking memory saving trick that fits with the theme of dtype_diet. It'd be nice if the tool considered it as an option.

The simple case would be to try a sparse column with NaN as the "omitted" value (or perhaps zero for dtypes that lack NaNs).

To get a bit more complex, Pandas lets you can choose any value, and a slightly better trick might be to use the most common value in the column as the "omitted" value. However, that might result in some silly suggestions. For example, suggesting that a column with values [1, 2, 2, 3] be made sparse by omitting '2' isn't really a great suggestion if '2' is only most common for the particular piece of example data being analysed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider sparse data types #8

Consider sparse data types #8

DanielFEvans commented Aug 27, 2020 •

edited

Loading

Consider sparse data types #8

Consider sparse data types #8

Comments

DanielFEvans commented Aug 27, 2020 • edited Loading

DanielFEvans commented Aug 27, 2020 •

edited

Loading