-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for PooledArray back into the package. #81
Comments
The thinking was promotion to PooledArrays of different Ref types (UInt8 .. UInt64) was too expensive in terms of real csv read performance (you only want to read a CSV first, if you're compiling the function 4 times for every string column, that's bad). In that context it only makes sense to convert string columns to PooledArrays after the column has been read as a StringVector. At which point you might as well do it outside TextParse (e.g. in JuliaDB) |
Why not default to The advantage of parsing directly to |
Another thing that would be neat is to be able to specify this separately for different columns. We are planning/thinking a lot about whole query optimization in Query.jl right now, and I can pretty easily see a scenario where for example something like |
This commit removed support for it entirely, not clear to me why. There are some very legit usecases where I think it would be great to get
PooledArray
s, so I would be in favor of just adding that feature back in. Perfectly fine to have it behind an optional switch.The text was updated successfully, but these errors were encountered: