Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pytorch] Add encoders for X #1219

Open
ebezzi opened this issue Jul 1, 2024 · 1 comment
Open

[pytorch] Add encoders for X #1219

ebezzi opened this issue Jul 1, 2024 · 1 comment
Labels
P0 Priority 0 - Critical, fix ASAP! pytorch tileDB work user request

Comments

@ebezzi
Copy link
Member

ebezzi commented Jul 1, 2024

Description

It would be useful to add custom encoders not only for obs, but also for the X matrix. This would allow e.g. to let the user decide whether they want a sparse or a dense without relying on a flag (currently return_sparse_X), provide a different formats for sparse matrices, etc.

@pablo-gar
Copy link
Contributor

pablo-gar commented Jul 22, 2024

We should have the encoder be flexible to allow on-the-fly tokenization of data. Form @AlejandroTL we heard the following:

Only thing I am missing [in the PyTorch loaders] a priori is something to handle different tokenization strategies during the dataloading. For instance, retrieving the counts, retrieving the indices of the genes given some previously stored dictionary, or binning the data on the fly, or creating a ranked list, etc.

@pablo-gar pablo-gar added the P0 Priority 0 - Critical, fix ASAP! label Aug 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P0 Priority 0 - Critical, fix ASAP! pytorch tileDB work user request
Projects
None yet
Development

No branches or pull requests

3 participants