Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read Andromeda object directly from Python #7

Closed
lhjohn opened this issue Jun 28, 2021 · 2 comments
Closed

Read Andromeda object directly from Python #7

lhjohn opened this issue Jun 28, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@lhjohn
Copy link
Collaborator

lhjohn commented Jun 28, 2021

Working on an implementation for reading Andromeda objects directly from Python here.

Once that is working, I will see if I can come up with an efficient implementation using PyTorch Datasets & Dataloaders. The Dataset API of PyTorch seems to allow to create custom dataset classes here.

The current implementation can be used as follows. Next I will look at temporal Andromeda objects.

import read_write_plp_data as rwplp

# Instantiate class object for each Andromeda object
plpData1 = rwplp.ReadWritePlpData()
plpData2 = rwplp.ReadWritePlpData()

# Load data
plpData1.load_plp_data("/Path/to/PlpData_L1_T11111")

# Load population
plpData1.load_population("/Path/to/Population_L1_T11111.rds")

Once data and population are loaded there are a number of getter functions that return data

# custom query of available tables: covariates, covariateRef, analysisRef
custom_covariates = plpData1.custom_query("SELECT * FROM covariates"))

# Some standard getter functions
covariates = plpData1.get_covariates()
analysis_ref = plpData1.get_analysis_ref()
covariate_ref = plpData1.get_covariate_ref()

Probably the most interesting data is the covariates of the population. For this the population and the PLP data need to be loaded.

# Load population data
population_data = plpData1.get_population_data()

Some additional meta data is also read from .RDS files. Do not think those will need getter functions as they are class instance variables

# Return additional meta data inside Andromeda object
meta_data = plpData1.meta_Data
outcomes = plpData1.outcomes
cohorts = plpData1.cohorts
time_ref = plpData1.time_ref
cov_rds = plpData1.cov_rds

@lhjohn
Copy link
Collaborator Author

lhjohn commented Jun 29, 2021

There are a number of RDS files, which AFAIK we cannot read from Python. I created an issue in the Andromeda project about the possibility to natively adopt JSON versions of those RDS files (Issue link)

@lhjohn lhjohn added the enhancement New feature or request label Apr 12, 2022
@egillax
Copy link
Collaborator

egillax commented Aug 29, 2022

I'm closing this. We are not using python yet, and andromeda is about to possibly change backends using arrow which should be trivial to read in python.

@egillax egillax closed this as completed Aug 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants