Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaNs in Search Results table #15

Closed
tylerapritchard opened this issue May 21, 2024 · 1 comment
Closed

NaNs in Search Results table #15

tylerapritchard opened this issue May 21, 2024 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@tylerapritchard
Copy link
Collaborator

tylerapritchard commented May 21, 2024

Just a general question - is there something we can/should be doing to make ourselves less brittle against passing NaN's from the search result table to other functions/operations? Or are we fine as is? These do seem to be sensible missing data values, but we want to be as easily maintainable and robust as we reasonably can.

In the process of working on #13 I discovered that the NaNs added into the pandas DataFrame search result table broke things in some unexpected ways when they got passed to astroquery. Particularly when a NaN dataURI got passed to astroquery.mast.observations.get_cloud_uris. These NaNs get introduced when query columns are empty when we use pd.concat() to join tables together. We're doing this as an outer join to preserve as much information in the columns as possible.

My most recent example example wasn't an issue in mainline lightkurve as this is from where we concatenate a DataFrame of TESSCut Information to our main self.table. However, that main self.table also has lots of NaNs from a similar operation where we concatenate the tables from astroquery.mast.observations.query_criteria and astroquery.mast.observations.get_product_list. So this may be representative of future concerns if we expand to run operations on more columns and make our tables more generic compared to the current lightkurve.

@tylerapritchard tylerapritchard added the question Further information is requested label May 21, 2024
@tylerapritchard tylerapritchard changed the title Nan is Search Results table NaNs in Search Results table May 21, 2024
@d-giles
Copy link
Collaborator

d-giles commented May 22, 2024

I think that the NaNs should be handled on a case-by-case basis. Where it breaks functionality, either the NaNs should be filled appropriately for that specific application, or if there's no appropriate substitute drop the lines and return the available results, warning about the data that didn't return results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants