Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bard14 Spectra #439

Merged
merged 14 commits into from
Dec 29, 2023
Merged

Bard14 Spectra #439

merged 14 commits into from
Dec 29, 2023

Conversation

LishaRamon
Copy link
Collaborator

@LishaRamon LishaRamon commented Dec 26, 2023

Will be ingesting converted spectra from bard14.

Info from issue:
"Spectra (17) converted in #365. Now we need to upload and ingest.
bard14.zip

Filenames have spaces in it. Not sure if that will be thing or not.

Spectra uploaded to: https://bdnyc.s3.amazonaws.com/SpeX/Prism/
AWS converted spaces to plus signs ('+') in the filenames."

Link to relevant issue: Closes #366

For data ingests:

  • includes script used for ingest
  • includes modified JSON files
  • Add new tests
  • Update the Versions table

@LishaRamon LishaRamon self-assigned this Dec 26, 2023
@LishaRamon LishaRamon requested a review from kelle December 26, 2023 19:52
@LishaRamon
Copy link
Collaborator Author

https://docs.astropy.org/en/stable/io/fits/index.html

Not sure how to read in a folder into data for python to read. Looking up ideas of converting fits and png files into csv data then ingesting them (via loop)

@kelle
Copy link
Collaborator

kelle commented Dec 27, 2023

You can use the pre-existing google sheet. Check out lines 8-13 in this script: scripts/spectra_convert/convert_bard14_spectra.py

@LishaRamon
Copy link
Collaborator Author

LishaRamon commented Dec 27, 2023

Is this the link to the sheet you mean? "https://bdnyc.s3.amazonaws.com/SpeX/Prism/" It's broken/non existent to me.
Update: it's in the convert spectra.py file, nevermind got it

Copy link
Collaborator

@kelle kelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discussed, use the google sheet.

Comment on lines 47 to 64
# Ingest SPECTRAL TYPES, loop through data
ingest_spectra(db,
sources = "Source",
spectrum= "Spectrum",
original_spectrum= "Original Spectrum",
regimes = "regime",
telescope= "telescope",
instrument= "instrument",
mode= "mode",
observation_date= "observation date",
spectrum_comments= "spectrum comments",
spectrum_reference= "spectrum reference",
ra= "ra",
dec= "dec",
aperture= "aperture",
raise_error=True,
search_db= True,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this particular case, since we are updating the spectra entries and not adding entirely new ones, we can't use this function.

Suggested change
# Ingest SPECTRAL TYPES, loop through data
ingest_spectra(db,
sources = "Source",
spectrum= "Spectrum",
original_spectrum= "Original Spectrum",
regimes = "regime",
telescope= "telescope",
instrument= "instrument",
mode= "mode",
observation_date= "observation date",
spectrum_comments= "spectrum comments",
spectrum_reference= "spectrum reference",
ra= "ra",
dec= "dec",
aperture= "aperture",
raise_error=True,
search_db= True,
)

@kelle
Copy link
Collaborator

kelle commented Dec 28, 2023

Instead of using the ingest_spectra function, I think we want to do use the update method of astrodbkit2 to update the fields. https://astrodbkit2.readthedocs.io/en/latest/#updating-data

  • Update the spectrum field to be the new .fits file
  • Update the original_spectrum field to be the old .txt file.

Do this row-by-row inside the loop.

updating_spectra(db)

with db.engine.begin() as conn:
for entry in update_all_spectra:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an extra loop. You already have the loop starting on line 38.

Comment on lines 59 to 68
regime_value = entry["regime"]
telescope_value = entry["telescope"]
instrument_value = entry["instrument"]
mode_value = entry["mode"]
observation_date_value = entry["observation date"]
spectrum_comments_value = entry["spectrum comments"]
spectrum_reference_value = entry["spectrum reference"]
ra_value = entry["ra"]
dec_value = entry["dec"]
aperture_value = entry["aperture"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only fields that need updating are the spectrum and original_spectrum so you don't need any of these other fields.

Suggested change
regime_value = entry["regime"]
telescope_value = entry["telescope"]
instrument_value = entry["instrument"]
mode_value = entry["mode"]
observation_date_value = entry["observation date"]
spectrum_comments_value = entry["spectrum comments"]
spectrum_reference_value = entry["spectrum reference"]
ra_value = entry["ra"]
dec_value = entry["dec"]
aperture_value = entry["aperture"]

Comment on lines 84 to 122
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.regime == regime_value)\
.values(regime = regime_value))

conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.telescope == telescope_value)\
.values(telescope = telescope_value))

conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.instrument == instrument_value)\
.values(instrument = instrument_value))

conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.mode == mode_value)\
.values(mode = mode_value))

conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.observation_date == observation_date_value)\
.values(observation_date = observation_date_value))

conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.comments == spectrum_comments_value)\
.values(comments = spectrum_comments_value))

conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.reference == spectrum_reference_value)\
.values(reference = spectrum_reference_value))

conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.ra == ra_value)\
.values(ra = ra_value))

conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.dec == dec_value)\
.values(dec = dec_value))

conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.aperture == aperture_value)\
.values(aperture = aperture_value))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These fields don't need updating.

Suggested change
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.regime == regime_value)\
.values(regime = regime_value))
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.telescope == telescope_value)\
.values(telescope = telescope_value))
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.instrument == instrument_value)\
.values(instrument = instrument_value))
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.mode == mode_value)\
.values(mode = mode_value))
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.observation_date == observation_date_value)\
.values(observation_date = observation_date_value))
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.comments == spectrum_comments_value)\
.values(comments = spectrum_comments_value))
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.reference == spectrum_reference_value)\
.values(reference = spectrum_reference_value))
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.ra == ra_value)\
.values(ra = ra_value))
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.dec == dec_value)\
.values(dec = dec_value))
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.aperture == aperture_value)\
.values(aperture = aperture_value))

Comment on lines 72 to 74
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.source == source_value)\
.values(source = source_value))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.source == source_value)\
.values(source = source_value))
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.source == source_value)\
.values(spectrum = spectrum_value))

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the magic sauce. Study this command. You'll need another command like this to update the original_spectrum field.


#Loop through data and update spectra
#def updating_spectra(db):
# for row in bard14_table[0:19]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the right way to do the loop.

Suggested change
# for row in bard14_table[0:19]:
# for row in bard14_table[0:19]:

#updating_spectra(db)

with db.engine.begin() as conn:
for entry in update_all_spectra:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work. update_all_spectra is the name of the function....it's not a list or table which you can loop over.

Suggested change
for entry in update_all_spectra:
for entry in update_all_spectra:

Comment on lines 77 to 79
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.original_spectrum == source_value)\
.values(original_spectrum = original_spectrum_value))
Copy link
Collaborator

@kelle kelle Dec 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not quite right. It's trying to find spectra files that are set to the source name.

Suggested change
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.original_spectrum == source_value)\
.values(original_spectrum = original_spectrum_value))
conn.execute(db.Spectra.update()\
.where(db.Spectra.c.source == source_value)\
.values(original_spectrum = original_spectrum_value))

Comment on lines 69 to 71
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.source == source_value)\
.values(spectrum = spectrum_value))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why you added the _table bit. The text here needs to correspond exactly to the name of the table you're trying to modify.

Suggested change
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.c.source == source_value)\
.values(spectrum = spectrum_value))
conn.execute(db.Spectra.update()\
.where(db.Spectra.c.source == source_value)\
.values(spectrum = spectrum_value))

Comment on lines 73 to 75
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.spectrum == spectrum_value)\
.values(spectrum = spectrum_value))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need this. The code on 69-71 updates the spectrum field.

Suggested change
conn.execute(db.Spectra_table.update()\
.where(db.Spectra_table.spectrum == spectrum_value)\
.values(spectrum = spectrum_value))

@kelle
Copy link
Collaborator

kelle commented Dec 29, 2023

Once you get the loop working, I also want to check that the spectra links are valid. Here's some code to get you started.

import requests
request_response = requests.head(spectra_value)
            status_code = (
                request_response.status_code
            )  # The website is up if the status code is 200

@LishaRamon LishaRamon marked this pull request as ready for review December 29, 2023 20:27


# The website is up if the status code is 200 (checking validity of links)
request_response = requests.head(spectrum_value)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the line which actually checks if the URL is valid. It currently only checks the spectrum_value URL. It is not checking the original_spectrum.

msg = f"Link invalid:{spectrum_value}"
raise SimpleError(msg)

if status_code != 200:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This status_code corresponds to the spectrum_value link, not the original_spectrum_value link.

Comment on lines 54 to 60
status_code = (request_response1.status_code)
status_code = (request_response2.status_code)

if status_code != 200:
msg = f"Link invalid:{spectrum_value}"
msg2 = f"Link invalid:{original_spectrum_value}"
raise SimpleError(msg, msg2)
Copy link
Collaborator

@kelle kelle Dec 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now this code only checks the original spectrum and not the spectrum. line 55 overwrites the status_code variable in line 54. You need two different variable names. and you need to check them both.

@kelle kelle merged commit b8b81dc into SIMPLE-AstroDB:main Dec 29, 2023
1 check passed
This was referenced Dec 30, 2023
@LishaRamon LishaRamon deleted the bard14-spectra branch January 3, 2024 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ingest converted Bard14 spectra
2 participants