Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor pipeline.new_sources.new_sources #732

Open
ddobie opened this issue Jul 18, 2024 · 0 comments · May be fixed by #714
Open

refactor pipeline.new_sources.new_sources #732

ddobie opened this issue Jul 18, 2024 · 0 comments · May be fixed by #714
Assignees
Labels
cleanup Clean up code documentation Improvements or additions to documentation low priority Issue is not of immediate concern.

Comments

@ddobie
Copy link
Contributor

ddobie commented Jul 18, 2024

In #730 I fixed a memory leak in pipeline.new_sources.new_sources, but the actual implementation was quite hacky.

As commented in the code, the function should be rewritten to reflect the new behaviour of pipeline.new_sources.parallel_get_rms_measurements. The below section should be incorporated into parallel_get_rms_measurements, and parallel_get_rms_measurements should be renamed to something like parallel_get_new_high_sigma.

new_sources_df = parallel_get_rms_measurements(
new_sources_df, edge_buffer=edge_buffer
)
# this removes those that are out of range
new_sources_df['img_diff_true_rms'] = (
new_sources_df['img_diff_true_rms'].fillna(0.)
)
new_sources_df = new_sources_df[
new_sources_df['img_diff_true_rms'] != 0
]
# calculate the true sigma
new_sources_df['true_sigma'] = (
new_sources_df['flux_peak'].values
/ new_sources_df['img_diff_true_rms'].values
)
# We only care about the highest true sigma
# new_sources_df = new_sources_df.sort_values(
# by=['source', 'true_sigma']
# )
# keep only the highest for each source, rename for the daatabase
new_sources_df = (
new_sources_df
# .drop_duplicates('source')
.set_index('source')
.rename(columns={'true_sigma': 'new_high_sigma'})
)
# moving forward only the new_high_sigma columns is needed, drop all
# others.
new_sources_df = new_sources_df[['new_high_sigma']]

@ddobie ddobie added documentation Improvements or additions to documentation cleanup Clean up code low priority Issue is not of immediate concern. labels Jul 18, 2024
@ddobie ddobie self-assigned this Jul 18, 2024
@ddobie ddobie linked a pull request Jul 23, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup Clean up code documentation Improvements or additions to documentation low priority Issue is not of immediate concern.
Projects
Status: To do
Development

Successfully merging a pull request may close this issue.

1 participant