-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Master regeneration bug associated with a changing pypeit file #1022
Comments
I can't tell if the Issue describe is a bug (i.e. the code is not performing as If the latter, I suggest a simple script that prints to the screen the files |
It is both a bug, and dissatisfaction with the naming convention that is the source of the bug. If you want to reproduce the bug do the following in the dev suite: Go to this PypeIt file: gemini_gnirs_32_sb_sxd.pypeit Pretend you are at the telescope reducing data as it is collected. Run the standard and the first ABBA block (i.e. calib = 0,1, comment out the second ABBA block). Then try to add the second ABBA block (calib=2,3) as if it just finished reading out. You will see that when you try to re-run the code with the new ABBA block, because of the new calib ids, the bits assigned to the master frames now change, and hence so does the master filename. Since the new masterfilename does not exist, it remakes masters. So it constantly remakes masters and saves new masters, wasting time and disk space, although this is exactly the same calibration file every time. It is a major bug that some script does not fix. |
Just a follow-up thought on this: The easiest solution that I can come up with is just to change
in
This would change the name of the master file to use the smallest calibration group number in the name of the file. That would get around this issue as long as you only ever increase the calibration group numbers as you add new data. But I don't know how disciplined we've been about using |
Hi @kbwestfall, I agree that this fixes the problem. In my opinion a solution that assigns a unique MasterKey based on the files used to create the master would be superior, since this identifier should only depend on the files used. Then given that identifier, we should have a script that will report what files were used for the creation of that master, and also list the calibration ids for which that master is used. Whether this script should ingest the pipet file, or simply sweep the Master directory is not so clear to me, but I sort of prefer the latter. |
What is the unique and terse MasterKey to be generated? Without that, this Issue can't proceed. |
Hi All, one thing I’ve noticed while observing at the telescope. On an observing run, we typically continuously add files to the pypeit files as the data is collected. I’m seeing that the MasterEdges and MasterFlat files are being recreated sometimes even when I run with run_pypeit -m. These different master files have different bit identifiers, i.e.
Masters/MasterEdges_A_1048575_01.fits.gz
Masters/MasterEdges_A_16383_01.fits.gz
Masters/MasterEdges_A_16777215_01.fits.gz
Masters/MasterEdges_A_262143_01.fits.gz
My understanding of what is going on here is that as I add data to PypeIt file, the bit arithmetic changes for the science frames that this master file corresponds to (i.e. specified by the calib field). As a result the name changes, and hence the Master is regenerated.
I think the bit arithmetic is great for determining which master to use for which science files, however it is an essentially useless naming convention. Not only can one not easily figure out which science frames this master applies to, but it also results in a time consuming regeneration of the master file if I simply add a new target to my pypeit file.
My proposal is that:
-- The Master file should have an identifier that is determined by the files that got used to generate it, and it should only be regenerated if those files changed. For example,
MasterEdges_s200907_0057-s200907_0062.fits
where s200907_0057.fits/s200907_0062.fits is the first/last raw file used to create the master.
We may need to think about whether there is a situation where this naming convention would not be unique.
-- The Master file names should be revisited so that users can easily associate them with actual frames. My suggestion is to use
the first file in the list of files used to generate the master. Alternatively, we could use the first and last file. It needs to be a file though, and not some arbitrary complicated number. If a user has a problem with a reduction caused by a problem in a master
file they need to be able to easily figure out which master frames to inspect based on QA plots or our utility scripts. This is essentially impossible with the bit arithmetic, without digging into the data container to figure out the matchup between bits and frames.
The text was updated successfully, but these errors were encountered: