Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Swap size_frac_low and up #21

Closed
kmexter opened this issue Jul 15, 2024 · 35 comments
Closed

Swap size_frac_low and up #21

kmexter opened this issue Jul 15, 2024 · 35 comments
Assignees
Labels
bug Something isn't working

Comments

@kmexter
Copy link
Contributor

kmexter commented Jul 15, 2024

Raising this issue here as it is to be done for all the observatory googlesheets.
See emo-bon/observatory-esc68n-crate#5 for background

The order of business is

  1. swap these column titles in all water and sediment googlesheets - @melinalou @melanthia to do and @kmexter to check and confirm
  2. change the description in the "definitions" tab for all logsheets - @melinalou @melanthia to do and @kmexter to check and confirm
  3. change the source_mat_id final column in all sampling tabs of all water googlesheet: in the equation there, col R should be changed to col Q (change the equation for the first row and drag-drop down to replace values in all rows) (note that this is not part of the id for sediment, so nothing needs to change here) - @melinalou @melanthia to do and @kmexter to check and confirm
  4. @laurianvm to change the definition for these terms in the ontology and the ttl template (the definitions should come from: water take it from the checklist https://www.ebi.ac.uk/ena/browser/view/ERC000024, sediment from the checklist https://www.ebi.ac.uk/ena/browser/view/ERC000021
  5. @kmexter to change the logsheet_schema_extended to change the definition there
  6. @kmexter to check if the QC also needs to change (was a check done on the low being > the high? if so, needs to change)

Only after point 6 is this issue solved.

Can we start with points 1,2,3 please? When done, comment in this issue so the next person knows they have to start their part.
Note: Katrina will be away until Aug 19.

@kmexter
Copy link
Contributor Author

kmexter commented Aug 22, 2024

@melinalou @melanthia (or possibly @cpavloud) can you tell me if you have done this swap in the logsheets - i.e. swapped the column titles around and changed the definition in the definition tab (which requires creating a new definitions tab as a copy of the old one, otherwise you cannot edit it). I do not want to change this in the governance data until it is changed in the logsheets.

@melinalou
Copy link

Hi! The swap of size_frac_low and size_frac_up columns has been done in all logsheets and the definitions are:

size_frac_low size-fraction lower threshold Refers to the mesh/pore size used to pre-filter/pre-sort the sample. Materials larger than the size threshold are excluded from the sample

size_frac_up size-fraction upper threshold Refers to the mesh/pore size used to retain the sample. Materials smaller than the size threshold are excluded from the sample

Unfortunately I have permission to create a new copy of the definitions sheet but not to rename or delete the old one..
But do they need to change?

@kmexter
Copy link
Contributor Author

kmexter commented Aug 26, 2024

OK, so the columns have been swapped in all Sampling tabs, but yes, you do need to swap the definitions also.
I would do that in a new definitions sheet and then delete the old definitions sheet.
BUT clearly we also need to change the way the source_mat_id is created in all the Sampling tabs, final column - see my comments in emo-bon/observatory-profile#13. Otherwise all the IDs are malformed. Christina did say this in an email some time back, so I think we don't need to check with her - you can just go ahead and do it. For that, you need to copy-past the Sampling tabs so you can modify that column - as we discussed last week during our meeting. Do you also need help changing the equation? Christina did say how to do that in her email sent some time ago.

@cpavloud
Copy link

@melinalou this is wrong, these are the old definitions.

size_frac_low is used to retain the sample
size_frac_up is used to pre-filter/pre-sort the sample

See also here for the ongoing discussion we have with the GSC to correct officially the definitions.

@melinalou
Copy link

@kmexter Ok, yes I will change the definitions but as I can see you (or someone with permission) will need to delete the old definitions sheets after I create the new corrected one because it is locked and I can not delete it.

Referring to the source_mat_id is this the equation that need to be fixed?
emo-bon/observatory-hcmr-1-crate#13
To put _1 in all blanks and _2 whenever it is another one? (or it has to do with the size_frac_up?)
I think it is not only that change if we need unique source_mat_ids cause e.g in https://docs.google.com/spreadsheets/d/11_Eu0W1-sDiuzKx1cIl6YuxjRHmWezN6u9v3Ly8JZ3A/edit?gid=124596284#gid=124596284 we have the same id for line 6 and 16 even if there no blanks in M column.
Sorry if I didn't understand well.

@melinalou
Copy link

@cpavloud thank you!

@kmexter
Copy link
Contributor Author

kmexter commented Aug 26, 2024

No, the blanks bit is another issue - don't do that one yet, or we will get confused
What needs changing is
size_frac_low is in Q column
size_frac_up is in R column

The equation is
=CONCATENATE(observatory!$A$2,"",H2,"",R2,"um","_",M2) (in row 2, of course the other rows have 3,4,5 etc instead of 2)
And the source_mat_id for the first sample is
EMOBON_ROSKOGO_Wa_210618_200um_1

But is should be
=CONCATENATE(observatory!$A$2,"",H2,"",Q2,"um","_",M2)
So that the source_mat_id for the first sample would be
EMOBON_ROSKOGO_Wa_210618_3um_1

@melinalou
Copy link

ok, I will fix it and let you know! Thank you.

@melinalou
Copy link

I made a copy of sampling where I 've fixed the source_mat_id and a copy of definitions where I 've changed the size_frac_up/low and n_alkanes definitions. So now we need to delete the old ones and rename the new.

@kmexter
Copy link
Contributor Author

kmexter commented Aug 26, 2024

For which googlesheet did you do this - so I can check? paste the URL here please

@melinalou
Copy link

@kmexter
Copy link
Contributor Author

kmexter commented Aug 26, 2024

OK, I checked 2 and they are nice. However, you will need to remove the original Sampling tab and rename the "Copy of sampling" to "sampling" ->otherwise those will not be harvested (as we harvest on the name of the tab). Perhaps rename "Copy of definitions" to "Updated definitions"
When you do remove the old sampling tab, you should make sure to copy over any comments that are still alive in there. Unfortunately comments get lost when you copy in the way I told you to, you see, but comments that are still alive are ones that the stations still need to do.

Many of my comments in emo-bon/observatory-profile#13 will now be solved - those related to size_frac_up and low - so bear that in mind as you work your way thru that issue.

Great work - boring I know, but it needs to be done!

@melinalou
Copy link

Yes but I can not delete the sampling and definitions. Unfortunately I do not have the permission..

@kmexter
Copy link
Contributor Author

kmexter commented Aug 26, 2024

indeed - hence rename the "Copy of definitions" so it is clear that it is an update, not a copy
Hmm, so I could remove the "sampling" tab before, I am sure. Perhaps @melanthia has permissions if you do not?
But what you surely can do is copy over the comments that are still relevant? Most are raised by HQ and I cannot tell if they can be closed or not, so better you do it

@melinalou
Copy link

Good.I will rename the copy of.. to Updated definitions and copy the comments. I will inform you here when it is done.

@melinalou
Copy link

All done! https://github.com/emo-bon/governance-data/blob/main/logsheets.csv
I renamed the copy of sampling -> new sampling and copy of definitions-> updated definitions.
Also I copied all the comments. Please check if the way I did it is helpful..cause I could only do copy and paste all the "chat".

@cymon
Copy link

cymon commented Aug 27, 2024 via email

@kmexter
Copy link
Contributor Author

kmexter commented Aug 27, 2024

OK, so as far as I can see that is good
However, it will be necessary to ASAP remove the "sampling" tab and rename "sampling new" to "sampling", otherwise (1) the source_mat_id in the measured tab (col 1) will be wrong and (2) the stations will be confused over which tab to fill in. Since @melinalou apparently does not have permission to do this, does @cpavloud or @melanthia have the necessary permissions?

@cpavloud
Copy link

I don't have permissions, no.
I have tried to use the EMBRC secretariat credentials (that should work) but they don't.
And at this point, no one knows which is the correct password for this account.
I don't think we can do anything else rather than wait for @isanti to come back from her leave.

@kmexter
Copy link
Contributor Author

kmexter commented Aug 28, 2024

You cannot edit the "Updated definitions" tab? That is there in all the water logsheets now.

@cpavloud
Copy link

Ah, I tought you meant the original "definitions" tab (which is locked).
I can edit the "Updated definitions" tab, yes. But probably everyone of us could do that...

@melinalou
Copy link

melinalou commented Aug 28, 2024 via email

@kmexter
Copy link
Contributor Author

kmexter commented Aug 28, 2024

Yes, EMOBON HQ should edit the Updated definitions
Since it appears that I do have permission to remove the old sampling tab and rename the new one, I will do that later today

@kmexter
Copy link
Contributor Author

kmexter commented Aug 28, 2024

@melinalou So neither do I have permission to remove the original sampling tab. sigh
HOWEVER, I could do this, and you could do the same.
I edited this one: https://docs.google.com/spreadsheets/d/1hvLkBwiKTGTJDx19m_8e7qJ2lm9bwLLeVztMpxTLqnk/edit?gid=15718907#gid=15718907
I RENAMED the original sampling tab to "old sampling - ignore"
Then I RENAMED the new sampling tab to "sampling"
Then I EDITED the equation in column 1 of the measured tab. It takes the column AM from the sampling tab, but when I renamed that tab, it also changed the equation from "=sampling!AM2" to "=old sampling - ingore!AM2" so I had to change it BACK to what it was before
Then finally I MOVED the tabs so that observatory, sampling, and measured were tabs 2,3,4, Updated defniitions was 5, and the tabs to be deleted are then pushed to the end

IS that clear? Can you do that for the other water logsheets?

@melanthia
Copy link

melanthia commented Aug 28, 2024 via email

@melinalou
Copy link

@kmexter Yes, I will do it for the other logsheets.

@melinalou
Copy link

@kmexter all done.
(I named the old sampling tab "old-sampling").If you want have a look at one random, e.g https://docs.google.com/spreadsheets/d/1AvQMYcS0tdNMw6Er8zUarQg1a_wrshhnkTS6RuI1FJQ/edit?gid=15718907#gid=15718907 to be sure that it is ok.

@kmexter
Copy link
Contributor Author

kmexter commented Aug 29, 2024

Can someone please tell me if I have chosen the correct BODC terms for these two properties
size_frac_low http://vocab.nerc.ac.uk/collection/P01/current/PRSZSPLW/ (I am unsure in particular because this defnition say "retained" while the ENA definition says "excluded"
size_frac_up https://vocab.nerc.ac.uk/collection/P06/current/UXMM/
I think @cpavloud understands this best....

@cpavloud
Copy link

cpavloud commented Sep 2, 2024

The BODC term for size_frac_low is Pore size of sampling processor (lower filter)

The BODC term for size_frac_up is Pore size of sampling processor (upper filter)

@kmexter
Copy link
Contributor Author

kmexter commented Sep 2, 2024

phew, so I got it right.
So, of the list at the beginning, can you all confirm that we have

  1. swapped the column titles in all logsheets (all observatories, water and sediment)
  2. change the definitions in the logsheets -> I dont think the examples in the definition tab have been changed, as it says 200 as an example for size_frac_low!
  3. source_mat_id equation has been changed
  4. I will check with laurian that these have changed in the ontology
  5. logsheet schema has been updated - yes
  6. QC updated - not done yet

@melinalou
Copy link

Good morning!
1 and 3 done! As for the second one I will take a look now and change the examples.
I will let you know.

@kmexter
Copy link
Contributor Author

kmexter commented Sep 2, 2024

I checked - 4 is done. we used the ENA definition in the ontology.

@melinalou
Copy link

2 checked and done.

@kmexter
Copy link
Contributor Author

kmexter commented Sep 2, 2024

ok, only 6 left that that is for VLIZ to do

@kmexter
Copy link
Contributor Author

kmexter commented Oct 23, 2024

@kmexter kmexter closed this as completed Oct 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants