Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicates at - URGENT #44

Open
GoogleCodeExporter opened this issue Mar 14, 2015 · 2 comments
Open

Duplicates at - URGENT #44

GoogleCodeExporter opened this issue Mar 14, 2015 · 2 comments

Comments

@GoogleCodeExporter
Copy link

Please describe your question as clearly as possible. Include links if
possible.

I now know two herbaria (MT, TRTE) who are recording in which herbaria 
duplicates/replicates of the specimen can be found. There are probably other 
herbaria doing the same. The field lists the herbarium acronyms concatenated 
and separated:

duplicatesAt: NFLD; UWO; DAO

Questions:

1. Is this useful information to share in Darwin Core? For regular users, 
FilteredPush users?

2. How can we share this in Darwin Core? No term seems appropriate.

The closest one might be 
http://rs.tdwg.org/dwc/terms/index.htm#associatedOccurrences, but that should 
list the IDs for the associated specimens, not the collections where they are 
deposited. Is there a way we can interpret this definition in a broader sense?

The other option is of course 
http://rs.tdwg.org/dwc/terms/index.htm#dynamicProperties. Since the key value 
pairs in this term are generally separated by ";", just dumping duplicatesAt 
(containing ";") will probably cause problems. Are any of these useable?

A. duplicatesAt=NLFD,UWO,DAO; otherDynamicProperty=...

B. duplicateAt=NLFD; duplicateAt=UWO; duplicateAt=DAO; otherDynamicProperty=...

Any advise would be welcome. Based on the feedback here, TRTE will either 
publish or not publish this information before Thursday, March 29th, after 
which it will be difficult to change it.

Original issue reported on code.google.com by [email protected] on 26 Mar 2012 at 1:16

@GoogleCodeExporter
Copy link
Author

Hi, sorry for a year of silence.

BRIT is using associatedOccurences for this.  If we were strictly following the 
rules I think we would represent it this way:
Duplicate:sheet:BRIT:BRIT24235; Duplicate:carpological:Duplicate:sheet:NY; 
Duplicate:sheet:MO.

I don't think Dynamic properties is the place to put this.  I also don't think 
many herbaria bother to record this.  When they do, a series of intended places 
for duplicate distribution often goes on the label but that's no guarantee all 
those herbaria actually did get a duplicate.  Might still be sitting in a box 
for future distribution!

I think the more likely future need for this will be as a later curatorial 
recording of duplicates at other institutions--that were finally located by FP.
A

Original comment by [email protected] on 28 Mar 2012 at 7:25

@GoogleCodeExporter
Copy link
Author

This raises an issue as to whether TDWG DarwinCore is complete enough for 
botanical purposes.  associatedOccurences really highlights the issue.  The 
members of a duplicate set come from only one occurence and should share the 
same occuranceID - they aren't associated occurences, they are the same 
occurance (and much of the time, the same biological individual).  However, the 
grouping of the duplicate sets is, to some varying extent, a matter of 
inference.  One potential use of a duplicatesAt property would be to present 
data from a herbarium sheet about where it asserts duplicates should be found.  
Another potential use of a duplicatesAt property would be to assert a full list 
of places where duplicates are believed to be found, by all means of inference. 
  We do capture some assertions about where duplicates are believed to be found 
at Harvard, but we currently aren't doing it as structured data.  

Our normal guidance in other cases of concatenated lists of multiple values in 
a single property has been to use a pipe character as a separator.  

A duplicatesAt property containing a pipe separated list of herbarium acronyms 
representing herbaria where duplicates are believed to have been sent seems a 
reasonable option that could be used in flat darwin core.  

In FilteredPush, we will likely be making assertions about duplicate 
relationships between particular specimen records, so transport of known places 
where duplicates are expected to occur could be of value in helping to make 
those assertions.

I think we need to propose a collectionObjectID/specimenID/voucherID term to 
hold a GUID for the database record for a particular specimen as an addition to 
TDWG DarwinCore.

I don't think we should use associatedOccurences in this case, as we are 
talking about multiple vouchers of the same occurence, not associated other 
occurences.  

Either approach A or approach B looks like it should work, with A supporting 
flat darwin core, and B not, thus perhaps favoring A.

Original comment by [email protected] on 29 Mar 2012 at 1:34

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant