-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve documentation: What a mapping is... and is not. #324
Comments
Thank you @joeflack4 I think it is a good thing to add a statement about this to the documentation. The answer is not going to be easy. If you ask the average mapping expert (aka your favourite LLM) what a mapping is, they will tell you something like:
I think the answer to this question will come down to delineating the idea of "correspondence" from mere "association". Correspondence:"Correspondence" refers to a specific relationship established between two or more entities across different datasets, schemas, or systems, indicating that they represent the same or equivalent information. The term often implies a more formal, direct, and explicit linkage between entities, suggesting a stronger degree of equivalence or similarity. For instance, in the process of ontology alignment or schema matching, when two concepts from different ontologies are determined to be equivalent, a correspondence is established between them. This correspondence ensures that data using either of these concepts can be seamlessly integrated or queried across both systems. Association:"Association", on the other hand, implies a more general relationship between entities. It suggests that two or more entities are related in some manner but doesn't necessarily dictate the nature or strength of that relationship. An association can be broad, encompassing various types of relationships like correlation, dependency, or co-occurrence. In data mapping, while a correspondence might indicate that two fields in different databases represent the same piece of information (e.g., "First Name" in Database A corresponds to "Given Name" in Database B), an association might indicate that two fields are related but not equivalent (e.g., "Product ID" in Database A is associated with "Transaction ID" in Database B because transactions are related to products, but they don't represent the same concept). So, Correspondence implies a direct, often bidirectional, and stronger relationship suggesting equivalence or very close similarity. Association is a broader term indicating some form of relationship but without specifying its nature or strength. I think anything that can be subsumed under the idea of "correspondence" predicate is a predicate whose presence indicates a true mapping relationship. Conversely, everything that is an association that does not correspond to a "correspondence" does not indicate a true mapping relationship. Let's call these "not-correspondences". It is my view, and correct me if I am wrong, that "not-correspondences" are not in scope for SSSOM. ExamplesClear examples of correspondences
Basically everything in https://mapping-commons.github.io/sssom/mapping-predicates/#the-3-step-process-for-selecting-an-appropriate-mapping-predicate should be non-controversial. Fuzzy examples of correspondenceHere, a bunch of people will already start saying "nope, that ain't a valid mapping predicate you Anarchist". But I thin about half of the people would except these.
What these have in common is that they are isomorphic (1:1) and highlight a different aspect of the thing.
This sounds like a clear mapping candidate and I am happy to classify it as such, but there is a bit of a case to be made that
is not a real mapping per se. Still I think it is just about in scope, as it is very analogous to skos:broadMatch which is clearly in scope. Clear examples of not-correspondences
Fuzzy examples of not-correspondences
Alright, this is my first take. Feel free to weigh in, its a complicated discussion but ultimately useful to avoid people building ontologies in SSSOM format. |
On that, I would like to state that, in my opinion, using For two reasons: First, it weakens the distinction between a “correspondence” and an “association”, or between the SSSOM format and an ontology serialisation format. It’s more difficult to argue that the SSSOM format should not be used to represent an ontology when the SSSOM format explicitly allows to use ontological relations as mapping predicates. This very discussion is a proof of that. Second, I believe that it is not up to the author of a mapping to decide what the OWL or RDF representation of that mapping should be, or put differently that a mapping should be distinct from its possible OWL or RDF representation(s). Using (Yes, it is still possible, when a mapping uses I am inclined to think that |
@gouttegd thanks for this great analysis and detailed comment. Intellectually I agree with all that you say, but I suspect that other systems/representations will want to be reflected (thinking of OMOP) and these systems use hasChild / hasParent relationships between concepts from different terminologies (that look a lot like subClassOf), in addition to something like mapsTo. Practically it will be hard to draw a strong line. But I agree with the sentiment that we should amend the documentation to:
|
I think you guys have this covered. What I add is only a tangent. @matentzn I don't really agree w/ your "fuzzy correspondence" examples. Anyway you don't think they map/correspond to each other either. There's probably room for better language to describe these kinds of relationships/associations. |
Late to the discussion here. These distinctions are clearly very difficult to make, especially with things like "gene encodes protein" (which fyi is not 1:1 @matentzn). It's not a formal equivalence at all but it's definitely a "mapping" in common usage. HOWEVER I don't think as SSSOM developers we need to lose too much sleep over this. You could dump an ontology into an SSSOM file using it as a TSV based triple serialization where all predicates are assumed to be "mappings" - but then the SSSOM tooling gives you nothing over what existing RDF/OWL tools provide, so why would you? I'm sure there are lots of examples of tools I can use for the wrong job. Like I could use MS word to make a crude spreadsheet with tables... or I could use Excel. The MS word developers don't worry that I might possibly use it to build a spreadsheet. I guess what I'm trying to say is whether or not some predicate is a "mapping" comes down to whether treating it as a mapping will allow you to extract further value from your data using the SSSOM toolkit. EDIT: though of course we will need to make these calls when publishing SSSOM datasets, and in the case of OXO, etc. -- I just don't think it's an issue with the core SSSOM spec itself. In my view a mapping is whatever you, for your use case, think a mapping is! |
@udp thanks for your perspective. I agree we should not lose sleep. And I also agree we do not need to be normative here.. But part of SSSOM is sociological, i.e. an effort that aims at making "things" better, and for that, we should at least have a rough idea on what "thing" we are talking about. So when people ask "what is a mapping", we should have some at least reasonable approximation of an answer to avoid it being misused. Also, in my experience it increased trust by stakeholders if we try to push the community towards less scope creep, so that eventually, the presence of a SSSOM file will be perceived as a work of true artisanship rather than a dump of arbitrary associations.
You are of course correct! Apple and Apple tree is a bad example. However, |
Overview
I glanced through the SSSOM docs but I didn't see anything about what a mapping is and is not.
I'm guessing it has something to do with the concept of "degree of equivalence / shared properties". It would be good to distinguish between mapping predicates and some other example relationship predicates.
Could also include short list of preds SSSOM considers by default (I'm guessing something like
skos
exact/broad/narrow/close/related &oboInOwl:hasDbXref
).Real world cases
OMOP has ~450 different relationships. It would be nice to be able to show which of those relationships are mapping relationships and why.
Additional details
Context:
The text was updated successfully, but these errors were encountered: