You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I may have found an issue with the MIMIC OMOP CDM mapping for the note table's note_type_concept_id column. The existing ETL script uses a lookup table to populate this column, and at first glance the choices of OMOP concepts all seem to make sense- they are all from the "Note Type" vocabulary and seem semantically appropriate- using, e.g. OMOP4822279 for discharge summaries, etc.
However, the CDM documentation specifies a set of valid concepts for the note_type_concept_id column, and the ones that the MIMIC ETL process uses are not in that set. Upon closer inspection, I noticed that the "Note Type" concepts that the existing ETL process uses are all marked as being from the "source concepts" (i.e., "non-standard") subset of the Athena vocabulary, and if I'm reading the relevant section of the OHDSI standardized vocabulary documentation, that means that they aren't supposed to be used in fields like note_type_concept_id. So instead of using OMOP4822279 for discharge summaries, we maybe ought to be using OMOP4976897.
The situation is a bit confusing to me, since my understanding was that the idea behind "source concepts" was that they were for external vocabularies that needed to be mapped in to OMOP-land, but the "Note Type" vocabulary entries are all marked as being from an OMOP-authored vocabulary, even though they are (apparently) "non-standard", so it's not like they're from some other vocabulary that got pulled in at some point. The concepts don't seem to have any relationships or ancestors, though, and so are off by themselves off in vocabulary-space. Without knowing more, my best guess is that perhaps at some earlier point in time these particular concepts were valid, but maybe things got reorganized at some point in time?
Anyway, that's probably a question for the Athena folks, but bringing it back to MIMIC:
Is there a particular reason why the ETL process is using these particular concepts for note_type_concept_id instead of the ones the documentation is (or appears to be) suggesting?
If not, would a PR be welcome that updated the mapping table for this field? I'd be happy to take a stab at that, if so.
Apologies if this is ground that has already been covered somewhere else; I looked at issues and documentation and didn't see anything but might have missed it. Thanks for an amazing dataset and tool!
The text was updated successfully, but these errors were encountered:
Hello! I may have found an issue with the MIMIC OMOP CDM mapping for the
note
table'snote_type_concept_id
column. The existing ETL script uses a lookup table to populate this column, and at first glance the choices of OMOP concepts all seem to make sense- they are all from the "Note Type" vocabulary and seem semantically appropriate- using, e.g.OMOP4822279
for discharge summaries, etc.However, the CDM documentation specifies a set of valid concepts for the
note_type_concept_id
column, and the ones that the MIMIC ETL process uses are not in that set. Upon closer inspection, I noticed that the "Note Type" concepts that the existing ETL process uses are all marked as being from the "source concepts" (i.e., "non-standard") subset of the Athena vocabulary, and if I'm reading the relevant section of the OHDSI standardized vocabulary documentation, that means that they aren't supposed to be used in fields likenote_type_concept_id
. So instead of usingOMOP4822279
for discharge summaries, we maybe ought to be usingOMOP4976897
.The situation is a bit confusing to me, since my understanding was that the idea behind "source concepts" was that they were for external vocabularies that needed to be mapped in to OMOP-land, but the "Note Type" vocabulary entries are all marked as being from an OMOP-authored vocabulary, even though they are (apparently) "non-standard", so it's not like they're from some other vocabulary that got pulled in at some point. The concepts don't seem to have any relationships or ancestors, though, and so are off by themselves off in vocabulary-space. Without knowing more, my best guess is that perhaps at some earlier point in time these particular concepts were valid, but maybe things got reorganized at some point in time?
Anyway, that's probably a question for the Athena folks, but bringing it back to MIMIC:
note_type_concept_id
instead of the ones the documentation is (or appears to be) suggesting?Apologies if this is ground that has already been covered somewhere else; I looked at issues and documentation and didn't see anything but might have missed it. Thanks for an amazing dataset and tool!
The text was updated successfully, but these errors were encountered: