Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenMRS sync creates duplicate resources. #141

Closed
witash opened this issue Oct 10, 2024 · 8 comments
Closed

OpenMRS sync creates duplicate resources. #141

witash opened this issue Oct 10, 2024 · 8 comments
Assignees
Labels
Type: Technical issue Improve something that users won't notice

Comments

@witash
Copy link
Contributor

witash commented Oct 10, 2024

OpenMRS sync currently subtracts the 2 * SYNC_INTERVAL from the SYNC_PERIOD to decide how long ago to sync. So if it's syncing 1 hour ago every minute, it only syncs resources which were updated in the last 58 minutes.

This works around an issue where it was creating duplicate resources based on resources it had already created, but it does not really fix the core issues and is very confusing.

This workaround should be removed, and duplicate resources should be prevented by fixing the bug in the code that is supposed to be checking unique keys.

@witash witash added the Type: Technical issue Improve something that users won't notice label Oct 10, 2024
@witash witash self-assigned this Oct 10, 2024
@witash witash moved this from Todo to In Progress in Product Team Activities Oct 10, 2024
@witash witash moved this from In Progress to This Week's commitments in Product Team Activities Oct 10, 2024
@witash
Copy link
Contributor Author

witash commented Oct 10, 2024

The 2 * SYNC_INTERVAL from the SYNC_PERIOD is because there is a gap between when a resource is created in one system and when it is created in another system. So resources created near the boundary of SYNC_PERIOD may appear in the query for one system, but be absent from the other, causing the sync to try and create them again.

So the idea was to add a window where a resource in one system that is not in the other is not created. It's not a good solution because creation times for corresponding resources in both systems can be different by much more than 2 * SYNC_INTERVAL if the sync wasn't running or there was an error during the initial sync and it was resolved later.

@njuguna-n njuguna-n self-assigned this Oct 14, 2024
@witash
Copy link
Contributor Author

witash commented Oct 17, 2024

One fix for this could be to make sure that the FHIR server doesn't allow duplicate resources, based on id or any of the other identifiers: OpenMRS ID, CHT Document ID

So if a resource appeared in the comparison list for OpenMRS but not FHIR, but was actually in the FHIR server, the sync would still send a creation request. But it wouldn't matter, it would get a 40X error and not do anything.

@njuguna-n
Copy link

@witash While working on the openmrs-mediator branch the openmrs channel is not created. When checking through the logs I see this logged twice for the two mediators but the OpenMRS mediator does not have a default channel like the default mediator (see screenshots below)
Image
Image

I manually created the openmrs channel to move forward but then I got a 400 error when sending a patient to OpenMRS (see logs below). I have pushed some of my work to the 141-openmrs-duplicate-resources branch but I have left out a bunch of debugging logs I had added. The branch adds a defaultChannelConfig to the OpenMRS mediator, although it didn't work as mentioned above, and fixes some scripts in the startup shell script.

2024-10-21 07:40:59 2024-10-21 04:40:59 INFO: Comparing Patient 
2024-10-21 07:40:59 2024-10-21 04:40:59 INFO: Incoming:  
2024-10-21 07:40:59 2024-10-21 04:40:59 INFO: Outgoing: d1fdf74a-0349-4133-bdee-866b14ad7b79 
2024-10-21 07:40:59 2024-10-21 04:40:59 INFO: Sending Patient d1fdf74a-0349-4133-bdee-866b14ad7b79 to OpenMRS 
2024-10-21 07:40:59 2024-10-21 04:40:59 ERROR: Request failed with status code 400 AxiosError: Request failed with status code 400
2024-10-21 07:40:59     at settle (/app/node_modules/axios/dist/node/axios.cjs:2019:12)
2024-10-21 07:40:59     at BrotliDecompress.handleStreamEnd (/app/node_modules/axios/dist/node/axios.cjs:3135:11)
2024-10-21 07:40:59     at BrotliDecompress.emit (node:events:529:35)
2024-10-21 07:40:59     at endReadableNT (node:internal/streams/readable:1400:12)
2024-10-21 07:40:59     at process.processTicksAndRejections (node:internal/process/task_queues:82:21)
2024-10-21 07:40:59     at Axios.request (/app/node_modules/axios/dist/node/axios.cjs:4287:41)
2024-10-21 07:40:59     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

I will pick this up from the bug and fix the actual issue with the duplicates once I am back on Monday (28th of this month) and have the defaultChannelConfig automatic setup be a separate issue.

@njuguna-n njuguna-n moved this from In Progress to Next Week's Commitments in Product Team Activities Oct 21, 2024
@witash
Copy link
Contributor Author

witash commented Oct 21, 2024

setting up the defaultChannelConfig is in this branch #139
the 400 is probably because of #140 . for e2e-tests the workflow adds identifier types automatically now (in #139) but for manual testing, they have to be added manually (or start everything with the e2e pipeline and leave it running)

Hopefully by the time you're back we'll have these merged to openmrs_mediator

@njuguna-n
Copy link

@witash I added a duplicate updateFhirResource request in this function to make it easier to reproduce this bug and as you can see from the logs below there is only one patient created from OpenMRS as expected. Given how hard this is to reproduce I think we can close this and reopen it if and when we encounter it again. What do you think?

The two requests to FHIR
Image

GET results from FHIR

{
  "resourceType": "Bundle",
  "id": "cc315011-5d11-4f4d-899b-7a5468dba2c0",
  "meta": {
    "lastUpdated": "2024-10-31T09:28:36.594+00:00"
  },
  "type": "searchset",
  "total": 3,
  "link": [ {
    "relation": "self",
    "url": "http://hapi-fhir:8080/fhir/Patient"
  } ],
  "entry": [ {
    "fullUrl": "http://hapi-fhir:8080/fhir/Patient/6f81b5eb-f6f5-478a-b393-7e3108093a8f",
    "resource": {
      "resourceType": "Patient",
      "id": "6f81b5eb-f6f5-478a-b393-7e3108093a8f",
      "meta": {
        "versionId": "1",
        "lastUpdated": "2024-10-31T06:56:29.222+00:00",
        "source": "#GwgomKKSxGcdNxtk"
      },
      "text": {
        "status": "generated",
        "div": "<div xmlns=\"http://www.w3.org/1999/xhtml\"><div class=\"hapiHeaderText\">Mother <b>1 </b></div><table class=\"hapiPropertyTable\"><tbody><tr><td>Identifier</td><td>73325</td></tr><tr><td>Date of birth</td><td><span>13 March 1991</span></td></tr></tbody></table></div>"
      },
      "identifier": [ {
        "use": "official",
        "type": {
          "text": "CHT Patient ID"
        },
        "value": "73325"
      }, {
        "use": "secondary",
        "type": {
          "text": "CHT Document ID"
        },
        "value": "6f81b5eb-f6f5-478a-b393-7e3108093a8f"
      } ],
      "name": [ {
        "family": "1",
        "given": [ "Mother" ]
      } ],
      "gender": "female",
      "birthDate": "1991-03-13"
    },
    "search": {
      "mode": "match"
    }
  }, {
    "fullUrl": "http://hapi-fhir:8080/fhir/Patient/830c4cd7-e85b-445f-b5c4-373155696861",
    "resource": {
      "resourceType": "Patient",
      "id": "830c4cd7-e85b-445f-b5c4-373155696861",
      "meta": {
        "versionId": "1",
        "lastUpdated": "2024-10-31T07:02:54.594+00:00",
        "source": "#uY7zaYayJPicJNpT"
      },
      "text": {
        "status": "generated",
        "div": "<div xmlns=\"http://www.w3.org/1999/xhtml\"><div class=\"hapiHeaderText\">Daughter <b>1 </b></div><table class=\"hapiPropertyTable\"><tbody><tr><td>Identifier</td><td>32700</td></tr><tr><td>Date of birth</td><td><span>20 October 2020</span></td></tr></tbody></table></div>"
      },
      "identifier": [ {
        "use": "official",
        "type": {
          "text": "CHT Patient ID"
        },
        "value": "32700"
      }, {
        "use": "secondary",
        "type": {
          "text": "CHT Document ID"
        },
        "value": "830c4cd7-e85b-445f-b5c4-373155696861"
      } ],
      "name": [ {
        "family": "1",
        "given": [ "Daughter" ]
      } ],
      "gender": "female",
      "birthDate": "2020-10-20"
    },
    "search": {
      "mode": "match"
    }
  }, {
    "fullUrl": "http://hapi-fhir:8080/fhir/Patient/c7333fd6-35eb-4dda-a0b3-5b89472316e3",
    "resource": {
      "resourceType": "Patient",
      "id": "c7333fd6-35eb-4dda-a0b3-5b89472316e3",
      "meta": {
        "versionId": "1",
        "lastUpdated": "2024-10-31T09:23:16.153+00:00",
        "source": "openmrs#k5mmMhCGFsdh45ge"
      },
      "text": {
        "status": "generated",
        "div": "<div xmlns=\"http://www.w3.org/1999/xhtml\"><table class=\"hapiPropertyTable\"><tbody><tr><td>Id:</td><td>c7333fd6-35eb-4dda-a0b3-5b89472316e3</td></tr><tr><td>Identifier:</td><td><div>10000X</div></td></tr><tr><td>Active:</td><td>true</td></tr><tr><td>Name:</td><td> Test <b>OPENMRS1 </b></td></tr><tr><td>Gender:</td><td>FEMALE</td></tr><tr><td>Birth Date:</td><td>12/10/1990</td></tr><tr><td>Deceased:</td><td>false</td></tr><tr><td>Address:</td><td/></tr></tbody></table></div>"
      },
      "contained": [ {
        "resourceType": "Provenance",
        "id": "7cfa27d1-d42a-422a-9f87-704c0f86ec3a",
        "recorded": "2024-10-31T08:53:05.000+00:00",
        "activity": {
          "coding": [ {
            "system": "http://terminology.hl7.org/CodeSystemv3-DataOperation",
            "code": "CREATE",
            "display": "create"
          } ]
        },
        "agent": [ {
          "type": {
            "coding": [ {
              "system": "http://terminology.hl7.org/CodeSystemprovenance-participant-type",
              "code": "author",
              "display": "Author"
            } ]
          },
          "role": [ {
            "coding": [ {
              "system": "http://terminology.hl7.org/CodeSystemv3-ParticipationType",
              "code": "AUT",
              "display": "author"
            } ]
          } ],
          "who": {
            "reference": "Practitioner/82f18b44-6814-11e8-923f-e9a88dcb533f",
            "type": "Practitioner",
            "display": "Super User"
          }
        } ]
      } ],
      "identifier": [ {
        "id": "a7a59b4b-2d22-4cc4-879c-695cb03edfd2",
        "extension": [ {
          "url": "http://fhir.openmrs.org/ext/patient/identifier#location",
          "valueReference": {
            "reference": "Location/b1a8b05e-3542-4037-bbd3-998ee9c40574",
            "type": "Location",
            "display": "Inpatient Ward"
          }
        } ],
        "use": "official",
        "type": {
          "text": "OpenMRS ID"
        },
        "value": "10000X"
      }, {
        "use": "secondary",
        "type": {
          "text": "OpenMRS Patient UUID"
        },
        "value": "c7333fd6-35eb-4dda-a0b3-5b89472316e3"
      } ],
      "active": true,
      "name": [ {
        "id": "99bf9d6c-4558-4b44-8577-18573251ad4f",
        "family": "OpenMRS1",
        "given": [ "Test" ]
      } ],
      "gender": "female",
      "birthDate": "1990-10-12",
      "deceasedBoolean": false,
      "address": [ {
        "id": "36e90179-89ec-4202-ad6d-db4b5cc504b3",
        "extension": [ {
          "url": "http://fhir.openmrs.org/ext/address",
          "extension": [ {
            "url": "http://fhir.openmrs.org/ext/address#address1",
            "valueString": "1234"
          } ]
        } ],
        "use": "home"
      } ]
    },
    "search": {
      "mode": "match"
    }
  } ]
}

@witash
Copy link
Contributor Author

witash commented Nov 1, 2024

I was able to reproduce part of the issue, although not the whole thing.

  1. Sending requests with different ids but the same OpenMRS ID, CHT Patient ID or CHT Document ID creates duplicate resources. We may have to enforce uniqueness of these identifiers in the mediator or get HAPI to do it somehow.
  2. openmrs_sync can create duplicate form submissions in CHT. so if there is an OPENMRS_PATIENT form in CHT to receive patients from OpenMRS and openmrs_sync doesn't see an already created FHIR patient, it will create a new OPENMRS_PATIENT report in CHT, even if updateFhirResource doesn't create a duplicate patient on the FHIR server.

What I think may have happened in the gandaki dev instance instead is that then, duplicate CHT patients were created, which then created new patients on the FHIR Server.
But I wasn't able to reproduce that part yet, and just looking at the gandaki config it doesn't look like it should be possible; outbound push for patients created by openmrs should only update the CHT Patient ID and CHT Document ID of existing patients in FHIR

@njuguna-n
Copy link

Sending requests with different ids but the same OpenMRS ID, CHT Patient ID or CHT Document ID creates duplicate resources.

I have a PR that addresses this issue by adding a check by an identifier for patient records. I'll handle the same problem for encounters in this issue.

openmrs_sync can create duplicate form submissions in CHT.

The same PR changes when we create the patient in the CHT to only when a new resource is created in FHIR by checking for only the 201 status code.

@njuguna-n
Copy link

PR merged.

@github-project-automation github-project-automation bot moved this from In Progress to Done in Product Team Activities Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Technical issue Improve something that users won't notice
Projects
No open projects
Status: Done
Development

No branches or pull requests

2 participants