-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backend Processing #8
Comments
This will be implemented as a Google Cloud function. |
To fulfill the client's requirements, we need to process the ASR JSON file to extract the necessary information and format it as specified. Let's break down the tasks to achieve this: Word-for-word Transcription
Speaker Likelihood Score Vector
Clinical Encounter Summary
To start, we can create a Python script to process the JSON file and generate the required outputs. The script will include functions for parsing the JSON, generating the word-for-word transcription and speaker likelihood score vector, and summarizing the encounter. Here's a high-level overview of the tasks we need to perform, along with a due date for each: - [ ] Parse the ASR JSON file to extract necessary details for transcription and speaker information.
- [ ] Generate a word-for-word transcription of the encounter and format it in a VA standard compliant JSON or HL7 file.
- [ ] Create a CSV file with the speaker likelihood score vector for each word in the transcription.
- [ ] Summarize the encounter based on the transcription and format it in a plain text JSON or HL7 file. |
Opted to do the work on the client side. |
Implement an endpoint or service that can take the raw output of the ASR pipeline and transform it into the CSV format required by the VA, with likelihood scores for speaker identification.
The development and testing of this transformation service could take several days, considering the need to accurately reflect the diarization data.
The text was updated successfully, but these errors were encountered: