Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REVAI-4324: Multichannel transcript grouping #119

Merged
merged 18 commits into from
Nov 27, 2024
Merged

Conversation

dmtrrk
Copy link
Contributor

@dmtrrk dmtrrk commented Nov 26, 2024

Overview

Add new optional parameters to get_transcript_xx functions that are available for multichannel media files:

  1. group_channels_by
    Specifies how to group multiple channels in the transcript. This parameter determines the atomic entity for breaking down the transcript into monologues. Only applicable when the submitted media has multiple channels (speaker_channels_count > 1):
  • speaker - groups by speakers
  • word - groups by individual words
  • sentence - groups by complete sentences
  1. group_channels_threshold_ms
    Threshold in milliseconds for handling speaker interruptions. When a speaker interrupts another speaker, this parameter determines how to group the segments. Only applicable when the submitted media has multiple channels (speaker_channels_count > 1):
  • If the interruption occurs within this threshold, preference is given to the most recent speaker
  • If the interruption occurs after this threshold, a new segment is created

Usage

from rev_ai import apiclient, GroupChannelsType

client = apiclient.RevAiAPIClient(token)
job = client.submit_job_local_file(filePath, speaker_channels_count=2)

# default (word, 1000ms)
transcript = client.get_transcript_text(job.id)

# specific WORD params
transcript_word = client.get_transcript_text(job.id, group_channels_by=GroupChannelsType.WORD, group_channels_threshold_ms=1000)

# specific Sentence parameters
transcript_sentence = client.get_transcript_text(job.id, group_channels_by=GroupChannelsType.SENTENCE, group_channels_threshold_ms=2000)

# Speaker parameter
transcript_speaker = client.get_transcript_text(job.id, group_channels_by=GroupChannelsType.SPEAKER)

@dmtrrk dmtrrk changed the title REVAI-4324: REVAI-4324: Multichannel transcript grouping Nov 26, 2024
@dmtrrk dmtrrk marked this pull request as ready for review November 27, 2024 19:15
@dmtrrk dmtrrk requested a review from a team as a code owner November 27, 2024 19:15
@@ -337,95 +337,144 @@ def get_list_of_jobs(self, limit=None, starting_after=None):

return [Job.from_json(job) for job in response.json()]

def get_transcript_text(self, id_):
def get_transcript_text(self, id_, group_channels_by=None, group_channels_threshold_ms=None):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i would use type hints if possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see we use type hints in this code. I consider this to be python 2.x compatibility

if group_channels_by is not None:
params.append('group_channels_by={}'.format(group_channels_by))
if group_channels_threshold_ms is not None:
params.append('group_channels_threshold_ms={}'.format(group_channels_threshold_ms))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmtrrk i think you were saying you have doubts about this, i thiunk this is right, we are dealing with these two parameters independently

Copy link

@alexsku alexsku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit: looks good to me

@dmtrrk dmtrrk merged commit e36130e into develop Nov 27, 2024
8 checks passed
@dmtrrk dmtrrk deleted the feature/REVAI-4324 branch November 27, 2024 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants