-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAMSL tags don't match #1
Comments
@lzfelix You are right. When I processed this dataset, I also did not find any "% -" tag in that dataset. Meanwhile, I did not find how to process "+" tag in the dataset. Because there is no "+" tag in those 42 tags, but the number of "+" tag is over 10,000 in the original dataset. Whether the "+" tag is replaced with the corresponding tag of previous utterance from the same speaker? |
@ruizheliUOA same here, don't know what to do with "+" tag. Having check many paper but get no idea... replacing with the corresponding tag of previous utterance from the same speaker seems reasonable. Reference: |
To my understanding, you can either do that or simply disregard these utterances, depending on your problem. |
FYI, This paper mention about label "+" (finally...) |
Utterances marked as + are interrupted conversations |
Initially I would like to thank you for making this code available.
On section 1c of the coder's manual we can see the table with the 42 clustered labels, although it has 43 rows, as you mention on your page. However, one of these classes is "% -", which can't be found on the dataset (I've performed a scan on it, and 0 matches were found). If the classes "% -" and "%" are merged (since both have a similar meaning), we are back to 42 classes as desired. This seemed to be done on Stolcke et al. [1] paper, as shown on Table 2. I've also noticed that on your page, the "% -" has the same full count as "%".
[1]
Stolcke, Andreas, et al. "Dialogue act modeling for automatic tagging and recognition of conversational speech." Computational linguistics 26.3 (2000): 339-373.
Thanks.
The text was updated successfully, but these errors were encountered: