Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What would be the best way to approach a classification problem with similar labels? #1

Open
adithyaan-creator opened this issue Jan 16, 2021 · 0 comments

Comments

@adithyaan-creator
Copy link

Hi @srivatsan88 ,
I am working on a problem of text classification where the labels are quite similar, like

  • Bad reputation
  • customer issues
  • delays
  • good reputation

The thing is there is a major overlap between the first 3 labels, as many have common words and could fall into multiple categories.
Example - Delay could have words overlapping with bad reputation, same way with customer issues and bad reputation.
Is there any good approach to be taken that can ensure good metrics?
And what would be an ideal number of data points required. Currently there is only about 6000 data points.

Cheers.

@adithyaan-creator adithyaan-creator changed the title What would be the best way to approach a classification probelm with similar labels? What would be the best way to approach a classification problem with similar labels? Jan 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant