kwx 0.0.2.2
The minimum viable product of kwx:
-
Users are able to extract keywords using the following methods
- Most frequent words
- TFIDF words unique to one corpus when compared to others
- Latent Dirichlet Allocation
- Bidirectional Encoder Representations from Transformers
- An autoencoder application of LDA and BERT combined
-
Users are able to tell the model to remove certain words to fine tune results
-
Support is offered for a universal cleaning process in all major languages
-
Visualization techniques to display keywords and topics are included
-
Outputs can be cleanly organized in a directory or zip file
-
Runtimes for topic number comparisons are estimated using tqdm