You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi everyone, first of all I would like to thank @MaartenGr and all the contributors for this amazing project.
For my project, I need to calculate the entropy of each topic. Could you help me how to calculate entropy in Bertopic. I have used probs to calculate, but the bug showed that the probs were 1 dimension array. But my code requires two dimension array. Thank you very much!
Have you searched existing issues? 🔎
Desribe the bug
Hi everyone, first of all I would like to thank @MaartenGr and all the contributors for this amazing project.
For my project, I need to calculate the entropy of each topic. Could you help me how to calculate entropy in Bertopic. I have used probs to calculate, but the bug showed that the probs were 1 dimension array. But my code requires two dimension array. Thank you very much!
Reproduction
import numpy as np
import pandas as pd
doc_topic_matrix = np.array(probs)
normalized_doc_topic_matrix = doc_topic_matrix / doc_topic_matrix.sum(axis=1, keepdims=True)
topic_entropy = (-normalized_doc_topic_matrix * np.log2(normalized_doc_topic_matrix + 1e-9)).sum(axis=0)
entropy_df = pd.DataFrame({'Topic': range(len(topic_entropy)), 'Entropy': topic_entropy})
topic_freq['Entropy'] = sorted_entropy_df['Entropy'].values
BERTopic Version
0.16.4
The text was updated successfully, but these errors were encountered: