使用自定义词典时的问题 #1

AirFin · 2020-04-12T02:47:53Z

您好，我在使用“通过自定义词典”来进行情感分析的过程中遇到些问题。想向您请教。

问题一，我遇到了不识别词典的问题。以下是代码。
`from cnsenti import Sentiment

senti = Sentiment(pos=r"D:\情感词典\负面词典.txt", #正面词典txt文件相对路径
neg=r"D:\情感词典\正面词典.txt", #负面词典txt文件相对路径
encoding='unicode_escape') #两txt均为utf-8编码

test_text = '这家公司是行业的引领者，是中流砥柱。今年的业绩非常好'
result1 = senti.sentiment_count(test_text)
result2 = senti.sentiment_calculate(test_text)
print('sentiment_count',result1)
print('sentiment_calculate',result2)`

以下是运行结果，可以发现显示的积极词数是0。这个文本中的词，”引领者“和”中流砥柱“均是词典中的积极词。
`D:\Anaconda3\python.exe D:/pythonproject/test1/金融文本情感-中文-cnsenti.py
Building prefix dict from the default dictionary ...
Loading model from cache C:\User\AppData\Local\Temp\jieba.cache
Loading model cost 0.675 seconds.
sentiment_count {'words': 15, 'sentences': 2, 'pos': 0, 'neg': 0}
Prefix dict has been built succesfully.
sentiment_calculate {'sentences': 2, 'words': 15, 'pos': 0, 'neg': 0}

进程已结束，退出代码 0
`

问题二，我遇到了被检测文本的末尾如果有句号，则报错。以下是代码。
`from cnsenti import Sentiment

senti = Sentiment(pos=r"D:\情感词典\负面词典.txt", #正面词典txt文件相对路径
neg=r"D:\情感词典\正面词典.txt", #负面词典txt文件相对路径
encoding='unicode_escape') #两txt均为utf-8编码

test_text = '这家公司是行业的引领者，是中流砥柱。今年的业绩非常好。'
result1 = senti.sentiment_count(test_text)
result2 = senti.sentiment_calculate(test_text)
print('sentiment_count',result1)
print('sentiment_calculate',result2)`

以下是运行结果。
`D:\Anaconda3\python.exe D:/pythonproject/test1/金融文本情感-中文-cnsenti.py
Building prefix dict from the default dictionary ...
Loading model from cache C:\Users\AppData\Local\Temp\jieba.cache
Loading model cost 0.685 seconds.
Prefix dict has been built succesfully.
Traceback (most recent call last):
File "D:/pythonproject/test1/金融文本情感-中文-cnsenti.py", line 9, in
result2 = senti.sentiment_calculate(test_text)
File "D:\Anaconda3\lib\site-packages\cnsenti\sentiment.py", line 218, in sentiment_calculate
pos = np.sum(score_array[:, 0])
IndexError: too many indices for array

进程已结束，退出代码 1
`

最后非常感谢您的贡献，希望得到您的帮助！

hiDaDeng · 2020-04-12T03:00:32Z

刚看了下你写的代码 senti = Sentiment(pos=r"D:\情感词典\负面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\正面词典.txt", #负面词典txt文件相对路径 encoding='unicode_escape') #两txt均为utf-8编码有两种可能的原因 1. pos和neg分别是积极和消极，参数你传递的不对。 2. encoding参数用来接收自定义词典的编码格式，如果你的txt是utf-8，encoding='utf-8'即可。unicode_escape我不太了解，不晓得是否也会导致问题 3. cnsenti用的jieba分词，不排除jieba分词分错新情感词。这块我cnsenti中没有开发这部分，后续会改进的

…

------------------ 原始邮件 ------------------ 发件人: "DesmondLiu"<[email protected]>; 发送时间: 2020年4月12日(星期天) 上午10:48 收件人: "thunderhit/cnsenti"<[email protected]>; 抄送: "Subscribed"<[email protected]>; 主题: [thunderhit/cnsenti] 使用自定义词典时的问题 (#1) 您好，我在使用“通过自定义词典”来进行情感分析的过程中遇到些问题。想向您请教。问题一，我遇到了不识别词典的问题。以下是代码。 `from cnsenti import Sentiment senti = Sentiment(pos=r"D:\情感词典\负面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\正面词典.txt", #负面词典txt文件相对路径 encoding='unicode_escape') #两txt均为utf-8编码 test_text = '这家公司是行业的引领者，是中流砥柱。今年的业绩非常好' result1 = senti.sentiment_count(test_text) result2 = senti.sentiment_calculate(test_text) print('sentiment_count',result1) print('sentiment_calculate',result2)` 以下是运行结果，可以发现显示的积极词数是0。这个文本中的词，”引领者“和”中流砥柱“均是词典中的积极词。 `D:\Anaconda3\python.exe D:/pythonproject/test1/金融文本情感-中文-cnsenti.py Building prefix dict from the default dictionary ... Loading model from cache C:\Users\刘铭基\AppData\Local\Temp\jieba.cache Loading model cost 0.675 seconds. sentiment_count {'words': 15, 'sentences': 2, 'pos': 0, 'neg': 0} Prefix dict has been built succesfully. sentiment_calculate {'sentences': 2, 'words': 15, 'pos': 0, 'neg': 0} 进程已结束，退出代码 0 ` 问题二，我遇到了被检测文本的末尾如果有句号，则报错。以下是代码。 `from cnsenti import Sentiment senti = Sentiment(pos=r"D:\情感词典\负面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\正面词典.txt", #负面词典txt文件相对路径 encoding='unicode_escape') #两txt均为utf-8编码 test_text = '这家公司是行业的引领者，是中流砥柱。今年的业绩非常好。' result1 = senti.sentiment_count(test_text) result2 = senti.sentiment_calculate(test_text) print('sentiment_count',result1) print('sentiment_calculate',result2)` 以下是运行结果。 `D:\Anaconda3\python.exe D:/pythonproject/test1/金融文本情感-中文-cnsenti.py Building prefix dict from the default dictionary ... Loading model from cache C:\Users\刘铭基\AppData\Local\Temp\jieba.cache Loading model cost 0.685 seconds. Prefix dict has been built succesfully. Traceback (most recent call last): File "D:/pythonproject/test1/金融文本情感-中文-cnsenti.py", line 9, in result2 = senti.sentiment_calculate(test_text) File "D:\Anaconda3\lib\site-packages\cnsenti\sentiment.py", line 218, in sentiment_calculate pos = np.sum(score_array[:, 0]) IndexError: too many indices for array 进程已结束，退出代码 1 ` 最后非常感谢您的贡献，希望得到您的帮助！ — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

AirFin · 2020-04-12T03:08:20Z

非常感谢您的回复！ 1.正负词我弄反了 2.我改为了“utf-8" 但是依然是不识别情感词。此外，关于第二个问题，即“被分析文本中有句号”时报错，请问有什么办法解决呢？代码如下。 from cnsenti import Sentiment senti = Sentiment(pos=r"D:\情感词典\正面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\负面词典.txt", #负面词典txt文件相对路径 encoding='utf-8') #两txt均为utf-8编码 test_text = '这家公司是行业的引领者，是中流砥柱。今年的业绩非常好。' result1 = senti.sentiment_count(test_text) result2 = senti.sentiment_calculate(test_text) print('sentiment_count',result1) print('sentiment_calculate',result2) 运行结果如下 D:\Anaconda3\python.exe D:/pythonproject/test1/金融文本情感-中文-cnsenti.py Building prefix dict from the default dictionary ... Loading model from cache C:\Users\AppData\Local\Temp\jieba.cache Loading model cost 0.739 seconds. Prefix dict has been built succesfully. Traceback (most recent call last):   File "D:/pythonproject/test1/金融文本情感-中文-cnsenti.py", line 9, in <module>     result2 = senti.sentiment_calculate(test_text)   File "D:\Anaconda3\lib\site-packages\cnsenti\sentiment.py", line 218, in sentiment_calculate     pos = np.sum(score_array[:, 0]) IndexError: too many indices for array 进程已结束，退出代码 1 非常感谢您的帮助！

…

------------------ 原始邮件 ------------------ 发件人: "thunderhit"<[email protected]>; 发送时间: 2020年4月12日(星期天) 中午11:00 收件人: "thunderhit/cnsenti"<[email protected]>; 抄送: ""<[email protected]>;"Author"<[email protected]>; 主题: Re: [thunderhit/cnsenti] 使用自定义词典时的问题 (#1) 刚看了下你写的代码 senti = Sentiment(pos=r"D:\情感词典\负面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\正面词典.txt", #负面词典txt文件相对路径 encoding='unicode_escape') #两txt均为utf-8编码有两种可能的原因 1. pos和neg分别是积极和消极，参数你传递的不对。 2. encoding参数用来接收自定义词典的编码格式，如果你的txt是utf-8，encoding='utf-8'即可。unicode_escape我不太了解，不晓得是否也会导致问题 3. cnsenti用的jieba分词，不排除jieba分词分错新情感词。这块我cnsenti中没有开发这部分，后续会改进的

------------------&nbsp;原始邮件&nbsp;------------------ 发件人:&nbsp;"DesmondLiu"<[email protected]&gt;; 发送时间:&nbsp;2020年4月12日(星期天) 上午10:48 收件人:&nbsp;"thunderhit/cnsenti"<[email protected]&gt;; 抄送:&nbsp;"Subscribed"<[email protected]&gt;; 主题:&nbsp;[thunderhit/cnsenti] 使用自定义词典时的问题 (#1) 您好，我在使用“通过自定义词典”来进行情感分析的过程中遇到些问题。想向您请教。问题一，我遇到了不识别词典的问题。以下是代码。 `from cnsenti import Sentiment senti = Sentiment(pos=r"D:\情感词典\负面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\正面词典.txt", #负面词典txt文件相对路径 encoding='unicode_escape') #两txt均为utf-8编码 test_text = '这家公司是行业的引领者，是中流砥柱。今年的业绩非常好' result1 = senti.sentiment_count(test_text) result2 = senti.sentiment_calculate(test_text) print('sentiment_count',result1) print('sentiment_calculate',result2)` 以下是运行结果，可以发现显示的积极词数是0。这个文本中的词，”引领者“和”中流砥柱“均是词典中的积极词。 `D:\Anaconda3\python.exe D:/pythonproject/test1/金融文本情感-中文-cnsenti.py Building prefix dict from the default dictionary ... Loading model from cache C:\Users\AppData\Local\Temp\jieba.cache Loading model cost 0.675 seconds. sentiment_count {'words': 15, 'sentences': 2, 'pos': 0, 'neg': 0} Prefix dict has been built succesfully. sentiment_calculate {'sentences': 2, 'words': 15, 'pos': 0, 'neg': 0} 进程已结束，退出代码 0 ` 问题二，我遇到了被检测文本的末尾如果有句号，则报错。以下是代码。 `from cnsenti import Sentiment senti = Sentiment(pos=r"D:\情感词典\负面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\正面词典.txt", #负面词典txt文件相对路径 encoding='unicode_escape') #两txt均为utf-8编码 test_text = '这家公司是行业的引领者，是中流砥柱。今年的业绩非常好。' result1 = senti.sentiment_count(test_text) result2 = senti.sentiment_calculate(test_text) print('sentiment_count',result1) print('sentiment_calculate',result2)` 以下是运行结果。 `D:\Anaconda3\python.exe D:/pythonproject/test1/金融文本情感-中文-cnsenti.py Building prefix dict from the default dictionary ... Loading model from cache C:\Users\AppData\Local\Temp\jieba.cache Loading model cost 0.685 seconds. Prefix dict has been built succesfully. Traceback (most recent call last): File "D:/pythonproject/test1/金融文本情感-中文-cnsenti.py", line 9, in result2 = senti.sentiment_calculate(test_text) File "D:\Anaconda3\lib\site-packages\cnsenti\sentiment.py", line 218, in sentiment_calculate pos = np.sum(score_array[:, 0]) IndexError: too many indices for array 进程已结束，退出代码 1 ` 最后非常感谢您的贡献，希望得到您的帮助！ — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

hiDaDeng · 2020-04-12T03:31:10Z

第一个大问题，我这里代码跑了，识别了中流砥柱，没什么问题。你留的第二个大问题，我正在解决。

…

---原始邮件--- 发件人: "DesmondLiu"<[email protected]> 发送时间: 2020年4月12日(周日) 中午11:08 收件人: "thunderhit/cnsenti"<[email protected]>; 抄送: "Comment"<[email protected]>;"thunderhit"<[email protected]>; 主题: Re: [thunderhit/cnsenti] 使用自定义词典时的问题 (#1) 非常感谢您的回复！ 1.正负词我弄反了 2.我改为了“utf-8" 但是依然是不识别情感词。此外，关于第二个问题，即“被分析文本中有句号”时报错，请问有什么办法解决呢？代码如下。 from cnsenti import Sentiment senti = Sentiment(pos=r"D:\情感词典\正面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\负面词典.txt", #负面词典txt文件相对路径 encoding='utf-8') #两txt均为utf-8编码 test_text = '这家公司是行业的引领者，是中流砥柱。今年的业绩非常好。' result1 = senti.sentiment_count(test_text) result2 = senti.sentiment_calculate(test_text) print('sentiment_count',result1) print('sentiment_calculate',result2) 运行结果如下 D:\Anaconda3\python.exe D:/pythonproject/test1/金融文本情感-中文-cnsenti.py Building prefix dict from the default dictionary ... Loading model from cache C:\Users\刘铭基\AppData\Local\Temp\jieba.cache Loading model cost 0.739 seconds. Prefix dict has been built succesfully. Traceback (most recent call last): &nbsp; File "D:/pythonproject/test1/金融文本情感-中文-cnsenti.py", line 9, in <module&gt; &nbsp; &nbsp; result2 = senti.sentiment_calculate(test_text) &nbsp; File "D:\Anaconda3\lib\site-packages\cnsenti\sentiment.py", line 218, in sentiment_calculate &nbsp; &nbsp; pos = np.sum(score_array[:, 0]) IndexError: too many indices for array 进程已结束，退出代码 1 非常感谢您的帮助！

------------------&nbsp;原始邮件&nbsp;------------------ 发件人:&nbsp;"thunderhit"<[email protected]&gt;; 发送时间:&nbsp;2020年4月12日(星期天) 中午11:00 收件人:&nbsp;"thunderhit/cnsenti"<[email protected]&gt;; 抄送:&nbsp;"刘铭基"<[email protected]&gt;;"Author"<[email protected]&gt;; 主题:&nbsp;Re: [thunderhit/cnsenti] 使用自定义词典时的问题 (#1) 刚看了下你写的代码 senti = Sentiment(pos=r"D:\情感词典\负面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\正面词典.txt", #负面词典txt文件相对路径 encoding='unicode_escape') #两txt均为utf-8编码有两种可能的原因 1. pos和neg分别是积极和消极，参数你传递的不对。 2. encoding参数用来接收自定义词典的编码格式，如果你的txt是utf-8，encoding='utf-8'即可。unicode_escape我不太了解，不晓得是否也会导致问题 3. cnsenti用的jieba分词，不排除jieba分词分错新情感词。这块我cnsenti中没有开发这部分，后续会改进的

------------------&amp;nbsp;原始邮件&amp;nbsp;------------------ 发件人:&amp;nbsp;"DesmondLiu"<[email protected]&amp;gt;; 发送时间:&amp;nbsp;2020年4月12日(星期天) 上午10:48 收件人:&amp;nbsp;"thunderhit/cnsenti"<[email protected]&amp;gt;; 抄送:&amp;nbsp;"Subscribed"<[email protected]&amp;gt;; 主题:&amp;nbsp;[thunderhit/cnsenti] 使用自定义词典时的问题 (#1) 您好，我在使用“通过自定义词典”来进行情感分析的过程中遇到些问题。想向您请教。问题一，我遇到了不识别词典的问题。以下是代码。 `from cnsenti import Sentiment senti = Sentiment(pos=r"D:\情感词典\负面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\正面词典.txt", #负面词典txt文件相对路径 encoding='unicode_escape') #两txt均为utf-8编码 test_text = '这家公司是行业的引领者，是中流砥柱。今年的业绩非常好' result1 = senti.sentiment_count(test_text) result2 = senti.sentiment_calculate(test_text) print('sentiment_count',result1) print('sentiment_calculate',result2)` 以下是运行结果，可以发现显示的积极词数是0。这个文本中的词，”引领者“和”中流砥柱“均是词典中的积极词。 `D:\Anaconda3\python.exe D:/pythonproject/test1/金融文本情感-中文-cnsenti.py Building prefix dict from the default dictionary ... Loading model from cache C:\Users\刘铭基\AppData\Local\Temp\jieba.cache Loading model cost 0.675 seconds. sentiment_count {'words': 15, 'sentences': 2, 'pos': 0, 'neg': 0} Prefix dict has been built succesfully. sentiment_calculate {'sentences': 2, 'words': 15, 'pos': 0, 'neg': 0} 进程已结束，退出代码 0 ` 问题二，我遇到了被检测文本的末尾如果有句号，则报错。以下是代码。 `from cnsenti import Sentiment senti = Sentiment(pos=r"D:\情感词典\负面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\正面词典.txt", #负面词典txt文件相对路径 encoding='unicode_escape') #两txt均为utf-8编码 test_text = '这家公司是行业的引领者，是中流砥柱。今年的业绩非常好。' result1 = senti.sentiment_count(test_text) result2 = senti.sentiment_calculate(test_text) print('sentiment_count',result1) print('sentiment_calculate',result2)` 以下是运行结果。 `D:\Anaconda3\python.exe D:/pythonproject/test1/金融文本情感-中文-cnsenti.py Building prefix dict from the default dictionary ... Loading model from cache C:\Users\刘铭基\AppData\Local\Temp\jieba.cache Loading model cost 0.685 seconds. Prefix dict has been built succesfully. Traceback (most recent call last): File "D:/pythonproject/test1/金融文本情感-中文-cnsenti.py", line 9, in result2 = senti.sentiment_calculate(test_text) File "D:\Anaconda3\lib\site-packages\cnsenti\sentiment.py", line 218, in sentiment_calculate pos = np.sum(score_array[:, 0]) IndexError: too many indices for array 进程已结束，退出代码 1 ` 最后非常感谢您的贡献，希望得到您的帮助！ — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

AirFin · 2020-04-12T03:32:27Z

好的。再次感谢您！发自我的iPhone

…

------------------ 原始邮件 ------------------ 发件人: thunderhit <[email protected]> 发送时间: 2020年4月12日 11:31 收件人: thunderhit/cnsenti <[email protected]> 抄送: DesmondLiu <[email protected]>, Author <[email protected]> 主题: 回复：[thunderhit/cnsenti] 使用自定义词典时的问题 (#1) 第一个大问题，我这里代码跑了，识别了中流砥柱，没什么问题。你留的第二个大问题，我正在解决。

---原始邮件--- 发件人: "DesmondLiu"<[email protected]&gt; 发送时间: 2020年4月12日(周日) 中午11:08 收件人: "thunderhit/cnsenti"<[email protected]&gt;; 抄送: "Comment"<[email protected]&gt;;"thunderhit"<[email protected]&gt;; 主题: Re: [thunderhit/cnsenti] 使用自定义词典时的问题 (#1) 非常感谢您的回复！ 1.正负词我弄反了 2.我改为了“utf-8" 但是依然是不识别情感词。此外，关于第二个问题，即“被分析文本中有句号”时报错，请问有什么办法解决呢？代码如下。 from cnsenti import Sentiment senti = Sentiment(pos=r"D:\情感词典\正面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\负面词典.txt", #负面词典txt文件相对路径 encoding='utf-8') #两txt均为utf-8编码 test_text = '这家公司是行业的引领者，是中流砥柱。今年的业绩非常好。' result1 = senti.sentiment_count(test_text) result2 = senti.sentiment_calculate(test_text) print('sentiment_count',result1) print('sentiment_calculate',result2) 运行结果如下 D:\Anaconda3\python.exe D:/pythonproject/test1/金融文本情感-中文-cnsenti.py Building prefix dict from the default dictionary ... Loading model from cache C:\Users\刘铭基\AppData\Local\Temp\jieba.cache Loading model cost 0.739 seconds. Prefix dict has been built succesfully. Traceback (most recent call last): &amp;nbsp; File "D:/pythonproject/test1/金融文本情感-中文-cnsenti.py", line 9, in <module&amp;gt; &amp;nbsp; &amp;nbsp; result2 = senti.sentiment_calculate(test_text) &amp;nbsp; File "D:\Anaconda3\lib\site-packages\cnsenti\sentiment.py", line 218, in sentiment_calculate &amp;nbsp; &amp;nbsp; pos = np.sum(score_array[:, 0]) IndexError: too many indices for array 进程已结束，退出代码 1 非常感谢您的帮助！

------------------&amp;nbsp;原始邮件&amp;nbsp;------------------ 发件人:&amp;nbsp;"thunderhit"<[email protected]&amp;gt;; 发送时间:&amp;nbsp;2020年4月12日(星期天) 中午11:00 收件人:&amp;nbsp;"thunderhit/cnsenti"<[email protected]&amp;gt;; 抄送:&amp;nbsp;"刘铭基"<[email protected]&amp;gt;;"Author"<[email protected]&amp;gt;; 主题:&amp;nbsp;Re: [thunderhit/cnsenti] 使用自定义词典时的问题 (#1) 刚看了下你写的代码 senti = Sentiment(pos=r"D:\情感词典\负面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\正面词典.txt", #负面词典txt文件相对路径 encoding='unicode_escape') #两txt均为utf-8编码有两种可能的原因 1. pos和neg分别是积极和消极，参数你传递的不对。 2. encoding参数用来接收自定义词典的编码格式，如果你的txt是utf-8，encoding='utf-8'即可。unicode_escape我不太了解，不晓得是否也会导致问题 3. cnsenti用的jieba分词，不排除jieba分词分错新情感词。这块我cnsenti中没有开发这部分，后续会改进的

------------------&amp;amp;nbsp;原始邮件&amp;amp;nbsp;------------------ 发件人:&amp;amp;nbsp;"DesmondLiu"<[email protected]&amp;amp;gt;; 发送时间:&amp;amp;nbsp;2020年4月12日(星期天) 上午10:48 收件人:&amp;amp;nbsp;"thunderhit/cnsenti"<[email protected]&amp;amp;gt;; 抄送:&amp;amp;nbsp;"Subscribed"<[email protected]&amp;amp;gt;; 主题:&amp;amp;nbsp;[thunderhit/cnsenti] 使用自定义词典时的问题 (#1) 您好，我在使用“通过自定义词典”来进行情感分析的过程中遇到些问题。想向您请教。问题一，我遇到了不识别词典的问题。以下是代码。 `from cnsenti import Sentiment senti = Sentiment(pos=r"D:\情感词典\负面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\正面词典.txt", #负面词典txt文件相对路径 encoding='unicode_escape') #两txt均为utf-8编码 test_text = '这家公司是行业的引领者，是中流砥柱。今年的业绩非常好' result1 = senti.sentiment_count(test_text) result2 = senti.sentiment_calculate(test_text) print('sentiment_count',result1) print('sentiment_calculate',result2)` 以下是运行结果，可以发现显示的积极词数是0。这个文本中的词，”引领者“和”中流砥柱“均是词典中的积极词。 `D:\Anaconda3\python.exe D:/pythonproject/test1/金融文本情感-中文-cnsenti.py Building prefix dict from the default dictionary ... Loading model from cache C:\Users\刘铭基\AppData\Local\Temp\jieba.cache Loading model cost 0.675 seconds. sentiment_count {'words': 15, 'sentences': 2, 'pos': 0, 'neg': 0} Prefix dict has been built succesfully. sentiment_calculate {'sentences': 2, 'words': 15, 'pos': 0, 'neg': 0} 进程已结束，退出代码 0 ` 问题二，我遇到了被检测文本的末尾如果有句号，则报错。以下是代码。 `from cnsenti import Sentiment senti = Sentiment(pos=r"D:\情感词典\负面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\正面词典.txt", #负面词典txt文件相对路径 encoding='unicode_escape') #两txt均为utf-8编码 test_text = '这家公司是行业的引领者，是中流砥柱。今年的业绩非常好。' result1 = senti.sentiment_count(test_text) result2 = senti.sentiment_calculate(test_text) print('sentiment_count',result1) print('sentiment_calculate',result2)` 以下是运行结果。 `D:\Anaconda3\python.exe D:/pythonproject/test1/金融文本情感-中文-cnsenti.py Building prefix dict from the default dictionary ... Loading model from cache C:\Users\刘铭基\AppData\Local\Temp\jieba.cache Loading model cost 0.685 seconds. Prefix dict has been built succesfully. Traceback (most recent call last): File "D:/pythonproject/test1/金融文本情感-中文-cnsenti.py", line 9, in result2 = senti.sentiment_calculate(test_text) File "D:\Anaconda3\lib\site-packages\cnsenti\sentiment.py", line 218, in sentiment_calculate pos = np.sum(score_array[:, 0]) IndexError: too many indices for array 进程已结束，退出代码 1 ` 最后非常感谢您的贡献，希望得到您的帮助！ — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

hiDaDeng · 2020-04-12T04:13:31Z

我更新了代码，你可以卸载cnsenti   过半个小时再安装新的cnsenti 这是cnsenti文档，已经解决多个句子问题； https://github.com/thunderhit/cnsenti/blob/master/README.md

…

------------------ 原始邮件 ------------------ 发件人: "DesmondLiu"<[email protected]>; 发送时间: 2020年4月12日(星期天) 上午10:48 收件人: "thunderhit/cnsenti"<[email protected]>; 抄送: "Subscribed"<[email protected]>; 主题: [thunderhit/cnsenti] 使用自定义词典时的问题 (#1) 您好，我在使用“通过自定义词典”来进行情感分析的过程中遇到些问题。想向您请教。问题一，我遇到了不识别词典的问题。以下是代码。 `from cnsenti import Sentiment senti = Sentiment(pos=r"D:\情感词典\负面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\正面词典.txt", #负面词典txt文件相对路径 encoding='unicode_escape') #两txt均为utf-8编码 test_text = '这家公司是行业的引领者，是中流砥柱。今年的业绩非常好' result1 = senti.sentiment_count(test_text) result2 = senti.sentiment_calculate(test_text) print('sentiment_count',result1) print('sentiment_calculate',result2)` 以下是运行结果，可以发现显示的积极词数是0。这个文本中的词，”引领者“和”中流砥柱“均是词典中的积极词。 `D:\Anaconda3\python.exe D:/pythonproject/test1/金融文本情感-中文-cnsenti.py Building prefix dict from the default dictionary ... Loading model from cache C:\Users\刘铭基\AppData\Local\Temp\jieba.cache Loading model cost 0.675 seconds. sentiment_count {'words': 15, 'sentences': 2, 'pos': 0, 'neg': 0} Prefix dict has been built succesfully. sentiment_calculate {'sentences': 2, 'words': 15, 'pos': 0, 'neg': 0} 进程已结束，退出代码 0 ` 问题二，我遇到了被检测文本的末尾如果有句号，则报错。以下是代码。 `from cnsenti import Sentiment senti = Sentiment(pos=r"D:\情感词典\负面词典.txt", #正面词典txt文件相对路径 neg=r"D:\情感词典\正面词典.txt", #负面词典txt文件相对路径 encoding='unicode_escape') #两txt均为utf-8编码 test_text = '这家公司是行业的引领者，是中流砥柱。今年的业绩非常好。' result1 = senti.sentiment_count(test_text) result2 = senti.sentiment_calculate(test_text) print('sentiment_count',result1) print('sentiment_calculate',result2)` 以下是运行结果。 `D:\Anaconda3\python.exe D:/pythonproject/test1/金融文本情感-中文-cnsenti.py Building prefix dict from the default dictionary ... Loading model from cache C:\Users\刘铭基\AppData\Local\Temp\jieba.cache Loading model cost 0.685 seconds. Prefix dict has been built succesfully. Traceback (most recent call last): File "D:/pythonproject/test1/金融文本情感-中文-cnsenti.py", line 9, in result2 = senti.sentiment_calculate(test_text) File "D:\Anaconda3\lib\site-packages\cnsenti\sentiment.py", line 218, in sentiment_calculate pos = np.sum(score_array[:, 0]) IndexError: too many indices for array 进程已结束，退出代码 1 ` 最后非常感谢您的贡献，希望得到您的帮助！ — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

AirFin · 2020-04-12T04:14:45Z

好的。非常感谢您！祝您工作顺利！发自我的iPhone

…

------------------ 原始邮件 ------------------ 发件人: thunderhit <[email protected]> 发送时间: 2020年4月12日 12:13 收件人: thunderhit/cnsenti <[email protected]> 抄送: DesmondLiu <[email protected]>, Author <[email protected]> 主题: 回复：[thunderhit/cnsenti] 使用自定义词典时的问题 (#1)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

使用自定义词典时的问题 #1

使用自定义词典时的问题 #1

AirFin commented Apr 12, 2020 •

edited

Loading

hiDaDeng commented Apr 12, 2020 via email

AirFin commented Apr 12, 2020 via email •

edited

Loading

hiDaDeng commented Apr 12, 2020 via email

AirFin commented Apr 12, 2020 via email

hiDaDeng commented Apr 12, 2020 via email

AirFin commented Apr 12, 2020 via email

使用自定义词典时的问题 #1

使用自定义词典时的问题 #1

Comments

AirFin commented Apr 12, 2020 • edited Loading

hiDaDeng commented Apr 12, 2020 via email

AirFin commented Apr 12, 2020 via email • edited Loading

hiDaDeng commented Apr 12, 2020 via email

AirFin commented Apr 12, 2020 via email

hiDaDeng commented Apr 12, 2020 via email

AirFin commented Apr 12, 2020 via email

AirFin commented Apr 12, 2020 •

edited

Loading

AirFin commented Apr 12, 2020 via email •

edited

Loading