Sokhan is a fast, lightweight, and optimized framework for natural language processing (NLP) in Persian that provides developers and researchers with advanced NLP capabilities with high accuracy and fast performance. With full support for the Persian language, this tool is suitable for personal, research, and commercial projects.
Video > Voice > Image > Last IT Text
Install Windows
pip install Sokhan
Install Linux/Mac
python3 -m pip install Sokhan
or
python3 -m pip install Sokhan
Take Easy, Just Do it :
from sokhan.core.normalize import Sokhan
Sokhan = Sokhan()
text = 'من اولین پیام شما هستم به دنیای سخن خوش آمدید !'
Sokhan(text)
CLI
-------------
'من اولین پیام شما هستم به دنیای سخن خوش آمدید !'
Important
Fast, Optimized, No Free, No Needed GPU And Hard Processing.
Lang :
- Support Persian
- Support Engilish
Feature :
- Support CPU/GPU[No Need Requerment]
- Super Fast Proccessing / Turbo Fast Proccessing
- Config With .json Files
- Base C Lang[For Speed UP]
- API Web
Framework Modules :
- Normalizer
- Informal Normalize
- Summerizer
- Twitter Normalizer
- Instagram Normalizer
- Youtube Normilzer
- Telegram Normilzer
- Whatsapp Normalizer
Sokhan Vs AI In this test, the models are tested on graphics cards of at least H100, while the Sokhan library is being tested on the least hardware resources.
Lang/Module's | Sokhan* | GPTv4 | DeepSeek | Gemini | Gama | Grok v3 | Shiraz |
---|---|---|---|---|---|---|---|
Count Keyword | 0.00000001 | No | No | No | No | No | Yes |
Sokhan Benchmark's language
Lang/Module's | Sokhan* | NLTK | Fast Text | SpaCy | Regex | Scikit-Learn | Hazm |
---|---|---|---|---|---|---|---|
Support Persian | Yes | No | No | No | No | No | Yes |
Support Engilish | Yes | No | No | No | No | No | no |
Support Arabic | Cooming... | No | No | No | No | No | no |
Support Russian | Cooming... | No | No | No | No | No | no |
Support France | Cooming... | No | No | No | No | No | no |
Support Italy | Cooming... | No | No | No | No | No | no |
Support Spanol | Cooming... | No | No | No | No | No | no |
Speed Benhmark:
- Cpu 2 Core | No GPU | Ram 8 Gig | Python 3.10
- 10 Gig Dataset's Comment Ronaldo.
Feature's/Module's | Sokhan* | NLTK | Fast Text | SpaCy | Regex | Scikit-Learn | Hazm |
---|---|---|---|---|---|---|---|
Normalize | 0.00000016123 | ------- | ------- | ------- | ------- | ------- | ------- |
informal normalize | 0.00000016123 | ------- | ------- | ------- | ------- | ------- | ------- |
Summrize | 0.00000016123 | ------- | ------- | ------- | ------- | ------- | ------- |
Clean Text | 0.00000016123 | ------- | ------- | ------- | ------- | ------- | ------- |
Dataset's Benchmark:
Dataset's/Module's | Sokhan* | NLTK | Fast Text | SpaCy | Regex | Scikit-Learn | Hazm |
---|---|---|---|---|---|---|---|
Yes | ------- | ------- | ------- | ------- | ------- | ------- | |
Yes | ------- | ------- | ------- | ------- | ------- | ------- | |
Youtube | Yes | ------- | ------- | ------- | ------- | ------- | ------- |
Telegram | Yes | ------- | ------- | ------- | ------- | ------- | ------- |
Bale.io | Yes | ------- | ------- | ------- | ------- | ------- | ------- |
Eeta | Yes | ------- | ------- | ------- | ------- | ------- | ------- |
Tiktok | Yes | ------- | ------- | ------- | ------- | ------- | ------- |
Robika | Yes | ------- | ------- | ------- | ------- | ------- | ------- |
Hamshahri | Yes | ------- | ------- | ------- | ------- | ------- | ------- |
Forbs | Yes | ------- | ------- | ------- | ------- | ------- | ------- |
requerment's :
- Python 3.10
- Cython
Suitable for:
- Researcher
- Student's
- Bussiness Productor
- Startup Projects