LibIndic's chardetails module may be used to get the details of a given unicode character.
- Clone the repository
git clone https://github.com/libindic/chardetails.git
- Change to the cloned directory
cd chardetails
- Run setup.py to create installable source
python setup.py sdist
- Install using pip
pip install dist/chardetails*.tar.gz
Input: String of Unicode characters
Output: Dictionary containing details of each character
>>> from libindic.chardetails import getInstance
>>> tool = getInstance()
>>> tool.getdetails(u'ആന')
{'Characters': [u'\u0d06', u'\u0d28'],
u'\u0d06': {'AlphaNumeric': 'True',
'Alphabet': 'True',
'Canonical Decomposition': '',
'Code point': "u'\\u0d06'",
'Digit': 'False',
'HTML Entity': '3334',
'Name': 'MALAYALAM LETTER AA'},
u'\u0d28': {'AlphaNumeric': 'True',
'Alphabet': 'True',
'Canonical Decomposition': '',
'Code point': "u'\\u0d28'",
'Digit': 'False',
'HTML Entity': '3368',
'Name': 'MALAYALAM LETTER NA'}}
For more details read the docs
To run tests,
cd chardetails
python setup.py test