The repository contains parallel language corpus links for popular Indian languages.
Contains Parallel corpus from following sources
- IIT Bombay v_2.1- Original, 1.5 Million sentences.Using these groups in the corpus -chats, Movie Dialogs, general,Hi-Eng Word-Linkage,Admin Dictionary, Admin Examples,Admin Definitions, ted talks, Indic Multi-Parallel, JudicialI and II, Govt Websites I and II, Book Translations, Wikipedia, Book translation.
Citation : Anoop Kunchukuttan, Pratik Mehta, Pushpak Bhattacharyya. The IIT Bombay English-Hindi Parallel Corpus. Language Resources and Evaluation Conference. 2018. http://www.cfilt.iitb.ac.in/iitb_parallel/
- Augmented data
- Law Commission of India
Prepared from the documents of Law Commision of India using OCR.
- Indian Judiciary
Contains data scraped from indian judiciary data sources and translated using google.
- Names dictionary
Contains names of person, geographical location etc.
- Contains data scraped from indian judiciary data sources and translated using google.
- Contains data scraped from indian judiciary data sources and translated using google.
- Contains data scraped from indian judiciary data sources and translated using google.
- Contains data scraped from indian judiciary data sources and translated using google.
- Contains data scraped from indian judiciary data sources and translated using google.
- Contains data scraped from indian judiciary data sources and translated using google.
- Contains data scraped from indian judiciary data sources and translated using google.
- Contains data scraped from indian judiciary data sources and translated using google.