A tool to detect whether numerals present in Financial Texts are in-claim or out-of-claim. It has been accepted at the FinWeb@TheWebConf-2022 (formerly ACM-WWW) (Core rank: A*) (pre-print)
Use it directly from HuggingFace Spaces or Google Colab
The API is available here.
For re-training or re-using the tool locally, please refer to requirements.txt for versions of the Python libaries used while developing this tool.
Training
For training you need to execute the FiNCAT_training.ipynb notebook the present in the training folder. It needs fincat_utils.py present in the main folder and the embeddings/labels present in the training folder as .csv files. X_train_df.zip needs to be unzipped to get the X_train_df.csv file. You can obtain the raw data from here .
Using the tool locally
For using the tool locally, you do not need to train it as we have already provided the model artifacts. You can simply execute the FiNCAT_tool_enhanced_UI.ipynb notebook. More details have been provided in the tools folder.
This tool has been built using Google Colab and Gradio. It has been hosted using 🤗 HuggingFace Spaces.
Tool citation:
@inproceedings{ghosh-fiNCAT,
title = "FiNCAT: Financial Numeral Claim Analysis Tool",
author = "Sohom Ghosh, Sudip Kumar Naskar",
year = "2022",
journal = "In Companion Proceedings of the Web Conference 2022 (WWW ’22 Companion)"
url = "https://arxiv.org/abs/2202.00631",
doi = "10.1145/3487553.3524635"
}
@article{fincat2,
title = {FiNCAT-2: An enhanced Financial Numeral Claim Analysis Tool},
journal = {Software Impacts},
volume = {},
pages = {},
year = {2022},
issn = {2665-9638},
doi = {10.1016/j.simpa.2022.100288},
url = {https://www.sciencedirect.com/science/article/pii/S2665963822000367},
author = {Sohom Ghosh, Sudip Kumar Naskar},
}
Dataset and shared task citation:
@inproceedings{finum3,
title={Overview of the NTCIR-16 FinNum-3 Task: Investor’s and Manager’s
Fine-grained Claim Detection},
author={Chen, Chung-Chi and Huang, Hen-Hsen and Huang, Yu-Lieh and Takamura, Hiroya and Chen, Hsin-Hsi},
journal={Proceedings of the 16th NTCIR Conference on Evaluation of Information Access Technologies, Tokyo Japan},
year={2022}
}
@inbook{numclaim,
author = {Chen, Chung-Chi and Huang, Hen-Hsen and Chen, Hsin-Hsi},
title = {NumClaim: Investor's Fine-Grained Claim Detection},
year = {2020},
isbn = {9781450368599},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3340531.3412100},
booktitle = {Proceedings of the 29th ACM International Conference on Information & Knowledge Management},
pages = {1973–1976},
numpages = {4}
}
NOTE:
This tool is released under MIT license.
The embeddings and labels are released under Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.