GitHub - jleguina/entity-normalization: Entity normalization engine for Vector.ai

About The Project

This project is an entity normalisation engine developed for the Vector AI recruitment process. It supports entity normalisation for the following types of entities:

Companies, businesses;
Products, objects;
Locations, cities, countries;
Serial numbers;
Street addresses.

The model takes as input a stream of strings in the classes above. There is no context provided for each entity.

The model performs a normalisation to suitable Wikipedia articles for the first three types of entities. Given the uniqueness of the latter two types of entities, normalisation is performed according to linguistic similarity of the input entities using the Levenshtein distance.

The model accepts entities in any language supported by the Google Translator API.

Getting Started

To set up this project:

Clone GitHub repo:

git clone https://github.com/jleguina0/entity-normalization.git

Create a suitable virtual environment and install dependencies:
- With conda:
```
cd entity-normalization
conda env create -f environment.yml
conda activate entity-norm37
```
- Or else, create a virtual environment with Python 3.7 and do:
```
pip install -r requirements.txt
```
To run the normalization engine with some predefined examples in various languages:
```
python entity_norm.py
```

Contact

Javier Leguina Peral - [email protected]

Project Link: https://github.com/jleguina0/entity-normalization

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.gitignore		.gitignore
README.md		README.md
entity_norm.py		entity_norm.py
environment.yml		environment.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About The Project

Getting Started

Contact

About

Releases

Packages

Languages

jleguina/entity-normalization

Folders and files

Latest commit

History

Repository files navigation

About The Project

Getting Started

Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages