Skip to content

Latest commit

 

History

History
87 lines (64 loc) · 2.56 KB

readme.md

File metadata and controls

87 lines (64 loc) · 2.56 KB

NLPG

Natural language processing extensions for the Postgresql database. It uses a number of pretrained language models to perform common tasks such as translation, classification, sentance embeddings and much more.

Api

Translate text

select babel('Hallo','nl','en');
Hello

Sentence Embeddings

select sbert('...');
'[...]'

The output of sbert is supported by pgvector.

select vector(sbert('...'));
[...]

It supports the operators for the L2 product <->, the cosine distance <=> or the inner product <#>.

Summary Text

select summary('...');
...

Ask a Question

select ask_question('Hallo','context');
Hello

Zero shot Classification

Classify text with a pretrained language transformer.

select zero_shot('text',['amsterdam','berlin','copenhagen']);
'berlin'

Installation

Make sure to install both pgx and rust-bert correctly and config your environment variables to be able to find the libtorch shared library.

Run example:

LD_LIBRARY_PATH=${HOME}/Code/libtorch/lib:$LD_LIBRARY_PATH cargo pgx run

Install package

LD_LIBRARY_PATH=${HOME}/Code/libtorch/lib:$LD_LIBRARY_PATH sudo cargo pgx install

pgvector

This extension can be used together with the pgvector extension. Pgvector must be compiled from source and to copied to the pgx test installation of postgresql.

cp vector.so ~/.pgx/15.2/pgx-install/lib/postgresql/x
cp vector.control ~/.pgx/15.2/pgx-install/share/postgresql/extension/
cp sql/vector*.sql ~/.pgx/15.2/pgx-install/share/postgresql/extension/

Config

  • RUSTBERT_CACHE location of language models defaults to ~/.cache/.rustbert
  • PGX_IGNORE_RUST_VERSIONS

Awesome links