Skip to content

huolter/microVector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 

Repository files navigation

Note: quantization untested - will it work? XD

MicroVector

MicroVector is a lightweight Python tool for managing a small-scale vector database. It is designed to handle vectors associated with documents, providing functionalities such as adding, removing, searching, quantization and persisting the data to disk. All in-memory and with flat index.

MicroVector follows the SSII paradigm: Stupid Simple, Incredible Inefficient. :-)

Features

  • Add Node: Add a vector and its associated document to the database.
  • Binary Quantization: Apply n-bits binary quantization.
  • Remove Node: Remove a node from the database based on its index.
  • Search Top K: Retrieve the top K similar nodes based on a query vector.
  • Save to Disk: Persist the database to a file using pickle.
  • Read from Disk: Load the database from a previously saved file.

Usage

# Example Usage

from microvector import MicroVectorDB
import numpy as np

# Initialize MicroVectorDB with a specified vector dimension
vector_db = MicroVectorDB(dimension=50)

# Add nodes without quantization
vector_db.add_node(np.random.rand(50), "Document A")
vector_db.add_node(np.random.rand(50), "Document B")

# Add nodes with 8-bit quantization
vector_db.add_node(np.random.rand(50), "Document C", num_bits=8)
vector_db.add_node(np.random.rand(50), "Document D", num_bits=8)

# Remove a node by index
vector_db.remove_node(0)

# Search for the top 5 similar nodes based on a query vector
query_vector = np.random.rand(50)
top_results = vector_db.search_top_k(query_vector, k=5)

# Save the database to disk
vector_db.save_to_disk("vector_db.pkl")

# Read the database from disk
vector_db.read_from_disk("vector_db.pkl")

Next

  • Voroni Cells
  • Hierarchical Navigable Small-World (HNSW)
  • Examples
  • Benchmarks on speed and memory ussage

Links and references

About

Tiny in-memory vector database

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages