Awesome LLMOps

An awesome & curated list of the best LLMOps tools for developers.

Note

Contributions are most welcome, please adhere to the contribution guidelines.

Model

Large Language Model

Project	Details	Repository
Alpaca	Code and documentation to train Stanford's Alpaca models, and generate the data.
BELLE	A 7B Large Language Model fine-tune by 34B Chinese Character Corpus, based on LLaMA and Alpaca.
Bloom	BigScience Large Open-science Open-access Multilingual Language Model
dolly	Databricks’ Dolly, a large language model trained on the Databricks Machine Learning Platform
Falcon 40B	Falcon-40B-Instruct is a 40B parameters causal decoder-only model built by TII based on Falcon-40B and finetuned on a mixture of Baize. It is made available under the Apache 2.0 license.
FastChat (Vicuna)	An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and FastChat-T5.
Gemma	Gemma is a family of lightweight, open models built from the research and technology that Google used to create the Gemini models.
GLM-6B (ChatGLM)	An Open Bilingual Pre-Trained Model, quantization of ChatGLM-130B, can run on consumer-level GPUs.
ChatGLM2-6B	ChatGLM2-6B is the second-generation version of the open-source bilingual (Chinese-English) chat model ChatGLM-6B.
GLM-130B (ChatGLM)	An Open Bilingual Pre-Trained Model (ICLR 2023)
GPT-NeoX	An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.
Luotuo	A Chinese LLM, Based on LLaMA and fine tune by Stanford Alpaca, Alpaca LoRA, Japanese-Alpaca-LoRA.
Mixtral-8x7B-v0.1	The Mixtral-8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts.
StableLM	StableLM: Stability AI Language Models

⬆ back to ToC

CV Foundation Model

Project	Details	Repository
disco-diffusion	A frankensteinian amalgamation of notebooks, models and techniques for the generation of AI Art and Animations.
midjourney	Midjourney is an independent research lab exploring new mediums of thought and expanding the imaginative powers of the human species.
segment-anything (SAM)	produces high quality object masks from input prompts such as points or boxes, and it can be used to generate masks for all objects in an image.
stable-diffusion	A latent text-to-image diffusion model
stable-diffusion v2	High-Resolution Image Synthesis with Latent Diffusion Models

⬆ back to ToC

Audio Foundation Model

Project	Details	Repository
bark	Bark is a transformer-based text-to-audio model created by Suno. Bark can generate highly realistic, multilingual speech as well as other audio - including music, background noise and simple sound effects.
whisper	Robust Speech Recognition via Large-Scale Weak Supervision

Serving

Large Model Serving

Project	Details	Repository
Alpaca-LoRA-Serve	Alpaca-LoRA as Chatbot service
CTranslate2	fast inference engine for Transformer models in C++
Clip-as-a-service	serving the OpenAI CLIP model
DeepSpeed-MII	MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Faster Whisper	fast inference engine for whisper in C++ using CTranslate2.
FlexGen	Running large language models on a single GPU for throughput-oriented scenarios.
Flowise	Drag & drop UI to build your customized LLM flow using LangchainJS.
llama.cpp	Port of Facebook's LLaMA model in C/C++
Infinity	Rest API server for serving text-embeddings
Modelz-LLM	OpenAI compatible API for LLMs and embeddings (LLaMA, Vicuna, ChatGLM and many others)
Ollama	Serve Llama 2 and other large language models locally from command line or through a browser interface.
TensorRT-LLM	Inference engine for TensorRT on Nvidia GPUs
text-generation-inference	Large Language Model Text Generation Inference
text-embeddings-inference	Inference for text-embedding models
tokenizers	💥 Fast State-of-the-Art Tokenizers optimized for Research and Production
vllm	A high-throughput and memory-efficient inference and serving engine for LLMs.
whisper.cpp	Port of OpenAI's Whisper model in C/C++
x-stable-diffusion	Real-time inference for Stable Diffusion - 0.88s latency. Covers AITemplate, nvFuser, TensorRT, FlashAttention.

⬆ back to ToC

Frameworks/Servers for Serving

Project	Details	Repository
BentoML	The Unified Model Serving Framework
Jina	Build multimodal AI services via cloud native technologies · Model Serving · Generative AI · Neural Search · Cloud Native
Mosec	A machine learning model serving framework with dynamic batching and pipelined stages, provides an easy-to-use Python interface.
TFServing	A flexible, high-performance serving system for machine learning models.
Torchserve	Serve, optimize and scale PyTorch models in production
Triton Server (TRTIS)	The Triton Inference Server provides an optimized cloud and edge inferencing solution.
langchain-serve	Serverless LLM apps on Production with Jina AI Cloud
lanarky	FastAPI framework to build production-grade LLM applications
ray-llm	LLMs on Ray - RayLLM
Xinference	Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

⬆ back to ToC

Security

Frameworks for LLM security

Project	Details	Repository
Plexiglass	A Python Machine Learning Pentesting Toolbox for Adversarial Attacks. Works with LLMs, DNNs, and other machine learning algorithms.

⬆ back to ToC

Observability

Project	Details	Repository
Azure OpenAI Logger	"Batteries included" logging solution for your Azure OpenAI instance.
Deepchecks	Tests for Continuous Validation of ML Models & Data. Deepchecks is a Python package for comprehensively validating your machine learning models and data with minimal effort.
Evidently	An open-source framework to evaluate, test and monitor ML and LLM-powered systems.
Fiddler AI	Evaluate, monitor, analyze, and improve machine learning and generative models from pre-production to production. Ship more ML and LLMs into production, and monitor ML and LLM metrics like hallucination, PII, and toxicity.
Giskard	Testing framework dedicated to ML models, from tabular to LLMs. Detect risks of biases, performance issues and errors in 4 lines of code.
Great Expectations	Always know what to expect from your data.
Helicone	Open source LLM observability platform. One line of code to monitor, evaluate, and experiment with features like prompt management, agent tracing, and evaluations.
Traceloop OpenLLMetry	OpenTelemetry-based observability and monitoring for LLM and agents workflows.
whylogs	The open standard for data logging

⬆ back to ToC

LLMOps

Project	Details	Repository
agenta	The LLMOps platform to build robust LLM apps. Easily experiment and evaluate different prompts, models, and workflows to build robust apps.
AgentMark	Type-Safe Markdown-based Agents
AI studio	A Reliable Open Source AI studio to build core infrastructure stack for your LLM Applications. It allows you to gain visibility, make your application reliable, and prepare it for production with features such as caching, rate limiting, exponential retry, model fallback, and more.
Arize-Phoenix	ML observability for LLMs, vision, language, and tabular models.
BudgetML	Deploy a ML inference service on a budget in less than 10 lines of code.
Cheshire Cat AI	Web framework to create vertical AI agents. FastAPI based, plugin system inspired to WordPress, admin panel, vector DB included
Dataoorts	Enjoy unlimited API calls with Serverless AI Workers/LLMs for just $25 per month. No rate or concurrency limits.
deeplake	Stream large multimodal datasets to achieve near 100% GPU utilization. Query, visualize, & version control data. Access data w/o the need to recompute the embeddings for the model finetuning.
Dify	Open-source framework aims to enable developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable.
Dstack	Cost-effective LLM development in any cloud (AWS, GCP, Azure, Lambda, etc).
Embedchain	Framework to create ChatGPT like bots over your dataset.
Epsilla	An all-in-one platform to create vertical AI agents powered by your private data and knowledge.
Evidently	An open-source framework to evaluate, test and monitor ML and LLM-powered systems.
Fiddler AI	Evaluate, monitor, analyze, and improve MLOps and LLMOps from pre-production to production.
Glide	Cloud-Native LLM Routing Engine. Improve LLM app resilience and speed.
gotoHuman	Bring a human into the loop in your LLM-based and agentic workflows. Prompt users to approve actions, select next steps, or review and validate generated results.
GPTCache	Creating semantic cache to store responses from LLM queries.
GPUStack	An open-source GPU cluster manager for running and managing LLMs
Haystack	Quickly compose applications with LLM Agents, semantic search, question-answering and more.
Helicone	Open-source LLM observability platform for logging, monitoring, and debugging AI applications. Simple 1-line integration to get started.
Humanloop	The LLM evals platform for enterprises, providing tools to develop, evaluate, and observe AI systems.
Izlo	Prompt management tools for teams. Store, improve, test, and deploy your prompts in one unified workspace.
Keywords AI	A unified DevOps platform for AI software. Keywords AI makes it easy for developers to build LLM applications.
MLflow	An open-source framework for the end-to-end machine learning lifecycle, helping developers track experiments, evaluate models/prompts, deploy models, and add observability with tracing.
Laminar	Open-source all-in-one platform for engineering AI products. Traces, Evals, Datasets, Labels.
langchain	Building applications with LLMs through composability
LangFlow	An effortless way to experiment and prototype LangChain flows with drag-and-drop components and a chat interface.
Langfuse	Open Source LLM Engineering Platform: Traces, evals, prompt management and metrics to debug and improve your LLM application.
LangKit	Out-of-the-box LLM telemetry collection library that extracts features and profiles prompts, responses and metadata about how your LLM is performing over time to find problems at scale.
LangWatch	LLM Ops platform with Analytics, Monitoring, Evaluations and an LLM Optimization Studio powered by DSPy
LiteLLM 🚅	A simple & light 100 line package to standardize LLM API calls across OpenAI, Azure, Cohere, Anthropic, Replicate API Endpoints
Literal AI	Multi-modal LLM observability and evaluation platform. Create prompt templates, deploy prompts versions, debug LLM runs, create datasets, run evaluations, monitor LLM metrics and collect human feedback.
LlamaIndex	Provides a central interface to connect your LLMs with external data.
LLMApp	LLM App is a Python library that helps you build real-time LLM-enabled data pipelines with few lines of code.
LLMFlows	LLMFlows is a framework for building simple, explicit, and transparent LLM applications such as chatbots, question-answering systems, and agents.
Lunary	Observability and prompt management for LLM chabots and agents. Debug agents with powerful tracing and logging. Usage analytics and dive deep into the history of your requests. Developer friendly modules with plug-and-play integration into LangChain.
magentic	Seamlessly integrate LLMs as Python functions. Use type annotations to specify structured output. Mix LLM queries and function calling with regular Python code to create complex LLM-powered functionality.
Manag.ai	Your all-in-one prompt management and observability platform. Craft, track, and perfect your LLM prompts with ease.
Mirascope	Intuitive convenience tooling for lightning-fast, efficient development and ensuring quality in LLM-based applications
OpenLIT	OpenLIT is an OpenTelemetry-native GenAI and LLM Application Observability tool and provides OpenTelmetry Auto-instrumentation for monitoring LLMs, VectorDBs and Frameworks. It provides valuable insights into token & cost usage, user interaction, and performance related metrics.
Opik	Confidently evaluate, test, and ship LLM applications with a suite of observability tools to calibrate language model outputs across your dev and production lifecycle.
Parea AI	Platform and SDK for AI Engineers providing tools for LLM evaluation, observability, and a version-controlled enhanced prompt playground.
Pezzo 🕹️	Pezzo is the open-source LLMOps platform built for developers and teams. In just two lines of code, you can seamlessly troubleshoot your AI operations, collaborate and manage your prompts in one place, and instantly deploy changes to any environment.
PromptDX	A declarative, extensible, and composable approach for developing LLM prompts using Markdown and JSX.
PromptHub	Full stack prompt management tool designed to be usable by technical and non-technical team members. Test, version, collaborate, deploy, and monitor, all from one place.
promptfoo	Open-source tool for testing & evaluating prompt quality. Create test cases, automatically check output quality and catch regressions, and reduce evaluation cost.
PromptFoundry	The simple prompt engineering and evaluation tool designed for developers building AI applications.
PromptLayer 🍰	Prompt Engineering platform. Collaborate, test, evaluate, and monitor your LLM applications
PromptMage	Open-source tool to simplify the process of creating and managing LLM workflows and prompts as a self-hosted solution.
PromptSite	A lightweight Python library for prompt lifecycle management that helps you version control, track, experiment and debug with your LLM prompts with ease. Minimal setup, no servers, databases, or API keys required - works directly with your local filesystem, ideal for data scientists and engineers to easily integrate into existing LLM workflows
Prompteams	Prompt management system. Version, test, collaborate, and retrieve prompts through real-time APIs. Have GitHub style with repos, branches, and commits (and commit history).
prompttools	Open-source tools for testing and experimenting with prompts. The core idea is to enable developers to evaluate prompts using familiar interfaces like code and notebooks. In just a few lines of codes, you can test your prompts and parameters across different models (whether you are using OpenAI, Anthropic, or LLaMA models). You can even evaluate the retrieval accuracy of vector databases.
Puzzlet AI	The Git-Based LLM Engineering Platform. Achieve more from GenAI: Manage, evaluate, and improve your full-stack LLM application - with version control, type-safety, and local development built-in.
systemprompt.io	Systemprompt.io is a Rest API with quality tooling to enable the creation, use and observability of prompts in any AI system. Control every detail of your prompt for a SOTA prompt management experience.
TreeScale	All In One Dev Platform For LLM Apps. Deploy LLM-enhanced APIs seamlessly using tools for prompt optimization, semantic querying, version management, statistical evaluation, and performance tracking. As a part of the developer friendly API implementation TreeScale offers Elastic LLM product, which makes a unified API Endpoint for all major LLM providers and open source models.
TrueFoundry	Deploy LLMOps tools like Vector DBs, Embedding server etc on your own Kubernetes (EKS,AKS,GKE,On-prem) Infra including deploying, Fine-tuning, tracking Prompts and serving Open Source LLM Models with full Data Security and Optimal GPU Management. Train and Launch your LLM Application at Production scale with best Software Engineering practices.
ReliableGPT 💪	Handle OpenAI Errors (overloaded OpenAI servers, rotated keys, or context window errors) for your production LLM Applications.
Portkey	Control Panel with an observability suite & an AI gateway — to ship fast, reliable, and cost-efficient apps.
Vellum	An AI product development platform to experiment with, evaluate, and deploy advanced LLM apps.
Weights & Biases (Prompts)	A suite of LLMOps tools within the developer-first W&B MLOps platform. Utilize W&B Prompts for visualizing and inspecting LLM execution flow, tracking inputs and outputs, viewing intermediate results, securely managing prompts and LLM chain configurations.
Wordware	A web-hosted IDE where non-technical domain experts work with AI Engineers to build task-specific AI agents. It approaches prompting as a new programming language rather than low/no-code blocks.
xTuring	Build and control your personal LLMs with fast and efficient fine-tuning.
ZenML	Open-source framework for orchestrating, experimenting and deploying production-grade ML solutions, with built-in `langchain` & `llama_index` integrations.

⬆ back to ToC

Search

Vector search

Project	Details	Repository
AquilaDB	An easy to use Neural Search Engine. Index latent vectors along with JSON metadata and do efficient k-NN search.
Awadb	AI Native database for embedding vectors
Chroma	the open source embedding database
Epsilla	A 10x faster, cheaper, and better vector database
Infinity	The AI-native database built for LLM applications, providing incredibly fast vector and full-text search
Lancedb	Developer-friendly, serverless vector database for AI applications. Easily add long-term memory to your LLM apps!
Marqo	Tensor search for humans.
Milvus	Vector database for scalable similarity search and AI applications.
Pinecone	The Pinecone vector database makes it easy to build high-performance vector search applications. Developer-friendly, fully managed, and easily scalable without infrastructure hassles.
pgvector	Open-source vector similarity search for Postgres.
pgvecto.rs	Vector database plugin for Postgres, written in Rust, specifically designed for LLM.
Qdrant	Vector Search Engine and Database for the next generation of AI applications. Also available in the cloud
txtai	Build AI-powered semantic search applications
Vald	A Highly Scalable Distributed Vector Search Engine
Vearch	A distributed system for embedding-based vector retrieval
VectorDB	A Python vector database you just need - no more, no less.
Vellum	A managed service for ingesting documents and performing hybrid semantic/keyword search across them. Comes with out-of-box support for OCR, text chunking, embedding model experimentation, metadata filtering, and production-grade APIs.
Weaviate	Weaviate is an open source vector search engine that stores both objects and vectors, allowing for combining vector search with structured filtering with the fault-tolerance and scalability of a cloud-native database, all accessible through GraphQL, REST, and various language clients.

⬆ back to ToC

Code AI

Project	Details	Repository
CodeGeeX	CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
CodeGen	CodeGen is an open-source model for program synthesis. Trained on TPU-v4. Competitive with OpenAI Codex.
CodeT5	Open Code LLMs for Code Understanding and Generation.
Continue	⏩ the open-source autopilot for software development—bring the power of ChatGPT to VS Code
fauxpilot	An open-source alternative to GitHub Copilot server
tabby	Self-hosted AI coding assistant. An opensource / on-prem alternative to GitHub Copilot.

Training

IDEs and Workspaces

Project	Details	Repository
code server	Run VS Code on any machine anywhere and access it in the browser.
conda	OS-agnostic, system-level binary package manager and ecosystem.
Docker	Moby is an open-source project created by Docker to enable and accelerate software containerization.
envd	🏕️ Reproducible development environment for AI/ML.
Jupyter Notebooks	The Jupyter notebook is a web-based notebook environment for interactive computing.
Kurtosis	A build, packaging, and run system for ephemeral multi-container environments.
Wordware	A web-hosted IDE where non-technical domain experts work with AI Engineers to build task-specific AI agents. It approaches prompting as a new programming language rather than low/no-code blocks.

⬆ back to ToC

Foundation Model Fine Tuning

Project	Details	Repository
alpaca-lora	Instruct-tune LLaMA on consumer hardware
finetuning-scheduler	A PyTorch Lightning extension that accelerates and enhances foundation model experimentation with flexible fine-tuning schedules.
Flyflow	Open source, high performance fine tuning as a service for GPT4 quality models with 5x lower latency and 3x lower cost
LMFlow	An Extensible Toolkit for Finetuning and Inference of Large Foundation Models
Lora	Using Low-rank adaptation to quickly fine-tune diffusion models.
peft	State-of-the-art Parameter-Efficient Fine-Tuning.
p-tuning-v2	An optimized prompt tuning strategy achieving comparable performance to fine-tuning on small/medium-sized models and sequence tagging challenges. (ACL 2022)
QLoRA	Efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task performance.
TRL	Train transformer language models with reinforcement learning.

⬆ back to ToC

Frameworks for Training

Project	Details	Repository
Accelerate	🚀 A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision.
Apache MXNet	Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler.
axolotl	A tool designed to streamline the fine-tuning of various AI models, offering support for multiple configurations and architectures.
Caffe	A fast open framework for deep learning.
Candle	Minimalist ML framework for Rust .
ColossalAI	An integrated large-scale model training system with efficient parallelization techniques.
DeepSpeed	DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Horovod	Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Jax	Autograd and XLA for high-performance machine learning research.
Kedro	Kedro is an open-source Python framework for creating reproducible, maintainable and modular data science code.
Keras	Keras is a deep learning API written in Python, running on top of the machine learning platform TensorFlow.
LightGBM	A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
MegEngine	MegEngine is a fast, scalable and easy-to-use deep learning framework, with auto-differentiation.
metric-learn	Metric Learning Algorithms in Python.
MindSpore	MindSpore is a new open source deep learning training/inference framework that could be used for mobile, edge and cloud scenarios.
Oneflow	OneFlow is a performance-centered and open-source deep learning framework.
PaddlePaddle	Machine Learning Framework from Industrial Practice.
PyTorch	Tensors and Dynamic neural networks in Python with strong GPU acceleration.
PyTorch Lightning	Deep learning framework to train, deploy, and ship AI products Lightning fast.
XGBoost	Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library.
scikit-learn	Machine Learning in Python.
TensorFlow	An Open Source Machine Learning Framework for Everyone.
VectorFlow	A minimalist neural network library optimized for sparse data and single machine environments.

⬆ back to ToC

Experiment Tracking

Project	Details	Repository
Aim	an easy-to-use and performant open-source experiment tracker.
ClearML	Auto-Magical CI/CD to streamline your ML workflow. Experiment Manager, MLOps and Data-Management
Comet	Comet is an MLOps platform that offers experiment tracking, model production management, a model registry, and full data lineage from training straight through to production. Comet plays nicely with all your favorite tools, so you don't have to change your existing workflow. Comet Opik to confidently evaluate, test, and ship LLM applications with a suite of observability tools to calibrate language model outputs across your dev and production lifecycle!
Guild AI	Experiment tracking, ML developer tools.
MLRun	Machine Learning automation and tracking.
Kedro-Viz	Kedro-Viz is an interactive development tool for building data science pipelines with Kedro. Kedro-Viz also allows users to view and compare different runs in the Kedro project.
LabNotebook	LabNotebook is a tool that allows you to flexibly monitor, record, save, and query all your machine learning experiments.
Sacred	Sacred is a tool to help you configure, organize, log and reproduce experiments.
Weights & Biases	A developer first, lightweight, user-friendly experiment tracking and visualization tool for machine learning projects, streamlining collaboration and simplifying MLOps. W&B excels at tracking LLM-powered applications, featuring W&B Prompts for LLM execution flow visualization, input and output monitoring, and secure management of prompts and LLM chain configurations.

⬆ back to ToC

Visualization

Project	Details	Repository
Fiddler AI	Rich dashboards, reports, and UMAP to perform root cause analysis, pinpoint problem areas, like correctness, safety, and privacy issues, and improve LLM outcomes.
LangWatch	Visualize LLM evaluations experiments and DSPy pipeline optimizations
Maniford	A model-agnostic visual debugging tool for machine learning.
netron	Visualizer for neural network, deep learning, and machine learning models.
OpenOps	Bring multiple data streams into one dashboard.
TensorBoard	TensorFlow's Visualization Toolkit.
TensorSpace	Neural network 3D visualization framework, build interactive and intuitive model in browsers, support pre-trained deep learning models from TensorFlow, Keras, TensorFlow.js.
dtreeviz	A python library for decision tree visualization and model interpretation.
Zetane Viewer	ML models and internal tensors 3D visualizer.
Zeno	AI evaluation platform for interactively exploring data and model outputs.

Model Editing

Project	Details	Repository
FastEdit	FastEdit aims to assist developers with injecting fresh and customized knowledge into large language models efficiently using one single command.

⬆ back to ToC

Data

Data Management

Project	Details	Repository
ArtiVC	A version control system to manage large files. Lake is a dataset format with a simple API for creating, storing, and collaborating on AI datasets of any size.
Dolt	Git for Data.
DVC	Data Version Control - Git for Data & Models - ML Experiments Management.
Delta-Lake	Storage layer that brings scalable, ACID transactions to Apache Spark and other engines.
Pachyderm	Pachyderm is a version control system for data.
Quilt	A self-organizing data hub for S3.

⬆ back to ToC

Data Storage

Project	Details	Repository
JuiceFS	A distributed POSIX file system built on top of Redis and S3.
LakeFS	Git-like capabilities for your object storage.
Lance	Modern columnar data format for ML implemented in Rust.

⬆ back to ToC

Data Tracking

Project	Details	Repository
Piperider	A CLI tool that allows you to build data profiles and write assertion tests for easily evaluating and tracking your data's reliability over time.
LUX	A Python library that facilitates fast and easy data exploration by automating the visualization and data analysis process.

⬆ back to ToC

Feature Engineering

Project	Details	Repository
Featureform	The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
FeatureTools	An open source python framework for automated feature engineering

⬆ back to ToC

Data/Feature enrichment

Project	Details	Repository
Upgini	Free automated data & feature enrichment library for machine learning: automatically searches through thousands of ready-to-use features from public and community shared data sources and enriches your training dataset with only the accuracy improving features
Feast	An open source feature store for machine learning.
distilabel	⚗️ distilabel is a framework for synthetic data and AI feedback for AI engineers that require high-quality outputs, full data ownership, and overall efficiency.

⬆ back to ToC

Large Scale Deployment

ML Platforms

Project	Details	Repository
Comet	Comet is an MLOps platform that offers experiment tracking, model production management, a model registry, and full data lineage from training straight through to production. Comet plays nicely with all your favorite tools, so you don't have to change your existing workflow. Comet Opik to confidently evaluate, test, and ship LLM applications with a suite of observability tools to calibrate language model outputs across your dev and production lifecycle!
ClearML	Auto-Magical CI/CD to streamline your ML workflow. Experiment Manager, MLOps and Data-Management.
Hopsworks	Hopsworks is a MLOps platform for training and operating large and small ML systems, including fine-tuning and serving LLMs. Hopsworks includes both a feature store and vector database for RAG.
OpenLLM	An open platform for operating large language models (LLMs) in production. Fine-tune, serve, deploy, and monitor any LLMs with ease.
MLflow	Open source platform for the machine learning lifecycle.
MLRun	An open MLOps platform for quickly building and managing continuous ML applications across their lifecycle.
ModelFox	ModelFox is a platform for managing and deploying machine learning models.
Kserve	Standardized Serverless ML Inference Platform on Kubernetes
Kubeflow	Machine Learning Toolkit for Kubernetes.
PAI	Resource scheduling and cluster management for AI.
Polyaxon	Machine Learning Management & Orchestration Platform.
Primehub	An effortless infrastructure for machine learning built on the top of Kubernetes.
OpenModelZ	One-click machine learning deployment (LLM, text-to-image and so on) at scale on any cluster (GCP, AWS, Lambda labs, your home lab, or even a single machine).
Seldon-core	An MLOps framework to package, deploy, monitor and manage thousands of production machine learning models
Starwhale	An MLOps/LLMOps platform for model building, evaluation, and fine-tuning.
TrueFoundry	A PaaS to deploy, Fine-tune and serve LLM Models on a company’s own Infrastructure with Data Security and Optimal GPU and Cost Management. Launch your LLM Application at Production scale with best DevSecOps practices.
Weights & Biases	A lightweight and flexible platform for machine learning experiment tracking, dataset versioning, and model management, enhancing collaboration and streamlining MLOps workflows. W&B excels at tracking LLM-powered applications, featuring W&B Prompts for LLM execution flow visualization, input and output monitoring, and secure management of prompts and LLM chain configurations.

⬆ back to ToC

Workflow

Project	Details	Repository
Airflow	A platform to programmatically author, schedule and monitor workflows.
aqueduct	An Open-Source Platform for Production Data Science
Argo Workflows	Workflow engine for Kubernetes.
Flyte	Kubernetes-native workflow automation platform for complex, mission-critical data and ML processes at scale.
Hamilton	A lightweight framework to represent ML/language model pipelines as a series of python functions.
Kubeflow Pipelines	Machine Learning Pipelines for Kubeflow.
LangFlow	An effortless way to experiment and prototype LangChain flows with drag-and-drop components and a chat interface.
Metaflow	Build and manage real-life data science projects with ease!
Ploomber	The fastest way to build data pipelines. Develop iteratively, deploy anywhere.
Prefect	The easiest way to automate your data.
VDP	An open-source unstructured data ETL tool to streamline the end-to-end unstructured data processing pipeline.
ZenML	MLOps framework to create reproducible pipelines.

⬆ back to ToC

Scheduling

Project	Details	Repository
Kueue	Kubernetes-native Job Queueing.
PAI	Resource scheduling and cluster management for AI (Open-sourced by Microsoft).
Slurm	A Highly Scalable Workload Manager.
Volcano	A Cloud Native Batch System (Project under CNCF).
Yunikorn	Light-weight, universal resource scheduler for container orchestrator systems.

⬆ back to ToC

Model Management

Project	Details	Repository
Comet	Comet is an MLOps platform that offers Model Production Management, a Model Registry, and full model lineage from training straight through to production. Use Comet for model reproducibility, model debugging, model versioning, model visibility, model auditing, model governance, and model monitoring.
dvc	ML Experiments Management - Data Version Control - Git for Data & Models
ModelDB	Open Source ML Model Versioning, Metadata, and Experiment Management
MLEM	A tool to package, serve, and deploy any ML model on any platform.
ormb	Docker for Your ML/DL Models Based on OCI Artifacts

⬆ back to ToC

Performance

ML Compiler

Project	Details	Repository
ONNX-MLIR	Compiler technology to transform a valid Open Neural Network Exchange (ONNX) graph into code that implements the graph with minimum runtime support.
bitsandbytes	Accessible large language models via k-bit quantization for PyTorch.
TVM	Open deep learning compiler stack for cpu, gpu and specialized accelerators

⬆ back to ToC

Profiling

Project	Details	Repository
octoml-profile	octoml-profile is a python library and cloud service designed to provide the simplest experience for assessing and optimizing the performance of PyTorch models on cloud hardware with state-of-the-art ML acceleration technology.
scalene	a high-performance, high-precision CPU, GPU, and memory profiler for Python

⬆ back to ToC

AutoML

Project	Details	Repository
Archai	a platform for Neural Network Search (NAS) that allows you to generate efficient deep networks for your applications.
autoai	A framework to find the best performing AI/ML model for any AI problem.
AutoGL	An autoML framework & toolkit for machine learning on graphs
AutoGluon	AutoML for Image, Text, and Tabular Data.
automl-gs	Provide an input CSV and a target field to predict, generate a model + code to run it.
AutoRAG	AutoML tool for RAG - Boost your LLM app performance with your own data
autokeras	AutoML library for deep learning.
Auto-PyTorch	Automatic architecture search and hyperparameter optimization for PyTorch.
auto-sklearn	an automated machine learning toolkit and a drop-in replacement for a scikit-learn estimator.
Dragonfly	An open source python library for scalable Bayesian optimisation.
Determined	scalable deep learning training platform with integrated hyperparameter tuning support; includes Hyperband, PBT, and other search methods.
DEvol (DeepEvolution)	a basic proof of concept for genetic architecture search in Keras.
EvalML	An open source python library for AutoML.
FEDOT	AutoML framework for the design of composite pipelines.
FLAML	Fast and lightweight AutoML (paper).
Goptuna	A hyperparameter optimization framework, inspired by Optuna.
HpBandSter	a framework for distributed hyperparameter optimization.
HPOlib2	a library for hyperparameter optimization and black box optimization benchmarks.
Hyperband	open source code for tuning hyperparams with Hyperband.
Hypernets	A General Automated Machine Learning Framework.
Hyperopt	Distributed Asynchronous Hyperparameter Optimization in Python.
hyperunity	A toolset for black-box hyperparameter optimisation.
Intelli	A framework to connect a flow of ML models by applying graph theory.
Katib	Katib is a Kubernetes-native project for automated machine learning (AutoML).
Keras Tuner	Hyperparameter tuning for humans.
learn2learn	PyTorch Meta-learning Framework for Researchers.
Ludwig	a toolbox built on top of TensorFlow that allows to train and test deep learning models without the need to write code.
MOE	a global, black box optimization engine for real world metric optimization by Yelp.
Model Search	a framework that implements AutoML algorithms for model architecture search at scale.
NASGym	a proof-of-concept OpenAI Gym environment for Neural Architecture Search (NAS).
NNI	An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Optuna	A hyperparameter optimization framework.
Pycaret	An open-source, low-code machine learning library in Python that automates machine learning workflows.
Ray Tune	Scalable Hyperparameter Tuning.
REMBO	Bayesian optimization in high-dimensions via random embedding.
RoBO	a Robust Bayesian Optimization framework.
scikit-optimize(skopt)	Sequential model-based optimization with a `scipy.optimize` interface.
Spearmint	a software package to perform Bayesian optimization.
TPOT	one of the very first AutoML methods and open-source software packages.
Torchmeta	A Meta-Learning library for PyTorch.
Vegas	an AutoML algorithm tool chain by Huawei Noah's Arb Lab.

⬆ back to ToC

Optimizations

Project	Details	Repository
FeatherCNN	FeatherCNN is a high performance inference engine for convolutional neural networks.
Forward	A library for high performance deep learning inference on NVIDIA GPUs.
LangWatch	LangWatch Optimization Studio is your laboratory to create, evaluate, and optimize your LLM workflows using DSPy optimizers
NCNN	ncnn is a high-performance neural network inference framework optimized for the mobile platform.
PocketFlow	use AutoML to do model compression.
TensorFlow Model Optimization	A suite of tools that users, both novice and advanced, can use to optimize machine learning models for deployment and execution.
TNN	A uniform deep learning inference framework for mobile, desktop and server.
optimum-tpu	Google TPU optimizations for transformers models

⬆ back to ToC

Federated ML

Project	Details	Repository
EasyFL	An Easy-to-use Federated Learning Platform
FATE	An Industrial Grade Federated Learning Framework
FedML	The federated learning and analytics library enabling secure and collaborative machine learning on decentralized data anywhere at any scale. Supporting large-scale cross-silo federated learning, cross-device federated learning on smartphones/IoTs, and research simulation.
Flower	A Friendly Federated Learning Framework
Harmonia	Harmonia is an open-source project aiming at developing systems/infrastructures and libraries to ease the adoption of federated learning (abbreviated to FL) for researches and production usage.
TensorFlow Federated	A framework for implementing federated learning

⬆ back to ToC

Awesome Lists

Project	Details	Repository
Awesome Argo	A curated list of awesome projects and resources related to Argo
Awesome AutoDL	Automated Deep Learning: Neural Architecture Search Is Not the End (a curated list of AutoDL resources and an in-depth analysis)
Awesome AutoML	Curating a list of AutoML-related research, tools, projects and other resources
Awesome AutoML Papers	A curated list of automated machine learning papers, articles, tutorials, slides and projects
Awesome-Code-LLM	👨‍💻 An awesome and curated list of best code-LLM for research.
Awesome Federated Learning Systems	A curated list of Federated Learning Systems related academic papers, articles, tutorials, slides and projects.
Awesome Federated Learning	A curated list of federated learning publications, re-organized from Arxiv (mostly)
awesome-federated-learningacc	All materials you need for Federated Learning: blogs, videos, papers, and softwares, etc.
Awesome Open MLOps	This is the Fuzzy Labs guide to the universe of free and open source MLOps tools.
Awesome Production Machine Learning	A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
Awesome Tensor Compilers	A list of awesome compiler projects and papers for tensor computation and deep learning.
kelvins/awesome-mlops	A curated list of awesome MLOps tools.
visenger/awesome-mlops	Machine Learning Operations - An awesome list of references for MLOps
currentslab/awesome-vector-search	A curated list of awesome vector search framework/engine, library, cloud service and research papers to vector similarity search.
pleisto/flappy	Production-Ready LLM Agent SDK for Every Developer

⬆ back to ToC

Files

README.md

Latest commit

History

README.md

File metadata and controls

Awesome LLMOps

Table of Contents

Model

Large Language Model

CV Foundation Model

Audio Foundation Model

Serving

Large Model Serving

Frameworks/Servers for Serving

Security

Frameworks for LLM security

Observability

LLMOps

Search

Vector search

Code AI

Training

IDEs and Workspaces

Foundation Model Fine Tuning

Frameworks for Training

Experiment Tracking

Visualization

Model Editing

Data

Data Management

Data Storage

Data Tracking

Feature Engineering

Data/Feature enrichment

Large Scale Deployment

ML Platforms

Workflow

Scheduling

Model Management

Performance

ML Compiler

Profiling

AutoML

Optimizations

Federated ML

Awesome Lists