Python Library | Overview | Link |
---|---|---|
Transformers (by Hugging Face) | A widely used library for training, fine-tuning, and using transformer-based models (like GPT, BERT, T5). | Transformers |
LangChain | A framework for developing applications powered by LLMs, integrating with external data sources and APIs. | LangChain |
GPT-Neo | Open-source implementation of GPT-3-like models by EleutherAI, designed for large-scale NLP tasks. | GPT-Neo |
Megatron-LM | A library for training large-scale transformer models efficiently, optimized for multi-GPU setups. | Megatron-LM |
DeepSpeed | A deep learning optimization library for training massive models, including efficient LLM training. | DeepSpeed |
FairScale | A PyTorch extension for large-scale training, including model parallelism, and optimized LLM training. | FairScale |
OpenAI GPT-3 | OpenAI's official GPT-3 API client, offering an easy interface to integrate GPT-3 into Python applications. | OpenAI GPT-3 |
T5 (Text-to-Text Transfer Transformer) | A library for using Google's T5 model for various NLP tasks by treating them as text-to-text problems. | T5 |
LlamaIndex (GPT Index) | A framework to build applications with LLMs, enabling large language model retrieval and query handling. | LlamaIndex |
BLOOM | A collection of multilingual large-scale transformer models that are open-sourced for research purposes. | BLOOM |
DialoGPT | A chatbot-oriented version of GPT-2 fine-tuned for conversational AI and dialogue generation. | DialoGPT |
Peft (Parameter Efficient Fine-Tuning) | A library designed to simplify parameter-efficient fine-tuning for LLMs like GPT, BERT, and others. | Peft |
Haystack | A framework for building production-ready NLP pipelines, integrating LLMs for question answering and more. | Haystack |
txtai | A semantic search library built on transformers, optimized for searching and generating text with LLMs. | txtai |
PaddlePaddle | Deep learning framework by Baidu with an easy-to-use interface for large-scale language model training. | PaddlePaddle |
LLM : Cohere, Falcon 2, Llama 3, GPT-4, Gemini - vid, Llama 2, PaLM 2 (Bison-001), Falcon LLM, GPT-4, Beluga 2, Stanford Alpaca, Llama, AutoGPT, JARVIS, Mini-GPT4, langchain, LlamaIndex, Alpaca Lora, 🌋 LLaVA, llm, privateGPT, Claude, haystack, dolly, bloom, opt, baby agi, agentGPT, StarCoder, @awesome-LLM, awesome-LLMOps. [ big LLM list ], ToolLLM, MetaGPT, Adobe Stardust, Stable Video AI Watched 600,000,000 Videos!, Stable Video Diffusion, [LLM Visualization], visualize matrices, shutterstock ai image generator.
Here's HuggingFace's LLM benchmark leaderboard - models, GLUE benchmarks, Building LLM applications for production : article.
Generative AI Service and Open Source Python libraries. Github repo search readme → Github. [ paperswithcode ]
Here is a list of all python libraries beyond 2k stars, updated in 25/09/2022
PYTHON LIBRARIES:
Best of Machine Learning with Python : github, Awesome Machine Learning with github, Awesome Python Data Science : github.
Computer Vision / Visualization Libraries: @github/awesome-computer-vision, @github/awesome-deep-vision
Natural Language Processing (NLP) libraries: @github/awesome-nlp
Library | Overview | Link |
---|---|---|
Matplotlib | A widely-used library for creating static, animated, and interactive plots in Python. | Matplotlib |
Seaborn | Built on top of Matplotlib, Seaborn simplifies creating informative and attractive statistical graphics. | Seaborn |
Plotly | An interactive graphing library for creating web-based data visualizations with support for dashboards. | Plotly |
Bokeh | A Python interactive visualization library for creating real-time and live-streaming visualizations. | Bokeh |
Altair | A declarative statistical visualization library for Python, built on Vega and Vega-Lite. | Altair |
ggplot | A Python version of the popular R ggplot2 library, for creating complex plots with a simple syntax. | ggplot |
Pyplot3D | A 3D plotting library built on Matplotlib for creating three-dimensional visualizations in Python. | Pyplot3D |
T-SNE | A Python library for visualizing high-dimensional data in a low-dimensional space using t-SNE techniques. | T-SNE |
Visdom | A flexible tool for creating and organizing visualizations of models during training in real-time. | Visdom |
TensorBoard | A tool for visualizing TensorFlow model training, including loss metrics, graphs, and embeddings. | TensorBoard |
LIME (Local Interpretable Model-Agnostic Explanations) | A tool for interpreting machine learning models by visualizing the impact of features on predictions. | LIME |
SHAP (SHapley Additive exPlanations) | A library for interpreting the outputs of machine learning models using Shapley values and visualizations. | SHAP |
Yellowbrick | A machine learning visualization library that provides visual analysis of model performance and diagnostics. | Yellowbrick |
Eli5 | A library for explaining machine learning models and visualizing their decision-making process. | Eli5 |
Plotly Dash | A web framework for creating interactive, dashboard-style visualizations using Plotly charts. | Dash |
Tidyverse | A collection of R packages for data visualization, data manipulation, and statistical analysis, ported to Python. | Tidyverse |
Deep Visualization Toolbox | A real-time interactive visualization tool to visualize deep learning models and their activations. | Deep Visualization Toolbox |
Neural Network Visualization (NNV) | A tool for visualizing neural networks and their layers, activations, and weights in a detailed way. | NNV |
Plotly Express | A simplified version of Plotly that allows users to create interactive plots with minimal code. | Plotly Express |
PCA (Principal Component Analysis) | A Python library for performing PCA on datasets and visualizing the results in 2D or 3D. | PCA |
@github/awesome-dataviz
This list1, list2 has most of the data viz libraries but i'll list a few which will be relevant for our use cases.
✤ data visualization:
- deck.gl : deck.gl is designed to simplify high-performance, WebGL-based visualization of large data sets. Check out these use cases.
- streetscape.gl : streetscape.gl is a toolkit for visualizing autonomous and robotics data in the XVIZ protocol. It is built on top of React and Uber’s WebGL-powered visualization frameworks. Check this demo.
- kepler.gl : github/Kepler.gl is a powerful web-based geospatial data analysis tool. Built on a high performance rendering engine and designed for large-scale data sets. kepler.gl is made from deck.gl.
- sanddance : github/SandDance helps you find insights about your data with unit visualizations and smooth animated transitions. It uses deck.gl to render chart layouts described with Vega.
- flowmapblue : FlowmapBlue is a free tool for representing aggregated numbers of movements between geographic locations as flow maps. It is used to visualize urban mobility, commuting behavior, bus, subway and air travels, bicycle sharing, human and bird migration, refugee flows, freight transportation, trade, supply chains, scientific collaboration, epidemiological and historical data and many other topics.
- cartodb : With CARTO, you can upload your geospatial data (Shapefiles, GeoJSON, etc) using a web form and then make it public or private. After it is uploaded, you can visualize it in a dataset or on a map, search it using SQL, and apply map styles using CartoCSS.
- cesiumJS: An open-source JavaScript library for world-class 3D globes and maps.
- Alibaba L7 & L7plot : Large-scale WebGL-powered Geospatial Data Visualization analysis engine. L7plot is a large-scale geospatial visualization chart library.
- datamaps: Customizable SVG map visualizations for the web in a single Javascript file using D3.js
✤ GIS python libraries:
- geopandas : python tools for geographic data
- RSGISlib : The Remote Sensing and GIS software library (RSGISLib) is a collection of tools for processing remote sensing and GIS datasets.
- ipyleaflet : If you want to create interactive maps, ipyleaflet is a fusion of Jupyter notebook and Leaflet. You can control an assortment of customizations like loading basemaps, geojson, and widgets.
- geemap : A Python package for interactive mapping with Google Earth Engine, ipyleaflet, and ipywidgets.
- lidar : lidar is a Python package for delineating the nested hierarchy of surface depressions in digital elevation models (DEMs). It is particularly useful for analyzing high-resolution topographic data, such as DEMs derived from Light Detection and Ranging (LiDAR) data.
- FOLIUM : Folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js library. Manipulate your data in Python, then visualize it in a Leaflet map via folium
- geoplot : geoplot is a high-level Python geospatial plotting library. It’s an extension to cartopy and matplotlib which makes mapping easy: like seaborn for geospatial.
✤ ml model visualization:
- NETRON : Netron is a viewer for neural network, deep learning and machine learning models.
- cnn model structure summary with this code. Tools to Design or Visualize Architecture of Neural Network : repo is a good list but i'll list down the few essentials.
- Tensorboard : TensorFlow visualization toolkit.
- PyTorchViz : PyTorch visualization kit.
- visualkeras : Visualkeras is a Python package to help visualize Keras (either standalone or included in TensorFlow) neural network architectures. The keras.utils.vis_utils provides utility functions to plot a Keras model using Graphviz
Library Name | Description | Link |
---|---|---|
detectron2 | Facebook's state-of-the-art vision algorithm platform for object detection and segmentation. | Link |
yolov5 | YOLOv5 - a fast, scalable, and easy-to-use object detection model for real-time applications. | Link |
kornia | Differentiable computer vision library for PyTorch. Contains a set of routines for solving computer vision tasks. | Link |
mmdetection | OpenMMLab's object detection toolbox and benchmark. | Link |
timm | PyTorch Image Models library with a large collection of pre-trained models for various image classification tasks. | Link |
opencv-python | OpenCV Python bindings for computer vision tasks including image processing and video analysis. | Link |
albumentations | Fast and flexible image augmentation library for deep learning tasks. | Link |
torchvision | Vision-related utilities for PyTorch including datasets, transforms, and pretrained models. | Link |
fastai | High-level API built on PyTorch that simplifies training neural networks for computer vision tasks. | Link |
simplecv | An easy-to-use framework for building computer vision applications using Python. | Link |
d2go | A streamlined version of Detectron2 optimized for mobile and edge devices. | Link |
imageai | Python library for performing object detection, image prediction, and custom image analysis using deep learning. | Link |
pytorch-lightning-bolts | PyTorch Lightning library with popular, well-structured vision models and pre-trained weights. | Link |
faster-rcnn | A Faster R-CNN implementation for object detection and segmentation with pre-trained models. | Link |
scikit-image | Image processing library for Python that complements SciPy and NumPy. | Link |
pix2pix | Image-to-image translation with deep learning using PyTorch. | Link |
segmentation-models.pytorch | PyTorch library for semantic segmentation models with pre-trained backbones. | Link |
openpose | Real-time multi-person human pose detection in images and videos. | Link |
yolo-v4 | YOLOv4: Real-time object detection with the state-of-the-art performance. | Link |
CycleGAN | A PyTorch implementation of CycleGAN for unpaired image-to-image translation. | Link |
Library Name | Description | Link |
---|---|---|
Stable Baselines3 | A set of reliable, optimized, and easy-to-use reinforcement learning algorithms built on PyTorch. | Link |
Gym | A toolkit for developing and comparing reinforcement learning algorithms. | Link |
Ray RLlib | A scalable reinforcement learning library built on top of Ray, supporting multi-agent environments and large-scale training. | Link |
TF-Agents | TensorFlow-based library for reinforcement learning with an easy-to-use interface and many pre-built agents. | Link |
Acme | A reinforcement learning framework developed by DeepMind for research purposes with modular design. | Link |
OpenAI Baselines | A collection of high-quality implementations of reinforcement learning algorithms. | Link |
Horizon | Facebook’s open-source end-to-end platform for applying reinforcement learning in production environments. | Link |
PettingZoo | A multi-agent reinforcement learning (MARL) environment library, built on top of Gym. | Link |
DQN | A PyTorch implementation of the Deep Q-Network (DQN) algorithm, one of the most famous RL algorithms. | Link |
Mario-RL | Reinforcement learning applied to training AI agents to play Super Mario using deep Q-learning. | Link |
Spinning Up | OpenAI’s educational resource for RL, providing foundational knowledge and simple RL code. | Link |
Dopamine | A research framework for fast prototyping of reinforcement learning algorithms by Google. | Link |
RLLib | RLlib is a scalable reinforcement learning library built on Ray, supporting both single-agent and multi-agent setups. | Link |
deepmind-control | A suite of control tasks designed to support RL research, developed by DeepMind. | Link |
RoboSchool | A collection of robot simulation environments for reinforcement learning research. | Link |
Baselines-RL | Keras-based RL library with implementations for popular algorithms like DQN, DDPG, A3C, etc. | Link |
Lightweight RL | A small and lightweight reinforcement learning library for easy experimentation and research. | Link |
gym-retro | A library for reinforcement learning that allows training agents to play classic video games using OpenAI Gym environments. | Link |
iLQR | A library implementing the iLQR algorithm for optimal control tasks in reinforcement learning. | Link |
Shifu | Uber’s platform for deep reinforcement learning that offers distributed training and scalable architecture. | Link |
Repository | Description |
---|---|
manim | An animation engine for creating mathematical animations. |
clearml | A platform for managing and automating machine learning experiments. |
3b1b videos | Code for the YouTube channel 3Blue1Brown, focused on explaining mathematical concepts. |
scientific visualization book | A book on scientific visualization with Python. |
bottles | A platform for managing and running Windows applications on Linux using Wine. |
dash | A framework for building analytical web applications in Python. |
ggpy | A Python interface to the R library ggplot2 for data visualization. |
Lenia | A project for creating generative life simulations. |
cartography | A tool for building knowledge graphs from cloud environments. |
Pywonderland | A library for generating 2D and 3D visualization of dynamical systems. |
altair | Declarative statistical visualization library for Python. |
yellowbrick | A suite of visual analysis and diagnostic tools for machine learning. |
shapely | A Python library for manipulation and analysis of planar geometric objects. |
orange3 | A data mining software suite with a focus on visual programming. |
science plots | A collection of scientific plotting styles for Matplotlib. |
flask_jsondash | A Flask extension for creating interactive dashboards. |
BLOOM | A large language model project, focusing on building the largest open-access model. |
whisper | OpenAI's speech-to-text model for transcription and translation. |
github/whisper | Open-source implementation of Whisper, OpenAI's automatic speech recognition system. |
ray | A unified computing framework for building scalable machine learning applications. |
elevenlabs | AI-driven platform for realistic speech synthesis. |
runwayml | A creative toolkit for using machine learning models in artistic and design work. |
caliban | A tool for launching and tracking numerical experiments in reproducible environments. |
kornia | A differentiable computer vision library for PyTorch. |
intel analytics-zoo | A big data AI platform for scaling AI workflows. |
mljar-supervised | An automated machine learning package for tabular data. |
deepdetect | AI platform supporting deep learning and traditional machine learning algorithms. |
dopamine | A framework for deep reinforcement learning research. |
deepmind lab | A 3D environment for deep reinforcement learning research. |
predictionio | A machine learning server for building predictive engines. |
detectron2 | Facebook's state-of-the-art platform for computer vision. |
tflearn | High-level API for TensorFlow, aimed at simplifying machine learning workflows. |
faceswap | Open-source library for creating deepfake images. |
waveglow | NVIDIA's WaveGlow for audio and speech synthesis. |
neural-enhance | Super-resolution technique for improving image quality. |
real-time voice cloning | Real-time voice cloning and manipulation tool. |
fasttext | Facebook's library for word representation learning and text classification. |
deOldify | A project for colorizing and restoring old photos and videos. |
NeuralTalk2 | A deep learning project for generating captions for images and videos. |
face-recognition | Real-time face recognition using deep learning. |
U GATIT | Image-to-anime style transfer using a GAN-based model. |
srez | Image super-resolution with deep learning. |
TecoGAN | Super-resolution technique for video frames. |
CMU open-pose | Real-time multi-person keypoint detection (body, face, hands, and feet). |
spaCy | Industrial-grade NLP library for Python. |
server | Optimized inference solution for cloud and edge deployment. |
background matting v2 | Real-time background removal. |
skyAR | Sky replacement using CycleGAN-based techniques. |
txtai | AI-powered semantic search applications. |
ONNX | An open ecosystem for AI model interoperability. |
open-cog | AGI-focused repository integrating various AI algorithms. |
prophet | Facebook's tool for high-quality time series forecasting with multiple seasonalities. |
Apache SystemDS | A system for end-to-end data science with distributed machine learning. |
AIF360 | Fairness metrics and algorithms to mitigate bias in datasets and models. |
tpot | Automated machine learning library using genetic programming. |
feature-tool | A tool for automatically creating features from temporal and relational data. |
auto-sklearn | An automated machine learning toolkit for scikit-learn. |
skorch | A PyTorch wrapper for scikit-learn. |
streamlit | Framework for building interactive data apps in Python. |
optuna | An automatic hyperparameter optimization framework for machine learning. |
shap | A game-theoretic approach to explaining machine learning models. |
pandas-profiling | Generates profiling reports for pandas DataFrames. |