Skip to content

Latest commit

 

History

History
1049 lines (713 loc) · 78 KB

aitools.MD

File metadata and controls

1049 lines (713 loc) · 78 KB

AI tools and Open Source libraries:

Here is a custom list of python libraries for Machine Learning ( (AI) ) :

Large Language Models (LLMs) :

Python Library Overview Link
Transformers (by Hugging Face) A widely used library for training, fine-tuning, and using transformer-based models (like GPT, BERT, T5). Transformers
LangChain A framework for developing applications powered by LLMs, integrating with external data sources and APIs. LangChain
GPT-Neo Open-source implementation of GPT-3-like models by EleutherAI, designed for large-scale NLP tasks. GPT-Neo
Megatron-LM A library for training large-scale transformer models efficiently, optimized for multi-GPU setups. Megatron-LM
DeepSpeed A deep learning optimization library for training massive models, including efficient LLM training. DeepSpeed
FairScale A PyTorch extension for large-scale training, including model parallelism, and optimized LLM training. FairScale
OpenAI GPT-3 OpenAI's official GPT-3 API client, offering an easy interface to integrate GPT-3 into Python applications. OpenAI GPT-3
T5 (Text-to-Text Transfer Transformer) A library for using Google's T5 model for various NLP tasks by treating them as text-to-text problems. T5
LlamaIndex (GPT Index) A framework to build applications with LLMs, enabling large language model retrieval and query handling. LlamaIndex
BLOOM A collection of multilingual large-scale transformer models that are open-sourced for research purposes. BLOOM
DialoGPT A chatbot-oriented version of GPT-2 fine-tuned for conversational AI and dialogue generation. DialoGPT
Peft (Parameter Efficient Fine-Tuning) A library designed to simplify parameter-efficient fine-tuning for LLMs like GPT, BERT, and others. Peft
Haystack A framework for building production-ready NLP pipelines, integrating LLMs for question answering and more. Haystack
txtai A semantic search library built on transformers, optimized for searching and generating text with LLMs. txtai
PaddlePaddle Deep learning framework by Baidu with an easy-to-use interface for large-scale language model training. PaddlePaddle

LLM : Cohere, Falcon 2, Llama 3, GPT-4, Gemini - vid, Llama 2, PaLM 2 (Bison-001), Falcon LLM, GPT-4, Beluga 2, Stanford Alpaca, Llama, AutoGPT, JARVIS, Mini-GPT4, langchain, LlamaIndex, Alpaca Lora, 🌋 LLaVA, llm, privateGPT, Claude, haystack, dolly, bloom, opt, baby agi, agentGPT, StarCoder, @awesome-LLM, awesome-LLMOps. [ big LLM list ], ToolLLM, MetaGPT, Adobe Stardust, Stable Video AI Watched 600,000,000 Videos!, Stable Video Diffusion, [LLM Visualization], visualize matrices, shutterstock ai image generator.

Here's HuggingFace's LLM benchmark leaderboard - models, GLUE benchmarks, Building LLM applications for production : article.

Generative AI Service and Open Source Python libraries. Github repo search readmeGithub. [ paperswithcode ]

Here is a list of all python libraries beyond 2k stars, updated in 25/09/2022

PYTHON LIBRARIES:

Heading

Libraries

Python
pylibs
essentials libraries
ml extra
tf / (PyTorch )
ML framework:
Reinforcement Learning
MLOPs / Distributed ML libraries

Best of Machine Learning with Python : github, Awesome Machine Learning with github, Awesome Python Data Science : github.

Computer Vision / Visualization Libraries: @github/awesome-computer-vision, @github/awesome-deep-vision

CV Toolbox
Object Detection
Text-to-Image
Multilingual OCR &QR
Diffusion Models
Vision Transformers
Image Augmentation
Upscale
Pose Detection
Annotation tool
Deep Fakes
Segmentation / Inpainting / Matting
Face Detection
WebML
3D data processing & reconstruction
fun plugins
Big Data visualization
Extra

Natural Language Processing (NLP) libraries: @github/awesome-nlp

Heading Libraries
NLP toolkit
audio analysis
chatbot
text2speech
time series
Conversational AI
neural machine translation
voice assistance
annotation
CTR
extra

✤ data visualization:

Library Overview Link
Matplotlib A widely-used library for creating static, animated, and interactive plots in Python. Matplotlib
Seaborn Built on top of Matplotlib, Seaborn simplifies creating informative and attractive statistical graphics. Seaborn
Plotly An interactive graphing library for creating web-based data visualizations with support for dashboards. Plotly
Bokeh A Python interactive visualization library for creating real-time and live-streaming visualizations. Bokeh
Altair A declarative statistical visualization library for Python, built on Vega and Vega-Lite. Altair
ggplot A Python version of the popular R ggplot2 library, for creating complex plots with a simple syntax. ggplot
Pyplot3D A 3D plotting library built on Matplotlib for creating three-dimensional visualizations in Python. Pyplot3D
T-SNE A Python library for visualizing high-dimensional data in a low-dimensional space using t-SNE techniques. T-SNE
Visdom A flexible tool for creating and organizing visualizations of models during training in real-time. Visdom
TensorBoard A tool for visualizing TensorFlow model training, including loss metrics, graphs, and embeddings. TensorBoard
LIME (Local Interpretable Model-Agnostic Explanations) A tool for interpreting machine learning models by visualizing the impact of features on predictions. LIME
SHAP (SHapley Additive exPlanations) A library for interpreting the outputs of machine learning models using Shapley values and visualizations. SHAP
Yellowbrick A machine learning visualization library that provides visual analysis of model performance and diagnostics. Yellowbrick
Eli5 A library for explaining machine learning models and visualizing their decision-making process. Eli5
Plotly Dash A web framework for creating interactive, dashboard-style visualizations using Plotly charts. Dash
Tidyverse A collection of R packages for data visualization, data manipulation, and statistical analysis, ported to Python. Tidyverse
Deep Visualization Toolbox A real-time interactive visualization tool to visualize deep learning models and their activations. Deep Visualization Toolbox
Neural Network Visualization (NNV) A tool for visualizing neural networks and their layers, activations, and weights in a detailed way. NNV
Plotly Express A simplified version of Plotly that allows users to create interactive plots with minimal code. Plotly Express
PCA (Principal Component Analysis) A Python library for performing PCA on datasets and visualizing the results in 2D or 3D. PCA

@github/awesome-dataviz

This list1, list2 has most of the data viz libraries but i'll list a few which will be relevant for our use cases.

✤ data visualization:

  • deck.gl : deck.gl is designed to simplify high-performance, WebGL-based visualization of large data sets. Check out these use cases.
  • streetscape.gl : streetscape.gl is a toolkit for visualizing autonomous and robotics data in the XVIZ protocol. It is built on top of React and Uber’s WebGL-powered visualization frameworks. Check this demo.
  • kepler.gl : github/Kepler.gl is a powerful web-based geospatial data analysis tool. Built on a high performance rendering engine and designed for large-scale data sets. kepler.gl is made from deck.gl.
  • sanddance : github/SandDance helps you find insights about your data with unit visualizations and smooth animated transitions. It uses deck.gl to render chart layouts described with Vega.
  • flowmapblue : FlowmapBlue is a free tool for representing aggregated numbers of movements between geographic locations as flow maps. It is used to visualize urban mobility, commuting behavior, bus, subway and air travels, bicycle sharing, human and bird migration, refugee flows, freight transportation, trade, supply chains, scientific collaboration, epidemiological and historical data and many other topics.
  • cartodb : With CARTO, you can upload your geospatial data (Shapefiles, GeoJSON, etc) using a web form and then make it public or private. After it is uploaded, you can visualize it in a dataset or on a map, search it using SQL, and apply map styles using CartoCSS.
  • cesiumJS: An open-source JavaScript library for world-class 3D globes and maps.
  • Alibaba L7 & L7plot : Large-scale WebGL-powered Geospatial Data Visualization analysis engine. L7plot is a large-scale geospatial visualization chart library.
  • datamaps: Customizable SVG map visualizations for the web in a single Javascript file using D3.js

✤ GIS python libraries:

  • geopandas : python tools for geographic data
  • RSGISlib : The Remote Sensing and GIS software library (RSGISLib) is a collection of tools for processing remote sensing and GIS datasets.
  • ipyleaflet : If you want to create interactive maps, ipyleaflet is a fusion of Jupyter notebook and Leaflet. You can control an assortment of customizations like loading basemaps, geojson, and widgets.
  • geemap : A Python package for interactive mapping with Google Earth Engine, ipyleaflet, and ipywidgets.
  • lidar : lidar is a Python package for delineating the nested hierarchy of surface depressions in digital elevation models (DEMs). It is particularly useful for analyzing high-resolution topographic data, such as DEMs derived from Light Detection and Ranging (LiDAR) data.
  • FOLIUM : Folium builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the Leaflet.js library. Manipulate your data in Python, then visualize it in a Leaflet map via folium
  • geoplot : geoplot is a high-level Python geospatial plotting library. It’s an extension to cartopy and matplotlib which makes mapping easy: like seaborn for geospatial.

✤ ml model visualization:

  • NETRON : Netron is a viewer for neural network, deep learning and machine learning models.
  • cnn model structure summary with this code. Tools to Design or Visualize Architecture of Neural Network : repo is a good list but i'll list down the few essentials.
  • Tensorboard : TensorFlow visualization toolkit.
  • PyTorchViz : PyTorch visualization kit.
  • visualkeras : Visualkeras is a Python package to help visualize Keras (either standalone or included in TensorFlow) neural network architectures. The keras.utils.vis_utils provides utility functions to plot a Keras model using Graphviz

Essential libraries:

Heading Libraries
toolbox
web crawling / parse
data
data pipeline / data science
databases
Streaming & stream processing
API
notification
csv - excel - json
recommenders
Documentation/ present
home automation & IOT / terminal
Tracking & monitoring
networking
automation
testing
GUI
search
pdf
debug
robotics
video conf
security
extra

Computer Vision :

Library Name Description Link
detectron2 Facebook's state-of-the-art vision algorithm platform for object detection and segmentation. Link
yolov5 YOLOv5 - a fast, scalable, and easy-to-use object detection model for real-time applications. Link
kornia Differentiable computer vision library for PyTorch. Contains a set of routines for solving computer vision tasks. Link
mmdetection OpenMMLab's object detection toolbox and benchmark. Link
timm PyTorch Image Models library with a large collection of pre-trained models for various image classification tasks. Link
opencv-python OpenCV Python bindings for computer vision tasks including image processing and video analysis. Link
albumentations Fast and flexible image augmentation library for deep learning tasks. Link
torchvision Vision-related utilities for PyTorch including datasets, transforms, and pretrained models. Link
fastai High-level API built on PyTorch that simplifies training neural networks for computer vision tasks. Link
simplecv An easy-to-use framework for building computer vision applications using Python. Link
d2go A streamlined version of Detectron2 optimized for mobile and edge devices. Link
imageai Python library for performing object detection, image prediction, and custom image analysis using deep learning. Link
pytorch-lightning-bolts PyTorch Lightning library with popular, well-structured vision models and pre-trained weights. Link
faster-rcnn A Faster R-CNN implementation for object detection and segmentation with pre-trained models. Link
scikit-image Image processing library for Python that complements SciPy and NumPy. Link
pix2pix Image-to-image translation with deep learning using PyTorch. Link
segmentation-models.pytorch PyTorch library for semantic segmentation models with pre-trained backbones. Link
openpose Real-time multi-person human pose detection in images and videos. Link
yolo-v4 YOLOv4: Real-time object detection with the state-of-the-art performance. Link
CycleGAN A PyTorch implementation of CycleGAN for unpaired image-to-image translation. Link

Reinforcement Learning :

Library Name Description Link
Stable Baselines3 A set of reliable, optimized, and easy-to-use reinforcement learning algorithms built on PyTorch. Link
Gym A toolkit for developing and comparing reinforcement learning algorithms. Link
Ray RLlib A scalable reinforcement learning library built on top of Ray, supporting multi-agent environments and large-scale training. Link
TF-Agents TensorFlow-based library for reinforcement learning with an easy-to-use interface and many pre-built agents. Link
Acme A reinforcement learning framework developed by DeepMind for research purposes with modular design. Link
OpenAI Baselines A collection of high-quality implementations of reinforcement learning algorithms. Link
Horizon Facebook’s open-source end-to-end platform for applying reinforcement learning in production environments. Link
PettingZoo A multi-agent reinforcement learning (MARL) environment library, built on top of Gym. Link
DQN A PyTorch implementation of the Deep Q-Network (DQN) algorithm, one of the most famous RL algorithms. Link
Mario-RL Reinforcement learning applied to training AI agents to play Super Mario using deep Q-learning. Link
Spinning Up OpenAI’s educational resource for RL, providing foundational knowledge and simple RL code. Link
Dopamine A research framework for fast prototyping of reinforcement learning algorithms by Google. Link
RLLib RLlib is a scalable reinforcement learning library built on Ray, supporting both single-agent and multi-agent setups. Link
deepmind-control A suite of control tasks designed to support RL research, developed by DeepMind. Link
RoboSchool A collection of robot simulation environments for reinforcement learning research. Link
Baselines-RL Keras-based RL library with implementations for popular algorithms like DQN, DDPG, A3C, etc. Link
Lightweight RL A small and lightweight reinforcement learning library for easy experimentation and research. Link
gym-retro A library for reinforcement learning that allows training agents to play classic video games using OpenAI Gym environments. Link
iLQR A library implementing the iLQR algorithm for optimal control tasks in reinforcement learning. Link
Shifu Uber’s platform for deep reinforcement learning that offers distributed training and scalable architecture. Link

Extra libraries: 🌸

Repository Description
manim An animation engine for creating mathematical animations.
clearml A platform for managing and automating machine learning experiments.
3b1b videos Code for the YouTube channel 3Blue1Brown, focused on explaining mathematical concepts.
scientific visualization book A book on scientific visualization with Python.
bottles A platform for managing and running Windows applications on Linux using Wine.
dash A framework for building analytical web applications in Python.
ggpy A Python interface to the R library ggplot2 for data visualization.
Lenia A project for creating generative life simulations.
cartography A tool for building knowledge graphs from cloud environments.
Pywonderland A library for generating 2D and 3D visualization of dynamical systems.
altair Declarative statistical visualization library for Python.
yellowbrick A suite of visual analysis and diagnostic tools for machine learning.
shapely A Python library for manipulation and analysis of planar geometric objects.
orange3 A data mining software suite with a focus on visual programming.
science plots A collection of scientific plotting styles for Matplotlib.
flask_jsondash A Flask extension for creating interactive dashboards.
BLOOM A large language model project, focusing on building the largest open-access model.
whisper OpenAI's speech-to-text model for transcription and translation.
github/whisper Open-source implementation of Whisper, OpenAI's automatic speech recognition system.
ray A unified computing framework for building scalable machine learning applications.
elevenlabs AI-driven platform for realistic speech synthesis.
runwayml A creative toolkit for using machine learning models in artistic and design work.
caliban A tool for launching and tracking numerical experiments in reproducible environments.
kornia A differentiable computer vision library for PyTorch.
intel analytics-zoo A big data AI platform for scaling AI workflows.
mljar-supervised An automated machine learning package for tabular data.
deepdetect AI platform supporting deep learning and traditional machine learning algorithms.
dopamine A framework for deep reinforcement learning research.
deepmind lab A 3D environment for deep reinforcement learning research.
predictionio A machine learning server for building predictive engines.
detectron2 Facebook's state-of-the-art platform for computer vision.
tflearn High-level API for TensorFlow, aimed at simplifying machine learning workflows.
faceswap Open-source library for creating deepfake images.
waveglow NVIDIA's WaveGlow for audio and speech synthesis.
neural-enhance Super-resolution technique for improving image quality.
real-time voice cloning Real-time voice cloning and manipulation tool.
fasttext Facebook's library for word representation learning and text classification.
deOldify A project for colorizing and restoring old photos and videos.
NeuralTalk2 A deep learning project for generating captions for images and videos.
face-recognition Real-time face recognition using deep learning.
U GATIT Image-to-anime style transfer using a GAN-based model.
srez Image super-resolution with deep learning.
TecoGAN Super-resolution technique for video frames.
CMU open-pose Real-time multi-person keypoint detection (body, face, hands, and feet).
spaCy Industrial-grade NLP library for Python.
server Optimized inference solution for cloud and edge deployment.
background matting v2 Real-time background removal.
skyAR Sky replacement using CycleGAN-based techniques.
txtai AI-powered semantic search applications.
ONNX An open ecosystem for AI model interoperability.
open-cog AGI-focused repository integrating various AI algorithms.
prophet Facebook's tool for high-quality time series forecasting with multiple seasonalities.
Apache SystemDS A system for end-to-end data science with distributed machine learning.
AIF360 Fairness metrics and algorithms to mitigate bias in datasets and models.
tpot Automated machine learning library using genetic programming.
feature-tool A tool for automatically creating features from temporal and relational data.
auto-sklearn An automated machine learning toolkit for scikit-learn.
skorch A PyTorch wrapper for scikit-learn.
streamlit Framework for building interactive data apps in Python.
optuna An automatic hyperparameter optimization framework for machine learning.
shap A game-theoretic approach to explaining machine learning models.
pandas-profiling Generates profiling reports for pandas DataFrames.