Skip to content
Change the repository type filter

All

    Repositories list

    • rocBLAS

      Public
      Next generation BLAS implementation for ROCm platform
      C++
      Other
      16935151Updated Dec 24, 2024Dec 24, 2024
    • ROCm

      Public
      AMD ROCm™ Software - GitHub Home
      Shell
      MIT License
      3934.8k10816Updated Dec 24, 2024Dec 24, 2024
    • Fast and memory-efficient exact attention
      Python
      BSD 3-Clause "New" or "Revised" License
      1.4k1482412Updated Dec 24, 2024Dec 24, 2024
    • ONNX Runtime: cross-platform, high performance scoring engine for ML models
      C++
      MIT License
      3k606Updated Dec 24, 2024Dec 24, 2024
    • This is the AMD-maintained fork of the LLVM git repository. This repository accepts pull requests and issues related to AMD fork-specific topics (amd/*). For all other issues/PRs, please submit upstream at https://github.com/llvm/llvm-project.
      LLVM
      Other
      12k1253317Updated Dec 24, 2024Dec 24, 2024
    • Composable Kernel: Performance Portable Programming Model for Machine Learning Tensor Operators
      C++
      Other
      1353282447Updated Dec 24, 2024Dec 24, 2024
    • xla

      Public
      A machine learning compiler for GPUs, CPUs, and ML accelerators
      C++
      Apache License 2.0
      4553018Updated Dec 24, 2024Dec 24, 2024
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      4.9k51113Updated Dec 24, 2024Dec 24, 2024
    • rocFFT

      Public
      Next generation FFT implementation for ROCm
      C++
      Other
      8518214Updated Dec 24, 2024Dec 24, 2024
    • C++
      MIT License
      101786Updated Dec 24, 2024Dec 24, 2024
    • rocSHMEM

      Public
      rocSHMEM intra-kernel networking runtime for AMD dGPUs on the ROCm platform.
      C++
      MIT License
      114381Updated Dec 24, 2024Dec 24, 2024
    • hipBLASLt

      Public
      hipBLASLt is a library that provides general matrix-matrix operations with a flexible API and extends functionalities beyond a traditional BLAS library
      Assembly
      MIT License
      9568672Updated Dec 24, 2024Dec 24, 2024
    • pytorch

      Public
      Tensors and Dynamic neural networks in Python with strong GPU acceleration
      Python
      Other
      23k2197737Updated Dec 24, 2024Dec 24, 2024
    • clr

      Public
      C++
      MIT License
      521101314Updated Dec 24, 2024Dec 24, 2024
    • hip-tests

      Public
      C++
      MIT License
      3033124Updated Dec 24, 2024Dec 24, 2024
    • HIP

      Public
      HIP: C++ Heterogeneous-Compute Interface for Portability
      C++
      MIT License
      5403.8k2239Updated Dec 24, 2024Dec 24, 2024
    • HIPIFY

      Public
      HIPIFY: Convert CUDA to Portable C++ Code
      C++
      MIT License
      75533200Updated Dec 24, 2024Dec 24, 2024
    • A system validation and diagnostics tool for monitoring, stress testing, detecting, and troubleshooting issues impacting AMD GPUs in high-performance computing environments
      C++
      MIT License
      3966010Updated Dec 24, 2024Dec 24, 2024
    • aotriton

      Public
      Ahead of Time (AOT) Triton Math Library
      Python
      MIT License
      1644101Updated Dec 24, 2024Dec 24, 2024
    • MIOpen

      Public
      AMD's Machine Intelligence Library
      Assembly
      Other
      2321.1k25062Updated Dec 24, 2024Dec 24, 2024
    • flang

      Public
      Mirror of flang repo: The source repo is https://github.com/flang-compiler/flang . Once a day the master branch is updated from the upstream source repo and then locked. AOMP or ROCm developers may commit or create PRs on branch aomp-dev.
      C++
      Other
      86011Updated Dec 24, 2024Dec 24, 2024
    • Shell
      Apache License 2.0
      71640Updated Dec 24, 2024Dec 24, 2024
    • omnitrace

      Public
      Omnitrace: Application Profiling, Tracing, and Analysis
      C++
      MIT License
      27304157Updated Dec 24, 2024Dec 24, 2024
    • AMD's graph optimization engine.
      C++
      MIT License
      8819035048Updated Dec 24, 2024Dec 24, 2024
    • rocAL

      Public
      The AMD rocAL is designed to efficiently decode and process images and videos from a variety of storage formats and modify them through a processing graph programmable by the user.
      C++
      MIT License
      141294Updated Dec 24, 2024Dec 24, 2024
    • rccl

      Public
      ROCm Communication Collectives Library (RCCL)
      C++
      Other
      1262841522Updated Dec 24, 2024Dec 24, 2024
    • triton

      Public
      Development repository for the Triton language and compiler
      C++
      MIT License
      1.7k101942Updated Dec 23, 2024Dec 23, 2024
    • Device Metrics Exporter exports metrics from AMD devices (GPUs) to collectors like Prometheus.
      Shell
      Apache License 2.0
      7201Updated Dec 23, 2024Dec 23, 2024
    • Cluster networking documentation for AMD Instinct accelerators
      MIT License
      3301Updated Dec 23, 2024Dec 23, 2024
    • rocPyDecode is a set of Python bindings to rocDecode C++ library which provides full HW acceleration for video decoding on AMD GPUs.
      C++
      MIT License
      7308Updated Dec 23, 2024Dec 23, 2024