Skip to content

Latest commit

 

History

History
53 lines (33 loc) · 3.29 KB

getting-started.md

File metadata and controls

53 lines (33 loc) · 3.29 KB

👩‍💻 Getting Started

Introduction

This chapter provides detailed technical information on the Confidential AI Inference, designed to ensure confidentiality, integrity, and verifiability of AI inference tasks. We use the the TEE technologies provided by NVIDIA GPU TEE and Intel TDX to secure AI workloads, allowing developers to easily deploy their LLMs in a secure environment.

Overview

Confidential inference addresses critical concerns such as data privacy, secure execution, and computation verifiability, making it indispensable for sensitive applications. As illustrated in the diagram below, people currently cannot fully trust the responses returned by LLMs from services like OpenAI or Meta, due to the lack of cryptographic verification. By running the LLM inside a TEE, we can add verification primitives alongside the returned response, known as a Remote Attestation (RA) Report. This allows users to verify the AI generation results locally without relying on any third parties.

Getting Started

We provide a public API endpoint for you to get the TEE attestation report and chat with the private AI.

Get TEE Attestation Report

Send a GET request to https://inference-api.phala.network/v1/attestation/report to get the TEE attestation report.

The response will be like:

{
  "signing_address": "...",
  "nvidia_payload": "...",
  "intel_quote": "..."
}

The signing_address is the account address generated inside TEE that will be used to sign the chat response. You can go to https://etherscan.io/verifiedSignatures, click Verify Signature, and paste the signing_address and message response to verify it.

nvidia_payload and intel_quote are the attestation report from NVIDIA TEE and Intel TEE respectively. You can use them to verify the integrity of the TEE. See Verify the Attestation for more details.

Chat With Private AI

We provide OpenAI-compatible API for you to send chat request to the LLM running inside TEE, where you just need to replace the API endpoint to https://platform.openai.com/docs/api-reference/chat.

Check the confidential-AI-API.md for more information.

Check the host-LLM-in-TEE.md for how to host your own private LLM in TEE.

Check the implementation.md for the technical details of the Confidential AI Inference.

References

  1. HCC-Whitepaper
  2. Intel SGX DCAP Orientation
  3. Phala's dcap-qvl
  4. Automata’s Solidity Implementation
  5. Phala Nvidia H200 TEE Benchmark Paper
  6. Phala DeRoT Post on FlashBots forum
  7. Phala Key Management Protocol Post on Flashbots forum