This chapter provides detailed technical information on the Confidential AI Inference, designed to ensure confidentiality, integrity, and verifiability of AI inference tasks. We use the the TEE technologies provided by NVIDIA GPU TEE and Intel TDX to secure AI workloads, allowing developers to easily deploy their LLMs in a secure environment.
Confidential inference addresses critical concerns such as data privacy, secure execution, and computation verifiability, making it indispensable for sensitive applications. As illustrated in the diagram below, people currently cannot fully trust the responses returned by LLMs from services like OpenAI or Meta, due to the lack of cryptographic verification. By running the LLM inside a TEE, we can add verification primitives alongside the returned response, known as a Remote Attestation (RA) Report. This allows users to verify the AI generation results locally without relying on any third parties.
We provide a public API endpoint for you to get the TEE attestation report and chat with the private AI.
Send a GET request to https://inference-api.phala.network/v1/attestation/report to get the TEE attestation report.
The response will be like:
{
"signing_address": "...",
"nvidia_payload": "...",
"intel_quote": "..."
}
The signing_address
is the account address generated inside TEE that will be used to sign the chat response. You can go to https://etherscan.io/verifiedSignatures, click Verify Signature, and paste the signing_address
and message response to verify it.
nvidia_payload
and intel_quote
are the attestation report from NVIDIA TEE and Intel TEE respectively. You can use them to verify the integrity of the TEE. See Verify the Attestation for more details.
We provide OpenAI-compatible API for you to send chat request to the LLM running inside TEE, where you just need to replace the API endpoint to https://platform.openai.com/docs/api-reference/chat
.
Check the confidential-AI-API.md for more information.
Check the host-LLM-in-TEE.md for how to host your own private LLM in TEE.
Check the implementation.md for the technical details of the Confidential AI Inference.