These tests check the API's ability to handle different amounts of load. The tests simulate a specified number of users hitting the endpoints with some number of requests per second.
Before running the tests, ensure that your API URL and API key are properly configured in your environment variables. Follow these steps:
-
Set the API URL:
export API_URL="https://leapfrogai-api.uds.dev"
-
Set the API token:
export BEARER_TOKEN="<your-api-key-here>"
Note: See the API documentation to create an API key.
-
(Optional) - Set the model backend, this will default to
vllm
if unset:export DEFAULT_MODEL="llama-cpp-python"
To start the Locust web interface and run the tests:
-
Install dependencies from the project root.
pip install ".[dev]"
-
Navigate to the directory containing
loadtest.py
. -
Execute the following command:
locust -f loadtest.py --web-port 8089
-
Open your web browser and go to
http://0.0.0.0:8089
. -
Use the Locust web interface to configure and run your tests:
- Set the number of users to simulate
- Set the spawn rate (users per second)
- Choose the host to test against (should match your
API_URL
) - Start the test and monitor results in real-time