Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: gateway payment tracking for AI pipelines #3358

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

ad-astra-video
Copy link
Collaborator

What does this pull request do? Explain your changes. (required)

DRAFT - This is still in progress and needs refinement. Debug log lines still included for visibility.

Updates sessions used for AI pipelines to keep the balances for longer. go-livepeer by default cleans up balances every 10 minutes and starts a new session when ticket params expire in 10 blocks. For AI jobs, this is a really short time for some pipelines that can have the ticket be worth many requests and causes overpayments for Gateway operators.

Gateway

  • Creates tickets and tracks the spent balance to each orchestrator by session id created from the ticket params the Orchestrator sends to the Gateway.
  • Ticket params expire in 10 blocks or about 2 minutes. When ticket params expire a new key is used to track the balances of the payments sent.
  • Balance is "reserved" when in play for a request and needs at least the Orchestrator ticket EV in the balance to not need a ticket. This means once 2 tickets are in no tickets should be needed until the reserve drops below EV again.
  • Balances are cleaned up every 10 minutes if not in use. Each balance check updates a time field for last update and resets the ttl from the last check.

Orchestrator

  • Tracks payment balances by Gateway eth address and pipeline/model id.
  • Uses same timeout of 10 minutes to clean up

This updates the Gateway to use the pipeline/model id as the key to track the balances for an orchestrator.

Specific updates (required)

  • Gateway updated to use pipeline/model id string to track payments
  • Accounting updated to not clear pipeline/model id on clean up interval

How did you test each of these updates (required)

Built docker image and sent requests

Does this pull request close any open issues?

None

Checklist:

@github-actions github-actions bot added go Pull requests that update Go code AI Issues and PR related to the AI-video branch. labels Jan 18, 2025
@ad-astra-video
Copy link
Collaborator Author

Sender nonce limit of 150 came up in the water cooler chat. I took a look and found that nonce tracking resets with params expiration block (10 blocks). Thinking this will generally save from having to increase this limit.

See updateSenderNonce in recipient.go.

Will edit this comment when have a chance to test more

@leszko
Copy link
Contributor

leszko commented Feb 3, 2025

I think that the main problem we would need to solve is the reason why the 10 min cleanup was created. So, I believe the reason is that we want to have the same balance sheet for both B and O.

Now, if we do the frequent cleanup (or we keep the balance sheet "per session") then all is simple. However, if we start to not make cleanup, then if O restarts it will have a difference balance sheet than G. This in turn will cause "insufficient balance" errors, because G will think it has already paid, while O will think that G hasn't paid.

@ad-astra-video
Copy link
Collaborator Author

I think 10 minutes was set because with transcoding the sessions are created for each stream start and are discarded at stream end or idle for 1 minute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AI Issues and PR related to the AI-video branch. go Pull requests that update Go code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants