WIP: feat: Configurable timeout and retry for custom endpoints #5568

jameslamine · 2025-01-31T02:35:22Z

Summary

This PR adds two new configuration options for custom endpoints:

timeout: Sets request timeout in milliseconds (default: 10 minutes)
maxRetries: Sets maximum retry attempts for failed requests (default: 2)

These settings map directly to the OpenAI SDK configuration options and help prevent hanging requests and improve error handling through automatic retries.

Fixes #5567

Example Usage:

endpoints:
  custom:
    - name: "Example"
      timeout: 5000      # 5 second timeout
      maxRetries: 3      # Retry failed requests up to 3 times

Change Type

New feature (non-breaking change which adds functionality)
This change requires a documentation update

Testing

Configure a custom endpoint with timeout and maxRetries:

endpoints:
  custom:
    - name: "Test Endpoint"
      timeout: 5000
      maxRetries: 3

Test scenarios:

Normal API calls work as expected
Set a low timeout (1ms) and saw error as expected

Checklist

I have performed a self-review of my own code
My changes do not introduce new warnings
Local unit tests pass with my changes

jameslamine · 2025-01-31T14:39:25Z

After reading the openai SDK, it looks like this timeout is applied to the entire streaming request. Ideally this would just be a timeout between chunks instead.

I'm going to re-think this. It might be better to set the underlying socket timeout rather than the OpenAI client's higher-level timeout and retry setting.

danny-avila · 2025-01-31T16:23:57Z

timeout between chunks instead

This can be simulated at least on the receiving end with streamRate

jameslamine · 2025-01-31T16:42:43Z

timeout between chunks instead

This can be simulated at least on the receiving end with streamRate

Thanks! Can you provide more details on what that setting does? The issue I'm trying to solve is we want to abort TCP connections which are hanging or failing to establish. From a user-experience perspective, if the underlying TCP connection to the LLM endpoint is hanging or the initial TCP connection handshake is hanging it should fail fast and show an error to the user so they can re-try.

It seems like the streamRate controls how quickly tokens are streamed from LibreChat server to the frontend browser. I don't think it would help with hanging TCP connections. Is my understanding correct?

danny-avila · 2025-01-31T16:50:24Z

Got it, misunderstood what you meant. AFAIK there is no timeout between chunks configuration, something like this would have to be custom-built

feat: Configurable timeout and retry for custom endpoints

ff2151b

jameslamine changed the title ~~feat: Configurable timeout and retry for custom endpoints~~ WIP: feat: Configurable timeout and retry for custom endpoints Jan 31, 2025

danny-avila marked this pull request as draft January 31, 2025 16:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: feat: Configurable timeout and retry for custom endpoints #5568

WIP: feat: Configurable timeout and retry for custom endpoints #5568

jameslamine commented Jan 31, 2025

jameslamine commented Jan 31, 2025

danny-avila commented Jan 31, 2025 •

edited

Loading

jameslamine commented Jan 31, 2025 •

edited

Loading

danny-avila commented Jan 31, 2025

WIP: feat: Configurable timeout and retry for custom endpoints #5568

Are you sure you want to change the base?

WIP: feat: Configurable timeout and retry for custom endpoints #5568

Conversation

jameslamine commented Jan 31, 2025

Summary

Change Type

Testing

Checklist

jameslamine commented Jan 31, 2025

danny-avila commented Jan 31, 2025 • edited Loading

jameslamine commented Jan 31, 2025 • edited Loading

danny-avila commented Jan 31, 2025

danny-avila commented Jan 31, 2025 •

edited

Loading

jameslamine commented Jan 31, 2025 •

edited

Loading