Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

16397 llm proxy custom providers #7

Open
wants to merge 58 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
58e8b20
increase postgresql read timeout and add dd metric
spikelu2016 Aug 10, 2024
9a28470
update CHANGELOG
spikelu2016 Aug 10, 2024
c342d01
update cache ttl
spikelu2016 Aug 10, 2024
c7db373
change db data type
spikelu2016 Aug 11, 2024
29a800c
update CHANGELOG
spikelu2016 Aug 11, 2024
bd64d6e
fix
spikelu2016 Aug 31, 2024
c7ae18b
fix
spikelu2016 Aug 31, 2024
66a3a0a
fix
spikelu2016 Aug 31, 2024
e34416f
remove
spikelu2016 Aug 31, 2024
876fa3c
add amazon bedrock integrations for claude
spikelu2016 Sep 9, 2024
9f42b41
update doc
spikelu2016 Sep 9, 2024
5dc1e68
fix provider selection issue
spikelu2016 Sep 12, 2024
8682f5e
update CHANGELOG
spikelu2016 Sep 12, 2024
0750ef8
fixed compatibility issues
spikelu2016 Sep 13, 2024
aa6321c
add support for openai o1
spikelu2016 Sep 16, 2024
805ef73
update CHANGElOG
spikelu2016 Sep 16, 2024
0976748
remove redundant gpt-4o entries from supported models
phlego Oct 10, 2024
7aedf1c
add `gpt-4o-mini` to supported models
phlego Oct 10, 2024
b0be28e
remove redundant attribute
phlego Oct 10, 2024
d3467c6
Merge pull request #85 from phlego/feature/support-gpt-4o-mini
spikelu2016 Oct 15, 2024
193b362
update CHANGELOG
spikelu2016 Oct 15, 2024
6773314
chore: update gpt-4o models' prices
Adibov Oct 16, 2024
4fee383
Merge pull request #86 from Adibov/patch-1
spikelu2016 Oct 16, 2024
8b3c614
update CHANGELOG
spikelu2016 Oct 16, 2024
0c74013
update CHANGELOG
spikelu2016 Oct 16, 2024
d792ed8
add request level timeout
spikelu2016 Oct 24, 2024
ca59ca6
update doc
spikelu2016 Oct 24, 2024
768ab1d
add support for AWS elastic cache.
lei-lei-shanda Oct 30, 2024
3009774
run `go mod tidy`.
lei-lei-shanda Oct 30, 2024
8e07ef4
add gpt-4o latest model.
lei-lei-shanda Oct 30, 2024
2ace998
upgrade dependency of goopenai to support structured output.
lei-lei-shanda Oct 30, 2024
a631038
Merge pull request #87 from galileilei/dev-ll-more-redis-config
spikelu2016 Oct 31, 2024
a4baa74
revert changes to gpt-4o cost.
lei-lei-shanda Oct 31, 2024
a0f7edb
fix model version number.
lei-lei-shanda Nov 1, 2024
5a6ab5a
Merge pull request #88 from galileilei/dev-ll-structured-output-support
spikelu2016 Nov 4, 2024
4796a61
go:1.23.2 setup-go@5 checkout@v4 docker/login-action@v3 docker/setup-…
andrewrothstein Nov 4, 2024
162c424
Merge pull request #89 from andrewrothstein/feature/freshen-dependencies
spikelu2016 Nov 7, 2024
b70ed33
add fix
lei-lei-shanda Nov 7, 2024
02e5328
Merge pull request #90 from galileilei/dev-ll-fix-openai-unmarshall-e…
spikelu2016 Nov 10, 2024
0643668
add new env variables for enabling redis tls
spikelu2016 Nov 10, 2024
9afbfb3
update cost
spikelu2016 Nov 10, 2024
78d1f24
update CHANGELOG
spikelu2016 Nov 10, 2024
d3e7cc8
add encryption
spikelu2016 Nov 16, 2024
c755d64
add debug log
spikelu2016 Nov 16, 2024
f2da1bb
add auth integration
spikelu2016 Nov 17, 2024
25627cc
fix bug
spikelu2016 Nov 18, 2024
8902184
first swing. broken.
andrewrothstein Nov 5, 2024
ae55b11
gussy up the services with named ports and named ingresses
andrewrothstein Nov 6, 2024
6807c1b
thats one service with multiple ports tho multiple ingresses
andrewrothstein Nov 6, 2024
0bbcd74
default values
andrewrothstein Nov 27, 2024
c101ce8
Merge pull request #92 from andrewrothstein/feature/add-k8s-helm-chart
spikelu2016 Dec 6, 2024
d88f94b
add support for amazon bedrock model
spikelu2016 Dec 27, 2024
6215788
add cost tracking for o1
spikelu2016 Dec 27, 2024
2a8a1df
add pushing to aws
spikelu2016 Jan 2, 2025
34ebf34
update workflow
spikelu2016 Jan 2, 2025
5fb4f93
fix bug
spikelu2016 Jan 5, 2025
c7d80a6
update encryptor initialization logic
spikelu2016 Jan 5, 2025
807c705
merge parent
sergei-bronnikov Jan 28, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
release_notes.md
target
.DS_STORE
.vscode/launch.json
.vscode/launch.json
.env
57 changes: 57 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,60 @@
## 1.39.0 - 2024-11-15
### Added
- Added encryption integration

### Changed
- Removed support for Redis TLS config


## 1.38.0 - 2024-11-09
### Added
- Added support for `claude-3-5-haiku`
- Added support for Redis TLS config
- Added support for `gpt-4o-2024-08-06`

## 1.37.0 - 2024-10-23
### Added
- Added request level timeout with HTTP header `x-request-timeout`

## 1.36.5 - 2024-10-16
### Changed
- Updated `gpt-4o` pricing according to OpenAI updates

## 1.36.4 - 2024-10-15
### Added
- Added support for `gpt-4o-mini` in routes

## 1.36.3 - 2024-09-16
### Added
- Added support for OpenAI o1 models

## 1.36.2 - 2024-09-13
### Fixed
- Fixed compatibility issues between Anthropic SDK and AWS Bedrock

## 1.36.1 - 2024-09-10
### Fixed
- Fixed provider selection issue when a key is associated with multiple providers

## 1.36.0 - 2024-09-09
### Added
- Added Amazon Bedrock integration for Claude models

## 1.35.2 - 2024-08-10
### Changed
- Changed aggregated table column data types from `INT` to `BIGINT`

## 1.35.1 - 2024-08-10
### Changed
- Changed cache TTL from `1h` to `24h` for keys and provider settings

## 1.35.0 - 2024-08-10
### Added
- Added cost tracking for `gpt-4o-2024-08-06`

### Changed
- Changed default read time out for PostgreSQL

## 1.34.0 - 2024-07-29
### Added
- Added cost tracking for `gpt-4o-mini`
Expand Down
4 changes: 2 additions & 2 deletions Dockerfile.dev
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
FROM golang:1.22.1 AS build
FROM golang:1.23.2 AS build
ENV CGO_ENABLED=0
ENV GOOS=linux

WORKDIR /go/src/github.com/bricks-cloud/bricksllm/
COPY . /go/src/github.com/bricks-cloud/bricksllm/
RUN go build -ldflags="-s -w" -o ./bin/bricksllm ./cmd/bricksllm/main.go

FROM alpine:3.17
FROM alpine:3.20
RUN apk --no-cache add ca-certificates
WORKDIR /usr/bin
COPY --from=build /go/src/github.com/bricks-cloud/bricksllm/bin /go/bin
Expand Down
4 changes: 2 additions & 2 deletions Dockerfile.prod
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
FROM golang:1.22.1 AS build
FROM golang:1.23.2 AS build
ENV CGO_ENABLED=0
ENV GOOS=linux

WORKDIR /go/src/github.com/bricks-cloud/bricksllm/
COPY . /go/src/github.com/bricks-cloud/bricksllm/
RUN go build -ldflags="-s -w" -o ./bin/bricksllm ./cmd/bricksllm/main.go

FROM alpine:3.17
FROM alpine:3.20
RUN apk --no-cache add ca-certificates
WORKDIR /usr/bin
COPY --from=build /go/src/github.com/bricks-cloud/bricksllm/bin /go/bin
Expand Down
87 changes: 30 additions & 57 deletions cmd/bricksllm/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ import (
auth "github.com/bricks-cloud/bricksllm/internal/authenticator"
"github.com/bricks-cloud/bricksllm/internal/cache"
"github.com/bricks-cloud/bricksllm/internal/config"
"github.com/bricks-cloud/bricksllm/internal/encryptor"
"github.com/bricks-cloud/bricksllm/internal/logger/zap"
"github.com/bricks-cloud/bricksllm/internal/manager"
"github.com/bricks-cloud/bricksllm/internal/message"
Expand Down Expand Up @@ -173,130 +174,97 @@ func main() {
}
rMemStore.Listen()

rateLimitRedisCache := redis.NewClient(&redis.Options{
Addr: fmt.Sprintf("%s:%s", cfg.RedisHosts, cfg.RedisPort),
Password: cfg.RedisPassword,
DB: 0,
})
defaultRedisOption := func(cfg *config.Config, dbIndex int) *redis.Options {

options := &redis.Options{
Addr: fmt.Sprintf("%s:%s", cfg.RedisHosts, cfg.RedisPort),
Password: cfg.RedisPassword,
DB: cfg.RedisDBStartIndex + dbIndex,
}

return options
}

rateLimitRedisCache := redis.NewClient(defaultRedisOption(cfg, 0))
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
if err := rateLimitRedisCache.Ping(ctx).Err(); err != nil {
log.Sugar().Fatalf("error connecting to rate limit redis cache: %v", err)
}

costLimitRedisCache := redis.NewClient(&redis.Options{
Addr: fmt.Sprintf("%s:%s", cfg.RedisHosts, cfg.RedisPort),
Password: cfg.RedisPassword,
DB: 1,
})
costLimitRedisCache := redis.NewClient(defaultRedisOption(cfg, 1))

ctx, cancel = context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
if err := costLimitRedisCache.Ping(ctx).Err(); err != nil {
log.Sugar().Fatalf("error connecting to cost limit redis cache: %v", err)
}

costRedisStorage := redis.NewClient(&redis.Options{
Addr: fmt.Sprintf("%s:%s", cfg.RedisHosts, cfg.RedisPort),
Password: cfg.RedisPassword,
DB: 2,
})
costRedisStorage := redis.NewClient(defaultRedisOption(cfg, 2))

ctx, cancel = context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
if err := costRedisStorage.Ping(ctx).Err(); err != nil {
log.Sugar().Fatalf("error connecting to cost limit redis storage: %v", err)
}

apiRedisCache := redis.NewClient(&redis.Options{
Addr: fmt.Sprintf("%s:%s", cfg.RedisHosts, cfg.RedisPort),
Password: cfg.RedisPassword,
DB: 3,
})
apiRedisCache := redis.NewClient(defaultRedisOption(cfg, 3))

ctx, cancel = context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
if err := apiRedisCache.Ping(ctx).Err(); err != nil {
log.Sugar().Fatalf("error connecting to api redis cache: %v", err)
}

accessRedisCache := redis.NewClient(&redis.Options{
Addr: fmt.Sprintf("%s:%s", cfg.RedisHosts, cfg.RedisPort),
Password: cfg.RedisPassword,
DB: 4,
})
accessRedisCache := redis.NewClient(defaultRedisOption(cfg, 4))

ctx, cancel = context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
if err := accessRedisCache.Ping(ctx).Err(); err != nil {
log.Sugar().Fatalf("error connecting to api redis cache: %v", err)
}

userRateLimitRedisCache := redis.NewClient(&redis.Options{
Addr: fmt.Sprintf("%s:%s", cfg.RedisHosts, cfg.RedisPort),
Password: cfg.RedisPassword,
DB: 5,
})
userRateLimitRedisCache := redis.NewClient(defaultRedisOption(cfg, 5))

ctx, cancel = context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
if err := userRateLimitRedisCache.Ping(ctx).Err(); err != nil {
log.Sugar().Fatalf("error connecting to user rate limit redis cache: %v", err)
}

userCostLimitRedisCache := redis.NewClient(&redis.Options{
Addr: fmt.Sprintf("%s:%s", cfg.RedisHosts, cfg.RedisPort),
Password: cfg.RedisPassword,
DB: 6,
})
userCostLimitRedisCache := redis.NewClient(defaultRedisOption(cfg, 6))

ctx, cancel = context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
if err := userCostLimitRedisCache.Ping(ctx).Err(); err != nil {
log.Sugar().Fatalf("error connecting to user cost limit redis cache: %v", err)
}

userCostRedisStorage := redis.NewClient(&redis.Options{
Addr: fmt.Sprintf("%s:%s", cfg.RedisHosts, cfg.RedisPort),
Password: cfg.RedisPassword,
DB: 7,
})
userCostRedisStorage := redis.NewClient(defaultRedisOption(cfg, 7))

ctx, cancel = context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
if err := userCostRedisStorage.Ping(ctx).Err(); err != nil {
log.Sugar().Fatalf("error connecting to user cost redis cache: %v", err)
}

userAccessRedisCache := redis.NewClient(&redis.Options{
Addr: fmt.Sprintf("%s:%s", cfg.RedisHosts, cfg.RedisPort),
Password: cfg.RedisPassword,
DB: 8,
})
userAccessRedisCache := redis.NewClient(defaultRedisOption(cfg, 8))

ctx, cancel = context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
if err := userAccessRedisCache.Ping(ctx).Err(); err != nil {
log.Sugar().Fatalf("error connecting to user access redis storage: %v", err)
}

providerSettingsRedisCache := redis.NewClient(&redis.Options{
Addr: fmt.Sprintf("%s:%s", cfg.RedisHosts, cfg.RedisPort),
Password: cfg.RedisPassword,
DB: 9,
})
providerSettingsRedisCache := redis.NewClient(defaultRedisOption(cfg, 9))

ctx, cancel = context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
if err := providerSettingsRedisCache.Ping(ctx).Err(); err != nil {
log.Sugar().Fatalf("error connecting to provider settings redis storage: %v", err)
}

keysRedisCache := redis.NewClient(&redis.Options{
Addr: fmt.Sprintf("%s:%s", cfg.RedisHosts, cfg.RedisPort),
Password: cfg.RedisPassword,
DB: 10,
})
keysRedisCache := redis.NewClient(defaultRedisOption(cfg, 10))

ctx, cancel = context.WithTimeout(context.Background(), 2*time.Second)
defer cancel()
Expand All @@ -318,11 +286,16 @@ func main() {
psCache := redisStorage.NewProviderSettingsCache(providerSettingsRedisCache, cfg.RedisWriteTimeout, cfg.RedisReadTimeout)
keysCache := redisStorage.NewKeysCache(keysRedisCache, cfg.RedisWriteTimeout, cfg.RedisReadTimeout)

encryptor, err := encryptor.NewEncryptor(cfg.DecryptionEndpoint, cfg.EncryptionEndpoint, cfg.EnableEncrytion, cfg.EncryptionTimeout, cfg.Audience)
if cfg.EnableEncrytion && err != nil {
log.Sugar().Fatalf("error creating encryption client: %v", err)
}
v := validator.NewValidator(costLimitCache, rateLimitCache, costStorage)


m := manager.NewManager(store, costLimitCache, rateLimitCache, accessCache, keysCache)
krm := manager.NewReportingManager(costStorage, store, store, v)
psm := manager.NewProviderSettingsManager(store, psCache)
psm := manager.NewProviderSettingsManager(store, psCache, encryptor)
cpm := manager.NewCustomProvidersManager(store, cpMemStore)
rm := manager.NewRouteManager(store, store, rMemStore, psm)
pm := manager.NewPolicyManager(store, rMemStore)
Expand Down Expand Up @@ -359,7 +332,7 @@ func main() {

rec := recorder.NewRecorder(costStorage, userCostStorage, costLimitCache, userCostLimitCache, ce, store)
rlm := manager.NewRateLimitManager(rateLimitCache, userRateLimitCache)
a := auth.NewAuthenticator(psm, m, rm, store)
a := auth.NewAuthenticator(psm, m, rm, store, encryptor)

c := cache.NewCache(apiCache)

Expand Down
9 changes: 4 additions & 5 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,11 @@
version: '3.8'
services:
redis:
image: redis:6.2-alpine
restart: always
ports:
- '6379:6379'
command: redis-server --save 20 1 --loglevel warning --requirepass eYVX7EwVmmxKPCDmwMtyKVge8oLd2t81
volumes:
volumes:
- redis:/data
postgresql:
image: postgres:14.1-alpine
Expand All @@ -16,10 +15,10 @@ services:
- POSTGRES_PASSWORD=postgres
ports:
- '5432:5432'
volumes:
volumes:
- postgresql:/var/lib/postgresql/data
# bricksllm:
# depends_on:
# depends_on:
# - redis
# - postgresql
# image: luyuanxin1995/bricksllm
Expand All @@ -38,4 +37,4 @@ volumes:
redis:
driver: local
postgresql:
driver: local
driver: local
14 changes: 12 additions & 2 deletions docs/admin.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1423,8 +1423,6 @@ components:
type: object
description: API Credentials associated with different providers.
example: { "apikey": "MY_OPENAI_API_KEY" }
required:
- apikey
properties:
apikey:
type: string
Expand All @@ -1438,6 +1436,18 @@ components:
type: string
example: MY_AZURE_OPENAI_RESOURCE_NAME
description: Required for Azure OpenAI integrations.
awsAccessKeyId:
type: string
example: MY_AWS_ACCESS_KEY_ID
description: Required for Bedrock Anthropic integrations.
awsSecretAccessKey:
type: string
example: MY_AWS_SECRET_ACCESS_KEY
description: Required for Bedrock Anthropic integrations.
awsRegion:
type: string
example: MY_AWS_REGION
description: Required for Bedrock Anthropic integrations.

ReportingEventsRequest:
type: object
Expand Down
Loading
Loading