Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: High latency in async Python #3284

Closed
suxb201 opened this issue Feb 27, 2025 · 8 comments
Closed

Performance: High latency in async Python #3284

suxb201 opened this issue Feb 27, 2025 · 8 comments
Labels
bug Something isn't working Optimization Optimization matter such as cleaner code, performance etc python Python wrapper User issue Issue openned by users Users Pain An issue known to cause users pain, generaly open by the user.

Comments

@suxb201
Copy link

suxb201 commented Feb 27, 2025

Describe the bug

I have noticed high latency when using valkey-glide in an asyncio environment. On my MacBook, the latency is about 1ms, while with redis-cli it is 0.3ms and with redis-py it is also 0.3ms.

Expected Behavior

low latency

Current Behavior

3x latency compare to redis-py and redis-cli

Reproduction Steps

glide code:

import asyncio
import gc
import random
import time

import glide


async def print_time():
    config = glide.GlideClientConfiguration(
        addresses=[glide.NodeAddress("127.0.0.1", 6379)],
        request_timeout=10,
        reconnect_strategy=glide.BackoffStrategy(0, 0, 0),
        protocol=glide.ProtocolVersion.RESP2,
    )
    r = await glide.GlideClient.create(config)
    while True:
        gc.disable()
        time_start = time.time()
        await r.set("test", "test")
        print(f"{time.strftime('%Y-%m-%d %H:%M:%S', time.localtime())} {(time.time() - time_start) * 1000:.3f}")
        gc.enable()
        await asyncio.sleep(random.random())
    await r.close()


async def main():
    tasks = []
    for i in range(1):
        tasks.append(asyncio.create_task(print_time()))
    await asyncio.gather(*tasks)


asyncio.run(main())

output:

Image

redis-cli --latency output:

Image

Client version used

1.3.0

Engine type and version

Redis 7.2.7

OS

Darwin Kernel Version 24.2.0

Language

Python

Language Version

3.12

@suxb201 suxb201 added the bug Something isn't working label Feb 27, 2025
@suxb201
Copy link
Author

suxb201 commented Feb 27, 2025

I suspect this latency is due to the scheduling of asyncio, but it is still higher than redis-py. In a scenario with 100 tasks:

  • redis.asyncio.Redis: 0.2ms to 0.9ms
  • glide: initially 5ms, then stabilizing between 0.3ms to 1.4ms, but occasionally reporting a timeout error.

Here’s the glide code:

import asyncio
import gc
import random
import time

import glide
import redis.asyncio as redis

async def print_time():
    config = glide.GlideClientConfiguration(
        addresses=[glide.NodeAddress("127.0.0.1", 6379)],
        request_timeout=10,
        reconnect_strategy=glide.BackoffStrategy(0, 0, 0),
        protocol=glide.ProtocolVersion.RESP2,
    )
    r = await glide.GlideClient.create(config)
    while True:
        gc.disable()
        time_start = time.time()
        await r.set("test", "test")
        print(f"{time.strftime('%Y-%m-%d %H:%M:%S', time.localtime())} {(time.time() - time_start) * 1000:.3f}")
        gc.enable()
        await asyncio.sleep(random.random())
    await r.close()

async def main():
    tasks = []
    for i in range(100):
        tasks.append(asyncio.create_task(print_time()))
    await asyncio.gather(*tasks)

asyncio.run(main())

And here’s the code for redis-py:

import asyncio
import gc
import random
import time

import glide
import redis.asyncio as redis


async def print_time():
    r = redis.Redis()
    while True:
        gc.disable()
        time_start = time.time()
        await r.set("test", "test")
        print(f"{time.strftime('%Y-%m-%d %H:%M:%S', time.localtime())} {(time.time() - time_start) * 1000:.3f}")
        gc.enable()
        await asyncio.sleep(random.random())
    await r.close()


async def main():
    tasks = []
    for i in range(100):
        tasks.append(asyncio.create_task(print_time()))
    await asyncio.gather(*tasks)


asyncio.run(main())

@BoazBD
Copy link
Collaborator

BoazBD commented Feb 27, 2025

Hi @suxb201 ! :)
Thanks for letting us know about this issue.
We'll work on reproducing the problem and investigating the potential causes and bottlenecks, and we'll keep you updated as soon as we have more information.

@BoazBD BoazBD added python Python wrapper Users Pain An issue known to cause users pain, generaly open by the user. User issue Issue openned by users Optimization Optimization matter such as cleaner code, performance etc labels Feb 27, 2025
@suxb201
Copy link
Author

suxb201 commented Feb 27, 2025

@BoazBD I am currently testing the latency variations of different Redis databases using Python, so I'm very sensitive to any delays.

@avifenesh
Copy link
Member

avifenesh commented Feb 27, 2025

@suxb201 While @BoazBD is checking it, what is the size of data you test, and what is the planned load, and what is the planned size of data?
I just want to make sure that we test what is relevant for you and not a sample which not pointing to the case you really care about.

@BoazBD
Copy link
Collaborator

BoazBD commented Feb 27, 2025

@suxb201

Thank you for your patience!

Glide's API is asynchronous and uses a multiplexed connection to interact with Valkey. This means that all requests are sent through a single connection, leveraging Valkey's pipelining capabilities. Pipelining enhances performance by sending multiple commands at once without waiting for individual responses, and using a single connection is the recommended method to optimize performance.
Based on these observations, here are a few suggestions to optimize GLIDE's performance:

  1. Create a single client and use it across all tasks, instead of creating a new client for each task.
  2. Running the server locally may cause the client and server to interfere with each other, leading to inaccurate results. Connecting to a remote server would better simulate a real-world scenario.
  3. GLIDE is designed primarily for high throughput with real-time optimizations. The wait you introduced obstructs GLIDE's multiplexing optimizations. Removing the wait should reveal significant performance improvements.

To have more accurate results, can you please share you use case and setup:

  1. What is the required latency and throughput you are planning on using?
  2. Are you using a cluster mode or standalone setup?
  3. Are you using TLS enabled?

You can also run our python benchmark that compares redis-py and glide by the following command:

git clone https://github.com/valkey-io/valkey-glide.git
cd valkey-glide/benchmarks
# if TLS isn't enabled, add -no-tls flag
# if you're testing a cluster-mode setup, add -is-cluster flag
./install_and_test.sh -python -host "example.cluster.use1.cache.amazonaws.com" -concurrentTasks 100 -data 100

@suxb201
Copy link
Author

suxb201 commented Feb 28, 2025

@BoazBD @avifenesh Thank you very much for your detailed explanation. I believe this statement addresses the issue:

GLIDE is designed primarily for high throughput with real-time optimizations. The wait you introduced obstructs GLIDE's multiplexing optimizations. Removing the wait should reveal significant performance improvements.

To answer your questions:

  • What is the required latency and throughput you are planning on using? I expect the latency to be around 0.15 ms.
  • Are you using a cluster mode or standalone setup? We are using a standalone setup.
  • Are you using TLS enabled? No, TLS is not enabled.

Upon further testing, I found that my previous tests were not rigorous. I have conducted some more tests, and the results show that GLIDE's latency data has significantly improved. In comparison with asyncio redis-py, GLIDE has slightly higher latency at low QPS, but as QPS increases, the latencies from both approaches become comparable.

1 thread sync redis-py's qps is about 7000.

low qps asyncio:
Image

Image

high qps asyncio:

Image

test code:

import asyncio
import random
import sys
import time

import glide
import redis.asyncio as redis


async def glide_client():
    config = glide.GlideClientConfiguration(
        addresses=[glide.NodeAddress("r-2zesvk4kudzic5yicy.redis.rds.aliyuncs.com", 6379)],
        request_timeout=10,
        reconnect_strategy=glide.BackoffStrategy(0, 0, 0),
        protocol=glide.ProtocolVersion.RESP2,
    )
    r = await glide.GlideClient.create(config)
    return r


def redis_client():
    return redis.Redis(host="r-2zesvk4kudzic5yicy.redis.rds.aliyuncs.com", port=6379, db=0)


total_count = 0
max_latency = 0
last_log_time = time.time()


async def print_time():
    global total_count, max_latency, last_log_time
    if sys.argv[1].lower() == "redis":
        print("Using asyncio Redis client")
        r = redis_client()
    else:
        assert sys.argv[1].lower() == "glide"
        print("Using Glide client")
        r = await glide_client()

    while True:
        time_start = time.time()
        await r.set("test", "test")
        latency = (time.time() - time_start) * 1000
        total_count += 1
        if latency > max_latency:
            max_latency = latency
        if time.time() - last_log_time > 1:
            print(f"{time.strftime('%Y-%m-%d %H:%M:%S', time.localtime())} max_latency: {max_latency:.3f}ms, qps: {total_count / (time.time() - last_log_time):.0f}")
            last_log_time = time.time()
            max_latency = 0
            total_count = 0
        # await asyncio.sleep(random.random()

async def main():
    tasks = []
    for i in range(1):  # 1 coroutine
        tasks.append(asyncio.create_task(print_time()))
    await asyncio.gather(*tasks)


asyncio.run(main())

That said, while the latency data for both GLIDE and asyncio redis-py is quite good, I am still concerned about the higher uncertainty associated with asyncio scheduling delays, as well as the propagation of the asyncio coding style in my project. Therefore, I will continue to consider using sync redis-py in my project with Tair Pulse.

Thank you all for your support and assistance!

@avifenesh
Copy link
Member

@suxb201 Understandable!
Just so we understand, are you looking for good performance while staying with sync style?
Just to better understand the needs of users.

@avifenesh
Copy link
Member

avifenesh commented Feb 28, 2025

@suxb201 I'm closing the issue.
Just pointing, in case you'll be interested in the future - #3285
There is a sync API creation under process.
The results, seems to be, by far better performance than other sync options.
Seems like this is where python glide is really overtake from the lowest to the highest throughput.

I believe it won't take long, but can't swear on that.

Anyway, it worth keeping an eye on it if sync API and low latency are your goals.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Optimization Optimization matter such as cleaner code, performance etc python Python wrapper User issue Issue openned by users Users Pain An issue known to cause users pain, generaly open by the user.
Projects
None yet
Development

No branches or pull requests

3 participants