Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize performance for h2c protocol #1400

Merged

Conversation

wangchengming666
Copy link
Collaborator

@wangchengming666 wangchengming666 commented Feb 29, 2024

优化效果

  • 环境信息:

MacOS 13.3.1 (a) (22E772610a)
六核Intel Core i7 16G
mac OS 10.15.7
JDK 1.8.0_291
VM version: JDK 1.8.0_291, Java HotSpot(TM) 64-Bit Server VM, 25.291-b10
VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_291.jdk/Contents/Home/jre/bin/java
VM options: -Xmx1g -Xms1g -XX:MaxDirectMemorySize=4g -XX:+UseG1GC -Djmh.ignoreLock=true -Dserver.host=localhost -Dserver.port=12200 -Dbenchmark.output=
Blackhole mode: full + dont-inline hint
Warmup: 1 iterations, 10 s each
Measurement: 1 iterations, 300 s each
Timeout: 10 min per iteration
Threads: 1000 threads, will synchronize iterations
Benchmark mode: Throughput, ops/time
Benchmark: com.alipay.sofa.benchmark.Client.existUser

  • 优化前
Benchmark          Mode  Cnt      Score   Error  Units
Client.existUser  thrpt       12858.680          ops/s
  • 优化后
Benchmark          Mode  Cnt      Score   Error  Units
Client.existUser  thrpt       15031.383          ops/s

思路

在当前com.alipay.sofa.rpc.transport.netty.NettyChannel#writeAndFlush的代码如下

@Override
    public void writeAndFlush(final Object obj) {
       //  直接调用channel的writeAndFlush
       Future future = channel.writeAndFlush(obj); 
        future.addListener(new FutureListener() {
            @Override
            public void operationComplete(Future future1) throws Exception {
                if (!future1.isSuccess()) {
                    Throwable throwable = future1.cause();
                    LOGGER.error("Failed to send to "
                        + NetUtils.channelToString(localAddress(), remoteAddress())
                        + " for msg : " + obj
                        + ", Cause by:", throwable);
                }
            }
        });
    }

在Netty4+的版本我们通过源码可以看到,当调用channel的writeAndFlush方法时,Netty4会判断当前发送请求的线程是否是当前channel所绑定的EventLoop线程,如果不是EventLooop则会构造一个写任务WriteTask并将其提交到EventLoop中稍后执行。

private void write(Object msg, boolean flush, ChannelPromise promise) {
       //  忽略
        final AbstractChannelHandlerContext next = findContextOutbound(flush ?
                (MASK_WRITE | MASK_FLUSH) : MASK_WRITE);
        final Object m = pipeline.touch(msg, next);
        EventExecutor executor = next.executor();
  	//  判断当前线程是否是该channel绑定的EventLoop
        if (executor.inEventLoop()) {
            if (flush) {
                next.invokeWriteAndFlush(m, promise);
            } else {
                next.invokeWrite(m, promise);
            }
        } else {
            final WriteTask task = WriteTask.newInstance(next, m, promise, flush);
            //  将写任务提交到EventLoop上稍后执行
            if (!safeExecute(executor, task, promise, m, !flush)) {
                task.cancel();
            }
        }
    }

从上面的代码我们可以知道Netty4写消息时总是会保证把任务提交到EventLoop线程上处理,而每调度一次EventLoop线程去执行写任务WriteTask只能写一个消息,也就是这时候是一对一的。

那么这个时候我们可以考虑将所有的消息都先提交到一个WriteQueue消息写队列上,内部会获取一次EventLoop并提交一个任务,然后从消息队列上不断的取消息出来并调用Netty4的write。

com.alipay.sofa.rpc.common.BatchExecutorQueue#run部分代码

    private void run(Executor executor) {
        try {
            Queue<T> snapshot = new LinkedList<>();
            T item;
            while ((item = queue.poll()) != null) {
                snapshot.add(item);
            }
            int i = 0;
            boolean flushedOnce = false;
            while ((item = snapshot.poll()) != null) {
                if (snapshot.size() == 0) {
                    flushedOnce = false;
                    break;
                }
                if (i == chunkSize) {
                    i = 0;
                    flush(item);
                    flushedOnce = true;
                } else {
                    prepare(item);
                    i++;
                }
            }
            if ((i != 0 || !flushedOnce) && item != null) {
                flush(item);
            }
        } finally {
            scheduled.set(false);
            if (!queue.isEmpty()) {
                scheduleFlush(executor);
            }
        }
    }

执行该flush的逻辑时,是处于EventLoop线程的,而从前面的Netty源码我们知道,当写动作处于EventLoop线程中时是会立即执行写动作的,此时不会出现线程切换的行为。那么相较于之前每次都直接在用户线程中调用writeAndFlush而言,大幅度的减少了用户线程与EventLoop线程的切换次数,也使得一次WriteTask写出的消息数量有了大幅度提高,达到批量发包的效果。

示意图如下

image

Copy link

codecov bot commented Feb 29, 2024

Codecov Report

Attention: Patch coverage is 84.84848% with 10 lines in your changes are missing coverage. Please review.

Project coverage is 72.13%. Comparing base (357fdf0) to head (742387c).
Report is 14 commits behind head on master.

Files Patch % Lines
...com/alipay/sofa/rpc/common/BatchExecutorQueue.java 76.92% 5 Missing and 4 partials ⚠️
.../alipay/sofa/rpc/transport/netty/NettyChannel.java 88.88% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master    #1400      +/-   ##
============================================
+ Coverage     72.04%   72.13%   +0.08%     
- Complexity      795      805      +10     
============================================
  Files           422      424       +2     
  Lines         17815    17873      +58     
  Branches       2770     2774       +4     
============================================
+ Hits          12835    12892      +57     
+ Misses         3570     3566       -4     
- Partials       1410     1415       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sofastack-cla sofastack-cla bot added cla:yes CLA is ok size/L labels Mar 1, 2024
@Lo1nt
Copy link
Collaborator

Lo1nt commented Mar 11, 2024

详细的思路和方案可以开一个issue描述,pr引issue即可。放在pr里担心以后不好回溯。

Copy link

stale bot commented May 10, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label May 10, 2024
@EvenLjj EvenLjj added later This will be worked on in later version and removed wontfix This will not be worked on labels May 14, 2024
Copy link
Collaborator

@EvenLjj EvenLjj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@EvenLjj EvenLjj merged commit 21acf28 into sofastack:master May 14, 2024
5 checks passed
@wangchengming666 wangchengming666 deleted the opt-performance-for-h2c-protocol branch May 14, 2024 05:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla:yes CLA is ok later This will be worked on in later version size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants