Slow throughput with aws_s3 output #2790

dbason · 2024-08-15T00:58:46Z

dbason
Aug 15, 2024

We have some flows where we read from a kafka input, run through an avro processor and then write to an s3 bucket. We seem to be hitting a flat max throughput which we're reasonably sure isn't the s3 request rate limit.

Our pipeline is as follows:

input:
      label: "kafka"
      kafka:
        addresses: [ "${KAFKA_BROKERS}" ]
        topics: 
          - datastream
        consumer_group: consumer_group_name
        client_id: outrider_kafka_input
        start_from_oldest: false
        group:
          session_timeout: 10s
          heartbeat_interval: 3s
          rebalance_timeout: 60s
        fetch_buffer_cap: 256
        target_version: 1.0.0
        batching:
          count: 10000
          period: 60s

    pipeline:
      processors:
        - avro:
            operator: to_json
            encoding: binary
            schema_path: "file:///config/avro.schema"

    output:
      switch:
        cases:
          - check: some condition
            continue: true
            output:
              aws_s3:
                max_in_flight: 300
                bucket: "bucketname"
                region: "us-east-2"
                credentials:
                  id: "${AWS_ACCESS_KEY}"
                  secret: "${AWS_ACCESS_KEY_SECRET}"
                path: out/output/path
                batching:
                  count: 10000
                  period: 60s
                  processors:
                    - archive:
                        format: lines

          - output:
              drop: {}

With this config we end up hitting an absolute maximum of 222 messages per second. If we double the input batching to 20000 we get a max of 444 which makes me think the limiting factor is the output. If the condition isn't met and it goes to the drop output it processes extremely quickly. We have also tried tuning the max_in_flight setting with not appreciable changes in throughput.

We are running the latest docker image that was published under jeffail/benthos

Is there anything obviously wrong?

Answered by dbason

Aug 15, 2024

I've done some experimentation with this now. We couldn't move the avro processor as we need that earlier to convert the messages from binary so we can run the blobLang query on them in the switch. The key was the checkpoint_limit field. Increasing this to the batchsize * max_in_flight increased the performance by several orders of magnitude. Thankyou so much for the advice!

View full answer

mihaitodor · 2024-08-15T11:52:31Z

mihaitodor
Aug 15, 2024
Collaborator

Hey @dbason, I think having both input and output batching makes it difficult to reason about what's going on. Please try removing the input batching and move the avro processor under the aws_s3 output batching.processors (before the archive one). Then you'll probably want to tweak the checkpoint_limit field since that limits "maximum number of messages of the same topic and partition that can be processed at a given time" and the default value is quite small when compared to your batching and max_in_flight settings. Also, you may wish to try using kafka_franz instead, since that one tends to be a bit faster.

5 replies

dbason Aug 15, 2024
Author

Thanks for the advice. We did try removing the input batching entirely, however this resulted in even slower throughput; 22 msgs per second.

mihaitodor Aug 15, 2024
Collaborator

Please look into the other stuff I suggested and try to adjust max_in_flight.

dbason Aug 15, 2024
Author

I've done some experimentation with this now. We couldn't move the avro processor as we need that earlier to convert the messages from binary so we can run the blobLang query on them in the switch. The key was the checkpoint_limit field. Increasing this to the batchsize * max_in_flight increased the performance by several orders of magnitude. Thankyou so much for the advice!

Answer selected by dbason

dbason Aug 16, 2024
Author

Also we completely removed the input batching

mihaitodor Aug 16, 2024
Collaborator

You’re welcome!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow throughput with aws_s3 output #2790

{{title}}

Replies: 1 comment 5 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Slow throughput with aws_s3 output #2790

dbason Aug 15, 2024

Replies: 1 comment · 5 replies

mihaitodor Aug 15, 2024 Collaborator

dbason Aug 15, 2024 Author

mihaitodor Aug 15, 2024 Collaborator

dbason Aug 15, 2024 Author

dbason Aug 16, 2024 Author

mihaitodor Aug 16, 2024 Collaborator

dbason
Aug 15, 2024

Replies: 1 comment 5 replies

mihaitodor
Aug 15, 2024
Collaborator

dbason Aug 15, 2024
Author

mihaitodor Aug 15, 2024
Collaborator

dbason Aug 15, 2024
Author

dbason Aug 16, 2024
Author

mihaitodor Aug 16, 2024
Collaborator