kafka-data-gen is a simple message generation tool that will create defined size events and send them to kafka at a certain rate(Events per Second).
kafka-data-gen is a gradle project and can be built using the gradle install
command.
A jar file is created which can be used for data generation. kafka-data-gen.jar
With EPS Throttling: java -jar build/libs/kafka-data-gen.jar -message-count 20000000 -message-size 256 -topic kafka-demo -bootstrap.servers "localhost:9092" -acks all -kafka-retries 0 -kafka-batch-size 60000 -kafka-linger 1 -kafka-buffer-memory 33554432 -eps 10000
Unlimited EPS: java -jar build/libs/kafka-data-gen.jar -message-count 20000000 -message-size 256 -topic kafka-demo -bootstrap.servers "localhost:9092" -acks all -kafka-retries 0 -kafka-batch-size 60000 -kafka-linger 1 -kafka-buffer-memory 33554432 -eps 0
With Header support
java -jar build/libs/kafka-data-gen.jar -message-count 10 -message-size 256 -topic kafka-demo -bootstrap.servers "localhost:9092" -acks all -kafka-retries 0 -kafka-batch-size 60000 -kafka-linger 1 -kafka-buffer-memory 33554432 -eps 0 -generate-kafka-headers true -header-gen-profile 0
message-count
: Sets amount of messages to create.message-size
: Sets size of messages to create.eps
: Amount of events per second to send to Kafka. Set to 0 for un-throttled event creation.topic
: Kafka Topic to send messages to.bootstrap.servers
: Kafka Servers to send messages to.acks
: Acknowledgement Scheme (all, 1, 0).kafka-retries
: Kafka retries amount.kafka-batch-size
: Kafka batch size amount.kafka-linger
: Kafka linger setting(ms).kafka-buffer-memory
: Kafka buffer amount.worker-thread-count
: default is 4eventFormat
: default is JSON. Other options include kinesisoutput-stdout
: default is false. Will write records to standard outgenerate-kafka-headers
: default is false. Will generate kafka records with populated headers. Used in conjunction with -header-gen-profileheader-gen-profile
: default is -1. Will populate Kafka records with a certain profile. 0 is all the same headers, 1 is completely random headers, 2 is alternating 1 header value every 100 records