Chef repository to install/config/execute the following servers:
- Apache Kafka
- Apache Spark
- Spark Jobserver
- Apache Cassandra
This repository contains the following cookbooks (some are just wrappers):
##java-wrapper Wrapper of the java cookbook to install Java 8.
##scala-wrapper Wrapper of the scala cookbook to install Scala 2.11.7.
##zookeeper-wrapper Wrapper of the zookeeper-cluster cookbook to install Zookeeper 3.5.0.
##kafka Simplified version of the apache-kafka cookbook to install Kafka 0.8.2.1.
##spark Cookbook to install Spark 1.4.1.
##spark-jobserver Cookbook to install Spark Jobserver version 0.5.2.
##cassandra Cookbook to install the Cassandra from the datastax stable release.
Contains all the zookeeper nodes ip. We assume they are also Kafka brokers.
Constains the spark master ip and the workers ip.
Constains the cassandra seed servers.
Base role for all the cluster nodes.
- java_wrapper
/nodes/<server-ip>.json
{
"run_list": [
"role[basic-server]"
],
"automatic": {
"ipaddress": "<server-ip>"
}
}
Role for all the zookeeper nodes.
- java_wrapper
- zookeeper_wrapper
/nodes/<server-ip>.json
{
"run_list": [
"role[zookeeper-cluster]"
],
"automatic": {
"ipaddress": "<server-ip>"
}
}
/data_bags/zookeeper.json
{
"id": "zookeeper",
"nodes": [
"<server-ip>"
]
}
Note: You can change the databag name, and then just override the attribute default[zookeeper-cluster][databag]
with the new name.
Role for all the Kafka brokers.
- java_wrapper
- zookeeper_wrapper
- kafka
/nodes/<server-ip>.json
{
"run_list": [
"role[kafka-cluster]"
],
"automatic": {
"ipaddress": "<server-ip>"
},
"apache_kafka": {
"broker.id": 0
}
}
Note:
- The broker-id must be different for each of the cluster brokers.
- For now we assume that the kafka brokers are the same as the zookeeper nodes, so, we are using the zookeeper databag.
Role for the spark master.
- java_wrapper
- scala_wrapper
- spark
/nodes/<server-ip>.json
{
"run_list": [
"role[spark-master]"
],
"automatic": {
"ipaddress": "<server-ip>"
}
}
/data_bags/spark.json
{
"id": "spark",
"master": "<server-ip>"
}
Role for all the spark workers.
- java_wrapper
- scala_wrapper
- spark
/nodes/<server-ip>.json
{
"run_list": [
"role[spark-worker]"
],
"automatic": {
"ipaddress": "<server-ip>"
}
}
/data_bags/spark.json
{
"id": "spark",
"master": "<server-ip>"
}
Role for the spark jobserver.
- java_wrapper
- scala_wrapper
- spark_jobserver
/nodes/<server-ip>.json
{
"run_list": [
"role[spark-jobserver]"
],
"automatic": {
"ipaddress": "<server-ip>"
}
}
Role for all the cassandra nodes.
- java_wrapper
- monit_wrapper
- cassandra
/nodes/<server-ip>.json
{
"run_list": [
"role[cassandra-cluster]"
],
"automatic": {
"ipaddress": "<server-ip>"
}
}
/data_bags/cassandra.conf
{
"id": "cassandra",
"seeds": [
"<server-ip>"
]
}
Note: The seeds
attribute must contain all the cassandra seed servers.