-
Notifications
You must be signed in to change notification settings - Fork 7
design_notes meta_discovery
@temujin9 has proposed, and it's a good propose, that there should exist such a thing as an 'integration cookbook'.
The hadoop_cluster cookbook should describe the hadoop_cluster, the ganglia cookbook ganglia, and the zookeeper cookbook zookeeper. Each should provide hooks that are neighborly but not exhibitionist, but should mind its own business.
The job of tying those components together should belong to a separate concern. It should know how and when to copy hbase jars into the pig home dir, or what cluster service_provide'r a redis client should reference.
-
I'm going to revert out the
node[:zookeeper][:cluster_name]
attributes -- services should always announce under their cluster. -
Until we figure out how and when to separate integrations, I'm going to isolate entanglements into their own recipes within cookbooks: so, the ganglia part of hadoop will become
ganglia_integration
or somesuch.
Pig needs jars from hbase and zookeeper. They should announce they have jars; pig should announce its home directory; the integration should decide how and where to copy the jars.
Right now in several places we have attributes like node[:zookeeper][:cluster_name]
, used to specify the cluster that provides_service zookeeper.
-
Server recipes should never use
node[:zookeeper][:cluster_name]
-- they should always announce under their native cluster. (I'd kinda like to giveprovides_service
some sugar to assume the cluster name, but need to find something backwards-compatible to use) -
Need to take a better survey of usage among clients to determine how to do this.
-
cases:
- hbase cookbook refs: hadoop, zookeeper, ganglia
- flume cookbook refs: zookeeper, ganglia.
- flume agents may reference several different flume provides_service'rs
- API using two different elasticsearch clusters
- tell flume you have logs to pick up
- tell ganglia to monitor you
Let everything with a dashboard say so, and then let one resource create a page that links to each.
These are still forming, ideas welcome.
Sometimes we want
-
if there are volumes marked for 'mysql-table_data', use that; otherwise, the 'persistent' datastore, if any; else the 'bulk' datastore, if any; else the 'fallback' datastore (which is guaranteed to exist).
-
IP addresses (or hostnames):
[:private_ip, :public_ip]
[:private_ip]
:primary_ip
-
.