Document is to present the MOSIP reference reporting framework set-up and deployment. Reporting framework uses below tool for real-time streaming data and visualization.
* Logstash is part of elastic stack and used to crawl data from database transform and index to elastic search (Batch processing). Logstash is not required if Debezium, Kafka and spark used (real-time processing).
* Kibana as visualization to create and view dashboards and reports. Reference dashboards and reports are provided as part of this deployment.
* Kibana as visualization to create and view dashboards and reports. Reference dashboards and reports are provided as part of this deployment.
$sudo yum install java
$cd /etc/yum.repos.d/
$vi elasticsearch.repo
[elasticsearch-7.x]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
$sudo yum install elasticsearch
$sudo chkconfig --add elasticsearch
network.host: xxx.xx.xx.xx <Internal_ip>
http.port: 9200
discovery.seed_hosts: ["xxx.xx.xxx.xx", "*.*.*.*", "host1", "host2"]
discovery.type: single-node
$sudo -i service elasticsearch start
Example: $curl -X GET “xxx.xx.x.x:9200/?pretty”
Example: http://xxx.xx.xxx.xx:9200/
$cd /etc/yum.repos.d/
$vi kibana.repo
[kibana-7.x]
name=Kibana repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
$sudo yum install kibana
$sudo chkconfig --add kibana
server.port: 5601
server.host: xxx.xx.xx.xx
elasticsearch.hosts: ["http://xxx.xx.xx.xx:9200"]
$sudo -i service kibana start
d. Logstash Installation and Set-up (Optional: Not required for real-time processing) for Batch Data Processing.
$cd /etc/yum.repos.d/
$vi logstash.repo
[logstash-7.x]
name=Elastic repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
$sudo yum install logstash
/usr/share/logstash/logstash-core/lib/jars/
It is installed with elastic by default with latest version of elastic search
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
$sudo -i service elasticsearch stop
$sudo -I service elasticsearch start
$./bin/elasticsearch-setup-passwords interactive
$curl -u <username>:<password> -X GET "xxx.xx.xx.xx:9200/?pretty"
Example:
$curl -u elastic:elastic -X GET "xxx.xx.xx.xx:9200/?pretty"
$curl -u elastic:elastic "xxx.xx.xx.xx:9200/_cat/indices?v"
$curl -u elastic:elastic -XDELETE xxx.xx.xx.xx:9200/index-name
elasticsearch.username: "elastic"
elasticsearch.password: "elastic"
$sudo -i service kibana stop
$sudo -I service kibana start
http://xxx.xx.xx.xx:5601/
$cd /home/madmin/zookeeper
$wget http://apachemirror.wuchna.com/zookeeper/zookeeper-3.6.1/apache-zookeeper-3.6.1-bin.tar.gz
$tar -zxvf apache-zookeeper-3.6.1-bin.tar.gz
$cd apache-zookeeper-3.6.1-bin
$vi conf/zoo.cfg
tickTime = 2000
dataDir = /home/madmin/zookeeper/apache-zookeeper-3.6.1-bin/data
clientPort = 2181
initLimit = 5
syncLimit = 2
$cd /home/madmin/kafka
$wget http://apachemirror.wuchna.com/kafka/2.5.0/kafka_2.12-2.5.0.tgz
$tar -zxvf kafka_2.12-2.5.0.tgz
$cd kafka_2.12-2.5.0
$vi config/zookeeper.properties
dataDir=/home/madmin/zookeeper/apache-zookeeper-3.6.1-bin/data
clientPort=2181
$bin/zookeeper-server-start.sh config/zookeeper.properties &
$bin/kafka-server-start.sh config/server.properties &
$bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --replication-factor 1 --partitions 1 –topic test-topic
$bin/kafka-topics.sh --list --bootstrap-server localhost:9092
$bin/kafka-console-producer.sh --bootstrap-server localhost:9092 --topic test-topic
$bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic test-topic --from-beginning
$bin/kafka-topics.sh --zookeeper localhost:2181 --delete --topic test-topic
By default it is available with 10+ postgres version
$vi /var/lib/pgsql/10/data/postgresql.conf
wal_level = logical
Restart the postgres server
$cd /home/madmin/Debezium
$wget https://repo1.maven.org/maven2/io/debezium/debezium-connector-postgres/1.2.0.Final/debezium-connector-postgres-1.2.0.Final-plugin.tar.gz
$tar -zxvf debezium-connector-postgres-1.2.0.Final-plugin.tar.gz
$cd debezium-connector-postgres
$cd /home/madmin/kafka/kafka_2.12-2.5.0/config
$vi connect-standalone.properties
plugin.path=/home/madmin/Debezium
key.converter.schemas.enable=false
value.converter.schemas.enable=false
$cd /home/madmin/kafka/kafka_2.12-2.5.0
$mkdir connector
$cd /home/madmin
$mkdir spark
$cd spark
$wget http://apachemirror.wuchna.com/spark/spark-2.4.6/spark-2.4.6-bin-hadoop2.7.tgz
$tar -xvf spark-2.4.6-bin-hadoop2.7.tgz
$vi ~/.bashrc
export SPARK_HOME=/home/madmin/spark/spark-2.4.6-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin
$source ~/.bashrc
$cd /home/madmin/spark/spark-2.4.6-bin-hadoop2.7
$./sbin/start-master.sh
http://xxx.xx.xx.xx:8080/
$./sbin/start-slave.sh <master-spark-URL>
$jps
$yum install python2
$whereis python2
$sudo ln -s /usr/bin/python2 /usr/bin/python
$wget https://bootstrap.pypa.io/get-pip.py
$python get-pip.py
$python -m pip install elasticsearch
$python -m pip install jproperties
$python -m pip install kafka-python
$python -m pip install pyspark
$python -m pip install configparser
$./bin/pyspark
$./bin/spark-submit example.py
$cd /home/madmin/kafka/kafka_2.12-2.5.0
$bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --topic REPORTING-SERVER.ida.auth_transaction
$bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --topic REPORTING-SERVER.prereg.applicant_demographic
$bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --topic REPORTING-SERVER.regprc.registration_list
$bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --topic REPORTING-SERVER.regprc.registration
$bin/kafka-topics.sh --create --bootstrap-server localhost:9092 --topic REPORTING-SERVER.audit.app_audit_log
$bin/kafka-topics.sh --list --bootstrap-server localhost:9092
1. Copy spark python jobs from git repository (reporting/reporting-framework/data-streaming/jobs/*
) to below directory
/home/madmin/spark/python-jobs/
2. Copy spark jobs properties from git repository (reporting/reporting-framework/data-streaming/properties/*
) to below directory
/home/madmin/spark/spark-2.4.6-bin-hadoop2.7/
3. Update the properties (appconfig.properties
)file with valid elasticsearch host, port, user credentials and index name details
$cd /home/madmin/spark/spark-2.4.6-bin-hadoop2.7/
$./bin/spark-submit --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.4.6 /home/madmin/spark/python-jobs/mosip-db-streaming-audit.py localhost:9092 REPORTING-SERVER.audit.app_audit_log &
$./bin/spark-submit --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.4.6 /home/madmin/spark/python-jobs/mosip-db-streaming-ida.py localhost:9092 REPORTING-SERVER.ida.auth_transaction &
$./bin/spark-submit --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.4.6 /home/madmin/spark/python-jobs/mosip-db-streaming-prereg.py localhost:9092 REPORTING-SERVER.prereg.applicant_demographic &
$./bin/spark-submit --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.4.6 /home/madmin/spark/python-jobs/mosip-db-streaming-regclient.py localhost:9092 REPORTING-SERVER.regprc.registration_list &
$./bin/spark-submit --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.4.6 /home/madmin /spark/python-jobs/mosip-db-streaming-reg.py localhost:9092 REPORTING-SERVER.regprc.registration &
1. Copy connector properties from git repository (reporting/reporting-framework/kafka-connect/properties/*
) to below directory
/home/madmin/kafka/kafka_2.12-2.5.0/connector
2. Update the connector properties (*. properties
) files with postgres details like, host, port, user credentials, database, schema and table details.
$cd /home/madmin/kafka/kafka_2.12-2.5.0
$bin/connect-standalone.sh config/connect-standalone.properties connector/connector-audit.properties connector/connector-ida.properties connector/connector-prereg.properties connector/connector-regclient.properties connector/connector-reg.properties &
1. Copy reference dashboard and report file from git repository (reporting/reporting-framework/mosip-ref-dashboard/mosip-dashboards.ndjson, mosip-reports.ndjson) to local where you are accessing kibana using browser.
1. Login to kibana
2. Go to left top menu and select Stack Management
3. Select saved objects under kibana section
4. Top right click on Import
5. Select mosip-dashboards.ndjson and click on import
6. Select mosip-reports.ndjson and click on import
1. Copy logstash pipeline configs from repo (reporting/reporting-framework/logstah-config/*) to server where logstash is running to below directory.
/usr/share/logstash/config-directory -- Create directory if not exist
2. Make required changes in the config files like db connection url, username, password, elastic search host, port...etc.
$cd /usr/share/logstash
$/bin/logstash -f config-directory &
b. Check data is streaming using kafka connector and kafka broaker by consuming data using below command
$bin/kafka-console-consumer.sh --bootstrap-server localhost:9092 --topic <topic-name> --from-beginning
$curl -u elastic:elastic -X GET "xxx.xx.xx.xx:9200/?pretty"
prereg-indexname = idx-prereg
regclient-indexname = idx-regclient
reg-indexname = idx-reg
ida-indexname = idx-ida
audit-indexname = idx-audit