-
Notifications
You must be signed in to change notification settings - Fork 96
Squall Cluster Configs
We will explain the content of a config file on squall-$VERSION/test/squall/confs/cluster/1G_hyracks
:
DIP_DISTRIBUTED true
DIP_QUERY_NAME hyracks
DIP_TOPOLOGY_NAME_PREFIX username
DIP_DATA_ROOT /export/home/squalldata/tpchdb/
DIP_SQL_ROOT ../test/squall/sql_queries/
DIP_SCHEMA_PATH ../test/squall/schemas/tpch.txt
# DIP_DB_SIZE is in GBs
DIP_DB_SIZE 1
########################################
#DIP_OPTIMIZER_TYPE INDEX_SIMPLE
#DIP_MAX_SRC_PAR 1
#DIP_OPTIMIZER_TYPE INDEX_RULE_BUSHY
#DIP_MAX_SRC_PAR 1
#DIP_OPTIMIZER_TYPE NAME_MANUAL_PAR_LEFTY
#DIP_PLAN CUSTOMER:2,ORDERS:3:4
#DIP_OPTIMIZER_TYPE NAME_MANUAL_COST_LEFTY
#DIP_PLAN CUSTOMER,ORDERS
#DIP_TOTAL_SRC_PAR 20
#DIP_OPTIMIZER_TYPE NAME_RULE_LEFTY
#DIP_TOTAL_SRC_PAR 20
DIP_OPTIMIZER_TYPE NAME_COST_LEFTY
DIP_TOTAL_SRC_PAR 20
########################################
#below are unlikely to change
DIP_EXTENSION .tbl
DIP_READ_SPLIT_DELIMITER \|
DIP_GLOBAL_ADD_DELIMITER |
DIP_GLOBAL_SPLIT_DELIMITER \|
DIP_ACK_EVERY_TUPLE true
DIP_KILL_AT_THE_END true
# Storage manager parameters
# Storage directory for local runs
STORAGE_LOCAL_DIR /tmp/ramdisk
# Storage directory for cluster runs
STORAGE_DIP_DIR /export/home/squalldata/storage
STORAGE_COLD_START true
MEMORY_SIZE_MB 4096
Config file 1G_hyracks
is the same as in 0_01G_hyracks_ncl
, except:
-
DIP_DISTRIBUTED
is set to true. -
DIP_TOPOLOGY_NAME_PREFIX
is an optional parameter. It is used for distinguishing different users possibly running the same query at the same time on the cluster. -
DIP_DATA_ROOT
refers to a location on the cluster. -
There is no
DIP_RESULT_ROOT
, because in Cluster Mode the results are not automatically merged and compared against a file.
Thus, in order to change database size, only the DIP_DB_SIZE
has to be changed, and for changing the query, we have to modify DIP_QUERY_NAME
. You can find more examples of config files in squall-$VERSION/test/squall/confs
, or you can write new ones from scratch.
Keep in mind that for in each config file you need to set DIP_DATA_ROOT
. In addition, DIP_QUERY_NAME
must correspond to a query from squall-$VERSION/test/squall/sql_queries/
.
You can run Squall with a desired config file as follows:
cd squall-$VERSION/bin
./squall_cluster.sh $CONFIG_FILE_PATH
where $CONFIG_FILE_PATH is relative or full path to a config file.
Due to the constrained main memory, you cannot run arbitrary large database with small component parallelism. For information on detecting this behavior, please consult Squall query plans vs Storm topologies, section How to know we run out of memory?.