-
Notifications
You must be signed in to change notification settings - Fork 77
Creating forests
This page covers the following topics:
- The two approaches for creating forests - either via properties or via payloads
- Previewing the forests that will be created
- Customizing the naming of forests when using the property-driven approach
ml-gradle provides two primary ways for creating forests and replicas:
- A "payload-driven" way, where you create the exact payloads for all of the primary forests and replica forests you want for a particular database. Tedious, but effective.
- A "property-driven" way, where several properties are available to configure the number and properties of forests and replicas. ml-gradle uses these properties to build forest payloads dynamically.
The payload-driven way means any forest configuration is possible, but it's also much more tedious than simply setting a handful of properties. The two approaches are described below.
This approach is simple and is shown in this sample project. For each database that you want to create forests for by defining all of the forest payloads, you create an src/main/ml-config/forests/(name of database)/(any-filename-you-want).json
file. As shown in the sample project, you can put many forest payloads into one file.
For example, if you wish to create custom forests for a database named my-database
, you would add a JSON file to src/main/ml-config/forests/my-database
containing the forests you desire. The example project linked to above demonstrates how this is done.
This is the preferred approach - the payload-driven approach exists only when the set of available properties can't be used to meet your use case (and if you run into this problem, please file an issue to identify it).
The Database and forest section of the Property Reference page covers all of the properties, with version 3.2.0 adding a number of them. The bullets below provide a little more detail about which properties you'll want to use and when:
- To set how many forests are created on each host, use
mlForestsPerHost
. New in 3.7.0 - you can specify multiple data directories per host for a database; if you do this, thenmlForestsPerHost
will really specify the number of forests per data directory per host (this is true formlContentForestsPerHost
as well). So if you have a value of 2 for a database and 3 data directories for a database, you'll end up with 6 forests on a host. - To specify that forests should only be created on one host for certain databases, use
mlDatabasesWithForestsOnOneHost
. - To specify which hosts forests should be created on for certain databases, use
mlDatabaseHosts
. - New in 3.3.0 - to specify which groups' hosts forests should be created on for certain databases, use
mlDatabaseGroups
. This takes precedence overmlDatabaseHosts
, in the event that a database has an entry in both properties. - To set default data/fast/large directories for all forests, regardless of the database, use
mlForestDataDirectory
,mlForestFastDataDirectory
, andmlForestLargeDataDirectory
. - To set data/fast/large directories for specific databases (thus overriding the above properties), use
mlDatabaseDataDirectories
,mlDatabaseFastDataDirectories
, andmlDatabaseLargeDataDirectories
. New in 3.7.0 - you can specify multiple data directories per database. - To create forest replicas for specific databases, use
mlDatabaseNamesAndReplicaCounts
. - To set default data/fast/large directories for all replica forests, regardless of the database, use
mlReplicaForestDataDirectory
,mlReplicaForestFastDataDirectory
, andmlReplicaForestLargeDataDirectory
. - To set data/fast/large directories for replicas for specific databases (thus overriding the above properties), use
mlDatabaseReplicaDataDirectories
,mlDatabaseReplicaFastDataDirectories
, andmlDatabaseReplicaLargeDataDirectories
.
In addition to the properties above that control replica forest creation, the underlying ml-app-deployer library also has a ReplicaBuilderStrategy
interface class that defines how replicas are constructed. In the 3.12.0 release of ml-gradle and ml-app-deployer, the default strategy now properly distributes replicas across hosts.
If you run into any issues with how this works and would like to define your own implementation of ReplicaBuilderStrategy
, you can do so by setting a different implementation on the AppConfig
object:
ext {
def myStrategy = new org.example.MyStrategy() // must be on buildscript classpath
mlAppConfig.setReplicaBuilderStrategy(myStrategy)
}
An existing alternative is the implementation used prior to the 3.12.0 release:
ext {
def myStrategy = new com.marklogic.appdeployer.command.forests.GroupedReplicaBuilderStrategy()
mlAppConfig.setReplicaBuilderStrategy(myStrategy)
}
New in 3.7.0 - you can use the mlPrintForestPlan
task to see what forests and replicas will be created for a database before the database is created (there's not yet support for seeing what replicas will be created for a database that already exists - if that's of interest, please file an issue!):
gradle -Pdatabase=my-database mlPrintForestPlan
This task will use all of the above configuration properties to determine what forests and replicas will be created when you run "mlDeploy" (or via a combination of "mlDeployDatabases" and "mlConfigureForestReplicas").
As an example, let's say you're connecting to a 3-host cluster (with host names of host1, host2, and host3), and you want to preview forests for a not-yet-created content database named "example-content". Since mlContentForestsPerHost
defaults to 3, running the above task will print out 9 forests, each looking like this:
{
"forest-name" : "example-content-1",
"host" : "host1",
"database" : "example-content"
}
Now let's add some replicas to the database - we'll add the following in gradle.properties:
mlDatabaseNamesAndReplicaCounts=example-content,2
Running the task again will still print out 9 primary forests, and each will now have 2 replicas:
{
"forest-name" : "example-content-1",
"host" : "host1",
"database" : "example-content",
"forest-replica" : [ {
"host" : "host2",
"replica-name" : "example-content-1-replica-1"
}, {
"host" : "host3",
"replica-name" : "example-content-1-replica-2"
} ]
}
Version 3.7.0 lets us specify multiple data directories per host - let's try that out by adding this to gradle.properties:
mlDatabaseDataDirectories=example-content,/path1|/path2
Running mlPrintForestPlan
now returns 18 forests, and ml-gradle will try to balance replicas across the different data directories as well:
{
"forest-name" : "example-content-1",
"host" : "host1",
"database" : "example-content",
"data-directory" : "/path1",
"forest-replica" : [ {
"host" : "host2",
"replica-name" : "example-content-1-replica-1",
"data-directory" : "/path2"
}, {
"host" : "host3",
"replica-name" : "example-content-1-replica-2",
"data-directory" : "/path1"
} ]
}
If you're interested in reusing the code for calculating forests, just check out the source of the PrintForestPlanTask
.
Starting in version 3.7.0, you can customize how forests are named when using the property-driven approach. Forests are created using the ml-app-deployer library, and that library uses an instance of the ForestNamingStrategy
interface to name primary forests and replica forests.
To use your own instance of ForestNamingStrategy
, you'll need to add a script like what's below to your build.gradle file (you can of course reference an implementation of ForestNamingStrategy
that's in an external jar). The important part is to associate an implementation of ForestNamingStrategy
with a database name, as shown in the "ext" block in the script:
import com.marklogic.appdeployer.AppConfig;
import com.marklogic.appdeployer.command.forests.ForestNamingStrategy;
class MyNamingStrategy implements ForestNamingStrategy {
String getForestName(String databaseName, int forestNumber, AppConfig appConfig) {
return "my-forest-" + databaseName + "-" + forestNumber
}
String getReplicaName(String databaseName, String forestName, int forestReplicaNumber, AppConfig appConfig) {
return "my-replica-" + forestName + "-" + forestReplicaNumber
}
}
ext {
mlAppConfig.forestNamingStrategies.put("example-content", new MyNamingStrategy())
}