-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #975 from hpcc-systems/yadhap/JM-docs
Added detailed JM Docs
- Loading branch information
Showing
6 changed files
with
201 additions
and
116 deletions.
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
--- | ||
sidebar_position: 3 | ||
label: "Cluster Monitoring" | ||
title: "Cluster Monitoring" | ||
--- | ||
|
||
Tombolo currently offers only one kind of monitoring for clusters, which is the ability to monitor if the engine(s) specified inside of the cluster exceed a usage percentage target. | ||
|
||
1. Engine - Specify the engine(s) inside of the cluster that you'd like to monitor. | ||
2. Cron - The schedule by which tombolo will run the monitoring to check the user provided parameters. | ||
3. Notify When - A list of monitoring paramters that are available, a notification will be sent out if the paramter is set and detected. | ||
|
||
a. Exceeded cluster usage % - A Percentage usage threshold that will trigger a notification to be sent if it is exceeded.<br/> | ||
|
||
4. Notification Channel - this allows users to provide a set of MS Teams webhooks and/or emails that will be notified when the user provided parameters are met. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
--- | ||
sidebar_position: 2 | ||
label: "Directory Monitoring" | ||
title: "Directory Monitoring" | ||
--- | ||
|
||
Directory monitoring in Tombolo enables tracking of files within the HPCC cluster, covering both [Logical Files](#logical-files) and [Landing Zone Files](#landing-zone-files). For logical files, you can monitor attributes like size, type, compression status, deletion, and protection status. For landing zone files, you can track file movement, appearance, and sizes, ensuring efficient management and visibility of your data | ||
|
||
1. Notification Channel - this allows users to provide a set of MS Teams webhooks and/or emails that will be notified when the user provided parameters are met. | ||
2. Cron - The schedule by which tombolo will run the monitoring to check the user provided parameters. | ||
3. File - This can be a singular file name or use any eligible [wildcards](/docs/User-Guides/Wildcards) to select anything matching a pattern. | ||
|
||
--- | ||
|
||
### Logical Files | ||
|
||
Logical Files are files stored inside of the HPCC cluster after a spray job has been run on them. For logical files, Tombolo offers the following monitoring paramters | ||
|
||
1. Notify When - A list of monitoring paramters that are available, a notification will be sent out if the paramter is set and detected. | ||
|
||
a. File Size - When the file exceeds the input size.<br/> | ||
b. File Size not in range - If the file size exceeds the maximum, or goes below the minimum<br/> | ||
c. Owner - If the owner of the file changes<br/> | ||
d. File Type - If the file type changes<br/> | ||
e. Compressed - If the file has been compressed<br/> | ||
f. Protecred - If the file has a new protection set for it<br/> | ||
g. File Deleted - If the file is deleted<br/> | ||
|
||
--- | ||
|
||
### Landing Zone Files | ||
|
||
Landing Zone files are files that have been added to an external landing zone, ready to be sprayed or desprayed into the HPCC cluster. | ||
|
||
1. Landing Zone - The Landing zone machine name and IP Address to monitor | ||
|
||
2. Directory - The directory of the file or pattern that you would like to monitor | ||
|
||
3. Notify When - A list of monitoring paramters that are available, a notification will be sent out if the paramter is set and detected. | ||
|
||
a. File is not moving - Notification is sent out if the file is not moved within the expected file move time from the first time that Tombolo detects the file.<br/> | ||
b. File is detected - Anytime tombolo matches a file name or pattern within the specified directory and landing zone.<br/> | ||
c. Incorrect File Size - Anytime a file is detected outside of the specified size range.<br/> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
--- | ||
sidebar_position: 4 | ||
label: "Superfiles Monitoring" | ||
title: "Superfiles Monitoring" | ||
--- | ||
|
||
Superfiles are collections of files inside of an HPCC cluster, and can be thought of as similar to a folder inside of an operating system. Tombolo currently provides a few different monitoring paramters for these collections of files. | ||
|
||
1. Search File - This can be a singular superfile name or use any eligible [wildcards](/docs/User-Guides/Wildcards) to select anything matching a pattern. | ||
2. Cron - The schedule by which tombolo will run the monitoring to check the user provided parameters. | ||
3. Notify When - | ||
|
||
a. File Size Changes - When the size of the superfile, which is the sum of the size of all of the logical files inside of it, changes from the last detected value.<br/> | ||
b. Total size not in range - When the size of the superfile is not in the specified range<br/> | ||
c. Subfile count changes - When the number of files within the superfile changes<br/> | ||
d. Subfile count not in range - When the number of files within the superfile is not in range<br/> | ||
e. Update interval not followed - When the file doesn't recieve an update during the specified interval from it's last detected update.<br/> | ||
f. File deleted - When the superfile is deleted<br/> | ||
|
||
4. Notification Channel - this allows users to provide a set of MS Teams webhooks and/or emails that will be notified when the user provided parameters are met. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,8 @@ | ||
{ | ||
"label": "Monitoring", | ||
"position": 2, | ||
"link": { | ||
"type": "generated-index", | ||
"description": "Tombolo allows you to monitor four types of assets: Files, Clusters, Jobs, and Superfiles. For each, you can set up notification conditions and choose notification recipients. Click on a card below to learn more about each monitoring option and how to configure it." | ||
} | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
--- | ||
sidebar_position: 1 | ||
label: "Job Monitoring" | ||
title: "Job Monitoring" | ||
--- | ||
|
||
Job monitoring in Tombolo allows you to track **Work Units (WU)** running on HPCC clusters. Tombolo checks the clusters every 30 minutes and monitors work units based on a specified name or name pattern. It can send notifications if a work unit enters an undesired state, deviates from the expected state, or violates punctuality rules, such as failing to start or complete within the expected time. | ||
|
||
Tombolo allows you to monitor various states of a work unit to ensure jobs run as expected and to alert you if issues arise. Below are the states that can be monitored: | ||
|
||
**Job Did Not Start on Expected Time** If a job is scheduled to run at a specific time but fails to start, Tombolo will alert you to ensure that the delay is addressed promptly. | ||
|
||
**Job Did Not Complete on Time** Tombolo monitors jobs to ensure they complete within the expected timeframe. If a job exceeds the expected completion time, a notification will be sent to inform you of the delay. | ||
|
||
**Failed** If a work unit encounters an error or fails for any reason during execution, Tombolo will notify you so you can take appropriate action to resolve the issue. | ||
|
||
**Aborted** If a job is manually or automatically aborted, Tombolo will send a notification, allowing you to investigate why the job was stopped prematurely. | ||
|
||
**Unknown** When a work unit enters an unknown state, Tombolo will notify you to investigate further. | ||
|
||
--- | ||
|
||
<div class="custom_details_component"> | ||
<details class="env_config-details"> | ||
<summary> | ||
## Setting Up Job Monitoring | ||
</summary> | ||
|
||
To set up job monitoring in Tombolo, follow these steps. After creating the monitoring, it must be approved and activated for the monitoring to take effect. | ||
|
||
1. Access the **Monitoring** menu from the left vertical bar and select **Job**. | ||
2. This will navigate you to the Job Monitoring page. | ||
3. On the top-right corner, click the **Action** button and select **Add Job Monitoring**. | ||
4. A modal with three tabs will appear. Complete the required details in each tab. Alternatively, if you have an existing monitoring setup, you can quickly create a new one by duplicating it and modifying the necessary details. To duplicate, click on 'More' under the Actions menu in the Job Monitoring table. | ||
|
||
### **Tab 1: Basic Information** | ||
|
||
This tab collects essential details about the job monitoring setup: | ||
|
||
1. **Monitoring Name** Provide a unique name for this job monitoring configuration. The name must be distinct across all job monitorings. | ||
|
||
2. **Description** Enter a brief description to make it easier to understand the purpose of this monitoring. | ||
|
||
3. **Monitoring Scope** Choose one of the two options: | ||
Select 'Specific Job' if you know the job name and it is always the same. Select 'Monitoring by Job Pattern' if the job name changes but follows a consistent pattern. | ||
4. **Cluster** | ||
Select a predefined cluster where the job is expected to run. | ||
|
||
5. **Job Name / Job Name Pattern** | ||
If the job name is constant, enter the exact name. If not, provide a pattern for the job name. Click the information icon next to this field for details on acceptable patterns. | ||
|
||
--- | ||
|
||
### **Tab 2: Scheduling Information** | ||
|
||
This tab is used to define the job schedule and timing requirements. | ||
|
||
1. **Schedule** Use the schedule picker at the top to define when the monitoring should occur. | ||
|
||
2. **Require Complete** Toggle this to **Yes** if the job is expected to complete at a specific time. | ||
|
||
3. **Expected Start Time** Specify the time when the job is expected to start. If there is no specific expected start time, enter the same value as the expected completion time. | ||
|
||
4. **Expected Completion Time** Specify the time by which the job is expected to complete. | ||
|
||
Depending on enabled integrations, additional fields may appear in this tab. | ||
|
||
--- | ||
|
||
### **Tab 3: Notifications** | ||
|
||
This tab configures how and when notifications are sent: | ||
|
||
1. **Notify When** Choose the conditions under which a notification should be sent (e.g., job state or timing violations). | ||
|
||
2. **Primary Contact** Specify the primary recipient for this job monitoring configuration. | ||
|
||
_Depending on enabled integrations, additional fields may appear in this tab_ | ||
|
||
</details> | ||
</div> | ||
|
||
<div class="custom_details_component"> | ||
<details class="env_config-details"> | ||
<summary> | ||
## Updating Job Monitoring | ||
</summary> | ||
|
||
To update existing job monitoring, you can use the **Edit** or **Bulk Edit** options: | ||
|
||
1. **Edit Option** | ||
For individual Job Monitoring, the **Edit** option is available under **Actions** in the Job Monitoring table. This allows you to update all the fields for a specific job monitoring. | ||
|
||
2. **Bulk Edit Option** | ||
The **Bulk Edit** option, accessible via the **Action** button in the top-right corner, enables you to update multiple job monitorings simultaneously. This is useful for making changes to several monitorings at once; however, note that only a limited set of fields can be updated using this option. | ||
</details> | ||
</div> | ||
|
||
<div class="custom_details_component"> | ||
<details class="env_config-details"> | ||
<summary> | ||
## Pausing and Starting Job Monitoring | ||
</summary> | ||
Job monitoring can be started or paused as needed. To do so, use the **Start/Pause** icon under the **Actions** menu in the Job Monitoring table. | ||
</details> | ||
</div> | ||
|
||
<div class="custom_details_component"> | ||
<details class="env_config-details"> | ||
<summary> | ||
## Notifications | ||
</summary> | ||
Job Monitoring sends notifications when the specified conditions are met. These notifications are saved by Tombolo and can be accessed from the Notifications page. To view the notifications, expand the dashboard and click on **Notifications** in the left navigation menu. | ||
</details> | ||
</div> |