-
Notifications
You must be signed in to change notification settings - Fork 5
Maintenance Schedule Guide
This guide serves as a detailed roadmap for the bi-weekly maintenance of forest-mainnet
, forest-calibnet
, lotus-mainnet
, Daily-Snapshot
, and Sync-check
. By providing comprehensive steps, our objective is to uphold the reliability, security, and optimal performance of these critical services.
A bi-weekly maintenance routine offers a balanced approach between ensuring our services stay current and minimizing potential disruptions. By doing so, we can swiftly identify and address issues, reducing extended downtime or unexpected complications.
-
Verify the status of all nodes to confirm they are operational. You can conveniently assess this by reviewing the New Relic Dashboard for all active nodes. Additionally, log into Digital Ocean for further confirmation. Document any inconsistencies or anomalies you observe or open an issue here
-
Check The snapshot buckets for Mainnet and Calibnet. Confirm that snapshots update as expected and adhere to their Hourly schedules.
For insight to how the filecoin nodes are running the follow following steps below.
-
Dashboards provide a summary, but the devil is in the details. What do key metrics like
Epoch Count
,Healthy Peers
, andProcess Wall Time
tell us aboutForest-calibnet
andForest-calibnet
? -
Log Delve: Logs narrate the node's story. Scour the logs for
forest-mainnet
,forest-calibnet
andlotus-mainnet
. Look beyond errors: Unusual patterns, even if not errors, could indicate looming issues.
For lotus-mainnet node you might need to login into the node to gain more insight and Confirm synchronization status with the wider network.
- Begin by logging into the node.
- Once inside, execute the following command to enter the running container:
docker exec -it lotus-mainnet bash
- To verify the synchronization status, use:
lotus sync status
- Obtain a general node overview:
lotus info
- Verify that the snapshot generation process is free of errors. you can check the
forest-notificationa
channel about the status recent uploads and logs if any errors - Inspect the latest snapshots for data consistency and completeness.
- Ensure there's sufficient storage space for upcoming snapshots.
- Execute the synchronization check script/tool.
- Confirm synchronization status as notificated in the
forest-notification
channel and document any discrepancies.
- Review the service logs on new-relic or any indications of unusual behavior or potential unauthorized accesses.
- Ensure that the firewall configurations on Digital Ocean are strictly set up to permit only essential traffic.
- Review health metrics and performance indicators across all nodes and services.
- Ensure services are operational, accessible, and delivering expected performance levels.
- Thoroughly document any discrepancies found during maintenance and flag necessary issues on the GitHub repository.
- Reflect upon the maintenance process. Identify any bottlenecks or challenges encountered.
- Explore avenues to refine, automate, or enhance the maintenance procedure.
- Periodically review and revise this guide to incorporate changes, augmentations, or new methodologies.
By regularly adhering to this guide, we cement our commitment to maintaining a stable, secure, and high-performance environment for all our services.