-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update documentation around migrating data #6214
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,47 +1,63 @@ | ||
How To Rebuild a CommCare HQ environment | ||
======================================== | ||
|
||
This step deletes all of the CommCare data from your environment and resets to as if it's a new environment. | ||
In practice, you will likely need this only to delete test environments and not production data. Please understand fully | ||
before you proceed to perform this as it will permenantly delete all of your data. | ||
These steps delete *all* CommCare data in your environment. | ||
|
||
In practice, you will likely *only* need this to delete test environments. We strongly discourage using any of | ||
these of steps on production data. Please fully understand this before proceeding as this will permenantly | ||
delete all of your data. | ||
|
||
How To Wipe Persistent Data | ||
--------------------------- | ||
Prior to Wiping Data | ||
-------------------- | ||
|
||
#. Ensure CommCare services are in a healthy state. If you observe any issues, see the Troubleshooting section below. | ||
|
||
.. code-block:: | ||
|
||
This step deletes all of the persistent data in BlobDB, Postgres, Couch and Elasticsearch. Note that this works only | ||
in the sequence given below, so you shouldn't proceed to next steps until the prior steps are successful. | ||
$ cchq <env_name> django-manage check_services | ||
|
||
|
||
#. Wipe BlobDB, ES, Couch using management commands. | ||
#. Deploy CommCare from a specific revision | ||
|
||
.. code-block:: | ||
|
||
$ cchq <env_name> django-manage wipe_blobdb --commit | ||
$ cchq <env_name> django-manage wipe_es --commit | ||
$ cchq <env_name> django-manage delete_couch_dbs --commit | ||
$ cchq <env_name> deploy commcare --commcare-rev=<commit-hash> | ||
|
||
#. Add "wipe_environment_enabled: True" to `public.yml` file. | ||
.. note:: | ||
This is especially important if you are performing a migration of your data to a new instance. You should have | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't this only needed if performing a migration. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah this is true. I'll make that distinction 👍🏻 |
||
been given a commit hash that matches the revision of CommCare used to generate your exported data, and it is | ||
critical that this same CommCare revision is used to rebuild the new environment, and load data in. | ||
|
||
#. Stop CommCare | ||
#. Stop CommCare services to prevent background processes from writing to databases. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 |
||
|
||
.. code-block:: | ||
|
||
$ cchq <env_name> service commcare stop | ||
$ cchq <env_name> downtime start | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (What's the difference between There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good question. This may not actually be necessary to change. Both commands operate on COMMCARE_INVENTORY_GROUPS, and stop those services using supervisorctl. The There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah on second thought, I like downtime for that reason in particular, and updated the instructions to select the kill option when prompted in b7b46e6 |
||
# Choose option to kill any running processes when prompted | ||
|
||
How To Wipe Persistent Data | ||
--------------------------- | ||
|
||
These steps are intended to be run in the sequence given below, so you shouldn't proceed to next step until | ||
the prior step is completed. | ||
|
||
|
||
#. Add "wipe_environment_enabled: True" to `public.yml` file. | ||
|
||
#. Reset PostgreSQL and PgBouncer | ||
#. Wipe BlobDB, Elasticsearch, and Couch using management commands. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good to check no services are running before doing this. This could happen if
|
||
|
||
.. code-block:: | ||
|
||
$ cchq <env_name> ap deploy_postgres.yml | ||
$ cchq <env_name> django-manage wipe_blobdb --commit | ||
$ cchq <env_name> django-manage wipe_es --commit | ||
$ cchq <env_name> django-manage delete_couch_dbs --commit | ||
|
||
#. Wipe PostgreSQL data | ||
|
||
Check status. Once status is "OK", wipe PostgreSQL data | ||
#. Wipe PostgreSQL data (restart first to kill any existing connections) | ||
|
||
.. code-block:: | ||
|
||
$ cchq <env_name> service postgresql status | ||
$ cchq <env_name> service postgresql restart | ||
$ cchq <env_name> ap wipe_postgres.yml | ||
|
||
#. Clear the Redis cache data | ||
|
@@ -56,41 +72,37 @@ in the sequence given below, so you shouldn't proceed to next steps until the pr | |
|
||
$ cchq <env_name> ap wipe_kafka.yml | ||
|
||
#. Remove the "wipe_environment_enabled: True" line in your `public.yml` file. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is this a new option that has been added? "wipe_environment_enabled"? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No it is used in ansible playbooks that wipe data as an extra safety feature to prevent people from accidentally running those playbooks. I just moved it to "wrap" all of the steps in the wiping data section even though it doesn't apply to django management commands because I thought that made more logical sense. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This option was added when we added the Ansible tasks for nuking environments' data, to ensure that the people with their fingers on the button know what they're doing. Edit: I should have refreshed my browser before responding. I see gherceg replied hours ago. ... I'll leave this here though, cos the meme is funny. 😏 |
||
|
||
You can check they have been removed by confirming that the following shows | ||
no output: | ||
|
||
**Note**\ : Use below command when the ``kafka version is < 3.x``. The ``--zookeeper`` argument is removed from 3.x. | ||
|
||
.. code-block:: | ||
|
||
$ kafka-topics.sh --zookeeper localhost:2181 --list | ||
|
||
**Note**\ : Use below command when the ``kafka version is >= 3.x``. | ||
|
||
.. code-block:: | ||
|
||
$ kafka-topics.sh --bootstrap-server localhost:9092 --list | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we don't need this anymore? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My interpretation was that we never needed this but it was just a nice additional check. However given that we don't do any similar checks for the other steps that are run, I see no reason why we should treat kafka any differently and had no issues when running |
||
|
||
Rebuilding environment | ||
---------------------- | ||
|
||
|
||
#. Remove the "wipe_environment_enabled: True" line in your `public.yml` file. | ||
|
||
#. Run Ansible playbook to recreate databases. | ||
#. Recreate all databases | ||
|
||
.. code-block:: | ||
|
||
$ cchq <env_name> ap deploy_db.yml --skip-check | ||
|
||
Run initial migration | ||
#. Run migrations for fresh install | ||
|
||
.. code-block:: | ||
|
||
$ cchq <env_name> ap migrate_on_fresh_install.yml -e CCHQ_IS_FRESH_INSTALL=1 | ||
|
||
#. Run a code deploy to create Kafka topics and Elasticsearch indices. | ||
#. Create kafka topics | ||
|
||
.. code-block:: | ||
|
||
$ cchq <env_name> django-manage create_kafka_topics | ||
|
||
.. note:: | ||
|
||
If you are migrating a project to a new environment, you can return to the steps outlined in | ||
`Import the data to the new environment <installation/migration/1-migrating-project.html#import-the-data-to-the-new-environment>`_. | ||
Otherwise, you can continue with the following steps. | ||
|
||
#. Run a code deploy to start CommCare back up. | ||
|
||
.. code-block:: | ||
|
||
|
@@ -104,3 +116,17 @@ Rebuilding environment | |
.. code-block:: | ||
|
||
$ cchq <env_name> django-manage make_superuser [email protected] | ||
|
||
Troubleshooting | ||
--------------- | ||
|
||
Issues with check_services | ||
~~~~~~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
* Kafka: No Brokers Available - Try resetting Zookeeper by performing the following steps: | ||
|
||
.. code-block:: | ||
|
||
$ cchq monolith service kafka stop | ||
$ rm -rf /var/lib/zookeeper/* | ||
$ cchq monolith service kafka restart |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe tell the reader that Celery needs to be stopped before running
load_domain_data
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally up for continuing to make it clear that celery needs to be stopped, but it's tricky since it isn't as simple as stopping celery prior to running
load_domain_data
. If celery was running at any point after resetting postgres, it is likely the case that the db has been "corrupted" and needs to be reset again.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is tricky. So a deploy will restart Celery, but we need to run a deploy after the environment is reset. Hmmm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I think the framing should be we need to run a deploy before resetting the environment (I think I made that change here too), since the "current" release also impacts what state the database in migrated to when rebuilding the environment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a critical point, so may be good to rephrase to share why
Ensure you are running the following steps from the same version of CommCareHQ as used to create the data dump being used to import data into your environment. Request for the CommCareHQ version/commit hash, if not shared.
Additionally, i think we need to do this during the initial setup. CommCareHQ is already deployed by this point and by default we deploy the latest code. So, at this point we are actually asking them to revert to an older version which isn't feasible due to migrations. Should we need a note about this when the setup is happening.