Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inactive users script #1019

Merged
merged 4 commits into from
Feb 18, 2025
Merged

Inactive users script #1019

merged 4 commits into from
Feb 18, 2025

Conversation

catarial
Copy link
Contributor

@catarial catarial commented Feb 7, 2025

Extension of #1010

A command line python script that identifies and optionally removes inactive users.

The amount of time that defines an active user can be specified with the -t flag.

The threshold time is compared to the timestamps of API calls and locations in the database.

I decided to pull out the part where it get's run on all deployments in favor of more configurability.

@shankari
Copy link
Contributor

shankari commented Feb 8, 2025

I decided to pull out the part where it get's run on all deployments in favor of more configurability.

This needs to be restored. I am not going to sit and edit the DB_HOST variable 40 times and manually run the script 40 times.
I don't see why it is an either-or for configurability versus running on all deployments.

@shankari
Copy link
Contributor

shankari commented Feb 8, 2025

Also, how did you test this? Can you please update "Testing done"?
Please also put this into the correct column in the project (PR for review by Shankari) once you are done

@catarial
Copy link
Contributor Author

I decided to pull out the part where it get's run on all deployments in favor of more configurability.

This needs to be restored. I am not going to sit and edit the DB_HOST variable 40 times and manually run the script 40 times. I don't see why it is an either-or for configurability versus running on all deployments.

I misunderstood the _common.py file and I didn't realize you could pass in the arguments of the function to execute with the * operator. I changed it back now

@catarial
Copy link
Contributor Author

Testing was done with the private database dump "openpath-stage-snapshot-investigate-demographics".

Since this dump is over a year old, running the script with a threshold of one day marks all the users as inactive

./e-mission-py.bash bin/historical/migrations/inactive.py -t 1                                 
Config file not found, returning a copy of the environment variables instead...
Retrieved config: {'DB_HOST': None, 'DB_RESULT_LIMIT': None}
URL not formatted, defaulting to "Stage_database"
Connecting to database URL localhost
PROD_LIST: ['stage']
About to run start_inactive(86400, False) on 1 deployments. Proceed? [y/n]
...
Of 76 users, found 76 inactive users:

Setting the threshold to 500 days gives us some active users

./e-mission-py.bash bin/historical/migrations/inactive.py -t 500
...
Of 76 users, found 54 inactive users:

Let's remove these users

./e-mission-py.bash bin/historical/migrations/inactive.py -t 500 -p
...
About to remove 54 users. Proceed? [y/n]
y

Now there are no more inactive users with a threshold of 500 days

./e-mission-py.bash bin/historical/migrations/inactive.py -t 500
...
Of 22 users, found 0 inactive users:

Copy link
Contributor

@shankari shankari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we need to make the changes to read from the profile anyway, I think it would also be helpful to have the script output the results as a csv, one line per deployment. That would allow us to save the results and generate some pretty pictures as needed later.

@@ -26,17 +26,21 @@
]
print(f"PROD_LIST: {PROD_LIST}")

def run_on_all_deployments(fn_to_run):
def run_on_all_deployments(fn_to_run, *args):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! @JGreenlee for visibility

@shankari
Copy link
Contributor

shankari commented Feb 18, 2025

@catarial I am merging this right now, but could you submit a separate PR that generates a csv of the results for better downstream analysis and tracking?

@shankari shankari merged commit b636af1 into e-mission:master Feb 18, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Tasks completed
Development

Successfully merging this pull request may close these issues.

2 participants