Skip to content

Commit

Permalink
Adding comment on papermill_launcher.ipynb
Browse files Browse the repository at this point in the history
  • Loading branch information
aderrien7 committed Aug 30, 2024
1 parent 43b7a0a commit 28c978b
Showing 1 changed file with 21 additions and 7 deletions.
28 changes: 21 additions & 7 deletions docs/papermill_launcher.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -186,10 +186,24 @@
"source": [
"___\n",
"### Explantion of the code below \n",
"- nbs is a list of the notebooks that has been processed, wether they failed or not.\n",
"- The code loops over the tag id present in tag list and calculates the time difference in the tagging events.\n",
"- If the fish has observation over 2 days and has not been processed yet, it starts running a parametrized notebook.\n",
"- If it succeds, the generated notebook is placed papermill_output/done, else, it goes at papermill_output/failed"
"- nbs is a list of the notebooks that has been processed, wether they failed or not. This list is used to keep a track of the tags that already has been generated.\n",
"These two conditions are used in the following way.\n",
"```\n",
"observation_length = (recapture_date - release_date) / np.timedelta64(1, \"D\")\n",
"\n",
"if (\n",
" (tag_name not in nbs) and observation_length > 2\n",
"): # Use this statement if you already start a computation that has been interrupted but the generation is still valid\n",
" # if time_difference > 2: # Use this if you want every tag in tag list to be processed\n",
"```\n",
"First, it means that if the fish, based on the tagging events, has a observation period of less than two days, his trajectory will not be computed.\n",
"Second, It means that you can either choose to regenerate for all the tags that you generated once if you noticed that there was an issue in the results.\n",
"If the generation was interrupted during the process but the results are valid, you can start back you computation where it has stopped.\n",
"\n",
"- The code loops over the tag id present in tag list and calculates the time difference in the tagging events.\n",
"- If it succeds, the generated notebook is placed papermill_output/done, else, it goes at papermill_output/failed\n",
" "
]
},
{
Expand All @@ -213,16 +227,16 @@
"for tag_name in tqdm(tag_list, desc=\"Processing tags\"):\n",
" try:\n",
" te = pd.read_csv(s3.open(f\"{cloud_root}/cleaned/{tag_name}/tagging_events.csv\"))\n",
" np_datetime1 = np.datetime64(\n",
" release_date = np.datetime64(\n",
" datetime.strptime(te[\"time\"][0], \"%Y-%m-%dT%H:%M:%SZ\")\n",
" )\n",
" np_datetime2 = np.datetime64(\n",
" recapture_date = np.datetime64(\n",
" datetime.strptime(te[\"time\"][1], \"%Y-%m-%dT%H:%M:%SZ\")\n",
" )\n",
" time_difference = (np_datetime2 - np_datetime1) / np.timedelta64(1, \"D\")\n",
" observation_length = (recapture_date - release_date) / np.timedelta64(1, \"D\")\n",
"\n",
" if (\n",
" (tag_name not in nbs) and time_difference > 2\n",
" (tag_name not in nbs) and observation_length > 2\n",
" ): # Use this statement if you already start a computation that has been interrupted but the generation is still valid\n",
" # if time_difference > 2: # Use this if you want every tag in tag list to be processed\n",
"\n",
Expand Down

0 comments on commit 28c978b

Please sign in to comment.