-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
3b6baa8
commit 6940737
Showing
1 changed file
with
245 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,245 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "p9WsWJ7Y1eVV" | ||
}, | ||
"source": [ | ||
"# Plastic Classifier: Getting started\n", | ||
"\n", | ||
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/surfriderfoundationeurope/IA_Pau/blob/master/Hackaton_Surfrider_Getting_Started.ipynb)\n", | ||
"\n", | ||
"\n", | ||
"This helper notebook is designed for surfrider Hackaton. The goal is to build a plastic classifier, as the core detector / tracker is already built (but only works for generic plastic). This notebook is designed to help you quickstart, but you may as well follow instructions directly from the [main github](https://github.com/surfriderfoundationeurope/surfnet/tree/further_research).\n", | ||
"\n", | ||
"If you want fast training, make sure you have a good GPU: check using the command `!nvidia-smi`" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"!git clone https://github.com/surfriderfoundationeurope/surfnet.git -b further_research\n", | ||
"%cd surfnet" | ||
], | ||
"metadata": { | ||
"id": "FzgreR54ouH9" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "9UZZrhzb3Rwp" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"pip install -r requirements.txt" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "iV_nmtVZ1y4f" | ||
}, | ||
"source": [ | ||
"### Getting the data\n", | ||
"\n", | ||
"To get the images, `azcopy` and the right token are needed. The following cells enable you to do so (Downloads 5Go of data). The token here enables to access the data until Monday 31st of January 2022. Then annotations in json format are also downloaded.\n", | ||
"\n", | ||
"It may be useful to mount a Drive if you plan to stick to Colab." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "y1yBlicl09l7" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"!wget https://aka.ms/downloadazcopy-v10-linux\n", | ||
"!tar -xvf downloadazcopy-v10-linux" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"!azcopy_linux_amd64_10.13.0/azcopy copy --recursive 'https://dataplasticoprod.blob.core.windows.net/images2label?sp=rl&st=2022-01-24T10:34:35Z&se=2022-01-31T18:34:35Z&spr=https&sv=2020-08-04&sr=c&sig=%2FHn2D3IvAECUJ0QqPpf0Jewo7GuNaIVYf23BjVjAd3Q%3D' './'" | ||
], | ||
"metadata": { | ||
"id": "e00U5YJnyTRp" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "7C7H_XkMJAhO" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"!mkdir -p data/images\n", | ||
"!mv images2label data/images/images" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "r0G2Uop_JCdc" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"!mkdir -p data/images/annotations\n", | ||
"!wget https://github.com/surfriderfoundationeurope/surfnet/releases/download/v01.2022/instances_train.json -P data/images/annotations/\n", | ||
"!wget https://github.com/surfriderfoundationeurope/surfnet/releases/download/v01.2022/instances_val.json -P data/images/annotations/" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "Tn0mpK_sNVLj" | ||
}, | ||
"source": [ | ||
"## Analyse the dataset\n", | ||
"\n", | ||
"the next following cells enable you to get a bit of information about the dataset" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "1je1TAgm5eks" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"import sys\n", | ||
"sys.path.insert(0,'/content/surfnet/src/')" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "hjrF1oan9oVv" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"import json \n", | ||
"from pycocotools.coco import COCO\n", | ||
"import matplotlib.pyplot as plt\n", | ||
"\n", | ||
"\n", | ||
"coco = COCO(annotation_file = './data/images/annotations/instances_train.json')\n", | ||
"\n", | ||
"coco_categories = coco.dataset['categories'][1:]\n", | ||
"\n", | ||
"nb_anns_per_cat = {cat['name']: len(coco.getAnnIds(catIds=[cat['id']])) for cat in coco_categories}\n", | ||
"nb_anns_per_cat = {k:v for k,v in sorted(nb_anns_per_cat.items(), key=lambda x: x[1], reverse=True)}\n", | ||
"cat_names = list(nb_anns_per_cat.keys())\n", | ||
"nb_images = list(nb_anns_per_cat.values())\n", | ||
"\n", | ||
"plt.bar(x = cat_names, height = nb_images)\n", | ||
"plt.xticks(range(len(cat_names)), cat_names, rotation='vertical')\n", | ||
"plt.ylabel('Number of annotations')\n", | ||
"plt.tight_layout()\n", | ||
"plt.autoscale(True)\n", | ||
"plt.savefig('dataset_analysis')" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"import os\n", | ||
"from detection.coco_utils import CocoDetectionWithExif, ConvertCocoPolysToBboxes\n", | ||
"\n", | ||
"def get_dataset(root, image_set):\n", | ||
" PATHS = {\n", | ||
" \"train\": (\"images\", os.path.join(\"annotations\", \"instances_train.json\")),\n", | ||
" \"val\": (\"images\", os.path.join(\"annotations\", \"instances_val.json\")),\n", | ||
" }\n", | ||
"\n", | ||
" img_folder, ann_file = PATHS[image_set]\n", | ||
" img_folder = os.path.join(root, img_folder)\n", | ||
" ann_file = os.path.join(root, ann_file)\n", | ||
"\n", | ||
" dataset = CocoDetectionWithExif(img_folder, ann_file, transforms=ConvertCocoPolysToBboxes())\n", | ||
"\n", | ||
" return dataset" | ||
], | ||
"metadata": { | ||
"id": "ndzdCs7C2rLP" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": { | ||
"id": "FzWI1i4L-YMD" | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from detection.coco_utils import get_surfrider\n", | ||
"from detection import transforms\n", | ||
"\n", | ||
"base_size = 540\n", | ||
"crop_size = (544, 960)\n", | ||
"downsampling_factor = 4\n", | ||
"num_classes = 10\n", | ||
"path = '/content/surfnet/data/images/'\n", | ||
"\n", | ||
"# Building a train & test dataset\n", | ||
"train_dataset = get_dataset(path, \"train\")\n", | ||
"val_dataset = get_dataset(path, \"val\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"source": [ | ||
"Let us display a full size picture, and corresponding bounding box" | ||
], | ||
"metadata": { | ||
"id": "MKaeAs534eH5" | ||
} | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"source": [ | ||
"import matplotlib.pyplot as plt\n", | ||
"x, y = next(iter(train_dataset))\n", | ||
"print(x.shape, y)\n", | ||
"plt.imshow(x)" | ||
], | ||
"metadata": { | ||
"id": "zQP9rfoB3zdh" | ||
}, | ||
"execution_count": null, | ||
"outputs": [] | ||
} | ||
], | ||
"metadata": { | ||
"accelerator": "GPU", | ||
"colab": { | ||
"collapsed_sections": [], | ||
"name": "Hackaton Surfrider Getting Started", | ||
"provenance": [] | ||
}, | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"name": "python" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 0 | ||
} |