There are many websites that allow us to monitor and track vessels online :
Commercial products, most of them provide very useful information for free and tracking vessels online can be done by everyone. But monitoring can be time consuming, especially during long term surveillance, on several ships...
For example, we, at OpenFacto, issued a report concerning Embargo violations in Libya in june 2020 and we tracked several vessels night and day for this purpose...
We think that Python can be very interesting for journalists to automatically collect data online, even with beginners in programing.
This repository contains two jupyter notebooks that explain how to scrape the data from a website, by first creating a simple script to do the job on one vessel (Vessel1), and then creating a function from this initial script in order to put it in a simple loop.
We use the following extra libraries for Python3 :
- Pandas
- BeautifulSoup4
- Requests
As we want the code to remain simple, yet efficient and friendly user, we also use a GoogleSpreadheet but any other online spreadsheet with csv output can be used.
Of course, Jupyter Notebook must be installed on your laptop.
This said, If you want to efficientely start coding we recommand you to use a real IDE, a tool to code, such as VSCodium.
We want to thank all the crew at DataHarvest for their commitment, their help, their trust. Special Thanks to Adriana Homolova, our moderator on this session!
This session and code is inspired on BearHunt's brilliant work here
Huge THANK YOU to @Benjnr for all his good vibes, patience, fantastic skills in Python, and for mentoring this tool! Peace man.