this is a web crawler that used to download files from https://file.wikileaks.org/file/
by setting the key words that you wants to search for. the program will download files whose name include the key words.
if you can not connect to the website, you can also set a proxy using socks or http protocol.
- python 3.5+
- beautifuly soup
- requests
pip install -r reqirements.txt
usage: Spider.py [-h] [-K KEY] [-D DIR] [-P PROXY]
Wikileaks file spider
optional arguments:
-h, --help show this help message and exit
-K KEY, --key KEY keywords for spider
-D DIR, --dir DIR directory to save files
-P PROXY, --proxy PROXY
set the spider porxy
- Example
python Spider.py -K "a b c" -D ./files -P socks5h://
- if you want to use proxy ,socks5 protocol, please use url like this:
!!! otherwise this may cause ssl error.