Spyder Library UNDERGOING RESTRUCTURE WILL BE MODULAR
The Spyder Library is a portable lightweight network crawler and parser. Spyder can search the page for specific html tags and records the URL to process later. Can be implemented as a service and use the built in Debug menu or drop it in an app of your choice. Very flexible and can respect the robots rule file and employs a throttling mechanism to ensure polite crawling protocols. Methods are optimized for multi-threaded speed and safety.
- Setup
- Logging
- Control
- Set depth level of crawler
- Can filter out urls based on exclusion patterns
- Output captured urls to file
- process input file urls
- Search for html tags in pages
There are built-in extension methods to add in rapid deployment.
Spyder contains Debug and text file loggers. Each logger has a rich set of options. The loggers feature custom formatters which can be customized to make log entries easier to find. The text file logger can use a single log file or each module can produce a log file.