Gatherer 是一个简易的爬虫工具,它可以从各种内容中收集资源链接和 API 然后进行访问
Gatherer v0.1.0
Usage of ./gatherer:
-H value
HTTP request headers (eg. -H 'Header1:value' -H 'Header2:value')
-ch
Run Javascript in headless Chrome
-debug
Debug mode
-dep int
Maximum path depth (default 1)
-ef string
Filter by extensions (separated by commas)
-igq
Ignore the query portion on the URL from a[href]
-json
Log as JSON format
-lf string
Filter by response length (separated by commas)
-limit int
Maximum number of concurrent requests (default 100)
-nr
Disallow auto redirect
-proxy string
Proxy URL
-rod string
Set the default value of options used by rod.
-sf string
Filter by status codes (separated by commas)
-sub
Allow to visit sub-domains
-t int
Request timeout (second) (default 10)
-tt int
Total timeout (second)
-u string
Target URL
-ua
Use random User-Agent
-w string
Wordlist file path
- 从 JS 代码中收集资源链接
- 从 Webpack 打包的代码中收集动态生成的 JS 资源链接
- 从 Swagger 文档中解析 API 的完整路径、方法、参数
- 从 robots.txt 中收集资源链接
- 从 XML sitemap 中收集资源链接
- 执行 JS 完成页面渲染,比如 SPA
- colly
- hakrawler
- LinkFinder
- Packer-Fuzzer
- more...