Skip to content
This repository has been archived by the owner on Apr 13, 2021. It is now read-only.
/ crawler-benchmark Public archive

A Reference Framework for the Automated Exploration of Web Applications. Provides some general web features to let you test crawlers in a well defined environment.

License

Notifications You must be signed in to change notification settings

WebMole/crawler-benchmark

Repository files navigation

Crawler Benchmark

Build Status codecov Docker Stars Docker Pulls Docker layers

Crawler-Benchmark

A Reference Framework for the Automated Exploration of Web Applications. Provides some general web features to let you test crawlers in a well defined environment.

Usage

  1. Clone repository

  2. cd into the repository

  3. Install Docker

  4. Install docker-compose

  5. Build and use the docker image with docker-compose

    cd crawler-benchmark
    cp .env.example .env # then edit with desired credentials
    docker-compose up -d

When it's done, you can visit the app running at localhost:8080

Development

Run tests locally

docker-compose run --rm website bash -c 'pytest --cov --cov-report term:skip-covered'

css editing

We are using grunt to auto compile scss files into css files and we may add tasks in the future. npm dependencies are specified in package.json.

Install sass from the command line (you may need sudo privileges)

gem install sass
npm install
npm run grunt

Todos

  • build frontend using webpack and load pure.scss from node_modules
  • Publish docker image so the world can spin this
  • Add nodejs docker support
  • Add link to home page (from title)
  • Add new features!
    • Robots.txt validation
    • Visited urls
    • Provide an api
  • Website navigation generation from model
  • Improve settings
    • Import
    • Export
    • json? yaml?
  • Spread the word, make the application known by crawler authors
  • Put online
    • Get crawled by general crawlers like google bot
    • Share results to the public

About

A Reference Framework for the Automated Exploration of Web Applications. Provides some general web features to let you test crawlers in a well defined environment.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published