Name		Name	Last commit message	Last commit date
parent directory ..
src/main/java/com/arthurivanets/born2crawl/web		src/main/java/com/arthurivanets/born2crawl/web
.gitignore		.gitignore
README.md		README.md
build.gradle		build.gradle

README.md

born2crawl-web

born2crawl-web provides a set of concrete implementations of the InputProcessor that can be used for web crawling.

Latest version:

dependencies {
    implementation("com.arthurivanets:born2crawl-web:x.y.z")
}

born2crawl-web depends on the following external dependencies:

OkHttp - HTTP client for the JVM, Android, and GraalVM.
jsoup - the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.

WebPageCrawler - a concrete implementation of the InputProcessor that allows to crawl the web pages.
FileDownloader - a concrete implementation of the InputProcessor that allows to download files by urls.