Skip to content

Commit

Permalink
docs: add details to README, LICENSE
Browse files Browse the repository at this point in the history
  • Loading branch information
lazd committed Mar 14, 2020
1 parent c222718 commit bf88d4f
Show file tree
Hide file tree
Showing 2 changed files with 52 additions and 3 deletions.
9 changes: 9 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Copyright (c) 2020, Lawrence Davis
All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
46 changes: 43 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,50 @@
# corona-scraper
# coronadatascraper
> A scraper that pulls coronavirus case data from verified sources.
Scrape case data from goverment websites.
## Running the scraper

## Usage
Before following these instructions, install [yarn](https://classic.yarnpkg.com/en/docs/install/).

```
yarn install
yarn start
```

## Contributing

Contributions for any place in the world are welcome. Write clean and clear code, and please ensure to follow the criteria below for sources.

Send a pull request with your scraper, and be sure to run the scraper first with the instructions above to make sure the data is valid.

It's a tough challenge to write scrapers that will work when websites are inevitably updated. Here are some tips:

* Write your scraper so it handles aggregate data with a single scraper entry (i.e. find a table, process the table)
* Try not to hardcode county or city names, instead let the data on the page populate that
* Try to make your scraper less brittle by generated class names (i.e. CSS modules)
* When targeting elements, don't assume order will be the same (i.e. if there are multiple `.count` elements, don't assume the second one is deaths, verify it by parsing the label)

## Criteria for sources

Any source added to the scraper must meet the following criteria:

### 1. Sources must be government or health organizations

No news articles, no aggregated sources.

### 2. Sources must provide the number of cases at a bare minimum

Additional data is welcome.

### 3. Presumptive cases are not considered confirmed

As of now, presumptive cases should not be considered.

## License

This project is licensed under the permissive [BSD 2-clause license](LICENSE).

The data produced by this project is public domain.

## Attribution

Please cite this project if you use it in your visualization or reporting.

0 comments on commit bf88d4f

Please sign in to comment.