The process
- Business Understanding
- Find data
- Understand data
- Understand all columns
- Run tests involving aggregate data: - Neighborhoods with most accidents - Day of the week with most accidents - Time of the day with most accidents - Type of car involved in most accidents
- Define business rules (criteria)
- Criteria for dangerous routes
- Define a scale to measure risk
- Define the influence radius of an accident
- Prepare data/matrix
- Choose the columns/criteria that make more sense (dimensionality reduction)
- Fill missing data by combining other data
- Develop logic/work flow
- Develop code
- Test
- Implement solution