The main objective of this project is to equip and familiarize students with the necessary skills to successfully complete a data science project, including data collection and processing, data exploration and visualization, identifying and formulating problems, developing algorithms and models, designing experimental evaluations and discussing results, scientific writing and working in teams.
/data
data folder, included in.gitignore
/generate
code used to generate the Pythoncaptcha
dataset/captcha
a local version of the captcha library, modified so bounding boxes are extractable/tex
source files for the report (compile using pdfLaTeX and BibTeX)
- Dataset, Collection, Pre-processing
- What are the characteristics of the dataset being used?
- How is the dataset being collected, combined, augmented, etc?
- What are the various steps in pre-processing the dataset?
- Problem Definition
- Is the problem clearly defined?
- Why is the selected problem important, hard, etc?
- Proposed Algorithm/Approach
- Is the proposed algorithm/approach defined in sufficient details?
- Why is this an appropriate algorithm/approach for the problem?
- Evaluation Methodology
- How do you go about evaluating your proposed algorithm/approach?
- What are the evaluation metrics and why are they appropriate?
- Results and Discussion
- What are your main findings from this study?
- Is there a fair discussion of possible issues and limitations of this study?
- Overall Presentation
- How well structured and organized is the report?
- Are the appropriate visualizations (graphs, charts, tables, etc) being used?
- Originality/Creativity
- How original and/or creative is the proposed study in terms of the above-mentioned points?