Skip to content

ishmalkhalid/CSV-Data-Cleaning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Homework 02

Questions

Question 1:

Is the number of suicides committed greater in males or females between the years (1976-2016)?

  • Who (population): Populations from many different countries around the globe
  • What (subject, discipline): The number of suicides
  • Where (location): All around the globe
  • When (snapshot, longitudinal): 1979-2016
  • How much data do you need to do the analysis/work: The number of males and females that committed suicide from 1976-2016

Question 2:

What is the maximum number of suicides committed?

  • Who (population): In any country around the globe
  • What (subject, discipline): Maximum number of suicides
  • Where (location): All around the globe
  • When (snapshot, longitudinal): 1979-2016
  • How much data do you need to do the analysis/work: All the number of suicides across the various divisions

Question 3:

What is the average number of suicides commited?

  • Who (population): In any country around the globe
  • What (subject, discipline): Average number of suicides
  • Where (location): All around the globe
  • When (snapshot, longitudinal): 1979-2016
  • How much data do you need to do the analysis/work: All the number of suicides across the various divisions

Who Might Collect Relevant Data / What Articles or Publications Cite a Relevant Data Set?

Question 4:

What is the variation in the number of suicides over the years ?

  • Who (population): In any country around the globe
  • What (subject, discipline): Variation in the number of suicides over the years
  • Where (location): All around the globe
  • When (snapshot, longitudinal): 1979-2016
  • How much data do you need to do the analysis/work: All the number of suicides across the various divisions

Who Might Collect Relevant Data / What Articles or Publications Cite a Relevant Data Set?

Government agencies, NGOs, academic researchers, scholarly articles

About the Data

  1. Name / Title: WHO Suicide Statistics

  2. Link to Data: https://www.kaggle.com/szamil/who-suicide-statistics

  3. Source / Origin:

    • Author or Creator: Szamil
    • Publication Date: 2018-08-30
    • Publisher: Szamil
    • Version or Data Accessed: Version 1
  4. License: CC BY-NC-SA 3.0 IGO (Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Intergovernmental Organization)

  5. Can You Use this Data Set for Your Intended Use Case? Yes, as stated by the license that: "Extracts of the information in the web site may be reviewed, reproduced or translated for research or private study but not for sale or for use in conjunction with commercial purposes."

Format

Overview

Format: Comma Separated Values (csv) Size: 1,809 KB Number of Records: 43776

Sample of Data

Albania,2015,male,35-54 years,,374464 Albania,2015,male,5-14 years,,184114 Albania,2015,male,55-74 years,,287770 Albania,2015,male,75+ years,,64200 Anguilla,1983,female,15-24 years,0, Anguilla,1983,female,25-34 years,0, Anguilla,1983,female,35-54 years,0, Anguilla,1983,female,5-14 years,0,

Fields or Column Headers

  • country: String
  • year: Integer
  • sex: String
  • age: String
  • suicides_no: Integer
  • population: Integer

Analysis

Central Tendency

Mean suicides: 194 Mean population: 1666203

Dispersion

Range of suicides: 22338 Range of suicides: 43805201 Variation of the suicides 664487.06813276 Variation of the population 13393385961749

Outliers

Maximum number of suicides: 22338 Minimum number of suicides: 0 Maximum number of population: 43805214 Minimum number of population: 13

Other

Number of males who committed suicide: 6124183 Number of females who committed suicide: 1946961

Conclusion

Yes , the calculations do answer my questions. However, I believe this analysis can be more substansive if we break up the analysis into different parts and include a seperate analysis for different/across years, different/across genders, different/across countries. Furthermore, the calculation for population mean holds no true importance since the population mean is for the for each country inclusive of 37 years and all age groups and genders i.e. it double counts the same people each year.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages