Skip to content

Data simulation of individuals with various characteristics

Notifications You must be signed in to change notification settings

SimonGrouard/intro_simulation

Repository files navigation

Already created First names, last names and gender (gender is in accordance with the first name). 
We downloaded a huge csv file where a company gave names and gender of loads of people.

For height: look at distributions between boys and girls and take means and variances for each and sample from normal distributions for each gender

BMI and height :  basically the same

Country and city : do the same as for first name and gender, we found such a file

Case control status : sample from binomial distribution that has a prob of 0.5

Education level : 5 different levels, we sample from uniform distribution

Gene expression and SNP values : we have 10 different gene expression features and 5 different SNP features. Each gene can take any number between -1 and 1. Each SNP can take any value between (0,1,2). So do the same as before, that is sample from uniform distribution.

About

Data simulation of individuals with various characteristics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •