This script is meant to generate big, flexible databases that can be used to practice the concepts learned on Biostatistic (and Statistic) classes. It comprises a plethora of variables of all sorts, with underlying relations between them, designed to allow exploration. All values are originally randomized, according to rules described on !!!ADD DOCUMENTATION!!!, many of them inspired by real data publicly available. Some easter eggs may be present, as the data will be forced to match some of my frends', in case their name is found in the table.

Background

This project was inspired by my experiences as professor assistant for the Med School Biostatistic class at Universidade Federal do Triângulo Mineiro, on the 2022/1 semester. It was proposed to create a ficticious scientific paper, step-by-step, inspired by real paper chosen by each student. However, many students had trouble using the paper's pre-existing variables and creating their own in a way that would make the best use of their syllabus, as well as the manual process of populating their databases and the distribution of their data. Helping one student to automate the creation of her data, we started applying real or reality-inspired conditions to the random generators. I later decided to expand such idea to a big database that would be flexible enough to discard the need for a paper and still allow for variety among the entire class.

The end database was originally intended to be used in as a population from which samples can be taken, as samples that can be created for each student, or as a mix of both. I suggest the use of a large n to create a population database, from which each student will devise a particular annalisys plan, followed by a sampling process to symbolize data available from literature or a small study in order to calculate sample size for the proper analysis, and then another sampling for the actual analysis project. The application is design in such a way that students may proceed with analysis of the data as-is, or may taper their population's attributes as desired.

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
Background_Data		Background_Data
Documentation		Documentation
Images		Images
.gitignore		.gitignore
README.md		README.md
chance_models.R		chance_models.R
code_dump.R		code_dump.R
config.R		config.R
data_manglin.py		data_manglin.py
data_mangling.R		data_mangling.R
database.R		database.R
health_engine.R		health_engine.R
helper_functions.R		helper_functions.R
hemogram_engine.R		hemogram_engine.R
personal_info_engine.R		personal_info_engine.R
personality_preferences_engine.R		personality_preferences_engine.R
socio-economics_engine.R		socio-economics_engine.R
validation_functions.R		validation_functions.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Background

Table of contents

Status

Installation

Usage

Requirements

Author

Credits

License

About

Releases

Packages

Languages

paulk2jonas/UFTM-BioStat-DataBase

Folders and files

Latest commit

History

Repository files navigation

About

Background

Table of contents

Status

Installation

Usage

Requirements

Author

Credits

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages