data-analysis-hack

Exercise in data analysis for CosmoLab

Goals for this excercise:

Practice programming & organizing code
Practice collaborating using git & github
Leran about (or gain more understanding about) probabilistic modeling

Each group should work together to do the following, for one (or both) of the two datasets. Some of these steps have specific associated functions that should be implemented, as indicated below; other steps are more open-ended.

Load & visualize the data. (Figure it out!)
Come up with a parameterized physical analytic model that could describe the data in the absence of uncertainty.
Plot a few different versions of this physical model (for a few different fixed values of the parameters) on top of the data.
Write down a probabilistic generative model to describe the data.
Implement a likelhood function according to this generative model.
Define the prior probability distribution function for your parameter space.
Overplot your data with N versions of your physical model evaluated with N samples from your prior.
Implement a posterior probability function.
Find the values of your model parameters that maximize the posterior probability, and make a model/data plot using those parameters.
Generate samples from your posterior according to an MCMC package of your choice.
Visualize the distribution and correlations of your posterior samples.
Overplot your data with N versions of your physical model evaluated with N samples from your posterior.

Some of the above steps should be encapsulated into the following functions you should implement, in a submodule named according to your group, which you should submit to this repository via a pull request, such that anyone could execute the following code from the top level of the repo:

import analysis_group_X as analysis

analysis.plot_data()

analysis.plot_model_exploration()

analysis.plot_model_prior_samples(N=25)

analysis.plot_model_max_posterior()

samples = analysis.sample_posterior()

analysis.plot_posterior_samples(samples)

analysis.plot_model_posterior_samples(samples, N=25)

The master.ipynb notebook contains such cells, and can be used to test and experiment. However, please only submit the code in your group's directory in your pull request, not any notebooks.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
data		data
.gitignore		.gitignore
README.md		README.md
analysis.py		analysis.py
master.ipynb		master.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

data-analysis-hack

About

Releases

Packages

Languages

usc-cosmolab/data-analysis-hack

Folders and files

Latest commit

History

Repository files navigation

data-analysis-hack

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages