Skip to content

Commit

Permalink
differences for PR #176
Browse files Browse the repository at this point in the history
  • Loading branch information
actions-user committed Nov 12, 2024
1 parent 90d05f2 commit 1f38aae
Show file tree
Hide file tree
Showing 7 changed files with 103 additions and 21 deletions.
4 changes: 2 additions & 2 deletions clt.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ mean(random_numbers)
```

``` output
[1] 0.5192774
[1] 0.4994345
```
The important point of the Central Limit Theorem is, that if we take a large
number of random samples, and calculate the mean of each of these samples,
Expand All @@ -59,7 +59,7 @@ mean(runif(100))
```

``` output
[1] 0.4992043
[1] 0.5301663
```
And we can use the `replicate()` function to repeat that calculation several times, in this case 1000 times:

Expand Down
86 changes: 84 additions & 2 deletions data.md
Original file line number Diff line number Diff line change
Expand Up @@ -1021,6 +1021,14 @@ _Dimensions:_ Rows: 631 Columns: 5

### SEXRAT

Frequencies of different sex orders of the first 5 children born in families.

Does the probability of a male birth differ from 50%?

Are the sex distribution of successive offspring independent? Ie, does the
sex of the first born child, affect the probability of the second child?


_Dimensions:_ Rows: 60 Columns: 8

[source](data.md#rosner_1)^1^
Expand All @@ -1043,8 +1051,6 @@ _Dimensions:_ Rows: 60 Columns: 8
| sexchldn* | Sex of all children |
| num_fam** | Number of families |

::::

+ For families with 5+ children, the sex of the first 5 children are listed.
The number of children is given as 5 for such families.

Expand All @@ -1054,6 +1060,78 @@ such families.

** Number of families with specific gender contribution of children

Example; there are:

* 4400 families with 2 children where both children are male,
* 4270 families with 2 children where the first child is male, and the second female and,
* 4633 families with 2 children where the first child is female and the second male.

::::


:::: spoiler

## Example

Compare P(child 2 is male | child 1 is female) with P(child 2 is male | child 1 is male)

That is, the probability child 2 is male given that child 1 is female.

```r
sexrat <- read_csv("https://raw.githubusercontent.com/KUBDatalab/R-toolbox/main/episodes/data/SEXRAT.csv")

# Number of families with female first child:

sexrat %>%
filter(sx_1 == "F") %>%
summarise(nF1 = sum(num_fam))

# A tibble: 1 x 1
nF1
<dbl>
1 25719

# Number of those families with a male second child:

sexrat %>%
filter(sx_1 == "F",
sx_2 == "M") %>%
summarise(nF1M2 = sum(num_fam))

# A tibble: 1 x 1
nF1M2
<dbl>
1 12882

# Point estimate for probability of child 2 being male, given child 1 is female:

pF1M2 <- 12882/25719
pF1M2

[1] 0.5008748

# Standard error of mean for proportions:

SEM_F1M2 <- sqrt(pF1M2*(1-pF1M2)/25719)
SEM_F1M2
[1] 0.003117757

# That gives us a 95% confidence interval for P(Child 2 is male | Child 1 is female):

pF1M2 + c(-1,1)*1.96*SEM_F1M2

[1] 0.4947640 0.5069856
# Doing the same calculations for P(Child 2 is male | Child 1 is male)
# gives us an interval of (rounded):

[1] 0.512 0.524

# Which would indicate the having a male child first, increases the probability
# of having a second male child.
```

::::


### SMOKE

Expand Down Expand Up @@ -1507,6 +1585,10 @@ Den oprindelige kilde til det datasæt: https://www.who.int/teams/global-tubercu

<a id="rosner_1">1</a>: Rosner, Bernard A. Fundamentals of Biostatistics, 7/e, International Edition, 2011 ISBN: 9780538735896. https://www.cengage.com/cgi-wadsworth/course_products_wp.pl?fid=M20b&product_isbn_issn=9780538733496&token

der er også guf her https://www.doc88.com/p-5925003681540.html

https://statanaly.com/wp-content/uploads/2023/04/Fundamentals-of-Biostatistics-7th-Edition.pdf

<a id="hopper_2">2</a>: Hopper, J.H. & Seeman, E (1994). The bone density
of female twins discordant for tobacco use. New England Journal of Medicine, 330, 387-392.

Expand Down
Binary file modified fig/clt-rendered-random-histogram-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified fig/clt-rendered-repeated-means-histogram-1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
30 changes: 15 additions & 15 deletions kmeans.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,31 +144,31 @@ clustering
```

``` output
K-means clustering with 3 clusters of sizes 69, 47, 62
K-means clustering with 3 clusters of sizes 62, 47, 69
Cluster means:
Alcohol Malicacid Ash Alcalinityofash Magnesium Totalphenols Flavanoids
1 12.51667 2.494203 2.288551 20.82319 92.34783 2.070725 1.758406
1 12.92984 2.504032 2.408065 19.89032 103.59677 2.111129 1.584032
2 13.80447 1.883404 2.426170 17.02340 105.51064 2.867234 3.014255
3 12.92984 2.504032 2.408065 19.89032 103.59677 2.111129 1.584032
3 12.51667 2.494203 2.288551 20.82319 92.34783 2.070725 1.758406
Nonflavanoidphenols Proanthocyanins Colorintensity Hue
1 0.3901449 1.451884 4.086957 0.9411594
1 0.3883871 1.503387 5.650323 0.8839677
2 0.2853191 1.910426 5.702553 1.0782979
3 0.3883871 1.503387 5.650323 0.8839677
3 0.3901449 1.451884 4.086957 0.9411594
OD280OD315ofdilutedwines Proline
1 2.490725 458.2319
1 2.365484 728.3387
2 3.114043 1195.1489
3 2.365484 728.3387
3 2.490725 458.2319
Clustering vector:
[1] 2 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 3 3 2 2 3 3 2 2 3 2 2 2 2 2 2 3 3
[38] 2 2 3 3 2 2 3 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 3 1 3 1 1 3 1 1 3 3 3 1 1 2
[75] 3 1 1 1 3 1 1 3 3 1 1 1 1 1 3 3 1 1 1 1 1 3 3 1 3 1 3 1 1 1 3 1 1 1 1 3 1
[112] 1 3 1 1 1 1 1 1 1 3 1 1 1 1 1 1 1 1 1 3 1 1 3 3 3 3 1 1 1 3 3 1 1 3 3 1 3
[149] 3 1 1 1 1 3 3 3 1 3 3 3 1 3 1 3 3 1 3 3 3 3 1 1 3 3 3 3 3 1
[1] 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 1 1 2 2 1 1 2 2 1 2 2 2 2 2 2 1 1
[38] 2 2 1 1 2 2 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 3 1 3 1 3 3 1 3 3 1 1 1 3 3 2
[75] 1 3 3 3 1 3 3 1 1 3 3 3 3 3 1 1 3 3 3 3 3 1 1 3 1 3 1 3 3 3 1 3 3 3 3 1 3
[112] 3 1 3 3 3 3 3 3 3 1 3 3 3 3 3 3 3 3 3 1 3 3 1 1 1 1 3 3 3 1 1 3 3 1 1 3 1
[149] 1 3 3 3 3 1 1 1 3 1 1 1 3 1 3 1 1 3 1 1 1 1 3 3 1 1 1 1 1 3
Within cluster sum of squares by cluster:
[1] 443166.7 1360950.5 566572.5
[1] 566572.5 1360950.5 443166.7
(between_SS / total_SS = 86.5 %)
Available components:
Expand All @@ -188,9 +188,9 @@ table()
``` output
true
quess 1 2 3
1 0 50 19
1 13 20 29
2 46 1 0
3 13 20 29
3 0 50 19
```

The algorithm have no idea about the numbering, the three groups are numbered
Expand Down
2 changes: 1 addition & 1 deletion md5sum.txt
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
"instructors/instructor-notes.md" "5cf113fd22defb29d17b64597f3c9bc0" "site/built/instructor-notes.md" "2024-11-12"
"learners/CLT-dk.md" "a1852fcb44235823d23cd4a4af6d3d49" "site/built/CLT-dk.md" "2024-11-12"
"learners/CLT-en.md" "8ae8f14f05472820ef155acf980ae06f" "site/built/CLT-en.md" "2024-11-12"
"learners/data.md" "862bdded52e397453ca7ca4c9ecb6d0e" "site/built/data.md" "2024-11-12"
"learners/data.md" "c0c18402a14da78b51c444e502637ea8" "site/built/data.md" "2024-11-12"
"learners/reference.md" "527a12e217602daae51c5fd9ef8958df" "site/built/reference.md" "2024-11-12"
"learners/setup.md" "9b1b924cf88e06b154562a92250fcb76" "site/built/setup.md" "2024-11-12"
"poster/poster_dk.Rmd" "721a2b68eeb0b61308158bce41c6ae21" "site/built/poster_dk.md" "2024-11-12"
Expand Down
2 changes: 1 addition & 1 deletion normal-distribution.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ rnorm(5, mean = 0, sd = 1 )
```

``` output
[1] -1.8124108 -0.9414055 -0.1999997 1.6756810 0.8498545
[1] -1.18196702 -0.21613964 0.23521930 -0.81630292 0.04482079
```
Den returnerer (her) fem tilfældige værdier fra en normalfordeling med (her)
middelværdi 0 og standardafvigelse 1.
Expand Down

0 comments on commit 1f38aae

Please sign in to comment.