diff --git a/bootstrap.qmd b/bootstrap.qmd index 53c9ea9..e42b7f1 100644 --- a/bootstrap.qmd +++ b/bootstrap.qmd @@ -22,8 +22,8 @@ The procedure is to simulate `n` response vectors from `m` using the estimated p and refit `m` to these responses in turn, accumulating the statistics of interest at each iteration. The parameters of a `LinearMixedModel` object are the fixed-effects -parameters, `β`, the standard deviation, `σ`, of the per-observation noise, and the covariance -parameter, `θ`, that defines the variance-covariance matrices of the random effects. +parameters (`β`), the standard deviation (`σ`), of the per-observation noise, and the covariance +parameter (`θ`), that defines the variance-covariance matrices of the random effects. A technical description of the covariance parameter can be found in the [MixedModels.jl docs](https://juliastats.org/MixedModels.jl/v4.7.1/optimization/). Lisa Schwetlick and Daniel Backhaus have provided a more beginner-friendly description of the covariance parameter in the [documentation for MixedModelsSim.jl](https://repsychling.github.io/MixedModelsSim.jl/v0.2.6/simulation_tutorial/). For today's purposes -- looking at the uncertainty in the estimates from a fitted model -- we can simply use values from the fitted model, but we will revisit the parametric bootstrap as a convenient way to simulate new data, potentially with different parameter values, for power analysis. @@ -73,11 +73,6 @@ contrasts = Dict(:spkr => EffectsCoding(), ``` The `EffectsCoding` contrast is used with these to create a ±1 encoding. -Furthermore, `Grouping` contrasts are assigned to the `subj` and `item` factors. -This is not a contrast per-se but an indication that these factors will be used as grouping factors for random effects and, therefore, there is no need to create a contrast matrix. -For large numbers of levels in a grouping factor, an attempt to create a contrast matrix may cause memory overflow. - -It is not important in these cases but a good practice in any case. We can look at an initial fit of moderate complexity: @@ -145,11 +140,11 @@ draw(plt; figure=(;supertitle="Parametric bootstrap estimates of variance compon The bootstrap sample can be used to generate intervals that cover a certain percentage of the bootstrapped values. We refer to these as "coverage intervals", similar to a confidence interval. -The shortest such intervals, obtained with the `shortestcovint` extractor, correspond to a highest posterior density interval in Bayesian inference. +The shortest such intervals, obtained with the `confint` extractor, correspond to a highest posterior density interval in Bayesian inference. We generate these for all random and fixed effects: ```{julia} -confint(samp) +confint(samp; method=:shortest) ``` ```{julia} diff --git a/live_coding_sessions/bootstrap.jl b/live_coding_sessions/bootstrap.jl new file mode 100644 index 0000000..a421b20 --- /dev/null +++ b/live_coding_sessions/bootstrap.jl @@ -0,0 +1,4 @@ +form = @formula(rt_trunc ~ 1 + spkr * prec * load + + (1 + spkr * prec * load | subj) + + (1 + spkr * prec * load | item)) +yolo = fit(MixedModel, form, kb07; contrasts, progress) diff --git a/live_coding_sessions/intro_to_julia.jl b/live_coding_sessions/intro_to_julia.jl new file mode 100644 index 0000000..0c61456 --- /dev/null +++ b/live_coding_sessions/intro_to_julia.jl @@ -0,0 +1,48 @@ +using Statistics + +v = [1, 2, 3] +for i in v + println(i) +end + +using DataFrames +df = DataFrame(;a=[1,2],b=["a","b"]) +describe(df) + +select(df, :a) +select(df, "a") + +select(df, a) +a = "b" +# select(df, a) +transform(df, :a => ByRow(abs2) => :c) +transform(df, :a => ByRow(abs2) => :a) +transform(df, :a => ByRow(abs2); renamecols=false) + +combine(groupby(df, :b), :a => mean; renamecols=false) + +# column access +df.a + +df[2, :] + +df[:, :a] + +# mutating variants +transform(df, :a => ByRow(abs2) => :c) +# this adds in the column to the original frame +transform!(df, :a => ByRow(abs2) => :c) + +function square(x) + return x^2 +end + +square(x) = x^2 # like lambda in python + +function factor_name(x::AbstractString) + return x +end + +function factor_name(x) + return string(x) +end diff --git a/shrinkageplot.qmd b/shrinkageplot.qmd index a154b17..a21c9c9 100644 --- a/shrinkageplot.qmd +++ b/shrinkageplot.qmd @@ -21,7 +21,7 @@ using MixedModelsMakie using Random using ProgressMeter -const progress=false +const progress = false ``` Load the kb07 data set (don't tell Reinhold that I used these data). @@ -32,8 +32,6 @@ kb07 = MixedModels.dataset(:kb07) ```{julia} contrasts = Dict( - :subj => Grouping(), - :item => Grouping(), :spkr => HelmertCoding(), :prec => HelmertCoding(), :load => HelmertCoding(), @@ -116,7 +114,7 @@ X1 * X1' ## How to interpret a shrinkage plot - - Extreme shrinkage (shrunk to a line or to a point) is easy to interpret - the term is not providing benefit and can be removed. + - Extreme shrinkage (shrunk to a line or to a point) is easy to interpret -- the term is not providing benefit and can be removed. - When the range of the blue dots (shrunk values) is comparable to those of the red dots (unshrunk) it indicates that the term after shrinkage is about as strong as without shrinkage. - By itself, this doesn't mean that the term is important. In some ways you need to get a feeling for the absolute magnitude of the random effects in addition to the relative magnitude. - Small magnitude and small relative magnitude indicate you can drop that term