Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculations of scores - Example 2 - conf.detect() #20

Open
melindahiggins2000 opened this issue Jun 22, 2023 · 2 comments
Open

Calculations of scores - Example 2 - conf.detect() #20

melindahiggins2000 opened this issue Jun 22, 2023 · 2 comments

Comments

@melindahiggins2000
Copy link

I am working with a polytomous dataset with a 5-level LiKert scaled response (1=strongly disagree to 5=strongly agree), and I'm trying to follow the logic of example 2 for the conf.detect() function. I am interested in running the polyDETECT analysis tests. However, I'm not following how the scores were calculated?

score <- stats::qnorm( ( rowMeans( dat )+.5 )  / ( 30 + 1 ) )

I'm not familiar with this equation. Why is 0.5 added to the row means? And why is this product then divided by 30+1. I'm guessing the 30 comes from the 30 columns of the dat dataset?

Please add some additional details for this example in the sirt::conf.detect() help pages and/or a reference on this equation for calculating scores using stats::qnorm() and how to adjust this equation for other datasets.

Thank you for your time and clarification!

====================================
Full example code:

## Not run: 
#############################################################################
# EXAMPLE 2: Big 5 data set (polytomous data)
#############################################################################

# attach Big5 Dataset
data(data.big5)

# select 6 items of each dimension
dat <- data.big5
dat <- dat[, 1:30]

# estimate person score by simply using a transformed sum score
score <- stats::qnorm( ( rowMeans( dat )+.5 )  / ( 30 + 1 ) )

# extract item cluster (Big 5 dimensions)
itemcluster <- substring( colnames(dat), 1, 1 )

# DETECT Item cluster
detect1 <- sirt::conf.detect( data=dat, score=score, itemcluster=itemcluster )
  ##        unweighted weighted
  ## DETECT      1.256    1.256
  ## ASSI        0.384    0.384
  ## RATIO       0.597    0.597

# Exploratory DETECT
detect5 <- sirt::expl.detect( data=dat, score=score,
                     nclusters=9, N.est=nrow(dat)  )
  ## DETECT (unweighted)
  ## Optimal Cluster Size is  6  (Maximum of DETECT Index)
  ##   N.Cluster N.items N.est N.val      size.cluster DETECT.est ASSI.est RATIO.est
  ## 1         2      30   500     0              6-24      1.073    0.246     0.510
  ## 2         3      30   500     0           6-10-14      1.578    0.457     0.750
  ## 3         4      30   500     0         6-10-11-3      1.532    0.444     0.729
  ## 4         5      30   500     0        6-8-11-2-3      1.591    0.462     0.757
  ## 5         6      30   500     0       6-8-6-2-5-3      1.610    0.499     0.766
  ## 6         7      30   500     0     6-3-6-2-5-5-3      1.557    0.476     0.740
  ## 7         8      30   500     0   6-3-3-2-3-5-5-3      1.540    0.462     0.732
  ## 8         9      30   500     0 6-3-3-2-3-5-3-3-2      1.522    0.444     0.724

# Plot Cluster solution
pl <- graphics::plot( detect5$clusterfit, main="Cluster solution" )
stats::rect.hclust(detect5$clusterfit, k=6, border="red")
@alexanderrobitzsch
Copy link
Owner

alexanderrobitzsch commented Jun 22, 2023

Yes, 30 refers to the number of items time the maximum number of categories (this, of course, only works without missing data). I will adapt this in the manual. The transformation of the mean score is made in order to define z scores for extreme cases (a raw score of 0 or 30).

@melindahiggins2000
Copy link
Author

Thank you for the quick reply. The dataset in this example dat had 30 items from with responses ranging from 0 to 2 (which I think aligns with 0=neutral, agree, strongly agree (from the original big5 dataset), 1=disagree, and 2=strongly disagree?). So, the rowMeans will range from 0 to 2.

How does adding 0.5 to the rowMeans and then dividing by the number of items + 1, compute a probability?

I see how taking the probability (or area under the normal curve with mean=0, sd=1) = p, and then running stats::qnorm(p) results in a z-score.

I'm just not following how the equation for the "transformed sum score" computes an area under the Normal curve?

Thank you again for your time and help!!
@melindahiggins2000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants