Package 'IRTest' reference manual

Title:	Parameter Estimation of Item Response Theory with Estimation of Latent Distribution
Description:	Item response theory (IRT) parameter estimation using marginal maximum likelihood and expectation-maximization algorithm (Bock & Aitkin, 1981 <doi:10.1007/BF02293801>). Within parameter estimation algorithm, several methods for latent distribution estimation are available. Reflecting some features of the true latent distribution, these latent distribution estimation methods can possibly enhance the estimation accuracy and free the normality assumption on the latent distribution.
Authors:	Seewoo Li [aut, cre, cph]
Maintainer:	Seewoo Li <[email protected]>
License:	GPL (>= 3)
Version:	2.1.0
Built:	2025-02-22 22:50:38 UTC
Source:	https://github.com/seewooli/irtest

Ability parameter estimation with fixed item parameters

Description

Ability parameter estimation when item responses and item parameters are given. This function can be useful in ability parameter estimation is adaptive testing.

Usage

adaptive_test(
  response,
  item,
  model = "dich",
  ability_method = "EAP",
  quad = NULL,
  prior = NULL
)
adaptive_test(
  response,
  item,
  model = "dich",
  ability_method = "EAP",
  quad = NULL,
  prior = NULL
)

Arguments

`response`	A matrix of item responses. For mixed-format test, a list of item responses where dichotomous item responses are the first element and polytomous item responses are the second element.
`item`	A matrix of item parameters. For mixed-format test, a list of item parameters where dichotomous item parameters are the first element and polytomous item parameters are the second element.
`model`	`dich` for dichotomous items, `cont` for continuous items, and a specific item response model (e.g., `PCM`, `GPCM`, `GRM`) for polytomous items and a mixed-format test. The default is `dich`.
`ability_method`	The ability parameter estimation method. The available options are Expected a posteriori (`EAP`), Maximum Likelihood Estimates (`MLE`), and weighted likelihood estimates (`WLE`). The default is `EAP`.
`quad`	A vector of quadrature points for `EAP` calculation. If `NULL` is passed, it is set as `seq(-6,6,length.out=121)`. The default is `NULL`.
`prior`	A vector of the prior distribution for `EAP` calculation. The length of it should be the same as `quad`. If `NULL` is passed, the standard normal distribution is used. The default is `NULL`.

Value

`theta`	The estimated ability parameter values. If `ability_method = "MLE"`. If an examinee receives a maximum or minimum score for all items, the function returns $\pm$ `Inf`.
`theta_se`	The standard errors of ability parameter estimates. It returns standard deviations of posteriors for `EAP`s and asymptotic standard errors (i.e., square root of inverse Fisher information) for `MLE`. If an examinee receives a maximum or minimum score for all items, the function returns `NA` for `MLE`.

Author(s)

Seewoo Li [email protected]

Examples



# dichotomous

response <- c(1,1,0)
item <- matrix(
  c(
      1, -0.5,   0,
    1.5,   -1,   0,
    1.2,    0, 0.2
  ), nrow = 3, byrow = TRUE
)
adaptive_test(response, item, model = "dich", ability_method = "WLE")


# polytomous

response <- c(1,2,0)
item <- matrix(
    c(
      1, -0.5, 0.5,
    1.5,   -1,   0,
    1.2,    0, 0.4
    ), nrow = 3, byrow = TRUE
  )
adaptive_test(response, item, model="GPCM", ability_method = "WLE")


# mixed-format test

response <- list(c(0,0,0),c(2,2,1))
item <- list(
  matrix(
    c(
        1, -0.5, 0,
      1.5,   -1, 0,
      1.2,    0, 0
    ), nrow = 3, byrow = TRUE
  ),
  matrix(
    c(
        1, -0.5, 0.5,
      1.5,   -1,   0,
      1.2,    0, 0.4
    ), nrow = 3, byrow = TRUE
  )
)
adaptive_test(response, item, model = "GPCM", ability_method = "WLE")


# continuous response

response <- c(0.88, 0.68, 0.21)
item <- matrix(
  c(
    1, -0.5, 10,
    1.5,   -1,  8,
    1.2,    0, 11
  ), nrow = 3, byrow = TRUE
)
adaptive_test(response, item, model = "cont", ability_method = "WLE")

# dichotomous

response <- c(1,1,0)
item <- matrix(
  c(
      1, -0.5,   0,
    1.5,   -1,   0,
    1.2,    0, 0.2
  ), nrow = 3, byrow = TRUE
)
adaptive_test(response, item, model = "dich", ability_method = "WLE")


# polytomous

response <- c(1,2,0)
item <- matrix(
    c(
      1, -0.5, 0.5,
    1.5,   -1,   0,
    1.2,    0, 0.4
    ), nrow = 3, byrow = TRUE
  )
adaptive_test(response, item, model="GPCM", ability_method = "WLE")


# mixed-format test

response <- list(c(0,0,0),c(2,2,1))
item <- list(
  matrix(
    c(
        1, -0.5, 0,
      1.5,   -1, 0,
      1.2,    0, 0
    ), nrow = 3, byrow = TRUE
  ),
  matrix(
    c(
        1, -0.5, 0.5,
      1.5,   -1,   0,
      1.2,    0, 0.4
    ), nrow = 3, byrow = TRUE
  )
)
adaptive_test(response, item, model = "GPCM", ability_method = "WLE")


# continuous response

response <- c(0.88, 0.68, 0.21)
item <- matrix(
  c(
    1, -0.5, 10,
    1.5,   -1,  8,
    1.2,    0, 11
  ), nrow = 3, byrow = TRUE
)
adaptive_test(response, item, model = "cont", ability_method = "WLE")

Model comparison

Description

Model comparison

Usage

## S3 method for class 'IRTest'
anova(...)
## S3 method for class 'IRTest'
anova(...)

Arguments

...

Objects of "IRTest"-class to be compared.

Value

Model-fit indices and results of likelihood ratio test (LRT).

Author(s)

Seewoo Li [email protected]

Selecting the best model

Description

Selecting the best model

Usage

best_model(..., criterion = "HQ")
best_model(..., criterion = "HQ")

Arguments

`...`	Candidate models
`criterion`	The criterion to be used. The default is `HQ`.

Value

The best model and model-fit indices.

Author(s)

Seewoo Li [email protected]

A recommendation for category collapsing of items based on item parameters

Description

In a polytomous item, one or more score categories may not have the highest probability among the categories in an acceptable $\theta$ range. In this case, the category may possibly be regarded as redundant in a psychometric point of view and can be collapsed into another score category. This function returns a recommendation for a recategorization scheme based on item parameters.

Usage

cat_clps(item.matrix, range = c(-4, 4), increment = 0.005)
cat_clps(item.matrix, range = c(-4, 4), increment = 0.005)

Arguments

`item.matrix`	A matrix of item parameters.
`range`	A range of $\theta$ to be evaluated. The default is `c(-4, 4)`.
`increment`	A width of the grid scheme. The default is `0.005`.

Value

A list of recommended recategorization for each item.

Author(s)

Seewoo Li [email protected]

Extract Standard Errors of Model Coefficients

Description

Standard errors of model coefficients calculated by using Fisher information functions.

Usage

coef_se(object, complete = TRUE)
coef_se(object, complete = TRUE)

Arguments

`object`	An object for which the extraction of standard errors is meaningful.
`complete`	A logical value indicating if the full standard-error vector should be returned.

Value

Standard errors extracted from the model (object).

Extract Model Coefficients

Description

A generic function which extracts model coefficients from objects returned by modeling functions.

Usage

## S3 method for class 'IRTest'
coef(object, complete = TRUE, ...)
## S3 method for class 'IRTest'
coef(object, complete = TRUE, ...)

Arguments

`object`	An object for which the extraction of model coefficients is meaningful.
`complete`	A logical value indicating if the full coefficient vector should be returned.
`...`	Other arguments.

Value

Coefficients extracted from the model (object).

Generating an artificial item response dataset

Description

This function generates an artificial item response dataset allowing various options.

Usage

DataGeneration(
  seed = 1,
  N = 2000,
  nitem_D = 0,
  nitem_P = 0,
  nitem_C = 0,
  model_D = "2PL",
  model_P = "GPCM",
  latent_dist = "Normal",
  item_D = NULL,
  item_P = NULL,
  item_C = NULL,
  theta = NULL,
  prob = 0.5,
  d = 1.7,
  sd_ratio = 1,
  m = 0,
  s = 1,
  a_l = 0.8,
  a_u = 2.5,
  b_m = NULL,
  b_sd = NULL,
  c_l = 0,
  c_u = 0.2,
  categ = 5,
  possible_ans = c(0.1, 0.3, 0.5, 0.7, 0.9)
)
DataGeneration(
  seed = 1,
  N = 2000,
  nitem_D = 0,
  nitem_P = 0,
  nitem_C = 0,
  model_D = "2PL",
  model_P = "GPCM",
  latent_dist = "Normal",
  item_D = NULL,
  item_P = NULL,
  item_C = NULL,
  theta = NULL,
  prob = 0.5,
  d = 1.7,
  sd_ratio = 1,
  m = 0,
  s = 1,
  a_l = 0.8,
  a_u = 2.5,
  b_m = NULL,
  b_sd = NULL,
  c_l = 0,
  c_u = 0.2,
  categ = 5,
  possible_ans = c(0.1, 0.3, 0.5, 0.7, 0.9)
)

Arguments

`seed`	A numeric value that is used for random sampling. Seed number can guarantee a replicability of the result.
`N`	A numeric value of the number of examinees.
`nitem_D`	A numeric value of the number of dichotomous items.
`nitem_P`	A numeric value of the number of polytomous items.
`nitem_C`	A numeric value of the number of continuous response items.
`model_D`	A vector or a character string that represents the probability model for the dichotomous items.
`model_P`	A character string that represents the probability model for the polytomous items.
`latent_dist`	A character string that determines the type of latent distribution. Currently available options are `"beta"` (four-parameter beta distribution; `betafunctions::rBeta.4P`), `"chi"` ( $\chi^2$ distribution; `rchisq`), `"normal"`, `"Normal"`, or `"N"` (standard normal distribution; `rnorm`), and `"Mixture"` or `"2NM"` (two-component Gaussian mixture distribution; see Li (2021) for details.)
`item_D`	An item parameter matrix for using fixed parameter values. The number of columns should be 3: `a` parameter for the first, `b` parameter for the second, and `c` parameter for the third column. Default is `NULL`.
`item_P`	An item parameter matrix for using fixed parameter values. The number of columns should be 7: `a` parameter for the first, and `b` parameters for the rest of the columns. Default is `NULL`.
`item_C`	An item parameter matrix for using fixed parameter values. The number of columns should be 3: `a` parameter for the first, `b` parameter for the second, and `nu` parameter for the third column. Default is `NULL`.
`theta`	An ability parameter vector for using fixed parameter values. Default is `NULL`.
`prob`	A numeric value for using `latent_dist = "2NM"`. It is the $\pi = \frac{n_1}{N}$ parameter of two-component Gaussian mixture distribution, where $n_1$ is the estimated number of examinees belonging to the first Gaussian component and $N$ is the total number of examinees (Li, 2021).
`d`	A numeric value for using `latent_dist = "2NM"`. It is the $\delta = \frac{\mu_2 - \mu_1}{\bar{\sigma}}$ parameter of two-component Gaussian mixture distribution, where $\mu_1$ and $\mu_2$ are the estimated means of the first and second Gaussian components, respectively. And $\bar{\sigma}$ is the overall standard deviation of the latent distribution (Li, 2021). Without loss of generality, $\mu_2 \ge \mu_1$ is assumed, thus $\delta \ge 0$ .
`sd_ratio`	A numeric value for using `latent_dist = "2NM"`. It is the $\zeta = \frac{\sigma_2}{\sigma_1}$ parameter of two-component Gaussian mixture distribution, where $\sigma_1$ and $\sigma_2$ are the estimated standard deviations of the first and second Gaussian components, respectively (Li, 2021).
`m`	A numeric value of the overall mean of the latent distribution. The default is 0.
`s`	A numeric value of the overall standard deviation of the latent distribution. The default is 1.
`a_l`	A numeric value. The lower bound of item discrimination parameters (a).
`a_u`	A numeric value. The upper bound of item discrimination parameters (a).
`b_m`	A numeric value. The mean of item difficulty parameters (b). If unspecified, `m` is passed on to the value.
`b_sd`	A numeric value. The standard deviation of item difficulty parameters (b). If unspecified, `s` is passed on to the value.
`c_l`	A numeric value. The lower bound of item guessing parameters (c).
`c_u`	A numeric value. The lower bound of item guessing parameters (c).
`categ`	A scalar or a numeric vector of length `nitem_P`. The default is 5. If `length(categ)>1`, the ith element equals the number of categories of the ith polyotomous item.
`possible_ans`	Possible options for continuous items (e.g., 0.1, 0.3, 0.5, 0.7, 0.9)

Value

This function returns a list of several objects:

`theta`	A vector of ability parameters ( $\theta$ ).
`item_D`	A matrix of dichotomous item parameters.
`initialitem_D`	A matrix that contains initial item parameter values for dichotomous items.
`data_D`	A matrix of dichotomous item responses where rows indicate examinees and columns indicate items.
`item_P`	A matrix of polytomous item parameters.
`initialitem_P`	A matrix that contains initial item parameter values for polytomous items.
`data_P`	A matrix of polytomous item responses where rows indicate examinees and columns indicate items.
`item_D`	A matrix of continuous response item parameters.
`initialitem_D`	A matrix that contains initial item parameter values for continuous response items.
`data_D`	A matrix of continuous response item responses where rows indicate examinees and columns indicate items.

Author(s)

Seewoo Li [email protected]

References

Li, S. (2021). Using a two-component normal mixture distribution as a latent distribution in estimating parameters of item response models. Journal of Educational Evaluation, 34(4), 759-789.

Examples

# Dichotomous item responses

Alldata <- DataGeneration(N = 500,
                          nitem_D = 10)


# Polytomous item responses

Alldata <- DataGeneration(N = 1000,
                          nitem_P = 10)


# Mixed-format items

Alldata <- DataGeneration(N = 1000,
                          nitem_D = 20,
                          nitem_P = 10)

# Continuous items

AllData <- DataGeneration(N = 1000,
                          nitem_C = 10)

# Dataset from non-normal latent density using two-component Gaussian mixture distribution

Alldata <- DataGeneration(N=1000,
                          nitem_P = 10,
                          latent_dist = "2NM",
                          d = 1.664,
                          sd_ratio = 2,
                          prob = 0.3)

# Dichotomous item responses

Alldata <- DataGeneration(N = 500,
                          nitem_D = 10)


# Polytomous item responses

Alldata <- DataGeneration(N = 1000,
                          nitem_P = 10)


# Mixed-format items

Alldata <- DataGeneration(N = 1000,
                          nitem_D = 20,
                          nitem_P = 10)

# Continuous items

AllData <- DataGeneration(N = 1000,
                          nitem_C = 10)

# Dataset from non-normal latent density using two-component Gaussian mixture distribution

Alldata <- DataGeneration(N=1000,
                          nitem_P = 10,
                          latent_dist = "2NM",
                          d = 1.664,
                          sd_ratio = 2,
                          prob = 0.3)

Re-parameterized two-component normal mixture distribution

Description

Probability density for the re-parameterized two-component normal mixture distribution.

Usage

dist2(x, prob = 0.5, d = 0, sd_ratio = 1, overallmean = 0, overallsd = 1)
dist2(x, prob = 0.5, d = 0, sd_ratio = 1, overallmean = 0, overallsd = 1)

Arguments

`x`	A numeric vector. The location to evaluate the density function.
`prob`	A numeric value of $\pi = \frac{n_1}{N}$ parameter of two-component Gaussian mixture distribution, where $n_1$ is the estimated number of examinees belonging to the first Gaussian component and $N$ is the total number of examinees (Li, 2021).
`d`	A numeric value of $\delta = \frac{\mu_2 - \mu_1}{\bar{\sigma}}$ parameter of two-component Gaussian mixture distribution, where $\mu_1$ and $\mu_2$ are the estimated mean of the first and second Gaussian component, respectively. And $\bar{\sigma}$ is the overall standard deviation of the latent distribution (Li, 2021). Without loss of generality, $\mu_2 \ge \mu_1$ is assumed, thus $\delta \ge 0$ .
`sd_ratio`	A numeric value of $\zeta = \frac{\sigma_2}{\sigma_1}$ parameter of two-component Gaussian mixture distribution, where $\sigma_1$ and $\sigma_2$ are the estimated standard deviation of the first and second Gaussian component, respectively (Li, 2021).
`overallmean`	A numeric value of $\bar{\mu}$ that determines the overall mean of two-component Gaussian mixture distribution.
`overallsd`	A numeric value of $\bar{\sigma}$ that determines the overall standard deviation of two-component Gaussian mixture distribution.

Details

The overall mean and overall standard deviation obtained from original parameters;

1) Overall mean ( $\bar{\mu}$ )

$\bar{\mu}=\pi\mu_1 + (1-\pi)\mu_2$

2) Overall standard deviation ( $\bar{\sigma}$ )

$\bar{\sigma}=\sqrt{\pi\sigma_{1}^{2}+(1-\pi)\sigma_{2}^{2}+\pi(1-\pi)(\mu_2-\mu_1)^2}$

Value

The evaluated probability density value(s).

Author(s)

Seewoo Li [email protected]

References

Li, S. (2021). Using a two-component normal mixture distribution as a latent distribution in estimating parameters of item response models. Journal of Educational Evaluation, 34(4), 759-789.

Examples

# Evaluated density
dnst <- dist2(seq(-6,6,.1), prob = 0.3, d = 1, sd_ratio=0.5)

# Plot of the density
plot(seq(-6,6,.1), dnst)

# Evaluated density
dnst <- dist2(seq(-6,6,.1), prob = 0.3, d = 1, sd_ratio=0.5)

# Plot of the density
plot(seq(-6,6,.1), dnst)

Estimated factor scores

Description

Factor scores of examinees.

Usage

factor_score(x, ability_method = "EAP", quad = NULL, prior = NULL)
factor_score(x, ability_method = "EAP", quad = NULL, prior = NULL)

Arguments

`x`	A model fit object from either `IRTest_Dich`, `IRTest_Poly`, `IRTest_Cont`, or `IRTest_Mix`.
`ability_method`	The ability parameter estimation method. The available options are Expected a posteriori (`EAP`), Maximum Likelihood Estimates (`MLE`), and weighted likelihood estimates (`WLE`). The default is `EAP`.
`quad`	A vector of quadrature points for `EAP` calculation.
`prior`	A vector of the prior distribution for `EAP` calculation. The length of it should be the same as `quad`.

Value

`theta`	The estimated ability parameter values. If `ability_method = "MLE"`. If an examinee receives a maximum or minimum score for all items, the function returns $\pm$ `Inf`.
`theta_se`	The standard errors of ability parameter estimates. It returns standard deviations of posteriors for `EAP`s and asymptotic standard errors (i.e., square root of inverse Fisher information) for `MLE`. If an examinee receives a maximum or minimum score for all items, the function returns `NA` for `MLE`.

Author(s)

Seewoo Li [email protected]

Examples


# A preparation of dichotomous item response data

data <- DataGeneration(N=500, nitem_D = 10)$data_D

# Analysis

M1 <- IRTest_Dich(data)

# Item fit statistics

factor_score(M1, ability_method = "MLE")

# A preparation of dichotomous item response data

data <- DataGeneration(N=500, nitem_D = 10)$data_D

# Analysis

M1 <- IRTest_Dich(data)

# Item fit statistics

factor_score(M1, ability_method = "MLE")

Item information function

Description

Item information function

Usage

inform_f_item(x, test, item = 1, type = "d")
inform_f_item(x, test, item = 1, type = "d")

Arguments

`x`	A vector of $\theta$ value(s).
`test`	An object returned from an estimation function.
`item`	A natural number indicating the $n$ th item.
`type`	A character value for a mixed format test which determines the item type: `"d"` and `"p"` stand for a dichotomous and polytomous item, respectively.

Value

A vector of the evaluated item information values.

Author(s)

Seewoo Li [email protected]

Test information function

Description

Test information function

Usage

inform_f_test(x, test)
inform_f_test(x, test)

Arguments

`x`	A vector of $\theta$ value(s).
`test`	An object returned from an estimation function.

Value

A vector of test information values of the same length as x.

Author(s)

Seewoo Li [email protected]

Item and ability parameters estimation for continuous response items

Description

This function estimates IRT item and ability parameters when all items are scored continuously. Based on Bock & Aitkin's (1981) marginal maximum likelihood and EM algorithm (EM-MML), this function provides several latent distribution estimation algorithms which could free the normality assumption on the latent variable. If the normality assumption is violated, application of these latent distribution estimation methods could reflect non-normal characteristics of the unknown true latent distribution, thereby providing more accurate parameter estimates (Li, 2021; Woods & Lin, 2009; Woods & Thissen, 2006).

Usage

IRTest_Cont(
  data,
  model = 2,
  range = c(-6, 6),
  q = 121,
  initialitem = NULL,
  ability_method = "EAP",
  latent_dist = "Normal",
  max_iter = 200,
  threshold = 1e-04,
  bandwidth = "SJ-ste",
  h = NULL
)
IRTest_Cont(
  data,
  model = 2,
  range = c(-6, 6),
  q = 121,
  initialitem = NULL,
  ability_method = "EAP",
  latent_dist = "Normal",
  max_iter = 200,
  threshold = 1e-04,
  bandwidth = "SJ-ste",
  h = NULL
)

Arguments

`data`	A matrix or data frame of item responses where responses are coded as 0 or 1. Rows and columns indicate examinees and items, respectively.
`model`	A scalar or vector that represents types of item characteristic functions: `1`, `"1PL"`, `"Rasch"`, or `"RASCH"` for one-parameter logistic model, and `2`, `"2PL"` for two-parameter logistic model.
`range`	Range of the latent variable to be considered in the quadrature scheme. The default is from `-6` to `6`: `c(-6, 6)`.
`q`	A numeric value that represents the number of quadrature points. The default value is 121.
`initialitem`	A matrix of initial item parameter values for starting the estimation algorithm. The default value is `NULL`.
`ability_method`	The ability parameter estimation method. The available options are Expected a posteriori (`EAP`), Maximum Likelihood Estimates (`MLE`), and weighted likelihood estimates (`WLE`). The default is `EAP`.
`latent_dist`	A character string that determines latent distribution estimation method. Insert `"Normal"`, `"normal"`, or `"N"` for the normality assumption on the latent distribution, `"EHM"` for empirical histogram method (Mislevy, 1984; Mislevy & Bock, 1985), `"2NM"` or `"Mixture"` for using two-component Gaussian mixture distribution (Li, 2021; Mislevy, 1984), `"DC"` or `"Davidian"` for Davidian-curve method (Woods & Lin, 2009), `"KDE"` for kernel density estimation method (Li, 2022), and `"LLS"` for log-linear smoothing method (Casabianca & Lewis, 2015). The default value is set to `"Normal"` to follow the convention.
`max_iter`	A numeric value that determines the maximum number of iterations in the EM-MML. The default value is 200.
`threshold`	A numeric value that determines the threshold of EM-MML convergence. A maximum item parameter change is monitored and compared with the threshold. The default value is 0.0001.
`bandwidth`	A character value that can be used if `latent_dist = "KDE"`. This argument determines the bandwidth estimation method for `"KDE"`. The default value is `"SJ-ste"`. See `density` for available options.
`h`	A natural number less than or equal to 10 if `latent_dist = "DC" or "LLS"`. This argument determines the complexity of the distribution.

Details

The probability of a response $u=x$ , where $0<u<1$

$P(u=x | a, b, \nu) = \frac{1}{B(\mu\nu, \,\nu(1-\mu))} u^{\mu\nu-1} (1-u)^{\nu(1-\mu)-1}$

where $\mu = \frac{e^{a(\theta -b)}}{1+e^{a(\theta -b)}}$ .

Latent distribution estimation methods

1) Empirical histogram method

$P(\theta=X_k)=A(X_k)$

where $k=1, 2, ..., q$ , $X_k$ is the location of the $k$ th quadrature point, and $A(X_k)$ is a value of probability mass function evaluated at $X_k$ . Empirical histogram method thus has $q-1$ parameters.

2) Two-component Gaussian mixture distribution

$P(\theta=X)=\pi \phi(X; \mu_1, \sigma_1)+(1-\pi) \phi(X; \mu_2, \sigma_2)$

where $\phi(X; \mu, \sigma)$ is the value of a Gaussian component with mean $\mu$ and standard deviation $\sigma$ evaluated at $X$ .

3) Davidian curve method

$P(\theta=X)=\left\{\sum_{\lambda=0}^{h}{{m}_{\lambda}{X}^{\lambda}}\right\}^{2}\phi(X; 0, 1)$

where $h$ corresponds to the argument h and determines the degree of the polynomial.

4) Kernel density estimation method

$P(\theta=X)=\frac{1}{Nh}\sum_{j=1}^{N}{K\left(\frac{X-\theta_j}{h}\right)}$

where $N$ is the number of examinees, $\theta_j$ is $j$ th examinee's ability parameter, $h$ is the bandwidth which corresponds to the argument bandwidth, and $K( \cdot )$ is a kernel function. The Gaussian kernel is used in this function.

5) Log-linear smoothing method

$P(\theta=X_{q})=\exp{\left(\beta_{0}+\sum_{m=1}^{h}{\beta_{m}X_{q}^{m}}\right)}$

where $h$ is the hyper parameter which determines the smoothness of the density, and $\theta$ can take total $Q$ finite values ( $X_1, \dots ,X_q, \dots, X_Q$ ).

Value

This function returns a list of several objects:

`par_est`	The item parameter estimates.
`se`	The asymptotic standard errors for item parameter estimates.
`fk`	The estimated frequencies of examinees at quadrature points.
`iter`	The number of EM-MML iterations elapsed for the convergence.
`quad`	The location of quadrature points.
`diff`	The final value of the monitored maximum item parameter change.
`Ak`	The estimated discrete latent distribution. It is discrete (i.e., probability mass function) by the quadrature scheme.
`Pk`	The posterior probabilities of examinees at quadrature points.
`theta`	The estimated ability parameter values. If `ability_method = "MLE"`, the function returns $\pm$ `Inf` for all or none correct answers.
`theta_se`	Standard error of ability estimates. The asymptotic standard errors for `ability_method = "MLE"` (the function returns `NA` for all or none correct answers). The standard deviations of the posterior distributions for `ability_method = "MLE"`.
`logL`	The deviance (i.e., -2logL).
`density_par`	The estimated density parameters.
`Options`	A replication of input arguments and other information.

Author(s)

Seewoo Li [email protected]

References

Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443-459.

Casabianca, J. M., & Lewis, C. (2015). IRT item parameter recovery with marginal maximum likelihood estimation using loglinear smoothing models. Journal of Educational and Behavioral Statistics, 40(6), 547-578.

Li, S. (2021). Using a two-component normal mixture distribution as a latent distribution in estimating parameters of item response models. Journal of Educational Evaluation, 34(4), 759-789.

Li, S. (2022). The effect of estimating latent distribution using kernel density estimation method on the accuracy and efficiency of parameter estimation of item response models [Master's thesis, Yonsei University, Seoul]. Yonsei University Library.

Mislevy, R. J. (1984). Estimating latent distributions. Psychometrika, 49(3), 359-381.

Mislevy, R. J., & Bock, R. D. (1985). Implementation of the EM algorithm in the estimation of item parameters: The BILOG computer program. In D. J. Weiss (Ed.). Proceedings of the 1982 item response theory and computerized adaptive testing conference (pp. 189-202). University of Minnesota, Department of Psychology, Computerized Adaptive Testing Conference.

Woods, C. M., & Lin, N. (2009). Item response theory with estimation of the latent density using Davidian curves. Applied Psychological Measurement, 33(2), 102-117.

Woods, C. M., & Thissen, D. (2006). Item response theory with estimation of the latent population distribution using spline-based densities. Psychometrika, 71(2), 281-301.

Examples


# Generating a continuous item response data
data <- DataGeneration(N = 1000, nitem_C = 10)$data_C

# Analysis
M1 <- IRTest_Cont(data, max_iter = 3) # increase `max_iter` in real analyses.


# Generating a continuous item response data
data <- DataGeneration(N = 1000, nitem_C = 10)$data_C

# Analysis
M1 <- IRTest_Cont(data, max_iter = 3) # increase `max_iter` in real analyses.

Item and ability parameters estimation for dichotomous items

Description

This function estimates IRT item and ability parameters when all items are scored dichotomously. Based on Bock & Aitkin's (1981) marginal maximum likelihood and EM algorithm (EM-MML), this function provides several latent distribution estimation algorithms which could free the normality assumption on the latent variable. If the normality assumption is violated, application of these latent distribution estimation methods could reflect non-normal characteristics of the unknown true latent distribution, and, thus, could provide more accurate parameter estimates (Li, 2021; Woods & Lin, 2009; Woods & Thissen, 2006).

Usage

IRTest_Dich(
  data,
  model = "2PL",
  range = c(-6, 6),
  q = 121,
  initialitem = NULL,
  ability_method = "EAP",
  latent_dist = "Normal",
  max_iter = 200,
  threshold = 1e-04,
  bandwidth = "SJ-ste",
  h = NULL
)
IRTest_Dich(
  data,
  model = "2PL",
  range = c(-6, 6),
  q = 121,
  initialitem = NULL,
  ability_method = "EAP",
  latent_dist = "Normal",
  max_iter = 200,
  threshold = 1e-04,
  bandwidth = "SJ-ste",
  h = NULL
)

Arguments

`data`	A matrix or data frame of item responses where responses are coded as 0 or 1. Rows and columns indicate examinees and items, respectively.
`model`	A scalar or vector that represents types of item characteristic functions. Insert `1`, `"1PL"`, `"Rasch"`, or `"RASCH"` for one-parameter logistic model, `2`, `"2PL"` for two-parameter logistic model, and `3`, `"3PL"` for three-parameter logistic model. The default is `"2PL"`.
`range`	Range of the latent variable to be considered in the quadrature scheme. The default is from `-6` to `6`: `c(-6, 6)`.
`q`	A numeric value that represents the number of quadrature points. The default value is 121.
`initialitem`	A matrix of initial item parameter values for starting the estimation algorithm. The default value is `NULL`.
`ability_method`	The ability parameter estimation method. The available options are Expected a posteriori (`EAP`), Maximum Likelihood Estimates (`MLE`), and weighted likelihood estimates (`WLE`). The default is `EAP`.
`latent_dist`	A character string that determines latent distribution estimation method. Insert `"Normal"`, `"normal"`, or `"N"` for the normality assumption on the latent distribution, `"EHM"` for empirical histogram method (Mislevy, 1984; Mislevy & Bock, 1985), `"2NM"` or `"Mixture"` for using two-component Gaussian mixture distribution (Li, 2021; Mislevy, 1984), `"DC"` or `"Davidian"` for Davidian-curve method (Woods & Lin, 2009), `"KDE"` for kernel density estimation method (Li, 2022), and `"LLS"` for log-linear smoothing method (Casabianca & Lewis, 2015). The default value is set to `"Normal"` to follow the convention.
`max_iter`	A numeric value that determines the maximum number of iterations in the EM-MML. The default value is 200.
`threshold`	A numeric value that determines the threshold of EM-MML convergence. A maximum item parameter change is monitored and compared with the threshold. The default value is 0.0001.
`bandwidth`	A character value that can be used if `latent_dist = "KDE"`. This argument determines the bandwidth estimation method for `"KDE"`. The default value is `"SJ-ste"`. See `density` for available options.
`h`	A natural number less than or equal to 10 if `latent_dist = "DC" or "LLS"`. This argument determines the complexity of the distribution.

Details

The probabilities for a correct response ( $u=1$ )

1) One-parameter logistic (1PL) model

$P(u=1|\theta, b)=\frac{\exp{(\theta-b)}}{1+\exp{(\theta-b)}}$

2) Two-parameter logistic (2PL) model

$P(u=1|\theta, a, b)=\frac{\exp{(a(\theta-b))}}{1+\exp{(a(\theta-b))}}$

3) Three-parameter logistic (3PL) model

$P(u=1|\theta, a, b, c)=c + (1-c)\frac{\exp{(a(\theta-b))}}{1+\exp{(a(\theta-b))}}$

Latent distribution estimation methods

1) Empirical histogram method

$P(\theta=X_k)=A(X_k)$

2) Two-component Gaussian mixture distribution

$P(\theta=X)=\pi \phi(X; \mu_1, \sigma_1)+(1-\pi) \phi(X; \mu_2, \sigma_2)$

where $\phi(X; \mu, \sigma)$ is the value of a Gaussian component with mean $\mu$ and standard deviation $\sigma$ evaluated at $X$ .

3) Davidian curve method

$P(\theta=X)=\left\{\sum_{\lambda=0}^{h}{{m}_{\lambda}{X}^{\lambda}}\right\}^{2}\phi(X; 0, 1)$

where $h$ corresponds to the argument h and determines the degree of the polynomial.

4) Kernel density estimation method

$P(\theta=X)=\frac{1}{Nh}\sum_{j=1}^{N}{K\left(\frac{X-\theta_j}{h}\right)}$

5) Log-linear smoothing method

$P(\theta=X_{q})=\exp{\left(\beta_{0}+\sum_{m=1}^{h}{\beta_{m}X_{q}^{m}}\right)}$

where $h$ is the hyper parameter which determines the smoothness of the density, and $\theta$ can take total $Q$ finite values ( $X_1, \dots ,X_q, \dots, X_Q$ ).

Value

This function returns a list of several objects:

`par_est`	The item parameter estimates.
`se`	The asymptotic standard errors for item parameter estimates.
`fk`	The estimated frequencies of examinees at quadrature points.
`iter`	The number of EM-MML iterations elapsed for the convergence.
`quad`	The location of quadrature points.
`diff`	The final value of the monitored maximum item parameter change.
`Ak`	The estimated discrete latent distribution. It is discrete (i.e., probability mass function) by the quadrature scheme.
`Pk`	The posterior probabilities of examinees at quadrature points.
`theta`	The estimated ability parameter values. If `ability_method = "MLE"`, the function returns $\pm$ `Inf` for all or none correct answers.
`theta_se`	Standard error of ability estimates. The asymptotic standard errors for `ability_method = "MLE"` (the function returns `NA` for all or none correct answers). The standard deviations of the posterior distributions for `ability_method = "MLE"`.
`logL`	The deviance (i.e., -2logL).
`density_par`	The estimated density parameters.
`Options`	A replication of input arguments and other information.

Author(s)

Seewoo Li [email protected]

References

Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443-459.

Li, S. (2021). Using a two-component normal mixture distribution as a latent distribution in estimating parameters of item response models. Journal of Educational Evaluation, 34(4), 759-789.

Mislevy, R. J. (1984). Estimating latent distributions. Psychometrika, 49(3), 359-381.

Woods, C. M., & Lin, N. (2009). Item response theory with estimation of the latent density using Davidian curves. Applied Psychological Measurement, 33(2), 102-117.

Woods, C. M., & Thissen, D. (2006). Item response theory with estimation of the latent population distribution using spline-based densities. Psychometrika, 71(2), 281-301.

Examples


# A preparation of dichotomous item response data

data <- DataGeneration(N=500,
                       nitem_D = 10)$data_D

# Analysis

M1 <- IRTest_Dich(data)

# A preparation of dichotomous item response data

data <- DataGeneration(N=500,
                       nitem_D = 10)$data_D

# Analysis

M1 <- IRTest_Dich(data)

Item and ability parameters estimation for a mixed-format item response data

Description

This function estimates IRT item and ability parameters when a test consists of mixed-format items (i.e., a combination of dichotomous and polytomous items). In educational context, the combination of these two item formats takes an advantage; Dichotomous item format expedites scoring and is conducive to cover broad domain, while Polytomous item format (e.g., free response item) encourages students to exert complex cognitive skills (Lee et al., 2020). Based on Bock & Aitkin's (1981) marginal maximum likelihood and EM algorithm (EM-MML), this function incorporates several latent distribution estimation algorithms which could free the normality assumption on the latent variable. If the normality assumption is violated, application of these latent distribution estimation methods could reflect some features of the unknown true latent distribution, and, thus, could provide more accurate parameter estimates (Li, 2021; Woods & Lin, 2009; Woods & Thissen, 2006).

Usage

IRTest_Mix(
  data_D,
  data_P,
  model_D = "2PL",
  model_P = "GPCM",
  range = c(-6, 6),
  q = 121,
  initialitem_D = NULL,
  initialitem_P = NULL,
  ability_method = "EAP",
  latent_dist = "Normal",
  max_iter = 200,
  threshold = 1e-04,
  bandwidth = "SJ-ste",
  h = NULL
)
IRTest_Mix(
  data_D,
  data_P,
  model_D = "2PL",
  model_P = "GPCM",
  range = c(-6, 6),
  q = 121,
  initialitem_D = NULL,
  initialitem_P = NULL,
  ability_method = "EAP",
  latent_dist = "Normal",
  max_iter = 200,
  threshold = 1e-04,
  bandwidth = "SJ-ste",
  h = NULL
)

Arguments

`data_D`	A matrix or data frame of item responses where responses are coded as 0 or 1. Rows and columns indicate examinees and items, respectively.
`data_P`	A matrix or data frame of item responses coded as `0, 1, ..., m` for the `m+1` category item. Rows and columns indicate examinees and items, respectively.
`model_D`	A scalar or vector that represents types of item characteristic functions. Insert `1`, `"1PL"`, `"Rasch"`, or `"RASCH"` for one-parameter logistic model, `2`, `"2PL"` for two-parameter logistic model, and `3`, `"3PL"` for three-parameter logistic model. The default is `"2PL"`.
`model_P`	A character value for an IRT model to be applied. Currently, `PCM`, `GPCM`, and `GRM` are available. The default is `"GPCM"`.
`range`	Range of the latent variable to be considered in the quadrature scheme. The default is from `-6` to `6`: `c(-6, 6)`.
`q`	A numeric value that represents the number of quadrature points. The default value is 121.
`initialitem_D`	A matrix of initial item parameter values for starting the estimation algorithm. The default value is `NULL`.
`initialitem_P`	A matrix of initial item parameter values for starting the estimation algorithm. The default value is `NULL`.
`ability_method`	The ability parameter estimation method. The available options are Expected a posteriori (`EAP`), Maximum Likelihood Estimates (`MLE`), and weighted likelihood estimates (`WLE`). The default is `EAP`.
`latent_dist`	A character string that determines latent distribution estimation method. Insert `"Normal"`, `"normal"`, or `"N"` for the normality assumption on the latent distribution, `"EHM"` for empirical histogram method (Mislevy, 1984; Mislevy & Bock, 1985), `"2NM"` or `"Mixture"` for using two-component Gaussian mixture distribution (Li, 2021; Mislevy, 1984), `"DC"` or `"Davidian"` for Davidian-curve method (Woods & Lin, 2009), `"KDE"` for kernel density estimation method (Li, 2022), and `"LLS"` for log-linear smoothing method (Casabianca & Lewis, 2015). The default value is set to `"Normal"` to follow the convention.
`max_iter`	A numeric value that determines the maximum number of iterations in the EM-MML. The default value is 200.
`threshold`	A numeric value that determines the threshold of EM-MML convergence. A maximum item parameter change is monitored and compared with the threshold. The default value is 0.0001.
`bandwidth`	A character value that can be used if `latent_dist = "KDE"`. This argument determines the bandwidth estimation method for `"KDE"`. The default value is `"SJ-ste"`. See `density` for available options.
`h`	A natural number less than or equal to 10 if `latent_dist = "DC" or "LLS"`. This argument determines the complexity of the distribution.

Details

Dichotomous: the probabilities for a correct response ( $u=1$ )

1) One-parameter logistic (1PL) model

$P(u=1|\theta, b)=\frac{\exp{(\theta-b)}}{1+\exp{(\theta-b)}}$

2) Two-parameter logistic (2PL) model

$P(u=1|\theta, a, b)=\frac{\exp{(a(\theta-b))}}{1+\exp{(a(\theta-b))}}$

3) Three-parameter logistic (3PL) model

$P(u=1|\theta, a, b, c)=c + (1-c)\frac{\exp{(a(\theta-b))}}{1+\exp{(a(\theta-b))}}$

Polytomous: the probability for scoring $u=k$ (i.e., $k=0, 1, ..., m; m \ge 2$ )

1) Partial credit model (PCM)

$P(u=0|\theta, b_1, ..., b_{m})=\frac{1}{1+\sum_{c=1}^{m}{\exp{\left[\sum_{v=1}^{c}{a(\theta-b_v)}\right]}}}$

$P(u=1|\theta, b_1, ..., b_{m})=\frac{\exp{(\theta-b_1)}}{1+\sum_{c=1}^{m}{\exp{\left[\sum_{v=1}^{c}{\theta-b_v}\right]}}}$

$\vdots$

$P(u=m|\theta, b_1, ..., b_{m})=\frac{\exp{\left[\sum_{v=1}^{m}{\theta-b_v}\right]}}{1+\sum_{c=1}^{m}{\exp{\left[\sum_{v=1}^{c}{\theta-b_v}\right]}}}$

2) Generalized partial credit model (GPCM)

$P(u=0|\theta, a, b_1, ..., b_{m})=\frac{1}{1+\sum_{c=1}^{m}{\exp{\left[\sum_{v=1}^{c}{a(\theta-b_v)}\right]}}}$

$P(u=1|\theta, a, b_1, ..., b_{m})=\frac{\exp{(a(\theta-b_1))}}{1+\sum_{c=1}^{m}{\exp{\left[\sum_{v=1}^{c}{a(\theta-b_v)}\right]}}}$

$\vdots$

$P(u=m|\theta, a, b_1, ..., b_{m})=\frac{\exp{\left[\sum_{v=1}^{m}{a(\theta-b_v)}\right]}}{1+\sum_{c=1}^{m}{\exp{\left[\sum_{v=1}^{c}{a(\theta-b_v)}\right]}}}$

3) Graded response model (GRM)

$P(u=0|\theta, a, b_1, ..., b_{m})=1-\frac{1}{1+\exp{\left[-a(\theta-b_1)\right]}}$

$P(u=1|\theta, a, b_1, ..., b_{m})=\frac{1}{1+\exp{\left[-a(\theta-b_1)\right]}}-\frac{1}{1+\exp{\left[-a(\theta-b_2)\right]}}$

$\vdots$

$P(u=m|\theta, a, b_1, ..., b_{m})=\frac{1}{1+\exp{\left[-a(\theta-b_m)\right]}}-0$

Latent distribution estimation methods

1) Empirical histogram method

$P(\theta=X_k)=A(X_k)$

2) Two-component Gaussian mixture distribution

$P(\theta=X)=\pi \phi(X; \mu_1, \sigma_1)+(1-\pi) \phi(X; \mu_2, \sigma_2)$

where $\phi(X; \mu, \sigma)$ is the value of a Gaussian component with mean $\mu$ and standard deviation $\sigma$ evaluated at $X$ .

3) Davidian curve method

$P(\theta=X)=\left\{\sum_{\lambda=0}^{h}{{m}_{\lambda}{X}^{\lambda}}\right\}^{2}\phi(X; 0, 1)$

where $h$ corresponds to the argument h and determines the degree of the polynomial.

4) Kernel density estimation method

$P(\theta=X)=\frac{1}{Nh}\sum_{j=1}^{N}{K\left(\frac{X-\theta_j}{h}\right)}$

where $N$ is the number of examinees, $\theta_j$ is $j$ th examinee's ability parameter, $h$ is the bandwidth which corresponds to the argument bw, and $K( \bullet )$ is a kernel function. The Gaussian kernel is used in this function.

5) Log-linear smoothing method

$P(\theta=X_{q})=\exp{\left(\beta_{0}+\sum_{m=1}^{h}{\beta_{m}X_{q}^{m}}\right)}$

where $h$ is the hyper parameter which determines the smoothness of the density, and $\theta$ can take total $Q$ finite values ( $X_1, \dots ,X_q, \dots, X_Q$ ).

Value

This function returns a list of several objects:

`par_est`	The list of item parameter estimates. The first and second objects are the matrices of dichotomous and polytomous item parameter estimates, respectively
`se`	The list of standard errors of the item parameter estimates. The first and second objects are the matrices of standard errors of dichotomous and polytomous item parameter estimates, respectively
`fk`	The estimated frequencies of examinees at quadrature points.
`iter`	The number of EM-MML iterations elapsed for the convergence.
`quad`	The location of quadrature points.
`diff`	The final value of the monitored maximum item parameter change.
`Ak`	The estimated discrete latent distribution. It is discrete (i.e., probability mass function) by the quadrature scheme.
`Pk`	The posterior probabilities of examinees at quadrature points.
`theta`	The estimated ability parameter values. If `ability_method = "MLE"`. If an examinee receives a maximum or minimum score for all items, the function returns $\pm$ `Inf`.
`theta_se`	Standard error of ability estimates. The asymptotic standard errors for `ability_method = "MLE"` (the function returns `NA` for all or none correct answers). The standard deviations of the posterior distributions for `ability_method = "MLE"`.
`logL`	The deviance (i.e., -2logL).
`density_par`	The estimated density parameters.
`Options`	A replication of input arguments and other information.

Author(s)

Seewoo Li [email protected]

References

Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443-459.

Lee, W. C., Kim, S. Y., Choi, J., & Kang, Y. (2020). IRT Approaches to Modeling Scores on Mixed-Format Tests. Journal of Educational Measurement, 57(2), 230-254.

Li, S. (2021). Using a two-component normal mixture distribution as a latent distribution in estimating parameters of item response models. Journal of Educational Evaluation, 34(4), 759-789.

Mislevy, R. J. (1984). Estimating latent distributions. Psychometrika, 49(3), 359-381.

Woods, C. M., & Lin, N. (2009). Item response theory with estimation of the latent density using Davidian curves. Applied Psychological Measurement, 33(2), 102-117.

Woods, C. M., & Thissen, D. (2006). Item response theory with estimation of the latent population distribution using spline-based densities. Psychometrika, 71(2), 281-301.

Examples


# A preparation of mixed-format item response data

Alldata <- DataGeneration(N=1000,
                          nitem_D = 5,
                          nitem_P = 3)

DataD <- Alldata$data_D   # item response data for the dichotomous items
DataP <- Alldata$data_P   # item response data for the polytomous items

# Analysis

M1 <- IRTest_Mix(DataD, DataP)

# A preparation of mixed-format item response data

Alldata <- DataGeneration(N=1000,
                          nitem_D = 5,
                          nitem_P = 3)

DataD <- Alldata$data_D   # item response data for the dichotomous items
DataP <- Alldata$data_P   # item response data for the polytomous items

# Analysis

M1 <- IRTest_Mix(DataD, DataP)

Item and ability parameters estimation for polytomous items

Description

This function estimates IRT item and ability parameters when all items are scored polytomously. Based on Bock & Aitkin's (1981) marginal maximum likelihood and EM algorithm (EM-MML), this function provides several latent distribution estimation algorithms which could free the normality assumption on the latent variable. If the normality assumption is violated, application of these latent distribution estimation methods could reflect non-normal characteristics of the unknown true latent distribution, and, thus, could provide more accurate parameter estimates (Li, 2021; Woods & Lin, 2009; Woods & Thissen, 2006).

Usage

IRTest_Poly(
  data,
  model = "GPCM",
  range = c(-6, 6),
  q = 121,
  initialitem = NULL,
  ability_method = "EAP",
  latent_dist = "Normal",
  max_iter = 200,
  threshold = 1e-04,
  bandwidth = "SJ-ste",
  h = NULL
)
IRTest_Poly(
  data,
  model = "GPCM",
  range = c(-6, 6),
  q = 121,
  initialitem = NULL,
  ability_method = "EAP",
  latent_dist = "Normal",
  max_iter = 200,
  threshold = 1e-04,
  bandwidth = "SJ-ste",
  h = NULL
)

Arguments

`data`	A matrix or data frame of item responses coded as `0, 1, ..., m` for the `m+1` category item. Rows and columns indicate examinees and items, respectively.
`model`	A character value for an IRT model to be applied. Currently, `PCM`, `GPCM`, and `GRM` are available. The default is `"GPCM"`.
`range`	Range of the latent variable to be considered in the quadrature scheme. The default is from `-6` to `6`: `c(-6, 6)`.
`q`	A numeric value that represents the number of quadrature points. The default value is 121.
`initialitem`	A matrix of initial item parameter values for starting the estimation algorithm. The default value is `NULL`.
`ability_method`	The ability parameter estimation method. The available options are Expected a posteriori (`EAP`), Maximum Likelihood Estimates (`MLE`), and weighted likelihood estimates (`WLE`). The default is `EAP`.
`latent_dist`	A character string that determines latent distribution estimation method. Insert `"Normal"`, `"normal"`, or `"N"` for the normality assumption on the latent distribution, `"EHM"` for empirical histogram method (Mislevy, 1984; Mislevy & Bock, 1985), `"2NM"` or `"Mixture"` for using two-component Gaussian mixture distribution (Li, 2021; Mislevy, 1984), `"DC"` or `"Davidian"` for Davidian-curve method (Woods & Lin, 2009), `"KDE"` for kernel density estimation method (Li, 2022), and `"LLS"` for log-linear smoothing method (Casabianca & Lewis, 2015). The default value is set to `"Normal"` to follow the convention.
`max_iter`	A numeric value that determines the maximum number of iterations in the EM-MML. The default value is 200.
`threshold`	A numeric value that determines the threshold of EM-MML convergence. A maximum item parameter change is monitored and compared with the threshold. The default value is 0.0001.
`bandwidth`	A character value that can be used if `latent_dist = "KDE"`. This argument determines the bandwidth estimation method for `"KDE"`. The default value is `"SJ-ste"`. See `density` for available options.
`h`	A natural number less than or equal to 10 if `latent_dist = "DC" or "LLS"`. This argument determines the complexity of the distribution.

Details

The probability for scoring $u=k$ (i.e., $k=0, 1, ..., m; m \ge 2$ )

1) Partial credit model (PCM)

$P(u=0|\theta, b_1, ..., b_{m})=\frac{1}{1+\sum_{c=1}^{m}{\exp{\left[\sum_{v=1}^{c}{a(\theta-b_v)}\right]}}}$

$P(u=1|\theta, b_1, ..., b_{m})=\frac{\exp{(\theta-b_1)}}{1+\sum_{c=1}^{m}{\exp{\left[\sum_{v=1}^{c}{\theta-b_v}\right]}}}$

$\vdots$

$P(u=m|\theta, b_1, ..., b_{m})=\frac{\exp{\left[\sum_{v=1}^{m}{\theta-b_v}\right]}}{1+\sum_{c=1}^{m}{\exp{\left[\sum_{v=1}^{c}{\theta-b_v}\right]}}}$

2) Generalized partial credit model (GPCM)

$P(u=0|\theta, a, b_1, ..., b_{m})=\frac{1}{1+\sum_{c=1}^{m}{\exp{\left[\sum_{v=1}^{c}{a(\theta-b_v)}\right]}}}$

$P(u=1|\theta, a, b_1, ..., b_{m})=\frac{\exp{(a(\theta-b_1))}}{1+\sum_{c=1}^{m}{\exp{\left[\sum_{v=1}^{c}{a(\theta-b_v)}\right]}}}$

$\vdots$

$P(u=m|\theta, a, b_1, ..., b_{m})=\frac{\exp{\left[\sum_{v=1}^{m}{a(\theta-b_v)}\right]}}{1+\sum_{c=1}^{m}{\exp{\left[\sum_{v=1}^{c}{a(\theta-b_v)}\right]}}}$

3) Graded response model (GRM)

$P(u=0|\theta, a, b_1, ..., b_{m})=1-\frac{1}{1+\exp{\left[-a(\theta-b_1)\right]}}$

$P(u=1|\theta, a, b_1, ..., b_{m})=\frac{1}{1+\exp{\left[-a(\theta-b_1)\right]}}-\frac{1}{1+\exp{\left[-a(\theta-b_2)\right]}}$

$\vdots$

$P(u=m|\theta, a, b_1, ..., b_{m})=\frac{1}{1+\exp{\left[-a(\theta-b_m)\right]}}-0$

Latent distribution estimation methods

1) Empirical histogram method

$P(\theta=X_k)=A(X_k)$

2) Two-component Gaussian mixture distribution

$P(\theta=X)=\pi \phi(X; \mu_1, \sigma_1)+(1-\pi) \phi(X; \mu_2, \sigma_2)$

where $\phi(X; \mu, \sigma)$ is the value of a Gaussian component with mean $\mu$ and standard deviation $\sigma$ evaluated at $X$ .

3) Davidian curve method

$P(\theta=X)=\left\{\sum_{\lambda=0}^{h}{{m}_{\lambda}{X}^{\lambda}}\right\}^{2}\phi(X; 0, 1)$

where $h$ corresponds to the argument h and determines the degree of the polynomial.

4) Kernel density estimation method

$P(\theta=X)=\frac{1}{Nh}\sum_{j=1}^{N}{K\left(\frac{X-\theta_j}{h}\right)}$

5) Log-linear smoothing method

$P(\theta=X_{q})=\exp{\left(\beta_{0}+\sum_{m=1}^{h}{\beta_{m}X_{q}^{m}}\right)}$

where $h$ is the hyper parameter which determines the smoothness of the density, and $\theta$ can take total $Q$ finite values ( $X_1, \dots ,X_q, \dots, X_Q$ ).

Value

This function returns a list of several objects:

`par_est`	The item parameter estimates.
`se`	The asymptotic standard errors for item parameter estimates.
`fk`	The estimated frequencies of examinees at quadrature points.
`iter`	The number of EM-MML iterations elapsed for the convergence.
`quad`	The location of quadrature points.
`diff`	The final value of the monitored maximum item parameter change.
`Ak`	The estimated discrete latent distribution. It is discrete (i.e., probability mass function) by the quadrature scheme.
`Pk`	The posterior probabilities of examinees at quadrature points.
`theta`	The estimated ability parameter values. If `ability_method = "MLE"`. If an examinee receives a maximum or minimum score for all items, the function returns $\pm$ `Inf`.
`theta_se`	Standard error of ability estimates. The asymptotic standard errors for `ability_method = "MLE"` (the function returns `NA` for all or none correct answers). The standard deviations of the posterior distributions for `ability_method = "MLE"`.
`logL`	The deviance (i.e., -2logL).
`density_par`	The estimated density parameters.
`Options`	A replication of input arguments and other information.

Author(s)

Seewoo Li [email protected]

References

Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46(4), 443-459.

Li, S. (2021). Using a two-component normal mixture distribution as a latent distribution in estimating parameters of item response models. Journal of Educational Evaluation, 34(4), 759-789.

Mislevy, R. J. (1984). Estimating latent distributions. Psychometrika, 49(3), 359-381.

Woods, C. M., & Lin, N. (2009). Item response theory with estimation of the latent density using Davidian curves. Applied Psychological Measurement, 33(2), 102-117.

Woods, C. M., & Thissen, D. (2006). Item response theory with estimation of the latent population distribution using spline-based densities. Psychometrika, 71(2), 281-301.

Examples


# Preparation of dichotomous item response data

data <- DataGeneration(N=1000,
                       nitem_P = 8)$data_P

# Analysis

M1 <- IRTest_Poly(data)

# Preparation of dichotomous item response data

data <- DataGeneration(N=1000,
                       nitem_P = 8)$data_P

# Analysis

M1 <- IRTest_Poly(data)

Item fit diagnostics

Description

This function analyzes and reports item-fit test results.

Usage

item_fit(x, bins = 10, bin.center = "mean")
item_fit(x, bins = 10, bin.center = "mean")

Arguments

`x`	A model fit object from either `IRTest_Dich`, `IRTest_Poly`, or `IRTest_Mix`.
`bins`	The number of bins to be used for calculating the statistics. Following Yen's $Q_{1}$ (1981), the default is 10.
`bin.center`	A method for calculating the center of each bin. Following Yen's $Q_{1}$ (1981), the default is `"mean"`. Use `"median"` for Bock's $\chi^{2}$ (1960).

Details

Bock's $\chi^{2}$ (1960) or Yen's $Q_{1}$ (1981) is currently available.

Value

This function returns a matrix of item-fit test results.

Author(s)

Seewoo Li [email protected]

References

Bock, R.D. (1960), Methods and applications of optimal scaling. Chapel Hill, NC: L.L. Thurstone Psychometric Laboratory.

Yen, W. M. (1981). Using simulation results to choose a latent trait model. Applied Psychological Measurement, 5(2), 245–262.

Examples


# A preparation of dichotomous item response data

data <- DataGeneration(N=500,
                       nitem_D = 10)$data_D

# Analysis

M1 <- IRTest_Dich(data)

# Item fit statistics

item_fit(M1)

# A preparation of dichotomous item response data

data <- DataGeneration(N=500,
                       nitem_D = 10)$data_D

# Analysis

M1 <- IRTest_Dich(data)

# Item fit statistics

item_fit(M1)

Latent density function

Description

Density function of the estimated latent distribution with mean and standard deviation equal to 0 and 1, respectively.

Usage

latent_distribution(x, model.fit)
latent_distribution(x, model.fit)

Arguments

`x`	A numeric vector. Value(s) on the $theta$ scale for evaluating the PDF.
`model.fit`	An object returned from an estimation function.

Value

The evaluated values of the PDF, a length of which equals to that of x.

Examples


# Data generation and model fitting
data <- DataGeneration(N=1000,
                       nitem_D = 15,
                       latent_dist = "2NM",
                       d = 1.664,
                       sd_ratio = 2,
                       prob = 0.3)$data_D

M1 <- IRTest_Dich(data = data, latent_dist = "KDE")

# Plotting the latent distribution
ggplot2::ggplot()+
  ggplot2::stat_function(fun=latent_distribution, args=list(M1))+
  ggplot2::lims(x=c(-6,6), y=c(0,0.5))

# Data generation and model fitting
data <- DataGeneration(N=1000,
                       nitem_D = 15,
                       latent_dist = "2NM",
                       d = 1.664,
                       sd_ratio = 2,
                       prob = 0.3)$data_D

M1 <- IRTest_Dich(data = data, latent_dist = "KDE")

# Plotting the latent distribution
ggplot2::ggplot()+
  ggplot2::stat_function(fun=latent_distribution, args=list(M1))+
  ggplot2::lims(x=c(-6,6), y=c(0,0.5))

Extract Log-Likelihood

Description

Extract Log-Likelihood

Usage

## S3 method for class 'IRTest'
logLik(object, ...)
## S3 method for class 'IRTest'
logLik(object, ...)

Arguments

`object`	A `IRTest`-class object from which a log-likelihood value is extracted.
`...`	Other arguments.

Value

Extracted log-likelihood.

Recovering original parameters of two-component Gaussian mixture distribution from re-parameterized values

Description

Recovering original parameters of two-component Gaussian mixture distribution from re-parameterized values

Usage

original_par_2GM(
  prob = 0.5,
  d = 0,
  sd_ratio = 1,
  overallmean = 0,
  overallsd = 1
)
original_par_2GM(
  prob = 0.5,
  d = 0,
  sd_ratio = 1,
  overallmean = 0,
  overallsd = 1
)

Arguments

`prob`	The $\pi = \frac{n_1}{N}$ parameter of two-component Gaussian mixture distribution, where $n_1$ is the estimated number of examinees belonging to the first Gaussian component and $N$ is the total number of examinees (Li, 2021).
`d`	The $\delta = \frac{\mu_2 - \mu_1}{\bar{\sigma}}$ parameter of two-component Gaussian mixture distribution, where $\mu_1$ and $\mu_2$ are the estimated means of the first and second Gaussian components, respectively. And $\bar{\sigma}$ is the overall standard deviation of the latent distribution (Li, 2021). Without loss of generality, $\mu_2 \ge \mu_1$ is assumed, thus $\delta \ge 0$ .
`sd_ratio`	A numeric value of $\zeta = \frac{\sigma_2}{\sigma_1}$ parameter of two-component Gaussian mixture distribution, where $\sigma_1$ and $\sigma_2$ are the estimated standard deviations of the first and second Gaussian components, respectively (Li, 2021).
`overallmean`	A numeric value of $\bar{\mu}$ that determines the overall mean of two-component Gaussian mixture distribution.
`overallsd`	A numeric value of $\bar{\sigma}$ that determines the overall standard deviation of two-component Gaussian mixture distribution.

Details

Original two-component Gaussian mixture distribution

$f(x)=\pi\times \phi(x | \mu_1, \sigma_1)+(1-\pi)\times \phi(x | \mu_2, \sigma_2)$

, where $\phi$ is a Gaussian component.

Re-parameterized two-component Gaussian mixture distribution

$f(x)=2GM(x|\pi, \delta, \zeta, \bar{\mu}, \bar{\sigma})$

, where $\bar{\mu}$ is overall mean and $\bar{\sigma}$ is overall standard deviation of the distribution.

The original parameters retrieved from re-parameterized values

1) Mean of the first Gaussian component (m1).

$\mu_1=-(1-\pi)\delta\bar{\sigma}+\bar{\mu}$

2) Mean of the second Gaussian component (m2).

$\mu_2=\pi\delta\bar{\sigma}+\bar{\mu}$

3) Standard deviation of the first Gaussian component (s1).

$\sigma_1^2=\bar{\sigma}^2\left(\frac{1-\pi(1-\pi)\delta^2}{\pi+(1-\pi)\zeta^2}\right)$

4) Standard deviation of the second Gaussian component (s2).

$\sigma_2^2=\bar{\sigma}^2\left(\frac{1-\pi(1-\pi)\delta^2}{\frac{1}{\zeta^2}\pi+(1-\pi)}\right)=\zeta^2\sigma_1^2$

Value

This function returns a vector of length 4: c(m1,m2,s1,s2).

`m1`	The location parameter (mean) of the first Gaussian component.
`m2`	The location parameter (mean) of the second Gaussian component.
`s1`	The scale parameter (standard deviation) of the first Gaussian component.
`s2`	The scale parameter (standard deviation) of the second Gaussian component.

Author(s)

Seewoo Li [email protected]

References

Li, S. (2021). Using a two-component normal mixture distribution as a latent distribution in estimating parameters of item response models. Journal of Educational Evaluation, 34(4), 759-789.

Plot of item response functions

Description

This function draws item response functions of an item of the fitted model.

Usage

plot_item(x, item.number = 1, type = NULL)
plot_item(x, item.number = 1, type = NULL)

Arguments

`x`	A model fit object from either `IRTest_Dich`, `IRTest_Poly`, `IRTest_Cont`, or `IRTest_Mix`.
`item.number`	A numeric value indicating the item number.
`type`	A character string required if `inherits(x, c("mix")) == TRUE`. It should be either `"d"` (dichotomous item) or `"p"` (polytomous item); `item.number=1, type="d"` indicates the first dichotomous item.

Value

This function returns a plot of item response functions.

Author(s)

Seewoo Li [email protected]

Examples


# A preparation of dichotomous item response data

data <- DataGeneration(N=500, nitem_D = 10)$data_D

# Analysis

M1 <- IRTest_Dich(data)

# Plotting item response function

plot_item(M1, item.number = 1)

# A preparation of dichotomous item response data

data <- DataGeneration(N=500, nitem_D = 10)$data_D

# Analysis

M1 <- IRTest_Dich(data)

# Plotting item response function

plot_item(M1, item.number = 1)

Plot of the estimated latent distribution

Description

This function draws a plot of the estimated latent distribution (the population distribution of the latent variable).

Usage

## S3 method for class 'IRTest'
plot(x, ...)
## S3 method for class 'IRTest'
plot(x, ...)

Arguments

`x`	An object of `"IRTest"`-class obtained from either `IRTest_Dich`, `IRTest_Poly`, `IRTest_Cont`, or `IRTest_Mix`.
`...`	Other aesthetic argument(s) for drawing the plot. Arguments are passed on to `ggplot2::stat_function`, if the distribution estimation method is 2NM, KDE, or DC. Otherwise, they are passed on to `ggplot2::geom_line`.

Value

A plot of estimated latent distribution.

Author(s)

Seewoo Li [email protected]

Examples


# Data generation and model fitting

data <- DataGeneration(N=1000,
                       nitem_D = 15,
                       latent_dist = "2NM",
                       d = 1.664,
                       sd_ratio = 2,
                       prob = 0.3)$data_D

M1 <- IRTest_Dich(data = data, latent_dist = "KDE")

# Plotting the latent distribution

plot(x = M1, linewidth = 1, color = 'red') +
  ggplot2::lims(x = c(-6, 6), y = c(0, .5))

# Data generation and model fitting

data <- DataGeneration(N=1000,
                       nitem_D = 15,
                       latent_dist = "2NM",
                       d = 1.664,
                       sd_ratio = 2,
                       prob = 0.3)$data_D

M1 <- IRTest_Dich(data = data, latent_dist = "KDE")

# Plotting the latent distribution

plot(x = M1, linewidth = 1, color = 'red') +
  ggplot2::lims(x = c(-6, 6), y = c(0, .5))

Printing the result

Description

This function prints the summarized information.

Usage

## S3 method for class 'IRTest'
print(x, ...)
## S3 method for class 'IRTest'
print(x, ...)

Arguments

`x`	An object of `"IRTest"`-class obtained from either `IRTest_Dich`, `IRTest_Poly`, or `IRTest_Mix`.
`...`	Additional arguments (currently non-functioning).

Value

Printed texts on the console recommending the usage of summary function and the direct access to the details using "$" sign.

Author(s)

Seewoo Li [email protected]

Examples


data <- DataGeneration(N=1000, nitem_P = 8)$data_P

M1 <- IRTest_Poly(data = data, latent_dist = "KDE")

M1

data <- DataGeneration(N=1000, nitem_P = 8)$data_P

M1 <- IRTest_Poly(data = data, latent_dist = "KDE")

M1

Printing the summary

Description

This function prints the summarized information.

Usage

## S3 method for class 'IRTest_summary'
print(x, ...)
## S3 method for class 'IRTest_summary'
print(x, ...)

Arguments

`x`	An object returned from `summary.IRTest`.
`...`	Additional arguments (currently non-functioning).

Value

Summarized texts on the console.

Author(s)

Seewoo Li [email protected]

Examples


data <- DataGeneration(N=1000, nitem_P = 8)$data_P

M1 <- IRTest_Poly(data = data,
                  latent_dist = "2NM")

summary(M1)

data <- DataGeneration(N=1000, nitem_P = 8)$data_P

M1 <- IRTest_Poly(data = data,
                  latent_dist = "2NM")

summary(M1)

Recategorization of data using a new categorization scheme

Description

With a recategorization scheme as an input, this function implements recategorization for the input data.

Usage

recategorize(data, new_cat)
recategorize(data, new_cat)

Arguments

`data`	An item response matrix.
`new_cat`	A list of a new categorization scheme.

Value

Recategorized data

Author(s)

Seewoo Li [email protected]

Examples


# Preparation of dichotomous item response data

data <- DataGeneration(N=1000,
                       nitem_P = 8)$data_P

# Analysis

M1 <- IRTest_Poly(data)

# Recommendation of category collapsing

new_cat <- cat_clps(M1$par_est)

# Recategorization of data

recategorize(data, new_cat)

# Preparation of dichotomous item response data

data <- DataGeneration(N=1000,
                       nitem_P = 8)$data_P

# Analysis

M1 <- IRTest_Poly(data)

# Recommendation of category collapsing

new_cat <- cat_clps(M1$par_est)

# Recategorization of data

recategorize(data, new_cat)

Marginal reliability coefficient of IRT

Description

Marginal reliability coefficient of IRT

Usage

reliability(x)
reliability(x)

Arguments

`x`	A model fit object from either `IRTest_Dich`, `IRTest_Poly`, `IRTest_Cont`, or `IRTest_Mix`.

Details

Reliability coefficient on summed-score scale

In accordance with the concept of reliability in classical test theory (CTT), this function calculates the IRT reliability coefficients.

The basic concept and formula of the reliability coefficient can be expressed as follows (Kim & Feldt, 2010):

An observed score of Item $i$ , $X_i$ , is decomposed as the sum of a true score $T_i$ and an error $e_i$ . Then, with the assumption of $\sigma_{T_{i}e_{j}}=\sigma_{e_{i}e_{j}}=0$ , the reliability coefficient of a test is defined as;

$\rho_{TX}=\rho_{XX^{'}}=\frac{\sigma_{T}^{2}}{\sigma_{X}^{2}}=\frac{\sigma_{T}^{2}}{\sigma_{T}^{2}+\sigma_{e}^{2}}=1-\frac{\sigma_{e}^{2}}{\sigma_{X}^{2}}$

See May and Nicewander (1994) for the specific formula used in this function.

Reliability coefficient on $\theta$ scale

For the coefficient on the $\theta$ scale, this function calculates the parallel-forms reliability (Green et al., 1984; Kim, 2012):

$\rho_{\hat{\theta} \hat{\theta}^{'}} =\frac{\sigma_{E\left(\hat{\theta}\mid \theta \right )}^{2}}{\sigma_{E\left(\hat{\theta}\mid \theta \right )}^{2}+E\left( \sigma_{\hat{\theta}|\theta}^{2} \right)} =\frac{1}{1+E\left(I\left(\hat{\theta}\right)^{-1}\right)}$

This assumes that $\sigma_{E\left(\hat{\theta}\mid \theta \right )}^{2}=\sigma_{\theta}^{2}=1$ . Although the formula is often employed in several IRT studies and applications, the underlying assumption may not be true.

Value

Estimated marginal reliability coefficients.

Author(s)

Seewoo Li [email protected]

References

Green, B.F., Bock, R.D., Humphreys, L.G., Linn, R.L., & Reckase, M.D. (1984). Technical guidelines for assessing computerized adaptive tests. Journal of Educational Measurement, 21(4), 347–360.

Kim, S. (2012). A note on the reliability coefficients for item response model-based ability estimates. Psychometrika, 77(1), 153-162.

Kim, S., Feldt, L.S. (2010). The estimation of the IRT reliability coefficient and its lower and upper bounds, with comparisons to CTT reliability statistics. Asia Pacific Education Review, 11, 179–188.

May, K., Nicewander, A.W. (1994). Reliability and information functions for percentile ranks. Journal of Educational Measurement, 31(4), 313-325.

Examples


data <- DataGeneration(N=500, nitem_D = 10)$data_D

# Analysis

M1 <- IRTest_Dich(data)


# Reliability coefficients
reliability(M1)

data <- DataGeneration(N=500, nitem_D = 10)$data_D

# Analysis

M1 <- IRTest_Dich(data)


# Reliability coefficients
reliability(M1)

Summary of the results

Description

This function summarizes the output (e.g., convergence of the estimation algorithm, number of parameters, model-fit, and estimated latent distribution).

Usage

## S3 method for class 'IRTest'
summary(object, ...)
## S3 method for class 'IRTest'
summary(object, ...)

Arguments

`object`	An object of `"IRTest"`-class obtained from either `IRTest_Dich`, `IRTest_Poly`, or `IRTest_Mix`.
`...`	Other argument(s).

Value

Summarized information.

Examples


data <- DataGeneration(N=1000, nitem_P = 8)$data_P

M1 <- IRTest_Poly(data = data, latent_dist = "KDE")

summary(M1)

data <- DataGeneration(N=1000, nitem_P = 8)$data_P

M1 <- IRTest_Poly(data = data, latent_dist = "KDE")

summary(M1)

Package 'IRTest'

Help Index

Ability parameter estimation with fixed item parameters

Description

Usage

Arguments

Value

Author(s)

Examples

Model comparison

Description

Usage

Arguments

Value

Author(s)

Selecting the best model

Description

Usage

Arguments

Value

Author(s)

A recommendation for category collapsing of items based on item parameters

Description

Usage

Arguments

Value

Author(s)

Extract Standard Errors of Model Coefficients

Description

Usage

Arguments

Value

Extract Model Coefficients

Description

Usage

Arguments

Value

Generating an artificial item response dataset

Description

Usage

Arguments

Value

Author(s)

References

Examples

Re-parameterized two-component normal mixture distribution

Description

Usage

Arguments

Details

Value

Author(s)

References

Examples

Estimated factor scores

Description

Usage

Arguments

Value

Author(s)

Examples

Item information function

Description

Usage

Arguments

Value

Author(s)

Test information function

Description

Usage

Arguments

Value

Author(s)

Item and ability parameters estimation for continuous response items

Description

Usage

Arguments

Details

Value

Author(s)