Complete Q‐matrices in conjunctive models on general attribute structures

In cognitive diagnostic assessment a property of the Q-matrix, usually referred to as completeness, warrants that the cognitive attributes underlying the observed behaviour can be uniquely assessed. Characterizations of completeness were first derived under the assumption of independent attributes, and are currently under investigation for interdependent attributes. The dominant approach considers so-called attribute hierarchies, which are conceptualized through a partial order on the set of attributes. The present paper extends previously published results on this issue obtained for conjunctive attribute hierarchy models. Drawing upon results from knowledge structure theory, it provides novel sufficient and necessary conditions for completeness of the Q-matrix, not only for conjunctive models on attribute hierarchies, but also on more general attribute structures.

Distance‐based logistic model for cross‐classified categorical data

Logistic regression models are a powerful research tool for the analysis of cross-classified data in which a categorical response variable is involved. In a logistic model, the effect of a covariate refers to odds, and the simple relationship between the coefficients and the odds ratio often makes these the parameters of interest due to their easy interpretation. In this article we present a distance-based logistic model that allows a simple graphical interpretation of the association coefficients using the odds ratio in a contingency table. Two configurations are estimated, one for the rows and one for the columns, as the categories of a polytomous predictor and a nominal response variable respectively, such that the local odds ratio and the distances between the predictor and response categories are inversely related. The associations in terms of the odds ratios, or the ratios of the odds to their geometric means, are interpreted through distances for the most common coding schemes of the predictor variable, and the relationship between the distances related to different codings is investigated in its full dimension. The performance of the estimation procedure is analysed with a Monte Carlo experiment. The interpretation of the model and its performance, as well as its comparison with a two-step procedure involving first a logistic regression and then unfolding, is illustrated using real data sets.

A flexible approach to modelling over‐, under‐ and equidispersed count data in IRT: The Two‐Parameter Conway–Maxwell–Poisson Model

Several psychometric tests and self-reports generate count data (e.g., divergent thinking tasks). The most prominent count data item response theory model, the Rasch Poisson Counts Model (RPCM), is limited in applicability by two restrictive assumptions: equal item discriminations and equidispersion (conditional mean equal to conditional variance). Violations of these assumptions lead to impaired reliability and standard error estimates. Previous work generalized the RPCM but maintained some limitations. The two-parameter Poisson counts model allows for varying discriminations but retains the equidispersion assumption. The Conway–Maxwell–Poisson Counts Model allows for modelling over- and underdispersion (conditional mean less than and greater than conditional variance, respectively) but still assumes constant discriminations. The present work introduces the Two-Parameter Conway–Maxwell–Poisson (2PCMP) model which generalizes these three models to allow for varying discriminations and dispersions within one model, helping to better accommodate data from count data tests and self-reports. A marginal maximum likelihood method based on the EM algorithm is derived. An implementation of the 2PCMP model in R and C++ is provided. Two simulation studies examine the model's statistical properties and compare the 2PCMP model to established models. Data from divergent thinking tasks are reanalysed with the 2PCMP model to illustrate the model's flexibility and ability to test assumptions of special cases.

Score‐based measurement invariance checks for Bayesian maximum‐a‐posteriori estimates in item response theory

A family of score-based tests has been proposed in recent years for assessing the invariance of model parameters in several models of item response theory (IRT). These tests were originally developed in a maximum likelihood framework. This study discusses analogous tests for Bayesian maximum-a-posteriori estimates and multiple-group IRT models. We propose two families of statistical tests, which are based on an approximation using a pooled variance method, or on a simulation approach based on asymptotic results. The resulting tests were evaluated by a simulation study, which investigated their sensitivity against differential item functioning with respect to a categorical or continuous person covariate in the two- and three-parametric logistic models. Whereas the method based on pooled variance was found to be useful in practice with maximum likelihood as well as maximum-a-posteriori estimates, the simulation-based approach was found to require large sample sizes to lead to satisfactory results.

A new person‐fit method based on machine learning in CDM in education

Cognitive diagnosis models have become popular in educational assessment and are used to provide more individualized feedback about a student's specific strengths and weaknesses than traditional total scores. However, if the testing data are contaminated by certain biases or aberrant response patterns, such predictions may not be accurate. The current research objective is to develop a new person-fit method that is based on machine learning and improves the functionality of existing person-fit methods. Various simulations were designed under three aberrant conditions: cheating, sleeping and random guessing. Simulation results showed that the new method was more powerful and effective than previous methods, especially for short-length tests.

Latent variable sdelection in multidimensional item response theory models using the expectation model selection algorithm

The aim of latent variable selection in multidimensional item response theory (MIRT) models is to identify latent traits probed by test items of a multidimensional test. In this paper the expectation model selection (EMS) algorithm proposed by Jiang et al. (2015) is applied to minimize the Bayesian information criterion (BIC) for latent variable selection in MIRT models with a known number of latent traits. Under mild assumptions, we prove the numerical convergence of the EMS algorithm for model selection by minimizing the BIC of observed data in the presence of missing data. For the identification of MIRT models, we assume that the variances of all latent traits are unity and each latent trait has an item that is only related to it. Under this identifiability assumption, the convergence of the EMS algorithm for latent variable selection in the multidimensional two-parameter logistic (M2PL) models can be verified. We give an efficient implementation of the EMS for the M2PL models. Simulation studies show that the EMS outperforms the EM-based L 1 regularization in terms of correctly selected latent variables and computation time. The EMS algorithm is applied to a real data set related to the Eysenck Personality Questionnaire.

Computing the real solutions of Fleishman’s equations for simulating non‐normal data

Fleishman's power method is frequently used to simulate non-normal data with a desired skewness and kurtosis. Fleishman's method requires solving a system of nonlinear equations to find the third-order polynomial weights that transform a standard normal variable into a non-normal variable with desired moments. Most users of the power method seem unaware that Fleishman's equations have multiple solutions for typical combinations of skewness and kurtosis. Furthermore, researchers lack a simple method for exploring the multiple solutions of Fleishman's equations, so most applications only consider a single solution. In this paper, we propose novel methods for finding all real-valued solutions of Fleishman's equations. Additionally, we characterize the solutions in terms of differences in higher order moments. Our theoretical analysis of the power method reveals that there typically exists two solutions of Fleishman's equations that have noteworthy differences in higher order moments. Using simulated examples, we demonstrate that these differences can have remarkable effects on the shape of the non-normal distribution, as well as the sampling distributions of statistics calculated from the data. Some considerations for choosing a solution are discussed, and some recommendations for improved reporting standards are provided.

Empirical underidentification in estimating random utility models: The role of choice sets and standardizations

A standard approach to distinguishing people’s risk preferences is to estimate a random utility model using a power utility function to characterize the preferences and a logit function to capture choice consistency. We demonstrate that with often-used choice situations, this model suffers from empirical underidentification, meaning that parameters cannot be estimated precisely. With simulations of estimation accuracy and Kullback–Leibler divergence measures we examined factors that potentially mitigate this problem. First, using a choice set that guarantees a switch in the utility order between two risky gambles in the range of plausible values leads to higher estimation accuracy than randomly created choice sets or the purpose-built choice sets common in the literature. Second, parameter estimates are regularly correlated, which contributes to empirical underidentification. Examining standardizations of the utility scale, we show that they mitigate this correlation and additionally improve the estimation accuracy for choice consistency. Yet, they can have detrimental effects on the estimation accuracy of risk preference. Finally, we also show how repeated versus distinct choice sets and an increase in observations affect estimation accuracy. Together, these results should help researchers make informed design choices to estimate parameters in the random utility model more precisely.

Approximately counting and sampling knowledge states

Approximately counting and sampling knowledge states from a knowledge space is a problem that is of interest for both applied and theoretical reasons. However, many knowledge spaces used in practice are far too large for standard statistical counting and estimation techniques to be useful. Thus, in this work we use an alternative technique for counting and sampling knowledge states from a knowledge space. This technique is based on a procedure variously known as subset simulation, the Holmes–Diaconis–Ross method, or multilevel splitting. We make extensive use of Markov chain Monte Carlo methods and, in particular, Gibbs sampling, and we analyse and test the accuracy of our results in numerical experiments.

The Fisher information function and scoring in binary ideal point item response models: a cautionary tale

This article examines the Fisher information functions, , and explores implications for scoring of binary ideal point item response models. These models typically appear to have that are bimodal and identically equal to 0 at the ideal point. The article shows that this is an inherent property of ideal point IRT models, which either have this property or are indeterminate and thus violate the likelihood regularity conditions. For some models, the indeterminacy can be resolved, generating an effectively unimodal , albeit with violated regularity conditions. In other cases, diverges. All reasonable ideal point IRT models exhibit this behaviour. Users should exercise caution when relying on asymptotics, particularly for shorter assessments. Use of simulated plausible values or prediction from a fully Bayesian estimation is recommended for scoring.