Rasch Trees: A New Method for Detecting Differential Item Functioning in the Rasch Model

Strobl, Carolin; Kopf, Julia; Zeileis, Achim

doi:10.1007/s11336-013-9388-3

Rasch Trees: A New Method for Detecting Differential Item Functioning in the Rasch Model

Published: 19 December 2013

Volume 80, pages 289–316, (2015)
Cite this article

Psychometrika Aims and scope Submit manuscript

Carolin Strobl¹,
Julia Kopf² &
Achim Zeileis³

2979 Accesses
68 Citations
Explore all metrics

Abstract

A variety of statistical methods have been suggested for detecting differential item functioning (DIF) in the Rasch model. Most of these methods are designed for the comparison of pre-specified focal and reference groups, such as males and females. Latent class approaches, on the other hand, allow the detection of previously unknown groups exhibiting DIF. However, this approach provides no straightforward interpretation of the groups with respect to person characteristics. Here, we propose a new method for DIF detection based on model-based recursive partitioning that can be considered as a compromise between those two extremes. With this approach it is possible to detect groups of subjects exhibiting DIF, which are not pre-specified, but result from combinations of observed covariates. These groups are directly interpretable and can thus help generate hypotheses about the psychological sources of DIF. The statistical background and construction of the new method are introduced by means of an instructive example, and extensive simulation studies are presented to support and illustrate the statistical properties of the method, which is then applied to empirical data from a general knowledge quiz. A software implementation of the method is freely available in the R system for statistical computing.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Item-focussed Trees for the Identification of Items in Differential Item Functioning

Article 23 November 2015

Gerhard Tutz & Moritz Berger

An R toolbox for score-based measurement invariance tests in IRT models

Article Open access 16 December 2021

Lennart Schneider, Carolin Strobl, … Rudolf Debelak

Interpreting Error Measurement: A Case Study Based on Rasch Tree Approach

Notes

At the time the quiz was conducted, Croatia was not yet a member of the EU.

References

Andersen, E. (1972). A goodness of fit test for the Rasch model. Psychometrika, 38, 123–140.
Article Google Scholar
Andrews, D.W.K. (1993). Tests for parameter instability and structural change with unknown change point. Econometrica, 61, 821–856.
Article Google Scholar
Ben-Shakhar, G., & Sinai, Y. (1991). Gender differences in multiple-choice tests: the role of differential guessing tendencies. Journal of Educational Measurement, 28(1), 23–35.
Article Google Scholar
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. Lord & M. Novick (Eds.), Statistical theories of mental test scores, Reading: Addison-Wesley.
Google Scholar
Boulesteix, A.L. (2006). Maximally selected chi-square statistics and binary splits of nominal variables. Biometrical Journal, 48(5), 838–848.
Article PubMed Google Scholar
Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression trees. New York: Chapman and Hall.
Google Scholar
Cohen, A., & Bolt, D. (2005). A mixture model analysis of differential item functioning. Journal of Educational Measurement, 42(3), 133–148.
Article Google Scholar
Dobra, A., & Gehrke, J. (2001). Bias correction in classification tree construction. In C.E. Brodley & A.P. Danyluk (Eds.), Proceedings of the seventeenth international conference on machine learning (ICML 2001), Williams College, Williamstown, MA, USA (pp. 90–97). San Mateo: Morgan Kaufmann.
Google Scholar
Fischer, G. & Molenaar, I. (Eds.) (1995). Rasch models: foundations, recent developments and applications. New York: Springer.
Google Scholar
Fraley, C., & Raftery, A. (2002). Model-based clustering, discriminant analysis and density estimation. Journal of the American Statistical Association, 97(458), 611–631.
Article Google Scholar
Fraley, C., & Raftery, A. (2012). mclust: Model-based clustering/Normal mixture modeling. R package version 3.4.11. http://CRAN.R-project.org/package=mclust.
Gelin, M., Carleton, B., Smith, M., & Zumbo, B. (2004). The dimensionality and gender differential item functioning of the mini asthma quality of life questionnaire (MiniAQLQ). Social Indicators Research, 68, 91–105.
Article Google Scholar
Gustafsson, J. (1980). Testing and obtaining fit of data in the Rasch model. British Journal of Mathematical & Statistical Psychology, 33(2), 205–233.
Article Google Scholar
Hancock, G. & Samuelsen, K. (Eds.) (2007). Advances in latent variable mixture models. Charlotte: Information Age.
Google Scholar
Hochberg, Y. & Tamhane, A. (Eds.) (1987). Multiple comparison procedures. New York: Wiley.
Google Scholar
Hothorn, T., Hornik, K., & Zeileis, A. (2006). Unbiased recursive partitioning: a conditional inference framework. Journal of Computational and Graphical Statistics, 15(3), 651–674.
Article Google Scholar
Hothorn, T., & Lausen, B. (2003). On the exact distribution of maximally selected rank statistics. Computational Statistics & Data Analysis, 43(2), 121–137.
Article Google Scholar
Hothorn, T., & Zeileis, A. (2008). Generalized maximally selected statistics. Biometrics, 64(4), 1263–1269.
Article PubMed Google Scholar
Hubert, L., & Arabie, P. (1985). Comparing partitions. Journal of Classification, 2(1), 193–218.
Article Google Scholar
Kelderman, H., & MacReady, G. (1990). The use of loglinear models for assessing differential item functioning across manifest and latent examinee groups. Journal of Educational Measurement, 27(4), 307–327.
Article Google Scholar
Koziol, J. (1991). On maximally selected chi-square statistics. Biometrics, 47(4), 1557–1561.
Article Google Scholar
Liou, M. (1994). More on the computation of higher-order derivatives on the elementary symmetric functions in the Rasch model. Applied Psychological Measurement, 18(1), 53–62.
Article Google Scholar
Maij-de Meij, A., Kelderman, H., & Van der Flier, H. (2008). Fitting a mixture item response theory model to personality questionnaire data: characterizing latent classes and investigating possibilities for improving prediction. Applied Psychological Measurement, 32(8), 611–631.
Article Google Scholar
Mair, P., & Hatzinger, R. (2007). Extended Rasch modeling: the eRm package for the application of IRT models in R. Journal of Statistical Software, 20, 9. http://www.jstatsoft.org/v20/i09/.
Google Scholar
Mair, P., Hatzinger, R., & Maier, M. (2012). eRm: extended Rasch modeling. R package version 0.15-0. http://CRAN.R-project.org/package=eRm.
Marcus, R., Peritz, E., & Gabriel, K. (1976). Closed testing procedures with special reference to ordered analysis of variance. Biometrika, 63(3), 655–660.
Article Google Scholar
Masters, G. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149–174.
Article Google Scholar
Merkle, E.C., Fan, J., & Zeileis, A. (2013). Testing for measurement invariance with respect to an ordinal variable. Psychometrika, forthcoming.
Merkle, E.C., & Zeileis, A. (2013). Tests of measurement invariance without subgroups: a generalization of classical methods. Psychometrika, 78(1), 59–82.
Article PubMed Google Scholar
Miller, R., & Siegmund, D. (1982). Maximally selected chi square statistics. Biometrics, 38(4), 1011–1016.
Article Google Scholar
Milligan, G., & Cooper, M. (1986). A study of the comparability of external criteria for hierarchical cluster analysis. Multivariate Behavioral Research, 21(4), 441–458.
Article Google Scholar
Mislevy, R., & Verhelst, N. (1990). Modeling item responses when different subjects employ different solution strategies. Psychometrika, 55(2), 195–215.
Article Google Scholar
Pedraza, O., Graff-Radford, N., Smith, G., Ivnik, R., Willis, F., Petersen, R., & Lucas, J. (2009). Differential item functioning of the Boston Naming Test in cognitively normal African American and Caucasian older adults. Journal of the International Neuropsychological Society, 15(05), 758–768.
Article PubMed Central PubMed Google Scholar
Penfield, D. (2007). Assessing differential step functioning in polytomous items using a common odds ratio estimator. Journal of Educational Measurement, 44(3), 187–210.
Article Google Scholar
Penfield, D., Alvarez, K., & Lee, O. (2009). Using a taxonomy of differential step functioning to improve the interpretation of DIF in polytomous items: an illustration. Applied Measurement in Education, 22(1), 61–78.
Article Google Scholar
Perkins, A., Stump, T., Monahan, P., & McHorney, C. (2006). Assessment of differential item functioning for demographic comparisons in the MOS SF-36 health survey. Quality of Life Research, 15, 331–348.
Article PubMed Google Scholar
R Development Core Team (2012). R: a language and environment for statistical computing. Vienna: R Foundation for Statistical Computing. ISBN 3-900051-07-0. http://www.R-project.org/.
Google Scholar
Rijmen, F., Tuerlinckx, F., De Boeck, P., & Kuppens, P. (2003). A nonlinear mixed model framework for item response theory. Psychological Methods, 8(2), 185–205.
Article PubMed Google Scholar
Rizopoulos, D. (2006). ltm: an R package for latent variable modeling and item response analysis. Journal of Statistical Software, 17, 5. http://www.jstatsoft.org/v17/i05/.
Google Scholar
Rizopoulos, D. (2012). ltm: latent trait models under IRT. R package version 0.9-9. http://CRAN.R-project.org/package=ltm.
Rost, J. (1990). Rasch models in latent classes: an integration of two approaches to item analysis. Applied Psychological Measurement, 14(3), 271–282.
Article Google Scholar
Shih, Y.S. (2004). A note on split selection bias in classification trees. Computational Statistics & Data Analysis, 45(3), 457–466.
Article Google Scholar
Smit, J., Kelderman, H., & Van der Flier, H. (2000). The mixed Birnbaum model: estimation using collateral information. Methods of Psychological Research Online, 5, 1–13.
Google Scholar
Strobl, C., Boulesteix, A.L., & Augustin, T. (2007). Unbiased split selection for classification trees based on the Gini index. Computational Statistics & Data Analysis, 52(1), 483–501.
Article Google Scholar
Strobl, C., Malley, J., & Tutz, G. (2009). An introduction to recursive partitioning: rationale, application and characteristics of classification and regression trees, bagging and random forests. Psychological Methods, 14(4), 323–348.
Article PubMed Central PubMed Google Scholar
Strobl, C., Wickelmaier, F., & Zeileis, A. (2011). Accounting for individual differences in Bradley–Terry models by means of recursive partitioning. Journal of Educational and Behavioral Statistics, 36(2), 135–153.
Google Scholar
Trepte, S. & Verbeet, M. (Eds.) (2010). Allgemeinbildung in Deutschland—Erkenntnisse aus dem SPIEGEL Studentenpisa-Test. Wiesbaden: VS Verlag.
Google Scholar
Van den Noortgate, W., & De Boeck, P. (2005). Assessing and explaining differential item functioning using logistic mixed models. Journal of Educational and Behavioral Statistics, 30(4), 443–464.
Article Google Scholar
Westers, P., & Kelderman, H. (1992). Examining differential item functioning due to item difficulty and alternative attractiveness. Psychometrika, 57(1), 107–118.
Article Google Scholar
Woods, C., Oltmanns, T., & Turkheimer, E. (2009). Illustration of MIMIC-model DIF testing with the schedule for nonadaptive and adaptive personality. Journal of Psychopathology and Behavioral Assessment, 31, 320–330.
Article PubMed Central PubMed Google Scholar
Zeileis, A., & Hornik, K. (2007). Generalized m-fluctuation tests for parameter instability. Statistica Neerlandica, 61(4), 488–508.
Article Google Scholar
Zeileis, A., Hothorn, T., & Hornik, K. (2008). Model-based recursive partitioning. Journal of Computational and Graphical Statistics, 17(2), 492–514.
Article Google Scholar
Zeileis, A., Strobl, C., Wickelmaier, F., & Kopf, J. (2012). psychotree: recursive partitioning based on psychometric models. R package version 0.12-2. http://CRAN.R-project.org/package=psychotree.

Download references

Acknowledgements

Julia Kopf is supported by the German Federal Ministry of Education and Research (BMBF) within the project “Heterogeneity in IRT-Models” (grant ID 01JG1060).

The authors would like to thank three anonymous reviewers for their very helpful and constructive feedback.

Special thanks go to Reinhold Hatzinger (1953–2012), who stimulated important insights for this and other projects through many conversations and his extensive work on the R package eRm (Mair & Hatzinger, 2007; Mair et al., 2012). We miss him as a researcher and friend.

Author information

Authors and Affiliations

Department of Psychology, Universität Zürich, Binzmühlestr. 14, 8050, Zürich, Switzerland
Carolin Strobl
Department of Statistics, Ludwig-Maximilians-Universität München, Ludwigstr. 33, 80539, München, Germany
Julia Kopf
Department of Statistics, Universität Innsbruck, Universitätsstr. 15, 6020, Innsbruck, Austria
Achim Zeileis

Authors

Carolin Strobl
View author publications
You can also search for this author in PubMed Google Scholar
Julia Kopf
View author publications
You can also search for this author in PubMed Google Scholar
Achim Zeileis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Carolin Strobl.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Strobl, C., Kopf, J. & Zeileis, A. Rasch Trees: A New Method for Detecting Differential Item Functioning in the Rasch Model. Psychometrika 80, 289–316 (2015). https://doi.org/10.1007/s11336-013-9388-3

Download citation

Received: 29 November 2010
Published: 19 December 2013
Issue Date: June 2015
DOI: https://doi.org/10.1007/s11336-013-9388-3

Key words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Rasch Trees: A New Method for Detecting Differential Item Functioning in the Rasch Model

Abstract

Access this article

Similar content being viewed by others

Item-focussed Trees for the Identification of Items in Differential Item Functioning

An R toolbox for score-based measurement invariance tests in IRT models

Interpreting Error Measurement: A Case Study Based on Rasch Tree Approach

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Key words

Navigation

Rasch Trees: A New Method for Detecting Differential Item Functioning in the Rasch Model

Abstract

Access this article

Similar content being viewed by others

Item-focussed Trees for the Identification of Items in Differential Item Functioning

An R toolbox for score-based measurement invariance tests in IRT models

Interpreting Error Measurement: A Case Study Based on Rasch Tree Approach

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Key words

Search

Navigation