The aim of the study is to examine differential item functioning dif. Pdf an introduction to differential item functioning. Windows software that generates irt parameters and. Classical item analyzer computes test scale reliability analyses.
Detection of dif is one step in the process of gathering score validity evidence. Differential item functioning shareware, freeware, demos. The program developed for dif analysis in cat was called computer adaptive testsimultaneous item bias catsib roussos 1996 that. Since the effect of missing data on differential item functioning dif assessment has been invest. We present an ordinal logistic regression model for identi. Improving the assessment of differential item functioning in. Dec 05, 2015 dif differential item functioning in larger testing programs, it is possible to look at how, within a given overall ability level, members of different groups e.
Research open access detecting differential item functioning. Modeling multiple response processes in judgment and choice. By design, largescale educational testing programs often have a large proportion of missing data. Windows software that generates irt parameters and item responses. Analysis of differential item functioning in the depression. Jun 22, 2012 the jmetrik software includes psychometric analyses such as ctt, irt, differential item functioning dif, and confirmatory factor analysis cfa. A variety of statistical procedures have been developed to assess dif in tests of dichotomous hills, 1989.
The purpose of the present analysis is to use differential item functioning dif to identify differences in the performance of native and immigrant students in pisa 2009 that can be directly related to their responses to particular items. Current methods include classical item analysis, differential item functioning dif analysis, confirmatory factor analysis, item response theory, irt equating, and nonparametric item response theory. Assessing differential item functioning in performance assessment. Gibbons, phd, lance jolley, ms, and gerald van belle, phd introduction. Eric ej973374 comparison of three software programs for. Theres a body of research on what is called differential item functioning in. Differential item functioning dif has been widely used in healthcare, business management, and educational measurement.
This simulation study examines item level differential item functioning dif in the context of international largescale assessment ilsa using a generalized logistic regression approach. If you have any comments or questions about any software on this page, contact the author of that specific package. Batch files can also be generated for handling multiple calibrations in a cue. Detecting differential item functioning using generalized. Apr 12, 20 differential item functioning dif is when a test item favors or hinders a characteristic exhibited by group members of a testtaking population. Detecting differential item functioning using wald and likelihood. With the rising concerns over the fairness of language tests, differential item functioning dif has been increasingly applied in bias analysis. Measuring differential item and test functioning across. Because of insufficient numbers of students for other demographic characteristics, this was the only comparison made.
Wingen provides a dialog input to introduce differential item functioning dif or item parameter drift in the simulated data. Flexible application to many types of selectedresponse items. Since the effect of missing data on differential item. Analysis of differential item functioning in the depression item bank from the patient reported outcome measurement information system promis. A computer program for detecting uniform and nonuniform differential item functioning with the mantelhaenszel procedure. Pdf comparison of three software programs for evaluating. The purpose of the proposed research is to create multilevel differential item functioning dif methods and software to increase the accuracy of the detection of dif. In this article, i show how item response models can be used to capture multiple response processes in psychological applications. X fits an item response model when x are item scores e. If dif is found for many items on the test, the final test scores do not represent the same. Irt differential item functioning tool assess computerized. Erm software school of education uncg soe unc greensboro. The fairness of an item depends directly on the purpose for which a test is being used.
Several methods have been proposed in recent decades for identifying items that function differently between two or more groups of examinees. Some of these procedures, such as the mantelhaenszel chi. It includes functions to use the monte carlo item parameter replication ipr approach for obtaining the associated statistical significance tests cutoff points. The student edition runs all example command files. Intuitive and analytical responses, agreedisagree answers, response refusals, socially desirable responding, differential item functioning, and choices among multiple options are considered. Differential item functioning analysis with ordinal logistic regression techniques difdetect and difwithpar paul k. Software erm faculty members have made the following software packages available free for download. Naep analysis and scaling differential item functioning. Recommendations for conducting differential item functioning. Starting from a framework for classifying dif detection methods and from a comparative overview of the most traditional methods, an r. This article provides an applied example using sibtest statistical software to detect dif in u.
The analysis of differential item functioning dif examines whether item responses. Comparison of three software programs for evaluating dif by means. Differential item functioning software free downloads. Relatively fewer studies examined an item level approach to measurement equivalence, particularly in settings where a large number of groups is included. A set of functions to perform differential item and item functioning analyses is implemented in the dfit package. Using difas penfield, 2005, differential item functioning dif analysis was performed comparing males with females using data from sets 1 and 2, which were administered to all examinees. Differential item functioning analysis with ordinal logistic. The item analysis includes proportion, point biserial, and biserial statistics for all response options.
Judicious application of this methodology by the researchers, however, requires an. Differential item functioning columbia university mailman. Package difr may, 2020 type package title collection of methods to detect dichotomous differential item functioning dif version 5. However, for rare events data, the maximum likelihood estimation method may be biased and the asymptotic distributions may not be reliable.
Software for analyzing differential item functioning using the. This analysis can be performed by calculating various statistics, one of the most important being the mantelhaenszel, which can be carried out with software programs. A comparative study of the bias correction methods for. A windowsbased item response theory data generator with an equating and differential item functioning simulation guide.
Current problems and future directions hossein karami, university of tehran, iran mohammad ali salmani nodoushan, iecf, iran. Performance differences at the measure level are described here as differential item functioning dif. A program to generate item response vectors unpublished manuscript. Avoiding bad discrimination in licensing and certification. Software for analyzing differential item functioning. An overview of differential item functioning in multistage computer. Below are common statistical programs capable of performing the procedures discussed herein. Differential item functioning dif is the preferred psychometric term for what is otherwise known as item bias. The analysis of differential item functioning dif examines whether item responses differ according to characteristics such as language and ethnicity, when people with matching ability levels respond differently to the items. Differential item functioning dif has been increasingly applied in fairness studies in psychometric circles. Graphing tool is a simple spreadsheet to visualize differential item functioning with item.
Differential item functioning between ethnic groups in the epidemiological assessment of depression. Free differential item functioning to download at shareware. In this study, the performance of the regular maximum likelihood ml estimation is compared with two bias. A thesis submitted to the graduate school of natural and applied sciences of middle east technical university. Average item scores for subgroups having the same overall score on the test are compared to determine whether the item is measuring in essentially the. An item displays dif when test takers possessing the same amount of an ability or trait, but belonging to different subgroups, do not share the same likelihood of correctly answering the item. Assessment developers design and construct questionnaires or tests including sets of items that measure, for example, cognition, personality traits, or political views. Improving the assessment of differential item functioning.
A general framework and an r package for the detection of. A computer program for detecting uniform and nonuniform differential. A new method for estimating differential item functioning dif for multiple groups and polytomous items. Dif analyses are statistical procedures used to determine to what extent the content of an item affects the item endorsement of subgroups of testtakers. Current methods include classical item analysis, differential item functioning dif analysis, item response theory, irt equating, and nonparametric item response theory. Differential item functioning dif is a statistical characteristic of an item that shows the extent to which the item might be measuring different abilities for members of separate subgroups. For example, a science item that is differentially difficult for women may be judged to be fair in a test designed for certification of science teachers because the item measures a topic that every entrylevel science teacher should know. With multiple file readin option in wingen, a user can have multiple groups of examinees and multiple sets of itemstests. We analyzed 95 cognitive reading items, administered to students in 29 european countries.
Performances based on ability estimation of the methods of. Differential item functioning dif refers to group differences in performance on a test item that cannot be explained by group differences in the construct targeted. Differential item functioning wikimili, the free encyclopedia. Measurement invariance and differential item functioning. Assessing differential item functioning among multiple groups. All of these analyses are useful in evaluating the psychometric quality of an assessment. Teresi, 1, 2 katja ocepekwelikson, 2 marjorie kleinman, 1 joseph p. In brief, differential item functioning dif occurs when groups such as defined by gender, ethnicity, age, or education have different probabilities of endorsing a given item on a multiitem scale after controlling for overall scale scores.
Search funded research grants and contracts details. Analysis of differential item functioning on some timss 2011. University of massachusetts, center for educational assessment. I would thank the authors of these programs for allowing free access to these packages. The differential item functioning analysis software penfield, 2005 and the easydif software gonz alez et al. Programs for differential item functioning linkdif and ezdif as described in applied psychological measurement factor analysis.
University of wisconsin, laboratory of experimental design. They are helpful to those of us who wish to investigate. Differential item functioning dif is an important issue of interest in psychometrics and educational measurement. Paper 29002015 multiple ways to detect differential item. As a result, the differential item functioning analysis system difas was developed to provide a costeffective and easytouse program for conducting many of the common nonparametric dif detection procedures, as well as several new dif detection procedures that are not available in other statistical packages. Statistical software for differential item functioning analysis.
The logistic regression lr model for assessing differential item functioning dif is highly dependent on the asymptotic sampling distributions. Analyze dif with specialized software like dfit or parscale. The conquest software provided the analysis model to understand the performance differences between groups i. A new method for estimating differential item functioning dif for. The difnlr package uses nonlinear regression to estimate dif.
1327 831 643 726 98 84 1335 1138 1025 72 1 534 1069 94 1527 1425 1093 791 1022 1501 965 420 501 1256 1369 1183 568 670 64 596 837 118 1068 1251 718 148