Non-uniform DIF detection in DIFdetect

DIFdetect fits ordinal logistic regression models to the data. The first step for non-uniform DIF detection is the creation of an interaction term between group assignment and ability level. DIFdetect will do this automatically. Once this interaction term is created, the following model is fit to the data:

ologit itemresponse abilitylevel groupassignment abilitylevel*groupassignment (1).

The first term after the command (ologit) represents the left side variable; this command is equivalent to the equation:

f(itemresponse) = cut + b1*abilitylevel + b2*groupassignment +

b3*abilitylevel*groupassignment (2)

where the function on the left is the cumulative logit function. Refer to McCullagh and Nelder (1989).

The assessment for non-uniform DIF is then the assessment of the significance of the interaction term between group assignment and ability level. If this term is found to be a statistically significant predictor of the item responses, then non-uniform DIF is present. The null hypothesis tested is H0: b3 = 0. If b3 not = 0, then H0 is rejected, and non-uniform DIF is found for the item in question.

DIFdetect offers two options for the detection of non-uniform DIF, related to different ways of assessing the statistical significance of the interaction term. We recommend that both techniques be used, and in our own data analysis we have found that the two techniques rarely provide different results.

The first technique is to compare the p-value associated with the b3 coefficient, that is, to compare the size of the b3 coefficient to its standard error. We recommend that an alpha level of 0.05 should be accepted as evidence of statistically significant non-uniform DIF.

The second technique is to compare the difference between the negative 2 log likelihood of formula (1) above with the negative 2 log likelihood of the formula without the interaction term, that is to say, the negative 2 log likelihood of the formula:

ologit itemresponse abilitylevel groupassignment (3).

The second technique then uses chi squared with 1 degree of freedom to test whether the interaction term explains additional variation in the itemresponse variable. Again, we recommend an alpha level of 0.05 for this test.

If the two techniques for measuring the significance of the interaction term provide disparate results, we would lean toward the second technique. In most situations, however, we suspect that the two techniques will provide very similar answers.


Return to CONTENTS
Return to DIFdetect homepage
Proceed to next page of documentation