Thurstonian models for preference judgements – Anna Brown, Professor of Psychometrics

Ipsative (or relative-to-self) questionnaires ask respondents to compare sets of two or more stimuli from the same domain, such as behaviours, values or interests. For example, to measure occupational interests, respondents may be asked to indicate their preference for two or more activities such as (A) planting roses, (B) receiving telephone calls, or (C) building bridges. Preferences can be expressed as rank orders (e.g. A>C>B), or graded in terms of strength (e.g. ‘prefer a little’ A to C, or ‘prefer a lot’ C to B), or as a percentage of the total (e.g. 50-40-10). These forced-choice, graded preference and compositional judgements, respectively, readily reflect intra-individual differences (comparisons of attributes within the same person can be made), but until recently, were severely limited for measuring inter-individual differences (comparisons of attributes between different people).

I have developed and published a unified methodology that enables proper scaling of preference data and allows ipsative questionnaires to be used for measurement of individual differences (i.e. making inter-individual comparisons, for example in personnel assessment). The underpinning research has been documented through academic publications in peer-reviewed journals or handbooks (see Reference list below).

Specifically, I developed a Thurstonian Item Response Theory or TIRT model [4], which enabled proper scaling of multiple personal attributes from forced-choice questionnaires consisting of ranking blocks of any size (including full or partial ranking). The development of this model was carried out as part of my doctoral research under supervision of Dr Alberto Maydeu-Olivares at the University of Barcelona. For this innovation, I received the prestigious Best Dissertation award from the Psychometric Society for 2010. Special cases of the TIRT model to analyse paired comparison and ranking data that measure single personal attribute (unidimensional TIRT model) were also developed and illustrated with applications [2].

Based on this methodology, two psychometric tests were developed initially – the OPQ32r published by SHL [1] and the free-to-use Forced-choice Five Factor markers [5]. The methodology was made available to the scientific community by publishing guidance to estimating model and person parameters under the TIRT model using software Mplus [7], and providing macros to aid Mplus syntax creation [6]. An application of the TIRT model to personnel assessment was illustrated to disseminate the methodology [8].

To integrate the TIRT approach into a wider family of models for forced-choice data, I developed a common framework [12], which enabled classification of all models for choice behaviour, and developed the fundamental rules for identifying the scale origin (which is the basis of inter-personal comparability) from forced-choice data. These rules apply not only to the TIRT models, which assume “dominance” process for relationships between items and test scores, but also to models that assume the so-called “ideal-point” process. In addition to providing guidelines for creating forced-choice assessments that enable inter-individual comparisons, I provided guidelines for writing informative and valid items for both the “dominance” [2, 4] and “ideal-point” models [2] within these assessments. These test creation and item writing guidelines, as well as practical guides for test scoring using item response theory were further developed and summarised in chapters of encyclopaedia and edited books [10, 11, 16].

Further, I developed a latent class approach to analysing forced-choice data, where, instead of continuous personal attributes, categories or classes are assumed to underlie choices between alternatives [9]. This approach enables personality assessment according to the “type” rather than “trait” approach.

Later, I developed the unified approach to analysing any type of ipsative data. Two important instances of the unified approach were addressed by the development of methods for scaling compositional formats [13], and for scaling graded preferences [17]. Within the latter development, I provided calculations of test information and reliability that apply to graded preferences and binary (forced-choice) preferences as their special case [17].

My further applied research has informed the use of comparative judgements to increase validity of computer-adaptive personality testing [14], and organisational appraisals [15].

This research has made a significant economic and societal impact. Today, the Thurstonian methodology is widely used in questionnaire development to reduce response biases, socially desirable responding and faking. An ever-increasing number of Thurstonian-based assessments are assessing several million candidates per year in at least 37 languages across over 40 countries. Work in this area has formed the basis for REF-2021 impact case study. For my continuous engagement with non-academic partners to bring the cutting-edge methodology to industry and practice, I have been awarded the University of Kent Knowledge Exchange Collaboration Prize 2022.

References to research, inventions, software and products

Brown, A. & Bartram, D. (2009-2011). OPQ32r Technical Manual. Surrey, UK. SHL Group.
Maydeu-Olivares, A. & Brown, A. (2010). Item response modeling of paired comparison and ranking data. Multivariate Behavioural Research, 45, 935-974. DOI: 10.1080/00273171.2010.531231
Brown, A. & Maydeu-Olivares, A. (2010). Issues not to be overlooked in the dominance vs. ideal point controversy. Industrial and Organizational Psychology, 3, 489–493. DOI: 10.1111/j.1754-9434.2010.01277.x
Brown, A. & Maydeu-Olivares, A. (2011). Item response modeling of forced-choice questionnaires. Educational and Psychological Measurement, 71(3), 460-502. DOI: 10.1177/0013164410375112
Brown, A. & Maydeu-Olivares, A. (2011). Forced-choice Five Factor markers. Retrieved from PsycTESTS. DOI: 10.1037/t05430-000
Brown, A. (2012). Mplus syntax builder for testing forced-choice data with the Thurstonian IRT model. Software and User’s Guide. Retrieved from http://annabrown.name/software
Brown, A. & Maydeu-Olivares, A. (2012). Fitting a Thurstonian IRT model to forced-choice data using Mplus. Behavior Research Methods, 44, 1135–1147. DOI: 10.3758/s13428-012-0217-x
Brown, A. & Maydeu-Olivares, A. (2013). How IRT can solve problems of ipsative data in forced-choice questionnaires. Psychological Methods, 18(1), 36-52. DOI: 10.1037/a0030641
Van Dam, N.T., Brown, A., Mole, T.B., Davis, J.H., Britton, W.B., & Brewer, J. A. (2015). Development and Validation of the Behavioral Tendencies Questionnaire. PLOS ONE, 10(11), e0140867. DOI:10.1371/journal.pone.0140867
Brown, A. (2015). Personality Assessment, Forced-Choice. In Wright, J. D. (Ed.), International Encyclopedia of the Social and Behavioural Sciences, 2^nd Edition. Elsevier. DOI:10.1016/B978-0-08-097086-8.25084-8
Brown, A. & Croudace, T. (2015). Scoring and estimating score precision using multidimensional IRT. In Reise, S. P. & Revicki, D. A. (Eds.). Handbook of Item Response Theory Modeling: Applications to Typical Performance Assessment (a volume in the Multivariate Applications Series), pp. 307-333. New York: Routledge/Taylor & Francis Group.
Brown, A. (2016). Item Response Models for Forced-Choice Questionnaires: A Common Framework. Psychometrika, 81(1), 135-160. doi: 10.1007/s11336-014-9434-9
Brown, A. (2016). Thurstonian Scaling of Compositional Questionnaire Data. Multivariate Behavioral Research, 51(2-3), 345-356. DOI: 10.1080/00273171.2016.1150152
Lin, Y. & Brown, A. (2017). Influence of Context on Item Parameters in Forced-Choice Personality Assessments. Educational and Psychological Measurement, 77(3), 389-414. doi: 10.1177/0013164416646162
Brown, A., Inceoglu, I., & Lin, Y. (2017). Preventing Rater Biases in 360-Degree Feedback by Forcing Choice. Organizational Research Methods, 20(1), 121-148. doi: 10.1177/1094428116668036
Brown, A. & Maydeu-Olivares, A. (2018). Modeling forced-choice response formats. In Irwing, P., Booth, T. & Hughes, D. (Eds.), The Wiley Handbook of Psychometric Testing: A Multidisciplinary Reference on Survey, Scale and Test Development, pp. 523-570. London: John Wiley & Sons.
Brown, A. & Maydeu-Olivares, A. (2018). Ordinal Factor Analysis of Graded-Preference Questionnaire Data. Structural Equation Modeling: A Multidisciplinary Journal, 25(4), 516-529. doi: 10.1080/10705511.2017.1392247

Leave a Comment Cancel Reply