Track № 2 Constructing measures with item response modeling, UC Berkeley и ETS

Part 1

Instructors:

Mark Wilson & Karen Draney (University of California)

Abstract

The first part of the workshop will cover the fundamental tasks of measurement: the construction of meaningful latent variables, the creation of items and coding schemes to tap into those variable, the collection and analysis of response to those items, and the analysis of evidence about the validity and reliability of the instruments used to measure those variables. In the second part, using the Rasch model, we will explain how item response models can help address these fundamental tasks, and how to use the ConQuest program to analyze the data and provide useful results.

The second part will conclude with a survey of extensions of the basic models to accommodate: polytomous items, measurement facets, Differential Item Functioning (DIF), latent regression, multidimensional models, and also, how to create new models using design matrices.

The lectures will be illustrated using examples of modeling and estimation from real data sets. The accompanying hands-on sessions will be coordinated to follow-up on the topics of the lectures by illustrating (a) the practices of (i) scoring using a construct map and (ii) assessment moderation, and (b) how to use output from Conquest to address issues of quality control.

Learning Objectives

- discuss the following topics and practise them during hands-on sessions

1. Using the BEAR Assessment System to help make measurement meaningful:

a) Construct Maps;

b) Items Design;

c) Outcome space; and

d) Measurement model.

2. Phenomenography and Assessment Moderation.

3. The simple Rasch model as an example of a standard IRT model:

a) formal equations (including both dichotomous and polytomous forms)

b) expressing uncertainty using standard errors;

c) Wright Maps

d) evaluating fit.

4. Quality control--validity and reliability evidence using the Wright map

a) Different types of reliability evidence

b) Content validity

c) Response Processes

d) Internal Structure

e) Relationships to External Variables; and

f) Consequential Validity.

5. Software.

6. Students will learn about extensions of the simple Rasch model:

a) Polytomous models

b) Facet models

c) DIF models

d) Latent Regression Models

e) Multidimensional models

f) Creating new models: The design matrix.

Learning outcomes

a) Acquiring of theoretic and practical knowledge on the above mentioned topics

b) The students will experience the processes of phenomenography and assessment moderation,where they will have to establish a consensus amongst themselves about the appropriate score-rules for student work. This will be a valuable professional experience, providing them opportunities to develop their abilities to present their ideas, along with their justifications, to listen to the same from others, and to reach agreement with others on these matters.

c) The students will learn how to create a command file for an IRT program (ConQuest), and will learn how to interpret the output

d) Students will learn about extensions of the simple Rasch model

Prerequisites

а) Wilson, M. (2004). Constructing Measures: An Item Response Modeling Approach. Mahwah, NJ: Erlbaum. (now published by Routledge).

b) Recommended References

c) Adams, R. J., Wilson, M., & Wang, W. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21(1), 1-23.

d) Adams, R. J., Wilson, M., & Wu, M. (1997). Multilevel item response models: An approach to errors in variables regression. Journal of Educational and Behavioral Statistics, 22(1), 47-76

e) Adams, R.J., Wu, M., & Wilson, M. (2012). ConQuest 3.0 [computer program]. Hawthorn, Australia: ACER.

f) Linacre, J.M. (1994). Constructing measurement with a many-facet Rasch model. In, Wilson, M. (Ed.). Objective measurement II: Theory into practice. Norwood, NJ: Ablex.

g) Meulders, M., & Xie, Y. (2004). Person-by-item predictors. In, P. De Boeck & M. Wilson, Explanatory Item Response Models: A generalized linear and nonlinear approach. New York: Springer-Verlag.

з) Wang, W.-C., & Wilson, M. (2005). The Rasch testlet model. Applied Psychological Measurement, 29, 126-149.

Part 2

Instructor:

Hongwen Guo, PhD, Educational Testing Service

Abstract

The workshop will introduce item response theory (IRT), which encompasses a group of probabilistic measurement models widely used in standardized testing programs. We will discuss foundations and assumptions underlying IRT, comparison of various IRT models, application of IRT to practical testing situations, and implementation of IRT using the statistical computing language R and some real data. Some recent trends in educational measurement such as process data analysis, adaptive testing, diagnosis modeling will also be considered.

Learning Objectives

a) Understand the principles of item response theory (IRT) as a modern and comprehensive psychometric framework

b) Understand the fundamental estimation methods used in IRT

c) Be familiarized with some of the most common IRT models

d) Recognize various practical applications of IRT

e) Appreciate the relationship between theory and practice in the testing industry

f) Gain exposure to some recent trends in educational measurement

g) Experience hands-on modeling of educational data using the open source R language and R packages

Learning outcomes

a) Know the fundamental methods of IRT основными методами оценки в IRT

b) Be able to apply IRT knowledge in practice

c) Be able to work in R program

Prerequisites

The workshop will utilize the R language to analyze and model educational data. The workshop will be taught in a mixed format: lectures & computing labs. Prerequisites include basic knowledge of the R language, statistics and probability, or the instructor’s permission.

Schedule of the track №2

Day 1.

1. Brief history of measurement

2. Overview of our method with some examples and software (BASS)

3. Construct Maps

4. Items design & Outcome space

Day 2.

5. Measurement model

6. Reliability & Validity evidence

7. Some extensions of the measurement model (e.g. latent regression, DIF)

8. Multidimensionality (between-item and within-item)

Day 3.

9. Brief overview of ConQuest

10. Design matrices

11. Learning progressions

12. Foundations of IRT Models & R programming

Day 4.

13. IRT Models (1PL: Estimation &Applications with R)

14. Comparison of IRT Models (1/2/3 PL& Applications with R)

15. IRT Models (Polytomous data: GPCM, NRM & Applications with

16. Test Analysis (model checking and applications in practice)

Day 5.

17. DIF & timing data (with real data applications & R packages)

18. Test Motivation (for low-stakes assessment & applications with R)

19. C o m puterized Adaptive Testi ng (basics & R packages)

20. MIRT & CDM (basics & R packages)

21. Writing Process Data (basics & R programming)