In this post, we introduce a polytomous scoring approach based on the optimal use of item response time. This approach provides an easy and practical way to deal with not-reached items in low-stakes assessments. First, we describe how the polytomous scoring approach works and then demonstrate how to implement this approach using R.
(12 min read)
Low-stakes assessments (e.g., formative assessments and progress monitoring measures in K-12) usually have no direct consequences for students. Therefore, some students may not show effortful response behavior when attempting the items on such assessments and leave some items unanswered. These items are typically referred to as not-reached items. For example, some students may try to answer all of the items rapidly and complete the assessment in unrealistically short amounts of time. Oppositely, some students may spend unrealistically long amounts of time on each item and thus fail to finish answering all of the items within the allotted time. Furthermore, students may leave items unanswered due to test speededness-the situation where the allotted time does not allow a large number of students to fully consider all items on the assessment (Lu & Sireci, 2007).
In practice, not-reached items are often treated as either incorrect or not-administered (i.e., NA) when estimating item and person parameters. However, when the proportion of not-reached items is high, these approaches may yield biased parameter estimates and thereby threatening the validity of assessment results. To date, researchers proposed various model-based approaches to deal with not-reached items, such as modeling valid responses and not-reached items jointly in a tree-based item response theory (IRT) model (Debeer et al., 2017) or modeling proficiency and tendency to omit items as distinct latent traits (Pohl et al., 2014). However, these are typically complex models that would not be easy to use in operational settings.
Response time spent on each item in an assessment is often considered as a strong proxy for students’ engagement with the items (Kuhfeld & Soland, 2020; Pohl et al., 2019; Rios et al., 2017). Several researchers demonstrated the utility of response times in reducing the effects of non-effortful response behavior such as rapid guessing (e.g., Kuhfeld & Soland (2020), Pohl et al. (2019), Wise & Kong (2005)). By identifying and removing responses where rapid guessing occurred, the accuracy of item and person parameter estimates can be improved, without having to apply a complex model-based approach.
In this post, we will demonstrate an alternative scoring method that considers not only students with rapid guessing behavior but also students who spend too much time on each item and thereby leaving many items unanswered. In the following sections, we will briefly describe how our scoring approach works and then demonstrate the approach using R.
In our recent study (Gorgun & Bulut, 2021)1, we have proposed a new scoring approach that utilizes item response time to transform dichotomous responses into polytomous responses. With our scoring approach, students are able to receive a partial credit on their responses depending on the optimality of their response behavior in terms of response time. This approach combines the speed and accuracy in the scoring process to alleviate the negative impact of not-reached items on the estimation of item and person parameters.
To conceptualize our scoring approach, we introduce the term of optimal time that refers to spending a reasonable amount of time when responding to an item. Optimal time allows us to make a distinction between students who spend optimal time but miss the item and students who spend too much time on the item and yet answer it incorrectly. By using item response time, we group students into three categories:
If an assessment is timed, students are expected to adjust their speed to attempt as many items as possible within the allotted time. Therefore, spending too little time (rapid guessers) or too much time (slow respondents) on a given item can be considered an outcome of disengaged response behavior. Our scoring approach enables assigning partial credit to optimal time users who answer the item incorrectly but spend optimal time when attempting the item. These students use the time optimally so that they can answer most (or all) of the items.
The polytomous scoring approach can be implemented using the following steps:
We separate response time for correct and incorrect responses and then find two median response times for each item: one for correct responses and another for incorrect responses. The median response time is used to avoid the outliers in the response time distribution.
We use the normative threshold (NT) approach (Wise & Kong, 2005) to find two cut-off values that divide the response time distribution into three regions: optimal time users, rapid guessers, and slow respondents. For example, we can use 25% and 175% of the median response times to specify the optimal time interval2.
After finding the cut-off values for the response time distributions for each item, we select a scoring range of 0 to 3 points or 0 to 4 points.
We determine how to deal with not-reached items. We can choose to treat not-reached items as either not-administered (i.e, NA) or incorrect.
Now, let’s see how the polytomous scoring approach works in R.
To illustrate the polytomous scoring approach, we use response data from a sample of 5000 students who participated in a hypothetical assessment with 40 items. In the response data,
The data also includes students’ response time (in seconds) for each item. The data as a comma-separated-values file (dichotomous_data.csv) is available here.
Now let’s import the data into R and view its content.