Institute of Education

Research & Expertise to Make a Difference in Education & Beyond

The Best Method for Making Language Testing Faster and More Accurate

Researchers discovered that language proficiency tests could be significantly shortened by processing responses more effectively. An international team of scholars, including a researcher from the Institute of Education, demonstrated this using data from around 3,000 students who took English listening tests.

The Best Method for Making Language Testing Faster and More Accurate

unsplash.com

Elena Kardanova, the Scientific Supervisor at the IOE Centre for Psychometrics and Measurement in Education, is one of the co-authors of the article A comparison of the stability of ability parameter estimation based on maximum likelihood and Bayesian estimation: A case study of dichotomous scoring test results. The researchers compared two common approaches to estimating participants’ abilities on a test measuring English listening ability.

Language assessment usually relies on the maximum likelihood method, which estimates ability levels by identifying the set of responses that is most likely given a student’s performance. While this approach works reasonably well for long tests, it becomes less reliable when tests are shortened. Students with similar language skills may receive different scores due to random factors in responses. 

In contrast, the researchers demonstrated that a Bayesian method—which considers not only an individual’s responses, but also the distribution of ability levels within a larger population—yields more consistent results. This approach reduces the influence of random guessing and remains stable even when the number of test items is reduced. 

The team used Monte Carlo simulations to compare the precision of the two methods and found that the Bayesian method delivered notably lower error rates and more stable estimates across different test lengths. This means that it remains stable even on short tests. 

With 20 questions, the Bayesian method is almost 30 times more reliable than the maximum likelihood method,’ explains Elena Kardanova. ‘However, when the test is increased to 40 questions, both approaches produce comparable accuracy results. In other words, the advantage of the Bayesian method disappears in long tests. Such tests are rarely used in practice because they are too long for students and too costly for organisers.’

This approach saves language centres a significant amount of time. For example, if a centre tests around 300 students per month and reduces the test duration from 40 minutes to 20 minutes, it frees up approximately 100 hours. This could mean either accommodating more participants with the same resources, or reducing costs for classrooms and staff. In the context of mass testing, even a few minutes saved can quickly add up to tens of hours.

Read More