Investigating the treatment of missing data in an Olympiad-type test – the case of the selection validity in the South African Mathematics Olympiad

Authors: Caroline Long, Johann Engelbrecht, Vanessa Scherman and Tim Dunne

The objectives of the South African Mathematics Olympiad (SAMO) are to generate enthusiasm and interest in mathematics; to enrich the study of mathematics; to promote mathematical problem-solving proficiency; to equip contestants for university level mathematical thinking; and to identify and inform selection of the finest young mathematical minds for international competition. However, the validity of this type of test has been questioned due to the following aspects:

The selection process is not natural, as the competitive environment pressurizes learners. Learners will not perform as well as they would in a relaxed classroom environment.
Do the psychometric properties truly measure mathematical excellence?
Are the design features, administration, scoring and analysis procedures really supportive of finding exceptional mathematical minds?
Educators are provided topics that are covered in the test. However, these topics may not be covered by the curriculum in time for the test, or at the level of test. Hence, there could be a “lack of exposure to the particular problem-solving strategy required for a specific problem or an element of surprise not obvious at the time of setting the paper”.
The level of language used in the test might be difficult for learners to understand.

Analysis of missing data can assist in checking and ensuring the validity of South African Mathematics Olympiad testing. Missing data is common in research and refers to planned and desired information that is not available for examination and analysis. There are several reasons why data could be missing. These reasons may relate to; the participants; the study design; and the interaction between the participants and the study design. The contestant could have: missed an item; saved the item for later and then run out of time; or felt reluctant to answer the question. Three types of missing data were identified:

Missing completely at random (MCAR); for example, when a contestant accidentally skips an item.
Missing at random (MAR); for example, if the general language proficiency, or time available, is insufficient, the learners will perform negatively.
Missing not at random (MNAR); this is caused by poor contestant proficiency, or difficulty of the test. Consequently, learners will opt not to answer due to negative marking if they are unsure of the answer.

The results of this research found the test to be adequate. However, the following recommendations were made:

No a priori weightings of an item should contribute to the final score.
Reconsider the weighting of the answers.
Introduce a support programme to address problem-solving strategies.
Further research is needed on whether the positive benefits of participating outweigh possible demoralisation.
Rather than only the top 100, consider all contestants who ended up in the top 150 or 200.

Full Text

Related Posts

Leave a Comment Cancel Reply