Reflections and comments on ACER’s Next steps: measuring reading progress
Strong governments and strong global institutions are important for defining, monitoring and addressing inequality in education. These three policy activities are linked by policy narratives that need to be strong, coherent and consistent to garner global legitimacy. The efforts of ACER’s Centre for Global Education and Monitoring is to be commended on this front.
Work on the United Nation’s Sustainable Global Development (SGD) goals goes back to 1972, and its 2030 agenda focusing on Poverty, Food, Health, Education, Gender and Water is laudable and an agenda to which we could all agree. However agreement on implementation requires the legitimation of a stronger narrative, and there are elements of the ACER approach that I would like to explore in this blog.
There are many things to like about ACER’s approach; the use of Item Response Theory (IRT) to develop a commonly agreed scale is one of them. IRT is a proven methodology for system and national evaluation, even though the methodology becomes suspect at the school, class and student levels. The other welcome element is the use of pairwise comparisons in the development of content. The recent increase in the use of teacher-based pairwise comparison is welcome because it reengages the teaching profession with scale formation, an engagement that has atrophied over recent decades due to the use of IRT scaling methodologies. However, in the ACER proposal teacher engagement seems limited to pairwise comparisons in preliminary item selection, and does not seem to extend to international agreement on content.
Where the proposal is likely to encounter legitimation issues relate to the hypothesis that educational skills are universal across the target countries and able to be described on a common scale. Sure, technically this can be done, I’ve rarely seen any test data that doesn’t scale, and where some items don’t scale properly these can be removed for ‘mysterious item reasons’. However there is bound to be concern around the legitimacy of claims about the universality of scales developed in this manner.
As I have argued elsewhere [on a unifying principle], the notion of being able to universally ‘identify where a student is’ is problematic. There are many ways of describing this issue. One way is to say that it’s too Kantian and ignores the work of Hegel in showing that knowledge is historically and socially located, and the work of Marxists that shows that formulations of knowledge can reinforce disadvantage. Another way is to describe the approach as too metaphysical by presupposing a universal Cartesian space in which students can be located. Realism is yet another word that comes to mind, an approach that assumes that what IRT measures actually exists in reality. Again, as I have argued elsewhere [constellation and continuum], the continuum metaphor is only one way to describe learning progress. So the observation that ‘progression occurs in a somewhat lumpy way’, is more than likely a reflection of the IRT model or metaphor, and not a phenomenon from the underlying reality of learning. This is not to discredit the validity of the IRT model or results derived from it for the purpose of international evaluation; it simply questions the universality of any claims made.
An alternative to presupposing universal realism across nations and cultures on matters such as reading and mathematics, is to develop a procedure for SGD countries to agree on what is common to all with respect to these content areas and to create a common scale around that agreed content and then report explicitly to that effect. That is, report that the scales represent what has agreed to be common, and not was is considered universal and enduring. The claims to universality, along with the described content methodology, could be characterised as cultural appropriation followed by cultural imperialism. Such an approach is likely to meet with resistance from teachers and the like at some point. People are social and cultural beings who use language to express themselves socially and culturally. Of course reading progress is important to these expressions and for prosperity, but these expressions are also specific to each cultural context and not a universal function of language.
French President Charles de Gaulle’s famous 1962 observation on “How can you govern a country which has two hundred and forty-six varieties of cheese?” provides a good example of where language equivalence does not mean cultural equivalence. Cheese (Australian), kaas (Dutch) and fromage (French) are language equivalents, but Australians have Cheddar and Tasty, the Dutch have Edam and Gouda, and the French have a much broader variety. Claims to social and cultural equivalence based on the simple language equivalence of ‘cheese’ is therefore likely to meet resistance. Reporting with claims to universality based on assessments that are only linguistically equivalent could therefore be perceived through a hegemonic narrative instead of the emancipatory one that is being sought by the UN.
It is difficult to know the status of the paper on which I’m commenting (research or marketing). It describes a comprehensive and worthwhile exercise, but it will require comprehensive consultation and discourse among target countries to develop legitimate measures that are acceptable to all.