The possibility of a unifying principle for assessment

[thank you to all those supporting me to date in much greater numbers than I had expected. It’s a bit difficult to stick with my longish blogs. I’m sharing my pre-confirmation PhD thinking for today so apologies for the dryness and density, but I feel that we need to go here to engage the neoliberal agenda, lighter material to come later]

Thought piece on Geoff N. Masters – Reforming Educational Assessment: Imperatives, principles and challenges


which metaphor for an assessment principle – constellation or continuum?

It is easy to agree with Geoff Masters (2013, p. 1) when he observes educational assessment as a field divided and in disarray.  Educational assessment began by providing simple reliable indicators to parents as well as to students for currency in the job and education markets.  Assessment has now grown to encompass school and system evaluation as well as scientific research, with elements of quality management and market research creeping in. Data collection is moving from research as event to embedded and ongoing research through ubiquitous and unobtrusive data collection  (Behrens & DiCerbo, 2014). This transition is blurring the demarcation between educational assessment and other forms of data collection.

While it’s easy to agree on the disarray, Masters’ unifying principle to address the chaos is problematic. Masters proposes that the fundamental purpose of assessment is to establish where learners are in their learning at the time of assessment (2013, p. 5), but this principle seems too attached to the objective measurement school and its philosophical stance.

The problem that Masters is sensibly trying to address is the divided approaches and paradigms in contemporary assessment practices such as quantitative, qualitative, formative, summative and the like. Masters addresses this problem by suggesting a universal transcendent principle to underwrite all assessment practices. But for his principle to be unifying, universal and useful it must be better than competing alternate ways of formulating a principle.  So is Masters’ principle something we could all agree to over other contenders for universal principles? While I do not propose to proffer an alternative at this stage, let’s explore Masters’ principle a little further.

Masters’ unifying principle is presaged by a learning space, either unidimensional (continuum) or multidimensional (continua), in which a student can be located at a particular point in time.  The language of the principle is about mathematical space and location, and by incorporating this metaphor into a principle he seeks to subsume all assessment practices.  His principle assumes that there is a true location at which each learner can be located at a point in time, and that once that location is determined that information can be used to fulfil all possible educational information purposes.   So there are two issues, is the location metaphor the best way to describe contemporary assessment practices, and is a location – should it be able to be determined – once determined be sufficient to meet all educational information needs.

The foundation for Masters’ principle appears to be the objective school of measurement with its Rasch-based and IRT-based models (e.g. see Embretson & Reise, 2000; Masters, 1982; Rasch, 1980). It is this school of measurement with its concerns for true score and measurement error that lends itself to the ‘where is the student’ metaphor.   However, there are increasing calls for the use of other measurement models for which the ‘who is the student’ metaphor is probably more appropriate. Notable examples of this work includes that of Mislevy as well as that of Leighton and Gierl (Almond, Mislevy, Steinberg, Yan, & Williamson, 2015; Leighton, Gierl, & Hunka, 2004; Leighton & Gierl, 2007, 2011). These alternative models, by moving away from the singular location metaphor, challenge the usefulness of Masters’ unifying principle.

There are several ways of describing and locating Masters’ unifying principle.  One that comes to mind is that Masters takes a Kantian approach with its focus on objective transcendence presupposing learning as moving from location to location. From the objective measurement school this is couched as ‘the idea of the variable must transcend any particular set of observations and the measure on the  variable must transcend the observed responses on which it is based’ (Wright & Stone, 1979, p. 141), where what is learning is seen as an a priori concept measured by the subject through empirical observation; along with appropriate application of measurement error.  By casting Masters’ approach as Kantian allows us to quickly sketch out a landscape of alternative foundations for a unifying principle.

Unlike Kant, Hegel took history into account. Where Kant thought he could say on purely philosophical grounds what human nature is and always must be, Hegel accepted that the Human condition could change from one historical era to another (Singer, 2001, p. 13). The Hegelian notion of a dynamic history challenges the stability of Masters’ notion of ‘establish where learners are’, because this location is dependent on historical context.  It’s then a fairly short leap to a Marxist critique of the principle, that any measure used to implement the principle could be biased against certain groups which of course could be mitigated by techniques such as DIF.  It is at this point that I find we can discard Masters’ principle from being universal, and that it’s at best a useful heuristic. This brief analysis points to the danger of basing principles on an instrumental technique, in this case the Rasch model. A principle should probably come before selecting a technical implementation.

When considering assessment from a Marxist perspective, and within the context of Lyotard’s (1984) analysis of knowledge , three further approaches become apparent. The first one is neo-liberalism and its concern for performativity (Ball, 2003) which Lyotard (1984, p. 54) describes as being defined by an input/output ratio. Masters’ Rasch model provides a particular advantage here over other models such as Bayesian networks .  As Masters has earlier stated, in order to enable quantitative comparisons, or make ratios, we need a linear scale that makes differences between persons the same wether through hard or easy items(Wright & Masters, 1982, p. 8). That is, the Rasch model’s ability to create linear scales dovetails neatly into neoliberalism’s need for ratios. Masters may therefore be inadvertently buttressing a neoliberal agenda with his unifying principle.

Returning to Lyotard(1984), the Marxist agenda bifurcated around the time his book was published into what I characterise as post-structuralists and neo-modernists. On assessment, the post-structuralist due to their incredulity of grand-narratives (in particular those that involve numbers) continue to take a suspicious stance towards systems and system assessment.  This stance has continued to grow since early days of the Frankfurt school in particular Marcuse and his notion of the Great Refusal (Marcuse, 1974, 2012). Post-structuralists therefore can find it difficult to engage with system assessment in a positive sense, but they have a lot to say about the lives of individuals within the lifeworld which continues to be valuable for system assessment.  Neo-modernists on the other hand, in the tradition of Habermas (1985, 1987), are simpatico with the petit narratives of the post-structuralists but engage more constructively with systems. Neo-modernists consider the system to have emancipatory potential while having a tendency to colonize the lifeworld of communities that needs to watched and mitigated through transparency and deliberate democratic processes. From a neo-modernist perspective, a principle should be based around what is sought to be achieved, what needs to be understood, or what needs to be coordinated across the system. A neo-modernist will continue to embrace the objective measurement school strongly however, because of objective measurement has a strong ability to determine DIF, bias, and fairness. But objective measurement would not presage a universal principle on assessment.

This author will continue to work in a modernist tradition towards one or more universal principles for assessment to provide alternative to Masters which I consider too close to the neoliberal agenda.

Almond, R. G., Mislevy, R. J., Steinberg, L., Yan, D., & Williamson, D. (2015). Bayesian Networks in Educational Assessment. Tallahassee: Springer.

Ball, S. J. (2003). The teacher’s soul and the terrors of performativity. Journal of Education Policy, 18(2), 215–228.

Behrens, J. T., & DiCerbo, K. E. (2014). Harnessing the Currents of the Digital Ocean. In J. A. Larusson & B. White (Eds.), Learning Analytics:From Research to Practice (pp. 39–60). New York: Springer.

Embretson, S. E., & Reise, S. P. (2000). Item Response Theory for Psychologists. L. Erlbaum Associates.

Habermas, J. (1985). The Theory of Communicative Action: Reason and the rationalization of society. (T. McCarthy, Trans.). Boston: Beacon Press.

Habermas, J. (1987). Lifeworld and system: a critique of functionalist reason. (T. McCarthy, Trans.). Boston: Beacon Press.

Leighton, J. P., & Gierl, M. J. (2007). Cognitive Diagnostic Assessment for Education: Theory and Applications. New York: Cambridge University Press.

Leighton, J. P., & Gierl, M. J. (2011). The Learning Sciences in Educational Assessment: The Role of Cognitive Models. Cambridge University Press.

Leighton, J. P., Gierl, M. J., & Hunka, S. M. (2004). The Attribute Hierarchy Method for Cognitive Assessment: A Variation on Tatsuoka’s Rule-Space Approach. Journal of Educational Measurement, 41(3), 205–237. doi:10.1111/j.1745-3984.2004.tb01163.x

Lyotard, J.-F. (1984). The Postmodern Condition: A Report on Knowledge. Minneapolis: University of Minnesota Press.

Marcuse, H. (1974). Eros and Civilization: A Philosophical Inquiry Into Freud. Beacon.

Marcuse, H. (2012). One-Dimensional Man: Studies in the Ideology of Advanced Industrial Society (Vol. 8). Beacon Press.

Masters, G. N. (1982). A rasch model for partial credit scoring. Psychometrika, 47(2), 149–174. doi:10.1007/BF02296272

Masters, G. N. (2013). Reforming Educational Assessment: Imperatives, principles and challenges. Australian Education Review. Retrieved from

Rasch, G. (1980). Probabilistic Models for Some Intelligence and Attainment Tests. Chicago: MESA PRESS.

Singer, P. (2001). Hegel: A Very Short Introduction. Oxford: OUP Oxford.

Wright, B. D., & Masters, G. N. (1982). Rating Scale Analysis. Chicago: MESA PRESS.

Wright, B. D., & Stone, M. H. (1979). Best Test Design. Chicago: MESA PRESS.

Time for the emeriti to stop taking pot shots at the establishment

A rejoinder to Richard Teese – How our elite unis ‘game’ the VCE and ATAR

Richard Teese’s article on universities, the ATAR and VCE shows that Hebert Marcuse’s Great Refusal is alive and well among Australia’s elite ‘progressive’ thinkers. The Great Refusal continues to provide the vacuum that allows neoliberalism to flourish. Teese’s article, with its Great Refusal tinged with a bit of ‘dead-white-male’ transcendence provides a caricature of authority that leaps established institutions at a single bound, swiping hard working teachers along the way. Teese confounds and conflates many distinct interests, institutions and dynamics in his overarching spray at public institutions; in particular, the role of universities, the ATAR and the VCE.

Teese’s authority on universities must be respected however. When he characterises their behaviour as bellicose, vain and market driven we need to accept that these are true even if somewhat localised and self-reflexive by way of Teese’s own university career.  Yet these characteristics may not be attributable to other universities with their autonomous mission and values.  Saying that universities ‘hide behind their students’, instead of taking ‘pride in their students’ could reflect Teese’s personal university experience, or may perhaps reflect a personal negative disposition.  Nevertheless, each university has its own autonomous body politic consisting of its own academics, so sweeping generalizations about their collective motivations are likely to be erroneous.

As to the ATAR, it is what it is. It provides a common meeting point for individuals and institutions to coordinate their actions.  The power of enfranchisement that the ATAR can bring is demonstrated by the celebration of Casimira Tipiloura’s achievement in being the first from the Tiwi Islands to attain an ATAR.  The ATAR provides students with a personal indicator that allows them to open up conversations with a large range of institutions on future educational options.   Of course how individual universities take up that conversation is up to them.  Some are indeed odious by priding themselves on the basis of QS rankings, research rankings, and world university rankings.  While the ATAR is a good facilitator of conversations between institutions and students, it remains a crude instrument and necessarily so given its academic and geographic scope as well as its focus on inclusiveness and fairness.  Furthermore, it is an annual ranking that’s not criterion referenced so substantive meaning of rank is not stable from year-to-year.  The ATAR therefore has no predictive validity and is not designed to be a predictive measure.  As Teese identifies the progress of students at the tertiary level is dependent on the preparedness of both students and academics as well the university’s balanced priority between teaching and research. Commentary on the ATAR’s predictive validity is therefore ill informed.

The most pernicious aspect of Teese’s commentary relates to the VCE.  The VCE remains a world leader as a broad based credential.  It has over one hundred subjects including community languages, has a wide range of assessment types, has a mix of internal and central assessment, has substantial teacher involvement in implementation at the school and central levels, and uses effective statistical techniques to articulate student VCE achievement with national and international institutions.  The VCE involves detailed processes conducted within tight timeframes on a shoe string budget.  Criticism in the vein of Teese’s can only damage these highly effective yet fragile processes operating within the public sphere. Teese’s criticism can only provide succour to neoliberal forces who would seek to apply proprietary uni-dimensional models to tertiary selection.

There are of course significant broader equity issues in education and these have been well identified by studies such as the OECD’s PISA.  It is interesting to note that the rearticulation and regurgitation of these same issues continues to be of interest to think tanks, research centres and academics alike. However, callous criticism of public and civil institutions such as the VCE and ATAR can only undermine the universal enfranchisement that these institutions seek to provide. It is a type of criticism that serves to dissolve publicly justified processes to have them replaced by opaque market based mechanisms.

The tradition of the Great Refusal is therefore alive and well and in some part explains why we have reached this current situation.  Victoria, a state with a university ranked among the best in the world for education, a university with academics that wantonly slag-off fellow state-based institutions, a state with a declining educational performance. I will not be as careless as Richard Teese with my inferences however.

What is clear is that as societies become more complex knowledge becomes increasingly differentiated, concurrent to this phenomenon are demands for broader enfranchisement for students of all backgrounds.  This requires enhanced technical and problem-solving skills across the educational industry to develop ever more comprehensive systems that are justifiably fair in ensuring enfranchisement at the national clearing house that is currently the ATAR. Unfortunately the current academic focus seems to be on cheap pot shots at the establishment, a stance that should have died in Paris in 1968.