To be at the Chartered College of Teaching’s Third Space event yesterday in Bristol (we were at the School of Education), was a great privilege. There were not a whole load of headteachers there, and it was very special for once to be among teachers, talking about assessment because tomorrow it matters. If policy-makers could have joined us, I think that they would have had their eyes opened in a good way. First of all, unless you really believe in this stuff you don’t go to Bristol on your weekend from places as far afield as Cumbria, Cornwall and Holland. And to see the determination of the Chartered College to reach people who really want to learn together, and thus be willing to give up weekends to host such good, focused events, thoughtful and beautifully planned, is in its way a treasure and gift to the profession.
It really feels like that. We had come from all over the place to learn about assessment and how we could make it more effective, but actually what we got was one another, listening hard, contributing sensitively and kindly led. This was as far as it could be from the manic world of Michael Barber, or the systems thinking that sometimes infects the EEF.
The morning began with an introduction from Alison Peacock, the college’s CEO, in which she expressed the hope that we would disrupt the landscape and be the grit in the oyster as we allowed research and reflection to take up more space in the classroom, so that we would have the confidence to say “this is what I am doing, this is why, this is the impact it is having and this is how I know.”
This is a virtually impenetrable argument to offer those “interested third parties who come from outside to see us” and enables us to move away from a highly defensive sandbagging approach where we defend ourselves with reams of paper. The central question remains What is the quality of our assessment that informs our teaching that enables children to flourish and do well?
The first presentation was from Stuart Kime at Durham, talking about whether Assessment Policies were worth the effort. Starting with Jerome Bruner’s contention that children should “experience success and failure not as reward and punishment, but as information” (On Knowing: Essays for the Left Hand, 1962), he posited that the information we get from what learning is going on – this is the purpose of assessment.
We are generally in an assessment world where extrinsic motivation is the norm. We have tests and they come with either high-stakes attached or more negative meaning than children really need. The work of Ryan and Deci (2000) on intrinsic and extrinsic motivation gives a framework where we can see children move from no motivation at all, to an extrinsic motivation to a much greater degree of self-regulation:
Gains in self-regulation, and toward an intrinsic motivation, also appear as aspects of teaching and learning that have higher effect sizes of the EEF Teachers Toolkit. Is it possible, Kime asked, to so remove the rewards and punishment aspect of assessment so that children’s interest, enjoyment or inherent satisfaction came from, and was stimulated by, the task at hand? In this world, assessment yields not “data” but information.
Assessment, contends Dylan Wiliam, is a “bridge between teaching and learning.” Bridges, by their nature, are strong, and made to carry a load. they are also there to ensure access and designed to fulfil a particular purpose. Bridges are not all alike, but to be successful bridges they all need the same characteristics of strength, reliability, dependability, access. They give, in this metaphor, confidence and access to learning. Assessment, likewise, must give access to new learning from the teaching that is taught. It must be hole-proof, and robust in every aspect of its construction. The policies or guides to an assessment system must also be fully established and clear, designed to help us cross the bridge clearly and easily. They need to be time-costed, so that we do not spend hopelessly long times on assessing to gain a small amount of information, and put to the right purposes.
Campbell’s law states:
The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.
Achievement tests may well be valuable indicators of general school achievement under conditions of normal teaching aimed at general competence. But when test scores become the goal of the teaching process, they both lose their value as indicators of educational status and distort the educational process in undesirable ways.
The implication of this is that we need to use exquisite care in devising policies and guides to assessment, lest we end up with Paul Dressel’s quote (see the previous post) and find ourselves standing with a grade paper in our hand that means nothing. In this case, as I have often contended, a sheet of descriptors of what a child actually does, knows or understands, is accurate. A numerically-malleable statistic is the approximation, not the reality.
Thus Stuart Kime concluded that actually, no, assessment policies are not worth a candle unless:
- They depict reality
- They are based on sound theory and practice, anchored in knowledge of assessment research
- They lead to better outcomes than would be caused by their absence
- They are manageable by normal humans under normal time pressures
- They are, and can be, understood by those who need to understand them.
- They are implemented as directed. Implementation is a problem not often commented on in education. We tend to assume, because of the scale of our organisations, that if a SLT member tells people, then they will do it. It doesn’t work like that and the implementation gap can get in the way (See Peter Gollwitzer from NY on this). We need to ensure in our assessment policies that policy has a way of affecting the phase leadership and that they have a plan for supporting assessment in the classroom.
How much time do we spend on assessment? If it is too long, then it had better be worth it! If it is not worth it, then stop and do something else. There are plenty of other approaches that are quicker and more reliable (such as comparative judgments).
As humans, we are not good at making absolute judgments. Our decisions are better when communal and comparative, so there is more accuracy when we are comparing two pieces of work than with a single one. Is a piece of work “original” or “highly original”? Does it show evidence of “mastery” or “excellence” and which is better? When assessment, what is it we want? Assessment that is:
- effective in enhancing learning
- efficient for all concerned and which gives a good return on the investment made in energy, time, enthusiasm, patience, etc.
- able to carry the load we want of it – it must be valid and reliable
- able to produce high quality, useful, comprehensible information to all who need it, and on which we can depend.
If an assessment policy works well, it will be able to produce all of the above. To check whether a policy does these things, we just reverse engineer it, and identify practices that are effective and efficient, and build on these, and those that ineffective and inefficient and stop doing those straightaway, and find other practices that work.Make sure that teachers can answer the four assessment questions posited by Alison at the beginning of the day: What are you doing? Why? What effect will it have? How do you know?
Exemplars of work, from well-moderated judgments, are really helpful in any assessment policy, and perhaps the best contribution of time that we can make to the process.
Stuart’s presentation is linked to this image.