If we knew the precise things that make the most difference in terms of school improvement in order to secure the specific outputs we want, then everyone would be laughing. We would simply do more of those things and the outputs would improve. Voila! Bob’s your uncle!
However, of course we don’t actually have a precise understanding of the inputs to outputs process so we have to make educated guesses. That’s all we can do given the state of evidence in our profession. We make educated guesses about the optimal allocation time to subjects, the nature of our assessments, the nature and frequency of interventions, the use of rewards and sanctions, the use of quality assurance and appraisal processes – all the elements of school life.
It’s only a problem when we make false assumptions such as:
- that an input change that ‘worked’ improving outputs for some students or in another context must therefore work for other students or in a different context;
- that an input change that seemed to generate an improved output in one year can be repeated or sustained generating similar improvements;
- that improvements in one particular output have a causal relationship to all the inputs that changed in that year.
It’s also a problem if some inputs give rise to multiple but competing outputs such as teaching narrowly to a test for short-term recall and performance at the expense of exploring a wider curriculum for longer term engagement with a subject at a higher level. Or assuming that, at system level, all students and schools can be winners, when it’s actually closer to a zero-sum game.
There are lots of measurable outputs including attendance and examination results that are determined by multiple inputs, some of which are in a teacher’s or school’s control and some of which are not. Some outputs such as a student’s personal attributes are much harder (if not impossible) to measure; you can only gauge them in subjective terms. Similarly, whereas some inputs are more tangible – things you do in practice; resources you provide and plans that you can write down – others are less easy to define and control – the spirit in which things are done, the quality of interactions and the level of intensity brought to bear.
In my experience, the difficulties arise when things go well or go badly because there’s a tendency to isolate the most tangible measureable outputs and link them to the most tangible measureable inputs even if the links are tenuous or tangential. This then leads to a build-up of received wisdom, school-wide or system-wide, about what works or where the issues lie that isn’t actually well-evidenced. For example, it could be that one school’s success is heavily driven by the ethos in their particular context – such that a change in students’ motivation and consequent diligence in relation to study and practice is highly effective at a particular point in time. However, the school’s use of data targets and PRP for teachers or their Y11 intervention programme might be attributed with the success because they are more tangible. Conversely, all the opposites could be true- that it could be deficits with ethos and motivation that lead to weaker outcomes; not deficits with data targets, PRP and intervention programmes.
One area that interests me is the intensity of in-house quality assurance processes – book scrutinies, learning walks and the like and the impact these things have on outcomes. I’ve encountered a very wide range of these systems. Some are hyper-intense with impressively systematic processes with recording systems to match; others are much more organic. To my mind, there isn’t a neat correlation between outcomes and QA intensity or the level of systematisation – but all kinds of (false) assumptions are made about this.
Another is in the central area of teaching for exam success. Leaving aside the zero-sum bell-curve issue, it’s sometimes very difficult to pin down the precise input source for success or disappointment in the outcomes. Teachers that seem very ordinary (-ie they don’t appear to do anything special) can get great results; others with a reputation for dazzling in the classroom can be disappointed in August. How often do we get into the subject-specific dirt here – the precise issues of pace and depth of topic coverage, concept sequencing, the frequency of practice across the range of topics and question types, the persistence of misconceptions, students’ capacity to perform under timed conditions – all of that?
These nitty-gritty issues and the way they vary between teachers and teaching groups are probably where cause-effect connections can be made but it’s not clear that our scrutiny processes actually reach into that territory especially for non-specialists. Leaders might focus on compliance with behaviour protocols, records of how many PP students attended the Saturday revision sessions and various generic presentational issues in book scrutiny exercises because these are things they can actually do. They might only serve as loose proxy indicators for the things that really matter; even though there is some plausible cause-effect link to be made, they might all be red herrings. Even if you get good attendance from a certain sub-group at your interventions, this might not be the reason they do well. Sure, it can’t harm them, but it could be mistake to put more resources into interventions if they’re not the source of the success.
My main issue with this is that, despite the obvious complexity, all too often people just want easy answers. We end up with systems in schools that take time and energy for very little gain; they serve the accountability masters with tangible evidence of plausible cause-effect school improvement activity; anything you can stick in a table or a graph satisfies the itch for control. But they don’t actually necessarily make a damned bit of difference. We should at least be able to raise the questions: Why are we doing this? What difference does it make? How do we know?