Battles with AQA over English exams have shaken confidence in the concept of standards

The story of the last week has been about English GCSE and the mess that has been made of the grading.  ‘Only’ 1.5% fewer grades at A*-C were awarded compared to 2011 but still this represents thousands of young people affected and hundreds of schools.  The analysis by Daniel Stucke is superb, showing how the grade boundaries have changed and how the various components of the exam give the whole process of arriving at one final grade a high degree of complexity.  Further excellent commentary is included in this Guardian piece and the various posts by Geoff Barton.  The campaigning work of ASCL and numerous Heads and commentators may yet pay off; let’s hope.

The perspective that I bring to this issue comes from spending significant amounts of time in the last year, fighting a battle with AQA to get fair grades for our English students at A Level and GCSE.  We’ve had numerous successful grade appeals and requests for full-cohort re-marks in the three years but in 2011 we hit a sizeable brick wall after we received some catastrophic and bizarrely anomalous results.  Our English Literature GCSE results were the worst on record by miles; at A Level, students in sizeable numbers who’d gained top A grades in all previous components and tests, suddenly were given Ds, Es and Us for their AS or A2 modules.  This was with the same teaching team who are highly experienced and totally up-to-speed with every aspect of the specification and so on.  We had to give up on the GCSE case but when the A level post-results services gave us no joy and we went to all the appeal stages.  We put together evidence with scripts from 2010 and 2011 showing very clearly that essays of the same standard had been given different marks. It was clearly an issue with how the mark scheme was interpreted and, out of the blue, a newly rigid interpretation was being applied causing our candidates’ marks to ‘fall off the cliff’ if they deviated from a very narrow path through the criteria.

However, as we discovered, the appeal system was stacked against us and we failed to get any movement at all.  This year, the same thing has happened; our A level results are riddled with bizarre anomalies once again. More happily, our English Literature GCSE results are the best ever; a swing from 10% A* in 2011 to a staggering 70% A* in 2012;  this is with similar students and the same teachers! Frankly, we’d be happy with a little consistency! How is it possible for such a huge swing to happen? We are ‘winners’ this year, but we have no reason to believe we won’t be ‘losers’ next time around.

What did the appeal process tell us?

1) AQA is empowered to conduct its own internal appeals process.  This is the first stage appeal. Here you meet face to face with the Chief Examiner. If he says the marking is accurate – it is impossible to argue. The panel, though independent, does not really have the capacity to challenge the word of the key person who determines the core standards in the exam. His word against ours?  No contest..

2) There is no obligation for an exam board to compare work from one year to another; some limited national sampling is done but no centre-specific comparisons are made. There is no system that allows large swings in one centre to trigger concern within AQA; they are blind to these issues and will always point to macro national level data to prove that they are consistent.

3) The Appeals Panel does not have to give a reason for its decision.  It simply says ‘upheld’ or ‘not upheld’  -which is quite appalling when faced with the latter!

4) The markers in the marking wing of AQA have nothing to do with the statisticians who convert marks into grades. The markers virtually disown the final grades and distance themselves totally from any responsibility for that part of the process.  This means that, similarly, the people setting the grades are almost totally divorced from any contact with actual work in the subject; they are using statistics to generate grades from the raw marks from examiners and it is quite possible for them to lose sight of how the actual work may vary, especially close to the boundaries.

5) The second level appeal, via the Examinations Appeals Board, is almost farcical in its Kafkaesque circularity.  (My favourite joke: it is ‘literally Kafkaesque’). They insist that they are independent but strictly non-specialist.  This means they cannot make judgements on the standards of scripts.  If your case depends on an analysis of scripts, they cannot take a view.  Who else is at the hearing?  The Chief Examiner! Here, once again, he simply has to re-state the view that the marking was not at fault and the scripts presented are not of the required standard.  Even when shown numerous comparable scripts from different years that were awarded different marks, the Chief Examiner is not required to answer – because, again, the OfQual Code does not require this level of scrutiny.

Where does this leave us at KEGS? 

Sadly, we have developed the deep conviction that the quality of marking and subsequent grading are the greatest variables in the process of students getting grades; it is not the students or the teachers. We soldier on, doing our best to develop a love of learning in English and hope for the best.  However, at A level, we have abandoned AQA.  In fact, from this September we will start the Cambridge Pre-U course in English instead of A level English.  There are other reasons (it looks like a wonderful course) but our total lack of confidence in AQA was the trigger.

How might this relate to the national situation?

The whole debacle shows how fragile our grading and standard-setting system is.  It is a house of cards and it wouldn’t take much to bring the whole thing down.  Is it the case that the quality of a C-candidate’s work in 2011, January 2012 or June 2012 is the same – or at least broadly the same within a range. There is a high degree of uncertainty;  no-one appears to line up the actual work from a largish sample to determine whether some objective standards have been met.  The whole system is overly dominated by de facto norm referencing which is clearly incompatible with the idea that students have certain standards to meet.  Messing around with this part-way through  a two-year cycle is ridiculous; it is akin to stretching the tape measure at the Long Jump so that the measured distance changes, regardless of the actual distance travelled.

The outpourings of glee from the ‘Frothing Right’ as epitomised by Max Hastings in the Mail on Sunday, lionising Michael Gove for taking a stand against grade inflation, simply reveal a total ignorance of how exams work.  The fact is that it is perfectly possible – actually it is close to a certainty –  that  D-grade students this year will have produced work that is of a higher standard in the detail of their actual writing than many C-grade students in previous years. Gerrymandering the bell curve does not equate to setting standards; to believe otherwise is simply ignorant in the literal sense.  However, I’m certain the current issue is as much about AQA’s short-comings as Government interference. Let’s not be under any illusions that AQA has tools sufficiently sophisticated to determine with absolute accuracy that a C represents a consistent standard; it doesn’t.

Depressing?  Well, frankly, yes. We need a massive re-think of this system and then a more honest debate about whether we want exams to be based on a set of absolute standards (thus allowing for grades to go up as schools improve) OR acknowledge openly that the system is essentially a giant ranking process within a norm referenced competition for a fixed proportion of grades. It has to be one or the other.

(For totally different approaches, we need a wider debate as I have suggested here. )

8 comments

Leave a reply to The Assessment Uncertainty Principle | headguruteacher Cancel reply