OfQual Insights: More thoughts on exams.

Exams are more complicated that we want to believe.
Exams are more complicated that we want to believe.

This week I met Glenys Stacey, the CEO of OfQual.  She’d invited me to share views about aspects of the examination system.  I had previously met Amanda Spielman, the Chair of OfQual – where she offered advice on our Baccalaureate model.   OfQual leaders are keen to engage in dialogue with people in the profession, not least so that they have opportunities to explain their work.   (Some of my ideas about exams are summarised and compiled in this post: Exam Reform: A blog manifesto. )

Some of the things I’ve had a chance to discuss include:

  • The issues surrounding marking and grading; error margins, variation across subjects, cliff-edges, the merits of points or grades systems; the variation in grade widths within and across subjects
  • The statistical processes used to ensure comparability between different Boards, across subjects and between years
  • The mechanics of grade inflation and possible solutions to tackle it whilst still allowing for system improvement.
  • The nature of the appeals system, blind re-marking, research on the quality of marking, the inherent variation in judgement-based marking; tech-innovations in marking systems.
  • Recent specific issues including successful changes to Science GCSEs, the growth and diverse nature of IGCSEs, the problems with GCSE English in 2012, recent difficulties around Speaking and Listening, multiple entry, reform to GCSEs, tiered papers.
  • The evidence for discrepancies between schools in the way they approach teacher assessed components; the impact this has on grade awards and the extent to which this can be controlled
  • The role of OfQual in working with Exam Boards and the DFE, maintaining independence, having a technical role but also a public profile, issues around communication and public understanding of examination systems.
  • The tension between envisaging an ideal system and establishing one, given the need to maintain public confidence during the transition; the political realities surrounding reform and the limits this places on making radical changes.

It’s all quite geeky but hugely fascinating.  There is so much more to know about how the system works and how it could improve. Given its critical importance to schools and students, it strikes me that not nearly enough people truly understand it. We’re often shouting at the injustice issued by the Black Box when, if we knew what was inside, we’d take a different view.

Thoughts about OfQual:

  • Regulating the exam system is a massive job: trying to ensure that A*s, Cs and Es have parity across every subject and exam board, referencing different specifications, modes of assessment and subject matter is an almighty task.
  • OfQual people are motivated to ensure fairness and rigour in the system and deal with this in a highly technical way.  It works closely with the four main exam boards and has links to the DFE but I believe that OfQual is more independent than people give it credit for. No-one tells OfQual to depress grades or halt grade inflation.  They arrive at these conclusions based on technical advice from assessment experts.  There’s no-one on the grassy knoll, so to speak.
  • OfQual has access to a mountain of data. They have information about the characteristics of every exam in every subject set by every Board.  They act on this information if their analysis suggests that standards are not being maintained or if significant anomalies emerge – which does happen.
  • When problems are detected at a national level, they seem to prefer to act quickly so that technical discrepancies are resolved –  in OfQual’s view so that unfairness does not persist – rather than set longer time-frames for changes to come into force, allowing existing qualifications to run their course. Political implications are not their concern and communications issues are not their highest priority or strength.  I’d suggest it would be better for all concerned if major mid-course changes were communicated such that Heads and teachers could engage with the rationale – but perceiving the changes as an injustice is wide of the mark.
  • I’m pretty sure that if we had access to the same information about, say, English GCSE grades overall or the scores for Speaking and Listening, we’d all be able to reach the same technical conclusions as OfQual.  I believe Glenys when she says that there are major inconsistencies in the way different centres award marks for Speaking and Listening. If she says it is so serious that S&L marks can’t be allowed to form part of the overall grade awards, we have no basis to argue: she’s saying that because of the analysis, whether we want to hear it or not.  Her rationale:  it is simply unfair for some students to gain more credit than others to the extent that is shown in the data, for performance that does not warrant it.  We can argue about process, the timing, the value of speaking and listening in the English curriculum but  I don’t think we would argue with the evaluation of the data.
  • Ofqual has too many issues to contend with at a national level to focus very much on resolving local issues with centres and exam boards.  Their main interest is in issues that have system-wide consequences; it is unrealistic to expect them to get involved where there are consequences for individuals, case by case.  They can’t do this with the resources they have and, to a large degree, rely on Boards to do their jobs well at this level. The appeals system is on their agenda as is the quality of marking.

Further thoughts about exams and exam reform.

We tend to assume that exam results are absolutes, to far too high a degree.  Here is a bombshell: in papers with extended writing, there is no one correct mark.  Marking is judgement-based; it is widely accepted that two or three markers could give a script different marks. None of them is necessarily wrong.  This has conceptual implications for many of us; we are looking for absolute standards that are not present.

Grade boundaries should shift for any given exam from year to year; that is how we get stability in the system and some sense of consistent standards.  When people say ‘well, the grade boundaries shifted’ as if to indicate some form of injustice or interference, they’ve misunderstood the process. Year to year, grade boundaries have always shifted as a means of linking standards between papers. It is a result of applying routine standardisation processes based on a statistical distribution. We must not tell students that mark X will get you grade Y – that’s not how it works until after all the marks are in.

Although there is a checking mechanism used by which subject experts seek to ensure grades broadly agree with some absolute standards as part of the overall process, grading is primarily a process of dividing the cohort into attainment bands. Grades say more about relative performance in a cohort than performance against fixed standards, even though the subject specialists have input into decisions around where exactly a grade boundary should be set.  Conceptually, we need to be more explicit about that. Exhortations to narrow gaps or secure continuous improvement or meet floor targets indicate a fundamental lack of understanding about exams and grading. Norm referencing isn’t an evil thing; it is how our exam grading works at a core level…we may have lost sight of that.

Genuine, deep-level improvement in learning as a result of improved teaching and changes to a school’s ethos, leadership and management, is slow and steady.  At a national level, this is likely to be very gradual if it is happening – ie if English children are getting better educated over time.  Rapid change at school level or national level should be treated with scepticism and subjected to scrutiny. It suggests that either things were extremely bad in the past; that the profile of the learners has significantly changed on intake, or that the scale of improvement is an illusion; we’re simply seeing schools do better at training students to pass exams. However, grade inflation is a structural consequence of aggregating ‘benefit of the doubt’ on margins of error, year after year; it’s not because of ‘cheating’ or ‘dumbing down’.

In order to establish a system where improvement in real standards could be shown at national or school level, we need a much more stable baseline of measures – ie free from perverse incentives, gaming and multiple entries. We also need a radical change in the culture that labels every low grade as a failure.  Given that not all learners can gain the highest grades, we must develop a system that gives much greater weight and value to non-examined elements of a young person’s education – cue our English Bacc model or something like it.

I recognise that in the current political climate many of my exam reform suggestions are pie in the sky. They suggest too much change at a time when we’ve had enough already. However, we should be able to set out a 10 year plan for reform that cuts across the election cycle… so we aim in the right direction.

I’m certain that accountability reform needs to precede fundamental exam reform.  If ministers continue to insist on using blunt data instruments to hold us to account based on exams that are not designed for that purpose, we’ll never get the level of intelligent behaviour and integrity in the system that we need.

12 comments

  1. Interesting as ever Tom. I do feel the time is approaching when we seriously consider ending formal external examinations at 16 . The amount of time and money spent on remarks etc is a huge waste of public money.Why is that when state schools use IGCSE’s they are gaming the system yet with the publication of GCSE Independent School results today they are lauded for increasing the number of entries for IGCSE ?

    Like

    • I know what you mean. Glenys herself suggested it is an error to lump all IGCSEs together. Those with high levels of coursework or those used as double-entry devices are the ones being criticised. There isn’t enough regulation of IGCSEs. I’d like to see a reduced market for qualifications but then they’d need to meet the needs of all learners. There are penalties: eg in 2014 and 2015 our AQA science IGCSEs (technically called ‘Certificates’) will be zero-rated for performance tables. But we like the courses so we’ll see it through.

      Like

  2. Thanks for this, Tom. Its good to get a head’s perspective.
    I have been working in the exams system for 35 years, initially as a teacher marking papers, more recently as a chief examiner and now as a reviser and occasional Ofqual consultant, and it is easy to forget that it is a black box to most folk.
    I am coming to the end of an MA (by research) in education, on the topic of marking the 6 mark questions. For that I have recently been reading the literature on reliability of marking, Ofqual and 2 of the Awarding Bodies have done a lot of research on this over the years. It is interesting stuff in a geeky kind of way.

    Like

  3. Tom, I’ve found this really useful to read. My concern is direction of travel towards exam based only assesment. The myth that exam marking is accurate and to be trusted without question you highlight and Ofqual seem to accept. This year we already have examples of massive changes to marks – D to a B grade at AS comes to mind. This is obviously an exceptional but not unique amongst remarks – do exams have any less degree of error as other components? With inherent issues in all forms of assessment Dylan Wiliam, at one of the SSAT Redesigning Schools symposium, suggested keeping summative assessment distributed, extensive and synoptic (blogged about it here http://leadinglearner.me/2013/04/12/redesigning-classrooms-is-your-summative-assessment-trustworthy/). Why are Ofqual moving in a different direction if there is no political influence?
    I hope you wil keep engaging in the debate and blogging about it. As professionals we must understand the system better if we wish to influence it. Hope you have a great start to the year. Regards, Stephen

    Like

  4. Apparently, Ofqual have now detected that there are major difficulties with the awarding of marks for reading and writing across the exam boards for the English GCSE, so they are removing these from the assessment scheme. Students will now have to go in and jump up and down on a trampoline and the resulting qualification will be called “English” (with air quotes). Unfortunately, Ofqual had no choice.

    Like

    • I sympathise with the frustration. However, there is clearly more to it and it would be worth considering that, just maybe, they’ve acted in a way that was entirely necessary, given the information they have about S&L assessments across the country. I think the problem is that this is not transparent and the timing precluded any discussion prior to the announcements. But it doesn’t necessarily make it a bad decision if their role is to ensure that a grade means something when it is awarded. The real problems arise if nothing is done when Exam Board practices are not sufficiently robust.

      Like

  5. Ofqual have just announced that from 2015 GCSE Maths students will have to remember equations BEFORE entering the exam:

    Thought you and colleagues might be interested in Phil Chambers and my new ebook, for which I attach a copy of the flyer below.

    Book is previewable via the link and your relevant Maths and Science teachers and pupils may of course be interested. Your library service may of course include a copy on its network.

    And any feedback or suggestions on our product (perhaps you would like to refrain from purchasing a copy until 2015 is closer) are of course appreciated.

    Thank you for your time!

    James Smith

    How long would it take for you to remember pi to 15 decimal places: 3.14159265358979? Wouldn’t it be easier to remember the phrase, “How I need a drink, alcoholic of course, after the heavy chapters involving quantum mechanics”? Just count the number of letters in each word. You don’t just have to use this for really long numbers. “I wish I knew (the root of two)” yields 1.414. If you need to remember the date 1852 then “A mnemonic helps me”.

    This is a very simple way to remember numbers. It’s just one of many more sophisticated techniques to help you remember equations, numbers and study effectively found in a new e-book …

    http://bit.ly/EquationsEM

    At last! The book that all maths, physics and economics students have been waiting for.

    HOW TO REMEMBER EQUATIONS AND FORMULAE

    Don’t take our word for it:

    “This is an outstanding and comprehensive book that delivers on every promise! All memory strategies including mind mapping and the journey system are here for you to depend on and you’ll quickly realize this is your most treasured memory resource”
    – Pat Wyman, founder HowToLearn.com and editor, Amazing Grades

    “If you need to remember formulae of any length, for study or work, and you’d like your hand held while you master this skill effortlessly in a fun way, you should buy this book today.”
    – Amanda Ollier, author of The Self Help Bible and The Mindset Shift

    Click on the following link for more information and to read our blog with free tips:

    http://bit.ly/EquationsEM

    Phil Chambers and James Smith

    phil@ecpcbooks.com
    Making difficult Things Easy Peasy
    jamessmith106@hotmail.com
    Tweet: JamesSmith106

    Like

  6. […] Having written about exams and assessment on my blog, I was delighted to be contacted by Amanda Spielman and then Glenys Stacey from OfQual who both separately invited me to meet them.  It was fascinating to meet the real people behind the organisation and to hear their views about the recent changes to exams; the issues of grade inflation, marking accuracy and some of the technical aspects of the exam system that are rarely understood or even discussed.  It’s clear, having met them, that they act on the basis of their technical analysis rather than because of political pressure.  We may not like to hear it but some of the changes were needed because things had simply gone awry.  My main conclusion is that not enough people really understand the mechanics  and inherent limitations of the exam system; we need to work on that.  I wrote about some of this here. […]

    Like

Leave a comment