Are Exams Too Easy?

My son has just completed his GCSEs, so I was paying a bit more attention this year to all the hue and cry about exam results – but I’ve not been immune to the fact that, for as long as I can recall, we’ve been saying that exams are getting easier.

The other way of looking at is that our children are getting brighter. This would be nice.

In fact, I don’t think either of those is the case; I do think exam results are being invalidated.

Here’s the 2011 GCSE results (if you want to see the graph without annotations,  click here):


You can see right away that the results are skewed heavily to the left – or that most (nearly 70%) of entrants received a C or above. In fact, nearly half get a B or above, and nearly a quarter get an A or above.

This cannot be right. For GCSE results to have any meaning at all, then surely most entrants should be getting an average mark (C or D); in 2011 more entrants got A* – B than got C – D.

The orange line on the graph shows you the trend (and clearly shows you a ‘pull’ to the left). The red line shows what the trend should be, if the results were a normal distribution.

Normality

This a statistics term, but it’s actually pretty straightforward (as long as I avoid the formulae!).

Normal distribution
Any large group of data tends towards the same distribution – this is called the Normal Distribution. Because it’s what normally happens.

The data items cluster around the average, spreading out a bit as you move into the extreme results.

It might seem a bit weird to apply this to people, but remember we’re talking about averages here – so most people are average. That’s a key point to appreciate. It’s a point you might not like, but that’s what the word ‘average’ means.

For example – you may initially be shocked when told that 50% of people have an IQ of less than 100 – but of course they do; 100 is the average IQ. So 50% of all people will have an IQ lower than the average, and 50% of all people have an IQ above the average.

The same concept applies to everything – and the larger the group of data, the more closely it resembles the Normal Distribution – the basis of which is that 68% of all results are very close (within one ‘standard deviation’) of the average result (34% to either side of it) – which you’ll notice doesn’t leave a lot of room for extreme results.

Exam results and Normality

Nationalised exam results are exactly the sort of thing that should trend towards normality – the point of a GCSE examination is not really to see how intelligent a child is. It’s to see how each child compares with the other children sitting the exam that year. That’s why the required exam marks for each grade aren’t (and cannot) be set in stone.

This is also why it doesn’t matter if they exam is ‘too easy’ or ‘too hard’ – you just move the goalposts. In an ‘easy’ exam, the average mark will be very high – so that’s your midpoint. In a ‘hard’ exam, the average mark will be very low, so that is your midpoint. Once you have your average mark, and you know how many participants there are, you can work out the standard deviation and then set down grade boundaries.

Easy, yeah?

So why are our exam results so fucked up?

Well in 2011, nearly 70% of exam results were within the A*- C range, which might mean they’ve set the ‘average’ point somewhere around a B. So if an entrant performs about as well as average, he’ll get a high pass.

That might sound pretty good – we all want our children to leave school with good exam results. But the problem is that these aren’t good exam results; they are meaningless exam results (or at least, they are heading in that direction).

Here’s what the exam results from 1996 look like (why 1996? Because that’s the year I did my GCSEs!):

This is a lot closer to a Normal Distribution. Closer, but not quite there – so the weighting has been shifted a bit to encourage higher marks (an average here is likely to be a C).

Here’s what the 2011 GCSEs would look like if they were Normalised:

Now obviously in a real situation the graph wouldn’t perfectly match normality, but coming closer to this would give us a bit more fairness.

Look: 50% of entrants get A* – C; 50% get D or below. 34% get a C, 34% get a D.

One of the other problems I think we have with our exam system is too many low grades. Is there any real difference between an F, a G and a U? They are all, apparently, GCSE pass marks. But in the real world, they aren’t. You won’t get a place at college with a G in a required subject. An F in Maths won’t convince an employer you can handle cash without further training.

What these results really mean is ‘you have failed the exam’ – and to the entrant they either mean ‘thank god, I’m done with that subject forever’, or ‘I need to resit that exam’. So why not just replace all the alphabet soup with an F. We all know what F means – it means ‘Fail’.

A clearly defined fail grade also gives real meaning the other grades, in particular D and E – which at the moment are stuck in a sort of limbo. Are they bad grades? Well, they’re better than a G – but you need at least a C (or really a B) for anyone to give a shit, so they aren’t useful grades… this isn’t fair or meaningful.

You may have noticed that only 0.1% of entrants qualify for an A*. Given that nearly 8% received A* grades this year, that’s quite a drop – but isn’t an A* supposed to be something special? If 1 in 12 people are getting them, then it’s just another grade. If 1 in 1000 people receive them, then it’s something special that only the brightest kids in each year earn.

Recalibrating Expectations

If we normalise exam results properly, it would mean that the vast majority of GCSE results were C or D.

We are currently very used to trumpeting kids for being straight-A students – and that would become a lot more rare. I don’t think that’s a problem – I think an A should be a mark of excellence (and an A* yet more so). B grades would be about as rare as A grades are now; and we’d have to get used to (or go back to) championing our children for their GCSE passes, rather than only focusing on the A – C grades they get.

Because, don’t fool yourselves, we’ve already been recalibrating our expectations over the past few decades – A grades used to be unusual. D grades didn’t always mean ‘failure’. Sixth Form colleges didn’t always demand As and Bs in required subjects – they asked for Cs and Ds; that alone should show us what the real value of each grade is nowadays.

I think we owe it to our children to ensure that their exam results have meaning; inventing new letters at the top of the alphabet simply will not cut it.

And aside from anything else, the other thing the Normal Distribution teaches us is that most people are average – teaching our children that they only matter if they get ‘above average’ results cannot be the lesson we want them to take away from school.

Not when 68% of them are average. Statistically speaking.

 


Dave received 4 As, 3 Bs and 2 Cs in his GCSEs.
One of the Bs was for Mathematics.

5 thoughts on “Are Exams Too Easy?

  • 9th September 2011 at 12:12
    Permalink

    Much as I loved to see Brandon getting good grades, it does seem like the system is wrong at present.

    Reply
  • 9th September 2011 at 14:02
    Permalink

    The cause, in my humble opinion, of what you have analysed so thoroughly, is that the GCSE was introduced to replace the O level and the CSE. In my day (and some other aged women of my aquaintance 😉 if you were clever you did O levels and the grades were A-C for passes, with D and E being fails. If you did CSEs it was the same. The problem was that students got divided at 14, or sometimes even younger if they went to a secondary modern school rather than a grammar school. This meant that if you were a bit dozy, or a late developer or whatever, it didn’t matter what you did, you could only get a CSE. Employers knew that only the dimmies did CSEs so even though there was rhetoric about a CSE A grade being the equivalent of a C at O level, everyone really knew (in the way that everyone does know these things) that CSEs were not as good. Also, they tended to be in slightly less demanding sounding subjects, like Combined science, or human biology, rather than hardcore Physics or English Literature.

    So the guvmint came up with the GCSE, which covered the whole academic range, with the A-C being the equivalent of an O level, and D and E the equivalent of CSEs A-C. The theory was that if someone who didn’t show much promise at 14, got a wriggle on and came good at 16 they could potentially get higher than a D. Thusly might social mobility be encouraged. Simples, as meerkats would say, and actually not such a daft idea.

    HOWEVER, then the guvmint, with that amazing talent it has, shot itself neatly and carefully in both feet! They did two things:

    1. They started setting a target for schools of how many students got grades A-C at GCSE. The impact of this is many and varied but it includes that schools stopped entering students who were likely to only get a D, because then they wouldn’t show up as a bad statistic, and it also sends a message out that only grades A-C count, so the push is on to get everyone into that band.
    2. They started having separate papers for those predicted grades A-C or D-E, known as the hard paper and the easy paper. Can you spot the effect of this??? If you are entered for the easy paper, the most you can get is a D! In other words the CSE is back by another name. The only slight change is that everyone does the same curriculum until nearer the exams when predicted grades might be more accurate. But I have had conversations with students in year 11 where they say “MY teacher says I should give the harder paper a go, but really I can’t be bothered”

    There is also talk of schools carefully picking exam boards, curricula and subjects that generate the grades, rather than demonstrate academic rigour. I suppose there was always an element of this, but with targetting being such a blunt instrument it becomes a much bigger motivating factor than educating children.

    Glad to hear Brandon did well 🙂

    Reply
    • 6th November 2011 at 16:30
      Permalink

      Targeting is definitely a massive issue in the education system at the moment – too many teachers (and schools) are worried about the appearance of progress (so they look good on the league tables), and less about teaching individual children.

      Reply
  • 4th November 2011 at 09:55
    Permalink

    I disagree with the very original premise. I think the idea of exams should not be to test students against each other, but against a carefully set of skills and knowledge they should possess. One might argue that if there are so many good grades, one could expect more skills and knowledge of students of the same age in the next year (if this turns out to be a general trend), but adjusting the difficulty or the way things are graded in order to achieve an average of C in a class does more harm than good.

    They actually try to do this here in Germany, and press teachers to adjust their tests and/or grading in order to achieve some statistically desired outcome (e.g. a Gaussian curve with its peak at C), without giving any regard to what the students actually learn. I think some teacher even have been fired for constantly giving “too good grades”, despite the fact that indeed their students just really knew what they had learnt very well, i.e. the teacher was exceptionally good and the kids in his class were bright.

    Reply
    • 6th November 2011 at 16:38
      Permalink

      That does sound like an ideal grading system – but I think it is pretty hard to objectively say “this is the required amount of knowledge for a C” – even with scientific subjects; and the more artsy the subject, the more subjective the grading becomes.

      Especially when you’re changing the exam every year (so students can’t cheat!), and then even more so when the curriculum gets changed as time goes by (more stuff gets added in, as society advances – we don’t teach kids to use slide rules in maths any more, for example – but they’re expected to know trigonometry at an earlier age than 100 years ago).

      I think the Gaussian distribution marking model is a ‘best fit’ solution – it’s not really fair; but it approximates fairness well enough.

      Most of the time.

      And only if the sample group is big enough (i.e. most English children are entered into the same exam for each subject) – it can’t really work on a class or even school level.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *