A minor miracle occurred last year at a small rainbow-painted primary school 10 minutes north of Whangarei. At the start of the year, 112 students at Hikurangi School sat the STAR reading test, and 51% ranked average or above. That’s pretty good, considering the school is decile one with a high level of transience and most of the students are Maori. But in term four, when the students sat the test again and principal Bruce Crawford fed those results into the system, he saw something astonishing: 91% of his kids had met the benchmark. He had been hoping for something like 70-80%. “I was just over the moon,” he says, with a fist-pump. “Yee-hah!”
But after the initial elation wore off, Crawford became suspicious. “If 91% of my kids are passing, that’s brilliant. But it’s rubbish. Because I know my kids.”
Meanwhile, at Edmonton School in Auckland, principal John Carrodus was scratching his head over the sudden leaps and bounds his students had made in writing. These students sat a test called e-asTTle. There is no straightforward way to quantify the improvement, but as Carrodus put it, “According to e-asTTle, we’ve got sows’ ears out here that we’ve turned into silk purses, and we’re going to have Shakespeares and Dylan Thomases oozing out of West Auckland.”
STAR reading and e-asTTle writing tests are important in primary and intermediate schools. About 1200 schools are using STAR, and in term four last year about 80,000 students sat e-asTTle – most students sit the tests two or three times during the year, so progress can be tracked.
STAR is owned and run by the New Zealand Council for Educational Research (NZCER), a highly respected statutory body with a reputation for thorough, credible, unbiased research. The e-asTTle test is owned and run by the Ministry of Education. Both tests play crucial double roles. They are used as diagnostic tools for identifying specific strengths and weaknesses in individual students and across whole classes and cohorts. Some schools show results to parents.
However, one South Island principal, who did not want to be named, said when she gave e-asTTle results out last year, “we had to say, ‘Well, we can’t actually trust or believe in this, because Catherine isn’t showing this in any of the other forms of testing and evidence we’ve got.’”
Other schools may not make that clear, she says, “because [higher marks] are what their parents want, and it makes them look good, and they don’t want to question it. Some people are happy to inflate results and that’s always been our profession’s downfall.”
Teachers also use these test results – along with lots of other information – when deciding how to rank each student against the National Standards.
These standards are not tests in themselves but a set of specific benchmarks, which can be compared with a ladder. To work out where children sit on that ladder, teachers use a range of tests, including STAR and e-asTTle, as well as what they know of the children in class. They might interview them, observe them working or in conversation with others, look at samples of their work and ask them questions about it.
Both STAR and e-asTTle were revamped recently, with the new versions out in schools last year. Now there appears to be widespread confusion and concern about the unusually high results that these previously trusted and valued tests are producing. Education blogger Kelvin Smythe, who has published some of these concerns on his blog, www.networkonnet.co.nz, says he has received 30-40 emails from concerned principals.
The Ministry of Education is part-way through an investigation into the new e-asTTle test and says so far no major anomalies have been found.
But some principals are worried that less-scrupulous schools – or those whose staff simply don’t understand how the tests have changed – could be using the results to artificially boost their National Standards results. That in turn could give schools a higher ranking in the public league tables.
Paul Drummond, principal of Tahunanui School and outgoing head of the New Zealand Principals’ Federation, says the data “has the potential to be misused, either intentionally or otherwise”.
“I’d like to think there was professional integrity around this, [but] there are going to be enormous pressures to the contrary – to actually spin your data. There is so much pressure put on for schools to look good in those judgments, those scores.”
CONSPIRACY THEORY OR GLITCH?
So what is going on with these tests? Among principals, the answer ranges from conspiracy theory – voila, marks are up just when the Government wants them to be – to benign communication glitch.
Questions are being asked of the Ministry of Education and the NZCER. Both organisations acknowledge more could be done in communicating changes in the tests to the sector but insist their tests are valid and there has been no deliberate match-fixing.
But Crawford believes “a vast number of schools out there now … are still in the dark” about the changes, and that many will be delighted with their high reading and writing results, and will be using them to make decisions about National Standards.
“Conspiracy theory No 345 says that the ministry knows this, the ministry wants it because it’s going to make it look good … The only way it makes sense to me is as a political decision to inflate the outcomes for the big-picture agenda.”
The National Government has repeatedly said its focus is on getting pass rates up. On Wednesday, Education Minister Hekia Parata reiterated this, saying in a press release: “Our Government has an unrelenting focus on lifting achievement for all students.”
It is particularly focused on Maori and Pasifika students, who make up a large part of the “long tail” of underachievement.
This tail is what led former Ministry of Education head Lesley Longstone to make the controversial point in the most recent annual report that New Zealand cannot claim to have a world-class education system.
Chris Harwood, the senior manager for curriculum teaching and learning design for the ministry’s Student Achievement Group, says “we did get some concerns” about the e-asTTle results and gives two possible explanations.
One is that more students are now using the tool, which naturally means the spread of results is wider – there are more high marks and more low marks, because there are more marks in total.
Another explanation is that what the test is actually testing has fundamentally changed. The old e-asTTle test looked at the piece of writing each student did during a test, and gave results purely on face value. The new one uses that piece of writing as a starting point, and extrapolates to what the student could probably do with support from his or her teacher and without the pressure of the test.
Harwood: “What the review has done is to have a look at the writing that the student has done in a test situation, and then recalibrate internally in the tool to say, ‘If this is what a student does in an unsupported test situation, then it is likely with the kinds of supports that go on in a classroom writing situation, [that] this is where the student’s learning is up to.’”
Carrodus sees that as a thinly veiled exercise to inflate marks. “It’s rubbish. All they’ve done is shift the bell curve – they’ve given it a nudge sideways.”
So, did politics influence this change to e-asTTle, in a deliberate move to raise marks? Harwood: “I think the key message from the ministry would be that e-asTTle is a tool to help support teachers in their planning for teaching and learning, to really try to help understand where kids are up to. It is only one source of information. Teachers need to use their daily observations and their daily knowledge of children as well. And that when teachers are making their judgments about overall progress and achievement about students, this is only one source of information. So to directly link changes in one tool to any other kind of agenda is patently not true.”
Says Harwood: “Indications to date are that there is not a major differential between what the old tool may have produced as a score, and the new tool.
“Right now we are making decisions about what, if any, future investigation we need to do with the tool. We are aware that because of the tenor of the questions you’re asking and the principals are raising, we should do some more work around communication – making sure that the changes to the tool are well understood by those who are using it, and making sure that our providers who work in terms of professional development with teachers and leaders are also very well aware of the changes and what that means.”
It says something about the atmosphere in the sector that even the NZCER is facing allegations of crunching the raw test results in a way that shows a better big picture.
The South Island principal, who has lost faith in both STAR and e-asTTle after “phenomenal” jumps in results, believes people at the NZCER have great integrity but the organisation “is under huge pressure to do things they don’t feel comfortable doing”.
Hikurangi principal Bruce Crawford says he had a “huge amount” of trust in the NZCER, but his recent experience with STAR has dashed that. He is now considering laying a Commerce Commission complaint against the council, on the basis that the tests are not as robust as they have been made out to be.
NZCER general manager of products and services Graeme Cosslett says he would welcome the chance to visit Crawford’s school and talk about his concerns. “We have extensive communications with schools and we apologise that we’ve not been able to engage with all of the schools as much as we would have liked to. But we’re certainly putting systems in place that we hope will not only allow for greater communications with schools, but also be able to provide greater support on the reporting and analysis side.”
Cosslett believes the issue is that Crawford and many other principals are still feeding their raw results through stanines – the classic bell curve graph – when another method, scale scoring, is much more helpful in tracking progress. He says stanines can be helpful in other ways, but under the new test, they can give inflated results at the end of the year.
“A stanine is a much cruder measure of progress and it’s a less reliable measure,” he says. “Given that schools have been using scale scores with the Progressive Achievement Tests (PATs) since 2006, we underestimated the need for ongoing learning about the usefulness of scale scores for STAR, as opposed to stanines.”
Most schools would be using a mix of both methods, Cosslett believes.
In response to the allegations that the NZCER has bent under political pressure, Cosslett says: “STAR is owned and developed by NZCER. We have spent 18 months to two years investing in its redevelopment, at a cost of thousands of dollars and involving thousands of students. To suggest we would then somehow tinker with it to inflate the results because of political pressure is absurd and an insult to the integrity of NZCER and of the many people involved in the revision.”
Was any political pressure applied? “No, not at all. And in fact if we were asked to do that, to be honest we would ignore it, because we stand by our tests.”
Tahunanui School’s Drummond says “some principals are seeing some kind of conspiracy. I don’t see the conspiracy. I don’t believe that [the NZCER] would compromise their professional integrity around what those results would show … Having met the people involved, I don’t believe that.
“But it’s not NZCER that I fear. It’s what the politicians and/or parents may read into these [results], and what it may do to the way schools work.”
NATIONAL STANDARDS FEAR
A fear voiced by many of the principals interviewed for this story is that some time soon the Government will roll out a single hard-and-fast test for National Standards, ignoring all the nuances that teachers currently use to make their judgments.
The Ministry of Education’s Harwood is adamant that won’t happen. “The whole National Standards framework was based on the professionalism of teachers to make overall teacher judgments using a wide range of sources of information – which includes tests, observations, interviews, samples of work – and any thought that we were reducing National Standards to a test are just not true.”
So there is no discussion about that? “Absolutely none.”
Meanwhile, Crawford and some of the other principals spoken to say they will no longer let STAR or e-asTTle have any influence on their National Standards results. Instead, they will fall back on other tests, including the old-fashioned running records kept by most teachers, and continue to put their faith in their teachers’ professional judgments.
Crawford: “If I was that type, I would have sat here, shut my mouth: ‘Yahoo, I’ve got the 91%!’ But you can’t.”
He shrugs. “You just can’t do that.”