Cross posted from the Mandarin
There is a huge catch 22 driving impact measurement in human services. A lot of the evaluation is done because governments seek it, but then it goes nowhere – and for good reason. NGOs and others hoping to ‘scale-up’ innovation can’t escape this without something like an Evaluator-General, writes Nicholas Gruen.
There’s a spectre haunting service provision. For decades now, we’ve presumed that new and innovative service provision will emerge from innovation out in the field, with successful pilots and innovative programs initiated by NGOs being grown to their appropriate size and unsuccessful ones being improved or closed down. But it almost never happens.
As Peter Shergold put it in 2013, things don’t actually work out this way:
Too much innovation remains at the margin of public administration. Opportunities are only half-seized; new modes of service delivery begin and end their working lives as ‘demonstration projects’ or ‘pilots’; and creative solutions become progressively undermined by risk aversion and a plethora of bureaucratic guidelines.1
As I’ve pondered this paradox over the past decade, it’s slowly dawned on me that, however well-intentioned we’ve been, it’s all built on a lie – a lie we’re not even admitting to ourselves.
There’s a catch 22 at the heart of the system.
The way it’s done now
Let’s say you’re running a pilot or a not for profit offering some innovative new approach to some service. You’re convinced it’s cost-effective and indeed generates all kinds of benefits for the community and overtime for the government – in lower outlays on health, corrective services and higher employment and productivity.
You seek greater support from governments, and the bureaucracy’s response is that you should have the program independently evaluated. Already there’s a problem: there’s no standardised way to conduct this evaluation – no published expectations of what success looks like, or measures of the success or otherwise of the existing services and how they compare.
In any event, you then spend a hundred grand or more having the program independently evaluated: Deloitte or PwC start turning over the meter.
You submit your paperwork, now bolstered by independent evaluation. But nothing happens.
That’s because once your independent evaluation makes its way to the hardheads in the central financial agency, they barely bat an eye. The fact is, your report isn’t independent: you funded and commissioned it, and we all know how creative consultants can be in arriving at conclusions that support their clients’ interests. Catch 22. Your money and time down the drain, your innovation might linger for years at the periphery of governments. Visitors might be taken to see it in all its glory. But it’s not going anywhere.
The required change of perspective
Policymakers have in their head an idea that markets are spontaneously occurring phenomena, when, in fact, they’re forged over long periods of time and often involve concerted struggles that require the buy-in of the commanding heights of the system at both the administrative and political level. As I’ll show in a subsequent article, even if innovation started bubbling up from below, validating it is just the start of a process because successful innovations can’t usually be scaled without redesigning the systems all around them.
But there’s wisdom in starting small — so how might we do that?
Already, in numerous areas, governments fund third-sector programs on condition that an evaluation is done. Accordingly, in those areas, funding is already in the budget and thus, implicitly provided by government. In these areas, governments should take greater responsibility to ensure that evaluation is commissioned independently – not by the advocate for the new initiative – and they need to commit to taking full account of it. And given this, bringing both monitoring and evaluation into the process would streamline things considerably.
Enter the Evaluator-General
To address the risk that this is done in a cumbersome “top-down” manner, the process should be steered by a board with strong representation from the third sector.
The purpose would be to work towards a close, collaborative relationship between the third-sector organisations running these programs and the evaluators, who nevertheless remain independent. The goal should be for independent experts in monitoring and evaluation to help build the capacity for practitioners themselves to understand the success or otherwise of their own practice and how to continually improve it. Thus an accountability system would not be imposed upon practitioners from above, but built ‘bottom-up’ from their own self-transparency within a system they’ve collaborated in building. This is how Toyota built is own accountability for production line efficiency – workers on the line are trained in statistical control and endlessly measure and optimise it.
Governments would commit to publishing all such evaluations and also to conducting and publishing evaluations of existing practices. The process should be trialled in some specific area and commenced as soon as possible. Action in this area would be welcome from governments in any of Australia’s three levels of government.
Tackling amnesia: The crucial step of following through
There’s another problem, and it’s growing: amnesia.
Governments commit to various actions but then don’t follow through. The problem is an international one, and well documented for Britain by the Institute for Government’s 2017 report All Change. It was brought home to me that same year when the PC reported on Data Availability and Use, largely replicating recommendations of the Government 2.0 Taskforce, which I chaired in 2009. The recommendations needed to be reiterated not because our recommendations had been rejected, but despite the fact that they’d been accepted!
With this failure to follow through now ingrained in our system, I suggest an independent source – for instance, the Auditor-General – should conduct a retrospective review of learning from pilot programs in Australian jurisdictions. The review would explore the extent to which intentions to learn from experimentation in pilots and to identify and grow the best of those pilots had been realised. In particular, such an initiative would review the extent to which programs that work well are discovered, learned from and expanded, whilst those that perform less well are improved or defunded.
- On the completion of this review, governments should commit to an explicit strategy to ensure pilots are well-chosen, evaluated and built on where they are successful.
- On the launch of this strategy, the government should commit to a regular Auditor-General review of its progress every two years.
- Whatever size they are, those delivering services should commit to build their own monitoring and evaluation systems in such a way that they:
- are built upon the self-accountability of those in the field seeking to continually improve their impact
- build that self-accountability alongside accountability to an independent, expert critical friend who is ultimately responsible for monitoring and evaluation outputs from which the organisation’s accountability is built.
- I think things are actually quite a bit worse than this. Ask yourself, when was the last time some small scale initiative grew to national significance? Perhaps there are more recent examples, but the last one I can think of is Landcare, although its pathway wasn’t really through the system of service delivery so much as the political system. The first local group, in Winjallok, near St Arnaud in Victoria in 1986, and within three years, Bob Hawke had announced the year of Landcare, with the ACF’s Philip Toyne and the NFF’s Rick Farley bringing their very different constituencies behind it.